[rfc-i] Use of unicode

Iljitsch van Beijnum iljitsch at muada.com
Tue Apr 24 12:51:33 PDT 2012

On 24 Apr 2012, at 21:31 , Andrew Sullivan wrote:

>> Non-Latin characters on the other hand are just random drawings to
>> the people unfamiliar with the script in question;

> With all due respect, that's sort of an overblown oversimplification
> of the internationalization problems.

Except for a small exposure to ancient Greek all the languages I'm familiar with use the Latin script. So I don't know what the problems are that people using non-Latin scripts have. (Although Dutch uses some non-essential stuff that wasn't always available on computers until the advent of Unicode and even that can be annoying.)

But the perspective of someone who doesn't know any other scripts than Latin is an important one, as there are a few billion people like that on this world. And most people are unfamiliar with most scripts, so even though a Chinese speaker knows Chinese, she probably doesn't know Arabic, so the problem of seeing characters that you don't understand well enough to be able to type them or even recognize conclusively is universal.

>> That's not the point. The point is that the stripping of these
>> characters must leave the RFC in a usable state, hence proving that
>> the characters weren't essential.

> That's just begging the question.  First of all, you have provided no
> evidence at all that stripping the non-ASCII characters "must leave
> the RFC in a usable state".

That's because it's an assertion.

I'm operating under the assumption that requiring the ability to recognize non-Latin characters in order to be able to implement an RFC makes such an RFC unimplementable for many people.

> If, as a matter of fact, one needs the
> characters for some purpose (for example, to illustrate the difference
> between two characters),

How is the difference in visual appearance between two characters relevant? We deal with protocols running on computers (using the term slightly loosely). Those recognize characters by their code points, which can be represented in ASCII.

Of course I'm not opposed to such examples where they make sense, but a visual evaluation of characters can never be part of the normative protocol description because I don't see how it can be sufficiently precise. So it has to be a supporting example with normative language expressed in regular Latin script text.

> If your argument in response is that you need to be able to do
> something with the document over your impossibly slow link from
> super-secret place

That was just an example to show how you could lose non-ASCII characters in practice, it's immaterial to the rest of the discussion.

More information about the rfc-interest mailing list