[rfc-i] Use of unicode

Iljitsch van Beijnum iljitsch at muada.com
Tue Apr 24 12:12:11 PDT 2012


On 24 Apr 2012, at 5:48 , Martin J. Dürst wrote:

>>> Even when all non 7-bit ASCII characters are summarily stripped from
>>> an RFC it must be possible to implement that RFC.

> So let's move 20 or 30 years back and say "even when all upper-case characters are summarily stripped from an RFC it must be possible to implement that RFC". How much sense does that make?

It's really not the same thing. ASCII has supported both upper and lower case since its inception, even though some of its predecessors haven't. So we know the format can handle them and they're a normal part of the English language that everyone understands, so we use caps.

Non-Latin characters on the other hand are just random drawings to the people unfamiliar with the script in question; they can't do anything useful with them except _maybe_ copy and paste them. But even copying them from a printed page to the computer is impossible. Not hard or inconvenient—impossible. If your keyboard doesn't have the characters and you don't know what they are there is no way you can enter them. So unless they are part of an important example that is meaningless without them, it is essential that non-Latin characters only appear as an addition to the Latin version of the same name or text.

> Do you assume that a reader would be aware of the fact that non-ASCII is stripped? If yes, shouldn't they just go make sure they have the non-stripped version?

That's not the point. The point is that the stripping of these characters must leave the RFC in a usable state, hence proving that the characters weren't essential. If I ssh to a box that has RFCs on it over a GPRS link because that's all that I have available in the super secret facility where I'm installing the implementation of an RFC, maybe I don't see the non-ASCII characters and I should still be able to get the information I need out of an RFC.


More information about the rfc-interest mailing list