[rfc-i] Re: Draft: Representation of Unicode and UTF-8 characters

John C Klensin john+rfc at jck.com
Mon May 17 13:18:06 PDT 2004



--On Mon, 17 May 2004 09:26:48 -0600 (MDT), Alex Rousskov 
wrote...

> On Sat, 15 May 2004, Henrik Levkowetz wrote:
>
>> "<U+" occurs on 7 lines in RFCs, all in the same one (RFC
>> 3454), and all using it to indicate Unicode.
>>
>> "<[A-H0-9][A-H0-9]( [A-H0-9][A-H0-9])*>" occurs on 444 lines
>> in RFC's, most of which use it to indicate carriage advance,
>> form feed, or html markup (e.g. <H1>).  All those RFC's are
>> numbered below 1000.
>
> Thanks for checking this, Henrik. It looks like there is no
> serious issue with past RFCs, although conflicts with popular
> HTML tags and less popular bracketed ASCII characters are
> rather unfortunate.

Actually, Alex, Henrik, U+ notation (without the brackets) also 
appears in RFC 3743 and, I think, in several other places.  That 
will, of course, not be picked up by Henrik's scan.

FWIW.

I also share the concern about overengineering this as well as 
Paul Hoffman's concern about changing mnemonic, but 
incorrectly-written, names into ones that look really ugly and 
lose mnenomic value.  For names containing non-ASCII characters, 
it would seem to me to be appropriate to have, not only 
"author's choice", but also some convention for representing 
both "spellings" in the text.

    john



More information about the rfc-interest mailing list