[rfc-i] Re: Draft: Representation of Unicode and UTF-8 characters

John C Klensin john+rfc at jck.com
Mon May 17 13:18:06 PDT 2004

--On Mon, 17 May 2004 09:26:48 -0600 (MDT), Alex Rousskov 

> On Sat, 15 May 2004, Henrik Levkowetz wrote:
>> "<U+" occurs on 7 lines in RFCs, all in the same one (RFC
>> 3454), and all using it to indicate Unicode.
>> "<[A-H0-9][A-H0-9]( [A-H0-9][A-H0-9])*>" occurs on 444 lines
>> in RFC's, most of which use it to indicate carriage advance,
>> form feed, or html markup (e.g. <H1>).  All those RFC's are
>> numbered below 1000.
> Thanks for checking this, Henrik. It looks like there is no
> serious issue with past RFCs, although conflicts with popular
> HTML tags and less popular bracketed ASCII characters are
> rather unfortunate.

Actually, Alex, Henrik, U+ notation (without the brackets) also 
appears in RFC 3743 and, I think, in several other places.  That 
will, of course, not be picked up by Henrik's scan.


I also share the concern about overengineering this as well as 
Paul Hoffman's concern about changing mnemonic, but 
incorrectly-written, names into ones that look really ugly and 
lose mnenomic value.  For names containing non-ASCII characters, 
it would seem to me to be appropriate to have, not only 
"author's choice", but also some convention for representing 
both "spellings" in the text.


