[rfc-i] Re: Draft: Representation of Unicode and UTF-8
John C Klensin
john+rfc at jck.com
Mon May 17 13:18:06 PDT 2004
--On Mon, 17 May 2004 09:26:48 -0600 (MDT), Alex Rousskov
> On Sat, 15 May 2004, Henrik Levkowetz wrote:
>> "<U+" occurs on 7 lines in RFCs, all in the same one (RFC
>> 3454), and all using it to indicate Unicode.
>> "<[A-H0-9][A-H0-9]( [A-H0-9][A-H0-9])*>" occurs on 444 lines
>> in RFC's, most of which use it to indicate carriage advance,
>> form feed, or html markup (e.g. <H1>). All those RFC's are
>> numbered below 1000.
> Thanks for checking this, Henrik. It looks like there is no
> serious issue with past RFCs, although conflicts with popular
> HTML tags and less popular bracketed ASCII characters are
> rather unfortunate.
Actually, Alex, Henrik, U+ notation (without the brackets) also
appears in RFC 3743 and, I think, in several other places. That
will, of course, not be picked up by Henrik's scan.
I also share the concern about overengineering this as well as
Paul Hoffman's concern about changing mnemonic, but
incorrectly-written, names into ones that look really ugly and
lose mnenomic value. For names containing non-ASCII characters,
it would seem to me to be appropriate to have, not only
"author's choice", but also some convention for representing
both "spellings" in the text.
More information about the rfc-interest