[rfc-i] UTF-8 and Unicode examples

Henning Schulzrinne hgs at cs.columbia.edu
Tue May 4 11:30:31 PDT 2004


> 
> Using [] instead of <> might be a good idea to reduce the number
> of confused applications that would try to XML-ify the escape
> sequence.

The symbols < and > appear all the time today in, e.g., ASCII art 
drawings as arrow heads, already. Examples are typically CDATA and part 
of <artwork>, so I don't see this as a big issue. In the XML input text, 
you may indeed have to escape the < as &lt;, but we have to do this all 
the time today when we code literal XML tags.

There is no hope in designing text output that will not trip up an XML 
processor - it is not XML, after all.

Given that the Unicode spec suggests the <> format, it would need a 
stronger case, in my view, to choose another rendering.

> 
> Should tools like xml2rfc accept/interpret raw UTF-8, the escape
> sequence above, or both? This matters because these tools produce both
> ASCII text and HTML versions of specs.

I would not want to derail the short-term need to be able to express 
Unicode and UTF-8 examples by commingling it with the much bigger 
long-term problem as to whether non-ASCII characters should be 
first-class RFC citizens. There are many reasons that this may not be a 
good idea, none of which affect the issue of expressing protocol 
examples I'm concerned about here.

> 
> Alex.



More information about the rfc-interest mailing list