[rfc-i] UTF-8 and Unicode examples
hgs at cs.columbia.edu
Tue May 4 11:30:31 PDT 2004
> Using  instead of <> might be a good idea to reduce the number
> of confused applications that would try to XML-ify the escape
The symbols < and > appear all the time today in, e.g., ASCII art
drawings as arrow heads, already. Examples are typically CDATA and part
of <artwork>, so I don't see this as a big issue. In the XML input text,
you may indeed have to escape the < as <, but we have to do this all
the time today when we code literal XML tags.
There is no hope in designing text output that will not trip up an XML
processor - it is not XML, after all.
Given that the Unicode spec suggests the <> format, it would need a
stronger case, in my view, to choose another rendering.
> Should tools like xml2rfc accept/interpret raw UTF-8, the escape
> sequence above, or both? This matters because these tools produce both
> ASCII text and HTML versions of specs.
I would not want to derail the short-term need to be able to express
Unicode and UTF-8 examples by commingling it with the much bigger
long-term problem as to whether non-ASCII characters should be
first-class RFC citizens. There are many reasons that this may not be a
good idea, none of which affect the issue of expressing protocol
examples I'm concerned about here.
More information about the rfc-interest