[rfc-i] Unicode or UTF-8
"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Tue Mar 27 20:08:54 PDT 2012
This is a followup to
http://www.ietf.org/jabber/logs/rfcform (which currently doesn't exist;
hope this gets fixed).
The minutes note:
8:00 - 18:10 = Questions/Comments
13) ? from Jabber: let's not talk about encodings. Unicode not UTF-8.
Having a fixed encoding is *extremely* helpful (if you don't believe
that, just think about how easy US-ASCII is in comparison with the mess
of encodings we have for non-ASCII text).
For the IETF, that would be UTF-8, because of RFC 2130
(http://tools.ietf.org/html/rfc2277) and everything after. That should
apply to xml2rfc source, the equivalent of the current .txt, and HTML at
least, as far as these are going to be used.
This may not apply to all formats, though. I'm not sure what .docx uses
internally. I'm not sure what PDF uses internally if you tell it to keep
all the text in Unicode. [Leonard?] For these cases, there's of course
absolutely no point to force them to use UTF-8 if they use another
encoding of Unicode.
So I'd think the conclusion should be:
UTF-8 where there's an (unnecessary!) choice, whatever Unicode encoding
was chosen for formats where a single choice already has been made.
More information about the rfc-interest