[rfc-i] Potential RFC format approach: HTML

Julian Reschke julian.reschke at gmx.de
Sat Mar 24 02:02:16 PDT 2012


On 2012-03-24 09:45, Fred Baker wrote:
>>> - Make the decision that the files will ALWAYS be encoded UTF-8.
>>
>> It's not strictly necessary as long as the encoding is properly
>> declared, but certainly helpful.
>
> Let me express my ignorance of character sets.
>
> One of the issues that is raised periodically is that people would like
> to be able to spell their own names correctly. I can in fact spell my
> middle name (Jürgens); I'm not sure UTF-8 helps me if my name is 孙琼.
> Does it?

It does. UTF-8 is a character encoding scheme that maps all Unicode code 
points into octet sequences; most of which being longer than one octet 
(and those that *do* map to single octets are code points 0..127, so 
text file encoded using the US-ASCII character encoding scheme by 
definition is a also a text file using the UTF-8 c.e.s.).

You may want to have a look at

   <http://tools.ietf.org/html/draft-hoffman-utf8-rfcs-06>

Best regards, Julian



More information about the rfc-interest mailing list