[rfc-i] Potential RFC format approach: HTML
Julian Reschke
julian.reschke at gmx.de
Sat Mar 24 02:02:16 PDT 2012
On 2012-03-24 09:45, Fred Baker wrote:
>>> - Make the decision that the files will ALWAYS be encoded UTF-8.
>>
>> It's not strictly necessary as long as the encoding is properly
>> declared, but certainly helpful.
>
> Let me express my ignorance of character sets.
>
> One of the issues that is raised periodically is that people would like
> to be able to spell their own names correctly. I can in fact spell my
> middle name (Jürgens); I'm not sure UTF-8 helps me if my name is 孙琼.
> Does it?
It does. UTF-8 is a character encoding scheme that maps all Unicode code
points into octet sequences; most of which being longer than one octet
(and those that *do* map to single octets are code points 0..127, so
text file encoded using the US-ASCII character encoding scheme by
definition is a also a text file using the UTF-8 c.e.s.).
You may want to have a look at
<http://tools.ietf.org/html/draft-hoffman-utf8-rfcs-06>
Best regards, Julian
More information about the rfc-interest
mailing list