[rfc-i] Potential RFC format approach: HTML
Joe Hildebrand
jhildebr at cisco.com
Sat Mar 24 07:38:03 PDT 2012
On 3/24/12 2:45 AM, "Fred Baker" <fred at cisco.com> wrote:
> Let me express my ignorance of character sets.
>
> One of the issues that is raised periodically is that people would like to be
> able to spell their own names correctly. I can in fact spell my middle name
> (Jürgens); I'm not sure UTF-8 helps me if my name is 孙琼. Does it?
Ironically, the email you sent was encoded as UTF-8. :)
Spend 5m to read the table at: http://en.wikipedia.org/wiki/Utf8#Description
Our only other choice is to allow multiple character encodings, which means
in practice you'll need to look at the first several bytes of any RFC to
guess if it's UTF-16BE, UTF-16LE, etc., since many folks operate on them
just as files, without having (the mostly useless in practice) hints from
the web server.
--
Joe Hildebrand
More information about the rfc-interest
mailing list