[rfc-i] Potential RFC format approach: HTML

Joe Hildebrand jhildebr at cisco.com
Sat Mar 24 07:38:03 PDT 2012


On 3/24/12 2:45 AM, "Fred Baker" <fred at cisco.com> wrote:

> Let me express my ignorance of character sets.
> 
> One of the issues that is raised periodically is that people would like to be
> able to spell their own names correctly. I can in fact spell my middle name
> (Jürgens); I'm not sure UTF-8 helps me if my name is 孙琼. Does it?

Ironically, the email you sent was encoded as UTF-8. :)

Spend 5m to read the table at: http://en.wikipedia.org/wiki/Utf8#Description

Our only other choice is to allow multiple character encodings, which means
in practice you'll need to look at the first several bytes of any RFC to
guess if it's UTF-16BE, UTF-16LE, etc., since many folks operate on them
just as files, without having (the mostly useless in practice) hints from
the web server.

-- 
Joe Hildebrand



More information about the rfc-interest mailing list