[rfc-i] Unicode or UTF-8

Dave Thaler dthaler at microsoft.com
Wed Mar 28 09:47:01 PDT 2012


Iljitsch van Beijnum writes:
[...]
> If we want to go beyond ASCII, UTF-8 is a no-brainer, because there is no
> difference between a file that is in US ASCII and a file that is in UTF-8 but just
> happens to have no code points > 127 
[...] 

Not entirely true.  A file that is in UTF-8 may start with a 3-byte BOM (EF BB BF)
that identifies it as being encoded in UTF-8.

-Dave



More information about the rfc-interest mailing list