[rfc-i] Unicode or UTF-8

Dave Thaler dthaler at microsoft.com
Wed Mar 28 09:47:01 PDT 2012

Iljitsch van Beijnum writes:
> If we want to go beyond ASCII, UTF-8 is a no-brainer, because there is no
> difference between a file that is in US ASCII and a file that is in UTF-8 but just
> happens to have no code points > 127 

Not entirely true.  A file that is in UTF-8 may start with a 3-byte BOM (EF BB BF)
that identifies it as being encoded in UTF-8.


More information about the rfc-interest mailing list