[rfc-i] Byte order marks

Joe Touch touch at ISI.EDU
Wed Nov 5 10:43:31 PST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Simon Josefsson wrote:
> Joe Touch <touch at ISI.EDU> writes:
...
> One reason the IETF have not had much progress on the UTF-8 issue may be
> that these interoperability discussions are brought up every time.  By
> conducting an experiment with .utf8 alternate versions of documents,
> we'd gain deployment experience, which I believe will resolve the
> uncertainty factor in a positive way.
> 
> Of course, I support and would prefer to specify UTF-8 (without BOM)
> directly for *.txt if it is possible to get consensus on that.  I am
> against UTF-8 with BOM as a format for *.txt based on what I know about
> deployed implementations today (i.e., the software I tested in another
> e-mail: none of them needed BOM to detect UTF-8 data).
> 
>> And when we do go to UTF-8 as a primary, we should, IMO, use .utf8 as
>> the suffix there. I.e., that's a separate issue.
> 
> Here I disagree.  I have never seen the .utf8 extension used for this
> purpose.  I edit many text files in UTF-8 that are called *.txt and
> haven't seen any problems with that.

OK, so there are separate issues:

- - if we use UTF-8 as supplemental
	- skip the BOM
	- store as .txt, but flag as UTF-8 somehow
		either by file location, or by name-utf8.txt (e.g.)

How we do things if UTF-8 is primary can be decided based on the
experiment of using it as supplemental at that point...

Joe

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkkR6VMACgkQE5f5cImnZrtuBQCfdvfxC5IHoA7BBe4wN3ReQYe/
lk8AoND+dX1qNeoXpSf6ck4R3bMRstqG
=shbJ
-----END PGP SIGNATURE-----


More information about the rfc-interest mailing list