[rfc-i] Byte order marks
Joe Touch
touch at ISI.EDU
Wed Nov 5 10:43:31 PST 2008
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Simon Josefsson wrote:
> Joe Touch <touch at ISI.EDU> writes:
...
> One reason the IETF have not had much progress on the UTF-8 issue may be
> that these interoperability discussions are brought up every time. By
> conducting an experiment with .utf8 alternate versions of documents,
> we'd gain deployment experience, which I believe will resolve the
> uncertainty factor in a positive way.
>
> Of course, I support and would prefer to specify UTF-8 (without BOM)
> directly for *.txt if it is possible to get consensus on that. I am
> against UTF-8 with BOM as a format for *.txt based on what I know about
> deployed implementations today (i.e., the software I tested in another
> e-mail: none of them needed BOM to detect UTF-8 data).
>
>> And when we do go to UTF-8 as a primary, we should, IMO, use .utf8 as
>> the suffix there. I.e., that's a separate issue.
>
> Here I disagree. I have never seen the .utf8 extension used for this
> purpose. I edit many text files in UTF-8 that are called *.txt and
> haven't seen any problems with that.
OK, so there are separate issues:
- - if we use UTF-8 as supplemental
- skip the BOM
- store as .txt, but flag as UTF-8 somehow
either by file location, or by name-utf8.txt (e.g.)
How we do things if UTF-8 is primary can be decided based on the
experiment of using it as supplemental at that point...
Joe
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkkR6VMACgkQE5f5cImnZrtuBQCfdvfxC5IHoA7BBe4wN3ReQYe/
lk8AoND+dX1qNeoXpSf6ck4R3bMRstqG
=shbJ
-----END PGP SIGNATURE-----
More information about the rfc-interest
mailing list