[rfc-i] Byte Order Marks for UTF-8

Tim Bray tbray at textuality.com
Wed Jul 18 08:24:37 PDT 2012


Actually, the BOM is super-useful for UTF-16; fortunately, there’s
very little of that being sent over the wires.  And any software whose
correct functioning requires a BOM in UTF-8 is pretty badly broken.

If you’re using an actual XML processor and get out of the way it’ll
get the encoding right; unfortunately, Web Architecture says (not
unreasonably) that you’re supposed to believe the encoding information
in the headers even when it’s wrong, which can make things break.
Sigh.
  -T

On Wed, Jul 18, 2012 at 8:11 AM, Phillip Hallam-Baker <hallam at gmail.com> wrote:
> I think the tools need to accept a BOM but not require one.
>
> It took me quite a while to work out why my XML sometimes worked and
> sometimes did not. It turned out that my tools added a BOM which was
> being stripped out again in some cases and not in others.
>
> Much better if nobody had introduced BOMs into XML in the first place.
> The declaration defines the encoding for a start. But they did and we
> need to live with the consequences.
>
>
> On Wed, Jul 18, 2012 at 11:01 AM, Paul Hoffman <paul.hoffman at vpnc.org> wrote:
>> On Jul 18, 2012, at 12:47 AM, Martin J. Dürst wrote:
>>
>>> I'm reluctant for now to add a BOM, because that might give some hick-ups to some kind of processing.
>>
>> Given that some people said "this works fine if there is a BOM", I think that experimenting with BOMs is actually worthwhile.
>>
>> --Paul Hoffman
>> _______________________________________________
>> rfc-interest mailing list
>> rfc-interest at rfc-editor.org
>> https://www.rfc-editor.org/mailman/listinfo/rfc-interest
>
>
>
> --
> Website: http://hallambaker.com/
> _______________________________________________
> rfc-interest mailing list
> rfc-interest at rfc-editor.org
> https://www.rfc-editor.org/mailman/listinfo/rfc-interest


More information about the rfc-interest mailing list