[rfc-i] Byte Order Marks for UTF-8
tbray at textuality.com
Wed Jul 18 08:24:37 PDT 2012
Actually, the BOM is super-useful for UTF-16; fortunately, there’s
very little of that being sent over the wires. And any software whose
correct functioning requires a BOM in UTF-8 is pretty badly broken.
If you’re using an actual XML processor and get out of the way it’ll
get the encoding right; unfortunately, Web Architecture says (not
unreasonably) that you’re supposed to believe the encoding information
in the headers even when it’s wrong, which can make things break.
On Wed, Jul 18, 2012 at 8:11 AM, Phillip Hallam-Baker <hallam at gmail.com> wrote:
> I think the tools need to accept a BOM but not require one.
> It took me quite a while to work out why my XML sometimes worked and
> sometimes did not. It turned out that my tools added a BOM which was
> being stripped out again in some cases and not in others.
> Much better if nobody had introduced BOMs into XML in the first place.
> The declaration defines the encoding for a start. But they did and we
> need to live with the consequences.
> On Wed, Jul 18, 2012 at 11:01 AM, Paul Hoffman <paul.hoffman at vpnc.org> wrote:
>> On Jul 18, 2012, at 12:47 AM, Martin J. Dürst wrote:
>>> I'm reluctant for now to add a BOM, because that might give some hick-ups to some kind of processing.
>> Given that some people said "this works fine if there is a BOM", I think that experimenting with BOMs is actually worthwhile.
>> --Paul Hoffman
>> rfc-interest mailing list
>> rfc-interest at rfc-editor.org
> Website: http://hallambaker.com/
> rfc-interest mailing list
> rfc-interest at rfc-editor.org
More information about the rfc-interest