[rfc-i] Byte order marks
paul.hoffman at vpnc.org
Tue Nov 4 09:14:40 PST 2008
At 5:33 PM +0100 11/4/08, Simon Josefsson wrote:
>1) Concatenating the document becomes non-trivial and incompatible with
> how ASCII documents are concatenated.
> Example of when this can cause problems: your other e-mail containing
> a MIME part with the draft with a leading UTF-8 BOM is displayed in
> my MUA with a Unicode ZWNBSP symbol.
It was application/octet-stream. A sane MUA should not have concatenated it with anything. Does your MUA also concatenate PDFs to text files?
> I do not recall anything in the
> MIME standard that requires MUAs to strip leading UTF-8 BOM before
> concatenating MIME parts.
Correct. Nor should it.
> Indeed, RFC 3629 section 6 requires that
> it is treated as ZWNBSP.
Further, I don't see the problem with concatenating. The BOM will either not print (if the display/print device understands that it is a zero-width character) or it will show as a single blurp on a blank line. What is the problem with that?
>2) BOM was not designed for UTF-8 auto-detection.
Um, so? It works fine for UTF-8 auto-detection.
>3) As far as I can tell, RFC 3629 section 6 says that protocols SHOULD
> forbid use of BOM signatures when the data format is specified to be
> UTF-8, which seems to hold here. See:
Right. But the data format proposed here is not UTF-8. It is "either US-ASCII or UTF-8". That's the whole point.
>4) It is still not clear that having a UTF-8 BOM does anything useful.
> Which implementations are affected? I believe modern versions of
> Microsoft tools, which used to be the problematic tools, have been
It is important that drafts and RFCs be readable on software other than just the latest tools.
--Paul Hoffman, Director
More information about the rfc-interest