[rfc-i] Byte order marks

Paul Hoffman paul.hoffman at vpnc.org
Tue Nov 4 09:14:40 PST 2008


At 5:33 PM +0100 11/4/08, Simon Josefsson wrote:
>1) Concatenating the document becomes non-trivial and incompatible with
>   how ASCII documents are concatenated.
>
>   Example of when this can cause problems: your other e-mail containing
>   a MIME part with the draft with a leading UTF-8 BOM is displayed in
>   my MUA with a Unicode ZWNBSP symbol. 

It was application/octet-stream. A sane MUA should not have concatenated it with anything. Does your MUA also concatenate PDFs to text files?

>   I do not recall anything in the
>   MIME standard that requires MUAs to strip leading UTF-8 BOM before
>   concatenating MIME parts. 

Correct. Nor should it.

>   Indeed, RFC 3629 section 6 requires that
>   it is treated as ZWNBSP.

Exactly.

Further, I don't see the problem with concatenating. The BOM will either not print (if the display/print device understands that it is a zero-width character) or it will show as a single blurp on a blank line. What is the problem with that?

>2) BOM was not designed for UTF-8 auto-detection.

Um, so? It works fine for UTF-8 auto-detection.

>3) As far as I can tell, RFC 3629 section 6 says that protocols SHOULD
>   forbid use of BOM signatures when the data format is specified to be
>   UTF-8, which seems to hold here.  See:
>
>   http://tools.ietf.org/html/rfc3629#section-6

Right. But the data format proposed here is not UTF-8. It is "either US-ASCII or UTF-8". That's the whole point.

>4) It is still not clear that having a UTF-8 BOM does anything useful.
>   Which implementations are affected?  I believe modern versions of
>   Microsoft tools, which used to be the problematic tools, have been
>   fixed.

It is important that drafts and RFCs be readable on software other than just the latest tools.

--Paul Hoffman, Director
--VPN Consortium


More information about the rfc-interest mailing list