[rfc-i] Fwd: I-D ACTION:draft-hoffman-utf8-rfcs-03.txt
duerst at it.aoyama.ac.jp
Mon Oct 20 03:54:33 PDT 2008
At 19:11 08/10/20, Julian Reschke wrote:
>Simon Josefsson wrote:
>> Julian Reschke <julian.reschke at gmx.de> writes:
>>> 1) "text/plain; charset=utf-8" doesn't work well once the file is stored
>>> locally in the file system, and the encoding information is lost. The
>>> proper fix for his is to use a UTF-8 BOM, which enables at least the
>>> standard Windows applications (Notepad/Wordpad) to do the right thing.
>> I think UTF-8 BOM is a poor solution to almost any problem.
>> Does modern Notepad/Wordpad require UTF-8 BOM to parse UTF-8 text?
>If by "modern" you mean "Vista", I don't know. The versions I have
>default to the system locale if no BOM is present (I think).
My understanding was that if there is significant text upfront
that makes it possible/easy to identify the data as UTF-8, then
a BOM is not needed. That understanding is based on WinXP, not Vista.
Also, it's not the result of rigorous tests.
In our case, "significant non-ASCII text upfront" will usually
not apply; a single or a few non-ASCII characters in an author's
name don't cut it, and there is too much boilerplate upfront.
>> I think we should just say that people should upgrade to tools that
>> detect and handle UTF-8 text without need for UTF-8 BOM's.
>That would be guessing based on statistics, right?
>rfc-interest mailing list
>rfc-interest at rfc-editor.org
#-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst at it.aoyama.ac.jp
More information about the rfc-interest