[rfc-i] New version: draft-hoffman-utf8-rfcs-04.txt

Julian Reschke julian.reschke at gmx.de
Tue Nov 4 00:37:49 PST 2008

Joe Touch wrote:
>>> I thought there was consensus to NOT include the BOM in these files (at
>>> least there were three of us who spoke up on the issue).
>> There were people in favor (such as myself), and people arguing against
>> it. Nobody has declared consensus.
> I saw only three messages about it...

Then look again. I suggested it, so I think it's legitimate to count 
that as supporting it.

>>> If support for UTF-8 was in fact as universal as asserted in this doc,
>>> why is a BOM needed at all?
>> That has nothing to do with UTF-8 support being universal or not.
>> The issue is that once encoding information is lost (such as when
>> transferred via FTP, or loaded from the file system), many clients use a
>> default encoding. 
> So the default is ASCII, not UTF-8.

No, there is no default in practice. It varies across operating systems 
and locales.

>> Some of those clients however look at the start of the
>> file, detect the BOM, and use UTF-8 instead (such as Notepad and Wordpad).
> "some" sounds like a strange reason; if it were "most" or "nearly all"
> that'd be different...

It explicitly takes care of your concerns with respect to the default 
editors shipping with Windows. So even if it only affects *some* tools, 
it will affect *many* recipients.

>> So this is a problem of text/plain (not having optional inline encoding
>> information such as XML or HTML), not a problem of UTF-8.
> Text/plain is ASCII; UTF-8 creates the problem by deliberately
> overloading text/plain to also mean UTF-8.

UTF-8 doesn't create any problem that's wouldn't also be present with 
any other superset of ASCII, such as ISO-8859-1.

My understanding is that you're opposed to use text/plain with anything 
except ASCII. I can understand this position, but let's be clear it's 
against *any* encoding other than ASCII, so it's against solving the 
I18N problem within the context of text/plain. From that point of view, 
the only alternatives seem to not to solve the problem, or to move to 
something other than text/plain.

BR, Julian

More information about the rfc-interest mailing list