[rfc-i] Fwd: I-D ACTION:draft-hoffman-utf8-rfcs-03.txt

Martin Duerst duerst at it.aoyama.ac.jp
Sun Oct 19 18:20:01 PDT 2008

At 03:54 08/10/20, Joe Touch wrote:

>Julian Reschke wrote:

>The "competing proposal" here is to continue to use the current subset
>of ASCII (presumably that's what you mean by 'text/plain'); so clearly
>the lack of limitations (needing to use a browser - which isn't
>available on all platforms,

Can you tell me about platforms where a browser isn't available?

>> As the IETF itself requires all new work to allow non-ASCII characters, 
>> and the UTF-8 spec is a full standard, we really should eat our own dog 
>> food. 
>It is a **very bad** idea to have documents describing a system "eat our
>own dog food". The whole point is to be able to read the docs to build
>our systems, not to use the docs as a proof-of-concept that our system
>decisions work.

This is true in general. You have to give new technology some
bootstrap time, and have to be somewhat more conservative for
standards documents than for fancy, cutting-egde stuff.

HOWEVER, this doesn't mean that one has to be overly super-conservative.

UTF-8 isn't Experimental, it isn't Proposed, it isn't Draft, it's
a full IETF Standard. Also, there is a BCP that requires new protocols
to support UTF-8, and recommends old protocols to be upgraded if they
don't support UTF-8 yet. Also, UTF-8 isn't something completely different
from everything we have had, it's just a graceful extension of US-ASCII.
So we are way above the level of dogfood, it's food we are already
eating ourselves and requiring others (humans) to eat!

Also, the proposal on the table doesn't require to use as much true
UTF-8 as possible, just to the contrary, it only proposes to use
UTF-8 where it's truely useful, namely for non-ASCII examples where
these help understand the spec, and in writing names and addresses of
the people involved. In each case, some fallback is to be provided,
so that the document can be read and understood even if the true
UTF-8 is printed as garbage or even if it should disappear.

As an example of a bootstrapping delay, I can tell you that the W3C,
who also tries to live by "eat your own dog food", was very careful
to not just use UTF-8 in the first spec they talked about using UTF-8.
The first spec for HTML internationalization, RFC 2070, was all in US-ASCII. The second one, HTML4, was in Latin 1. The first W3C Recommendations
that used UTF-8 (and used it sparingly) were much later.

Have you ever read any of the documents that were given as cases
where examples in non-ASCII would be useful, such as the IDN, IRI,
or EAI specs? My impression is that you and others oppose to
trying UTF-8 in part because you don't really see the benefit,
only the costs.

Regards,    Martin.

#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp     

More information about the rfc-interest mailing list