[rfc-i] Fwd: I-D ACTION:draft-hoffman-utf8-rfcs-03.txt

Martin Duerst duerst at it.aoyama.ac.jp
Tue Oct 7 02:27:48 PDT 2008

At 17:31 08/10/07, Tim Bray wrote:
>On Mon, Oct 6, 2008 at 7:49 PM, Martin Duerst <duerst at it.aoyama.ac.jp> wrote:
>> Just a few comments:
>I agree with all of Martin's comments, except as noted:
>> 2.1: I think there are other places where Non-ASCII makes sense.
>> E.g.: Acknowledgment section, in the actual text for document
>> titles (or names) quoted there, and so on.
>OK, except for "... and so on".  Let's keep it circumscribed.

Fine with me.

>> 2.2:
>>   Specifications encoded in UTF-8 should not contain the encodings of
>>   certain Unicode codepoints.  The codepoint ranges given in this
>>   section are inclusive:
>> I read the 'inclusive' as "these are okay". Also, "the encodings of"
>> looks redundant in this context. What about simply:
>>   Specifications using UTF-8 must not use the following codepoint
>>   ranges:
>"ranges, given inclusively":

"inclusively" confused me. Why would it be necessary? All ranges in
Unicode are given in that way. That's what average people
(including and especially non-geeks) expect.

>> 2.2: I would add a general sentence saying that there are many other
>> kinds of codepoints (e.g. unassigned, control- or formatting-like,
>> compatibility,...) that should only used with great care if at all.
>> You have something about compatibility characters, but I think wording
>> that in a more general fashion is better.
>Disagree, stated so generally.  Compatibility characters are
>well-defined.  If you have some specific codepoints or ranges to
>exclude, please argue for them specifically.

I don't want to start that slippery slope. But I know I could
keep us busy for quite a while. What I think is important is
to make sure to the authors that even if they respect the
explicit restrictions, being careful is still a good thing.

>> 2.3, or somewhere else: I'd like to see a strong recommendation
>> against "smart quotes", or otherwise some discussion of them
>> (we could propose that "smark quotes" are okay for textual
>> quotation, but not for protocol examples when they are supposed
>> to represent "'" or '"'). Smart quotes easily get into IETF
>> documents, and may be rather difficult to detect automatically.
>Why?  What's the problem with these?  (A similar argument would apply
>to hyphens and the various flavors of dashes).

Yes indeed. That's why I think it is desirable to be clear
on this matter, because I'm very sure this will come up in
practice rather sooner than later:
Do we exclude these (except in non-ASCII examples,...), so that
the text itself looks like it always did, do we tolerate them,
if an author is careful and uses them appropriately, because the
result may really look better, or do we encourage their use?

Regards,    Martin.

#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst at it.aoyama.ac.jp     

More information about the rfc-interest mailing list