[rfc-i] Fwd: I-D ACTION:draft-hoffman-utf8-rfcs-03.txt

Tim Bray tbray at textuality.com
Tue Oct 7 01:31:52 PDT 2008

On Mon, Oct 6, 2008 at 7:49 PM, Martin Duerst <duerst at it.aoyama.ac.jp> wrote:

> Just a few comments:

I agree with all of Martin's comments, except as noted:

> 2.1: I think there are other places where Non-ASCII makes sense.
> E.g.: Acknowledgment section, in the actual text for document
> titles (or names) quoted there, and so on.

OK, except for "... and so on".  Let's keep it circumscribed.

> 2.2:
>   Specifications encoded in UTF-8 should not contain the encodings of
>   certain Unicode codepoints.  The codepoint ranges given in this
>   section are inclusive:
> I read the 'inclusive' as "these are okay". Also, "the encodings of"
> looks redundant in this context. What about simply:
>   Specifications using UTF-8 must not use the following codepoint
>   ranges:

"ranges, given inclusively":

> 2.2: I would add a general sentence saying that there are many other
> kinds of codepoints (e.g. unassigned, control- or formatting-like,
> compatibility,...) that should only used with great care if at all.
> You have something about compatibility characters, but I think wording
> that in a more general fashion is better.

Disagree, stated so generally.  Compatibility characters are
well-defined.  If you have some specific codepoints or ranges to
exclude, please argue for them specifically.

> 2.3, or somewhere else: I'd like to see a strong recommendation
> against "smart quotes", or otherwise some discussion of them
> (we could propose that "smark quotes" are okay for textual
> quotation, but not for protocol examples when they are supposed
> to represent "'" or '"'). Smart quotes easily get into IETF
> documents, and may be rather difficult to detect automatically.

Why?  What's the problem with these?  (A similar argument would apply
to hyphens and the various flavors of dashes).  -Tim

More information about the rfc-interest mailing list