[rfc-i] Character sets, was Comments on draft-iab-rfcformat
"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Wed Dec 19 01:33:18 PST 2012
On 2012/12/19 14:38, Nico Williams wrote:
> What's a reasonable script to allow in an RFC? I'm not sure, but
> let's set that aside for a minute. I would rather instead first start
> with this rule: all normative text in an RFC MUST be in English, as
> well as all informative guidance to the implementor, and so on. The
> only text allowed to not be in English should be examples (e.g., in
> RFCs dealing with globalization) and names. This rule *alone* is
> sufficient to render all of the important bits of an RFC readable (to
> English readers, of course).
> ALSO, it may well be easier to say that it must be possible to
> machine-render an ASCII-only version of any RFC such that, for
> example, non-ASCII characters get replaced with corresponding HTML
> entities (e.g.,á). This may not be great for reading, but
> given the *language rule* it should still suffice for those who cannot
> display odd scripts or even non-ASCII at all. And guess what: it is
> absolutely possible to replace non-ASCII Unicode with, e.g., HTML
HTML entities have several problems: First, they don't exist for many
characters. Numeric character references (&#xhex;) work in all cases.
Second, there may be quite some cases where a document also uses the
escaped form for real, and as a result there will be confusion between
the different uses.
Third, expressed with a personal example but intended as a general
comment, if my name doesn't appear as Dürst, I don't care too much
whether it appears as Dürst, Dürst, D?rst, D\xyzrst,
or whatever other strange stuff. The important thing at that point
should be to make the readers realize that something isn't right, so
that they can switch to another version/tool (everybody has a browser
these days) if they think they need it, not to help the reader get back
to the right character via table lookup or some such.
More information about the rfc-interest