[rfc-i] Character sets, was Comments on draft-iab-rfcformat

Nico Williams nico at cryptonector.com
Tue Dec 18 21:38:25 PST 2012

I'm one of those troglodytes who uses Unix or Unix-like OSes and reads
RFCs (and I-Ds) on ttys.  And yet I have nothing against non-ASCII
Unicode (UTF-8) in RFCs and I-Ds.  Like others', my terminal
emulator(s) can display non-ASCII just fine.  No, I don't think anyone
still depends on or otherwise prefers to use old *hardware* terminals,
and I could care less if they do.  And no, we've not heard from anyone
who does prefer to use old hardware terminals.

So, count me in the group who would like to see all reasonable scripts
allowed in RFCs.

What's a reasonable script to allow in an RFC?  I'm not sure, but
let's set that aside for a minute.  I would rather instead first start
with this rule: all normative text in an RFC MUST be in English, as
well as all informative guidance to the implementor, and so on.  The
only text allowed to not be in English should be examples (e.g., in
RFCs dealing with globalization) and names.  This rule *alone* is
sufficient to render all of the important bits of an RFC readable (to
English readers, of course).

ALSO, it may well be easier to say that it must be possible to
machine-render an ASCII-only version of any RFC such that, for
example, non-ASCII characters get replaced with corresponding HTML
entities (e.g., á).  This may not be great for reading, but
given the *language rule* it should still suffice for those who cannot
display odd scripts or even non-ASCII at all.  And guess what: it is
absolutely possible to replace non-ASCII Unicode with, e.g., HTML


PS:  Back to "what's a reasonable script to allow in an RFC?"  I'm not
sure, but let's start with: all the Western ones (Latin, Cyrillic,
Greek, and variants thereof), as well as at least the CJK that was in
Unicode 2.0 if not 3.2, and *possibly* the middle Eastern scripts
(e.g., Arabic and Hebrew).  That leaves Indic and other scripts out
for now, but only because I've no idea how prevalent font and renderer
support for those scripts is.  The "reasonable script" set is likely
to get larger over time, as font and renderer support for them becomes
more universal.  If need be we can come up with rules for establishing
whether a script is sufficiently universally supported, but it'd be
*much* easier to just leave the judgement to the RFC-Editor and be
done, that way there's no details to haggle over.  Easier still is to
not even go here.  Just go with the English language rule for
normative text described above.

More information about the rfc-interest mailing list