[rfc-i] Character sets, was Comments on draft-iab-rfcformat

Ted Lemon mellon at fugue.com
Wed Dec 19 08:24:13 PST 2012

On Dec 19, 2012, at 11:14 AM, Brian E Carpenter <brian.e.carpenter at gmail.com> wrote:
> But very specifically, a format that
> router developers can't display on their everyday equipment would
> bother me a lot. That would suggest we are partitioning our space.
> So I think we still need a lowest common denominator.

Realistically, a router developer who doesn't have access to a web browser isn't going to be very effective.   And web browsers, by definition, support UTF-8.   And in practice they support it as well—e.g., five years ago all the work I did with Tibetan language required installing special fonts, and didn't work in some browsers.   Now I don't know of any browsers that don't successfully render Tibetan text.   I mention Tibetan because it's (a) relatively obscure and (b) really hard to render correctly.

So if pretty much any random browser can render Tibetan Unicode web pages, I don't think it's reasonable to suggest that Unicode isn't sufficiently widely implemented that we can rely on interested consumers of these documents having access to it.

To my mind, the argument about Unicode should focus on the screen reader—blind consumers of RFCs should be able to get the RFC read to them comprehensibly.   This means that glyphs that have similar shapes can't be used interchangeably, as we used to do with the 1 and the l on typewriters.   But screen readers that can read any text that's likely to appear in an RFC are widespread, so this argues in favor of a requirement that correct glyphs be used, so that screen readers can read them, but not in favor of a requirement that UTF-8 or Unicode be avoided entirely.

More information about the rfc-interest mailing list