[rfc-i] open issues: character sets of examples

Andrew Sullivan ajs at anvilwalrusden.com
Thu May 31 11:02:13 PDT 2012


Dear colleagues,

This is another attempt to discuss an issue raised by the RSE as not
having consensus.  In this case, it is "Want the ability to denote
protocol examples using the character sets those examples support";
and, by implication, "Want broader character encoding for body of
document".

I don't personally care about diagrams; I don't think in diagrams, and
I don't find them that helpful.  I am most comfortable with words.  As
a result, I find examples helpful, and one way I find examples to be
helpful is that they actually portray the case under discussion.
Since I sometimes work on internationalization issues, this
necessarily entails Unicode code points outside the ASCII repertoire.  

The counter argument appears to be that there is no reason to do this,
because one can specify the code points without actually displaying
them.  While this is true, it is not terribly convincing to say (for
instance) that U+02BC looks a lot like U+0027.  If, however, I say
that the character U+02BC, MODIFIER LETTER APOSTROPHE (ʼ) often
resembles the character U+0027, APOSTROPHE ('), then the claim will
perhaps be more convincing (to those using Unicode in their display).
"Ah," the counter-argument says, "but not everyone is using Unicode!"
Surely, however, this is a case where an encoding tag solves that
problem?  We seem to be capable of handling this in nearly every
browser I have seen in many years.  Even my email client of choice --
mutt -- has been able to cope with this for over 10 years on every
terminal I have used.  Perhaps someone can make the counter-argument
clearer to me?

Best regards,

A

-- 
Andrew Sullivan
ajs at anvilwalrusden.com


More information about the rfc-interest mailing list