[rfc-i] How lack of Unicode support in IDs is detrimental to design

Joe Hildebrand (jhildebr) jhildebr at cisco.com
Fri Jul 27 14:57:00 PDT 2012


On 7/27/12 3:34 PM, "Dave Crocker" <dhc at dcrocker.net> wrote:


>We've been using the current document format (with ASCII-only
>limitations) to specify binary-based protocols for more than 40 years.
>They've included examples.  Somehow, folks have managed to write them
>and other folk have managed to understand them well enough to achieve
>interoperability.

We've done it, but it's been painful.

>I am not understanding what it is that makes UTF-8 more special than IP
>or TCP binary headers, or ASN.1 encodings or...

Humans (including QA teams) enter these strings, potentially by
cut-and-paste, into user interfaces.  They don't do this with ASN.1 or
packet formats.

>It's not that UTF-8 support in RFCs is a bad idea, it's that claiming
>that its absence is a serious problem is.

I would argue that some of the drafts that specify higher-level protocols
that allow non-ASCII characters have not achieved the level of
interoperability that they should have due to lack of clarity in the
examples.

Folks who never have to deal with things that humans type tend not to
understand the pain that our QA team feels.

-- 
Joe Hildebrand




More information about the rfc-interest mailing list