[rfc-i] How lack of Unicode support in IDs is detrimental to design

Phillip Hallam-Baker hallam at gmail.com
Fri Jul 27 08:39:57 PDT 2012


I am just writing a draft that describes how to implement a PIN based
challenge response.

To establish an initial connection to the server, the user presents a
PIN value such as CS0F40-30LV09-K000.

Now the specification states that the PIN code is in UTF8. It probably
does not make any good sense for a French implementation to use accent
characters but I would hope that a implementer would have the sense to
use the Greek alphabet for a Greek language deployment, Cyrillic for
Russian and so on.


Now I am writing a reference implementation along with the draft to
generate the examples and I am about to extend the implementation to
offer Cyrillic , Greek, Arabic as well. All the examples in my drafts
are generated from live code that runs each time the text is built.

This is all going to be perfectly readable in the HTML version without
extra effort but it simply makes no sense in the ASCII version unless
I write a lot of additional code to present the UTF8 in the ASCII
format and write the draft to cope.

The path of least resistance is to simply write everything in ASCII
like the draft. Which means not having non-ASCII examples which in
turn means that the implementer reading the draft is never faced with
the fact that using UTF8 for the PIN and account name is more than
just a different encoding for the ASCII character set. They should be
actually supporting those other characters as well.

And this is before we get to the added complication of left-to-right
conversions.


Not being able to express these ideas in drafts means not being able
to communicate them effectively.

-- 
Website: http://hallambaker.com/


More information about the rfc-interest mailing list