[rfc-i] How lack of Unicode support in IDs is detrimental to design

"Martin J. Dürst" duerst at it.aoyama.ac.jp
Sun Jul 29 23:58:13 PDT 2012

On 2012/07/28 6:04, Martin Rex wrote:
> Andrew Sullivan wrote:

>> and he was quite correctly pointing out that providing zero examples
>> to show how this could happen is an excellent way to ensure that some
>> underpaid contract programmer with half an attention span will
>> implement the protocol in the future without handling the UTF-8
>> encoded protocol fields, and then there'll be an interoperability
>> problem.
> That is a complete misunderstanding of technology.
> 100% of ALL programmers (that includes the professional ones),
> will be completely unable to enter on their keyboards 90% of the
> Unicode glyphs that Phil might be tempted to use.

That's just fine. Copy-and-paste usually works fine. And there are also 
tools called character pickers.

> And from those
> few lucky ones that are able to enter it,>50% might find themselves
> unable to put it into their implementation, e.g. because from the zoo
> of platforms that a C89 source code is compiled on, only the ASCII
> subset of characters can be used inside code statements.

There are many different programming languages, with many different 
limitations and caveats. And of course examples should go into the 
actual implementation code. But they should go into the tests, where 
indeed it's good to have tests for actual UTF-8, not just for hex.

Regards,   Martin.

More information about the rfc-interest mailing list