[rfc-i] a short note about the use of non-ASCII characters (was Re: [xml2rfc] will <vspace=n> disappear)

Heather Flanagan (RFC Series Editor) rse at rfc-editor.org
Mon Oct 27 14:01:49 PDT 2014


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/27/14 4:53 PM, Nico Williams wrote:
> On Mon, Oct 27, 2014 at 09:00:17PM +0100, Carsten Bormann wrote:
>> On 27 Oct 2014, at 18:30, Julian Reschke <julian.reschke at gmx.de>
>> wrote:
>>> 
>>>> All my source files of course use Unicode (Universität, —, „,
>>>> «, all that). Today, xml2rfc converts this to the necessary
>>>> ASCII fallbacks for publication (Ari Keränen’s name being the
>>>> only place where that didn’t work well, so far). With XML as
>>>> a publication format, I’ll need a XML to XML processor to do
>>>> that.
>>> 
>>> No, you won't, as the new format will allow non-ASCII
>>> characters in contact information anyway.
>> 
>> Actually, I do, because text is rarely ASCII*). (Also, human
>> names do turn up in other places than contact information, but
>> that’s secondary.)
>> 
>> Grüße, Carsten
>> 
>> *) See 
>> https://svn.tools.ietf.org/svn/wg/core/draft-ietf-core-block-xx.xml
>>
>> 
for a random example.  This one doesn’t even have µW in it, a unit
>> that happens to be of great importance in the field I’m working
>> in.
> 
> +1.  But this is really a different thread, and IIUC the consensus
> is set for now.  We'll experiment with author contact info and
> we'll extend non-ascii support to other content later.

Actually, that's not entirely accurate.  See
http://tools.ietf.org/pdf/draft-flanagan-nonascii-03.pdf.  The rules
are more permissive than just author name; it's more a question of
when the tool chain used by the RFC Editor can support the necessary
checking and representation.

That draft will stay a draft until such time as we have running code
to see how things work; expect some changes as reality hits the plan.

> 
> (IMO that's a bit overly conservative.  Who can't read Unicode
> nowadays, at least for common scripts (and math symbols, and...)
> for which there are widely-available fonts?  It'd be more
> interesting to find a subset of Unicode that is extremely likely to
> be supported, then permit that for all RFC content, leaving the
> remainder of Unicode for author names and contact information.)

My biggest challenge in this area is a question of what typeface to
use that includes enough Unicode characters to be useful, has and
acceptable license, and includes at least a mono-space font and
hopefully a variable-width font as well.  But that's a subject for its
own thread, and it's something I'm talking to a typography expert about.

- -Heather

-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: GPGTools - http://gpgtools.org

iQEcBAEBAgAGBQJUTrK9AAoJEER/xjINbZoGyacH/ixfGBHngy61cW+8u6baxJ7N
eRzSJWlNSJF5KYIJSE5mSnKP8r6PNas/YlVOnMlyQ+BsE/FXHFocnvKwpHpBpB01
Z9W7g0Q3VdgC1pACP62LQPfJqMJ/tP0LtEp4w78KEYEAU7TRfLMcSI1aRKMpbmtQ
1qVkd0e/nemGo2emlKsNWa6cC/cdjg44Am7vX8ZEvJcVhqKWiUxiP6KHrloIKpSj
L2Z2bDWyLx41jMSGc24lu0qnJ9Q0ybhxG7jqTlAb3h4JNak94R9JEdA4JoaQpPdJ
AsUGkJg4JuUYeGw6nVcq3iKR2hLnLzwSYvhD6X8MagUO1MsrZI/3NypZKhlWY/U=
=/TpD
-----END PGP SIGNATURE-----


More information about the rfc-interest mailing list