[rfc-i] [IAB Trac] #270: Resources associated with allowing UTF-8 in RFCs (Section 3.3)

"Martin J. Dürst" duerst at it.aoyama.ac.jp
Wed Feb 27 00:51:31 PST 2013


Hello Heather, others,

On 2013/02/27 8:50, IAB issue tracker wrote:
> #270: Resources associated with allowing UTF-8 in RFCs (Section 3.3)
>
>
> Comment (by hlflanagan at gmail.com):
>
>   After some discussion, I have revised the ASCII/UTF-8 requirement in
>   Section 3.2 to read:
>
>   The official language of the RFC Series is English.  ASCII is required for
>   all "normative" text, i.e., text that must be read to understand or
>   implement the technology described in the RFC.  UTF-8/Unicode text will be
>   allowed for Author names and addresses and non-normative text within an
>   RFC.  Author names and addresses will require an ASCII equivalent for
>   indexing purposes.

I'm not sure that "Author names and addresses will require an ASCII 
equivalent for indexing purposes." is exactly the right wording. There 
are three issues:

1) "for indexing purposes" does not say whether the equivalent will be 
part of the published text, or will only be part of the metadata (maybe 
hidden somewhere in some versions of the published document).

2) "for indexing purposes" does not provide a justification. Sorting of 
Unicode text (for indexing) can be done in various ways these days. 
Search engines and the like take care of quite a bit of variability.

3) The distinction between ASCII and non-ASCII is a remainder from the 
old format, not what's actually needed here. No decent publisher (e.g. 
ACM, IEEE) or other standards organization (e.g. W3) would do things 
such as:
   Patrick Fältström (Patrick Faltstrom)
   Martin Dürst (Martin Duerst)
On the other hand, most publishers wouldn't use non-Latin names in 
prominent places at all (I very much think non-Latin names and addresses 
should be allowed, but with Latin equivalents closeby. This would be 
what W3C is doing for quite a while, see e.g. 
http://www.w3.org/TR/2003/REC-SVG11-20030114/). So what we need for 
author names and addresses is a distinction between Latin and non-Latin.

Regards,    Martin.

>   Based on conversations with Henrik Levkowetz regarding allowing UTF-8 in
>   to the RFC, this will require some code changes as well as some training
>   of the RFC Editor staff, but it is definitely within our reach to allow
>   for this change.



More information about the rfc-interest mailing list