[rfc-i] [IAB Trac] #270: Resources associated with allowing UTF-8 in RFCs (Section 3.3)
"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Wed Feb 27 00:51:31 PST 2013
Hello Heather, others,
On 2013/02/27 8:50, IAB issue tracker wrote:
> #270: Resources associated with allowing UTF-8 in RFCs (Section 3.3)
> Comment (by hlflanagan at gmail.com):
> After some discussion, I have revised the ASCII/UTF-8 requirement in
> Section 3.2 to read:
> The official language of the RFC Series is English. ASCII is required for
> all "normative" text, i.e., text that must be read to understand or
> implement the technology described in the RFC. UTF-8/Unicode text will be
> allowed for Author names and addresses and non-normative text within an
> RFC. Author names and addresses will require an ASCII equivalent for
> indexing purposes.
I'm not sure that "Author names and addresses will require an ASCII
equivalent for indexing purposes." is exactly the right wording. There
are three issues:
1) "for indexing purposes" does not say whether the equivalent will be
part of the published text, or will only be part of the metadata (maybe
hidden somewhere in some versions of the published document).
2) "for indexing purposes" does not provide a justification. Sorting of
Unicode text (for indexing) can be done in various ways these days.
Search engines and the like take care of quite a bit of variability.
3) The distinction between ASCII and non-ASCII is a remainder from the
old format, not what's actually needed here. No decent publisher (e.g.
ACM, IEEE) or other standards organization (e.g. W3) would do things
Patrick Fältström (Patrick Faltstrom)
Martin Dürst (Martin Duerst)
On the other hand, most publishers wouldn't use non-Latin names in
prominent places at all (I very much think non-Latin names and addresses
should be allowed, but with Latin equivalents closeby. This would be
what W3C is doing for quite a while, see e.g.
http://www.w3.org/TR/2003/REC-SVG11-20030114/). So what we need for
author names and addresses is a distinction between Latin and non-Latin.
> Based on conversations with Henrik Levkowetz regarding allowing UTF-8 in
> to the RFC, this will require some code changes as well as some training
> of the RFC Editor staff, but it is definitely within our reach to allow
> for this change.
More information about the rfc-interest