[rfc-i] Comments on draft-flanagan-nonascii-03 (was: Re: Font selection for RFCs)
"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Wed Oct 29 00:19:15 PDT 2014
Hello Heather, others,
Some comments on draft-flanagan-nonascii-03.
Note: I may have made some of these comments on earlier versions, but I
hope I can explain them better this time.
Looking at the examples in section 3.2, I see things like:
P. Fältström (P. Faltstrom)
PROPOSED/NEW: The following people contributed significant text to
early versions of this draft: Patrik Fältström (Patrik Faltstrom),
陈智昌 (William Chan), and Fred Baker.
The repetition of almost the same text (Fältström and Faltstrom) is
highly annoying. Those not used to accents, umlauts, and diacritics will
mostly just see the same thing twice. Those used to the above will of
course find the second instance totally superfluous. So in effect it
It is also not something I have seen in any serious publication. Both
ACM and IEEE, and I'm sure many other serious publications, would just
use Fältström. Same for books and the like, and for publications from
organizations such as W3C,...
At the best, people will find it annoying. At the worst, we will become
laughing stock, everybody joking how the IETF, despite a lot of efforts,
still cannot wean itself off its beloved ASCII :-).
The only reason I have heard for this was that the document should still
be searchable with ASCII-only. Search engines can deal with accents
without problems, lo let's call this the 'grep on a plane' use case.
What I propose is that for XML and HTML, we hide the ASCII somewhere
where it's still searchable (e.g. in an attribute). For plain text,
maybe we can create an Appendix that only contains ASCII texts so that
they get caught with grep. Human beings can ignore it.
As a separate issue, I'd suggest that in the Acknowledgments example,
instead of "陈智昌 (William Chan)", we use "William Chan (陈智昌)",
because this way, the text flows better. This is also what the W3C or
people such as Don Knuth use. Of course if somebody like William prefers
"陈智昌 (William Chan)", we can allow that, but it shouldn't be what we
On 2014/10/29 06:28, Heather Flanagan (RFC Series Editor) wrote:
> On 10/28/14 4:43 PM, Nico Williams wrote:
>> On Mon, Oct 27, 2014 at 8:45 PM, "Martin J. Dürst"
>> <duerst at it.aoyama.ac.jp> wrote:
>>> Then there is the issue of different languages having different font
>>> preferences. The most widely known example is Chinese vs. Japanese.
>>> the characters are the same, and legible either way, the fonts that the
>>> Chinese are used to and the fonts that the Japanese are used to are quite
>>> different. Showing a Japanese text with a standard Chinese font feels
>>> and the same the other way round. That's why XML and HTML have
>>> attributes, and we should make sure they are usable. [...]
>> This is probably a problem that I bet the RSE will have to deal with.
>> I'm glad you pointed it out.
There is a very small example in the nonascii draft:
"Patrik Fältström (Patrik Faltstrom), 陈智昌 (William Chan), and
The attached .gif shows how this appears in Firefox on Windows. It's
very much the same on IE and Safari, but not on Chrome or Opera (old or
new). It's also visible for me here when writing this mail (Thunderbird,
so not surprising that it's similar to Firefox).
What's the problem? (Please check the .gif; the mileage in your browser
may vary.) The first character in William's name (his family name) is
shown in a different font (thinner and with serifs) than the last two
characters (his given name, thicker and no serifs). It would be similar
to having your family name written e.g. in Times Roman and your given
name in a bold Helvetica.
Why does this happen? I'm on a Japanese OS, and that makes (some)
browsers try to prefer Japanese fonts over Chinese ones, but they have
different coverage. William's family name isn't used in Japanese, and so
isn't covered by most Japanese fonts. To the rendering falls back to a
Chinese font, where the character is available. But these fonts differ
When I enclosed William's Chinese name in <span lang='zh'> </span>
(or <span lang='zh-Hans'> </span>; zh is the code for Chinese, zh-Hans
is the code for Simplified Chinese), the browser used a Chinese font
from the start and the problem went away. So I strongly suggest that we
allow the xml:lang attribute on all relevant elements (we probably
already do), and make sure that we recommend its use in particular for
names (and other text) in Chinese and Japanese and other languages that
might benefit from it for glyph selection.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 34204 bytes
Desc: not available
More information about the rfc-interest