[rfc-i] Comments on draft-iab-rfcformatreq

Martin Rex mrex at sap.com
Fri Dec 21 02:03:31 PST 2012

Brian E Carpenter wrote:
[ Charset UTF-8 unsupported, converting... ]
> On 20/12/2012 17:18, Ted Hardie wrote:
> > On Wed, Dec 19, 2012 at 11:44 PM, Martin Rex <mrex at sap.com> wrote:
> > 
> >> I don't call them anything.  They're non-ascii, and can not be
> >> typed on my keyboard, any completely irrelevant to the languages
> >> that I understand.
> >
> > The idea that we don't use the trema in English to indicate diaeresis is
> > just naïve.  It's used in names, to indicate that two adjacent vowels do
> > not form a dipthong, and to return to voice what would otherwise be a
> > silent e.

Since I'm native german, I know the term "Umlaut" for the accented characters
(ä,ö,ü,Ä,Ö,Ü), but not the name for the accent itself, and my (german)
keyboard does not have a means to create the accent alone (there is also
no codepoint for just this accent in ISO 8859-1 / 8859-15 (aka Latin-1).

> In British English we write "naive" and "Noel", but if I had chosen to use
> my private address for RFCs published while I lived near Zürich, I would have
> needed to include the street name "Im Broëlberg" (which Germans may find hard
> to stomach, but it is the correct spelling of a street in Kilchberg, Zürich).
> It matters, because "Broelberg" could be misunderstood as an Anglicisation
> of "Brölberg", which is not the same thing.

That sounds really weird to me.  maps.google.com and www.openstreetmap.org
know Broelberg and Brölberg.

The variant you list with the accent on top of the letter e looks really
weird (and somewhat implausible).  Switzerland isn't Germany, but anyhow.

Historically, the accent above the three german umlauts are a small
letter "e" written in the old german script "Sütterlin", so it used to
be short strokes rather than dots for many decades (and explains why
umlauts can be expanded into two latin characters ä->ae, simply by
converting the degenerated "e" represented by the accent back into
a full blown "e" character.

Look at the lower case "e" here and the shape of the accents on the umlauts:

A lowercase "e" character with two dots above would indicate a term
from another language than German.

> A rule that we can use Unicode characters for such reasons seems necessary.

I don't mind Unicode in the illustrative parts -- when they're using
sparingly (be conservative in what you send out), but they ought to be
absent from the normative parts, where they're unnecessary and mainly
create ambiguity and impair communication.

A few years ago I was working on a project involving an engineer
from Russia.  His EMails show up with a long row of question marks
as the sender real name in my Outlook 2003 inbox.  If I double-click on
the question marks, I get a popup window where the name is displayed
with cyrillic characters.  But does that help in any way?  Nope.
Both look like random junk to me, I can not recognize either, I can
not pronounce it, I can not type it.  All by himself, that engineer
used all Latin characters for his name "Sergey" at the end of his Emails.

I still believe that interop is an important goal in the IETF, and that
means that all relevant information needs to be provided in a form that
is highly interoperable (english, ASCII).  Any non-english and non-ASCII
information needs to be clearly seperate and optional, so that it can
be safely ignored by those who do not recognize it.


More information about the rfc-interest mailing list