[rfc-i] Pagination requirements

Martin Rex mrex at sap.com
Tue May 15 17:07:19 PDT 2012


Peter Saint-Andre wrote:
> 
> Julian Reschke wrote:
> >
> > It's simply extremely hard to explain in a specification how to deal
> > with non-ASCII characters if you can't use them.
> > 
> > Read RFC 3987, then please come back and explain why this is not a problem.

Could you point to a specific location in that document where you believe
that the lack of non-ASCII glyphs is a problem, I don't see one.


> 
> Yes, more and more specifications in the Apps Area, RAI Area, and even
> Security Area need to say something about internationalization. They
> don't all need to include non-ASCII characters, but for those that do
> this issue needs to be addressed.

I'm pretty sure that there are less than 1 in 200 RFCs where this
would make sense at all.


> P.S. Here's a relevant paragraph from a discussion about visually
> similar characters in the security considerations of an I-D I'm working
> on (draft-ietf-precis-framework)...
> 
>    However, the problem is made more serious by introducing the full
>    range of Unicode code points into protocol strings.  For example, the
>    characters U+13DA U+13A2 U+13B5 U+13AC U+13A2 U+13AC U+13D2 from the
>    Cherokee block look similar to the US-ASCII characters "STPETER" as
>    they might look when presented in a "creative" font.
> 
> It would be helpful to include the actual characters, not just the
> Unicode codepoint numbers:

Helpful?  How?  Your example comes out as garbled noise here:

> 
>    However, the problem is made more serious by introducing the full
>    range of Unicode code points into protocol strings.  For example, the
>    characters U+13DA U+13A2 U+13B5 U+13AC U+13A2 U+13AC U+13D2 from the
>    Cherokee block, i.e., "á\217\232á\216¢á\216µá\217\213á\216¢á\217\213á\217\222", look similar to the US-ASCII
>    characters "STPETER" as they might look when presented in a
>    "creative" font.

I absolutely do not see a need to "demonstrate" this, and in particular,
the majority of fonts does not even have glyphs for such codepoints and
a lot of software does not have capabilities to display them anyway.

When creating a html document with your example codepoints:

    ASCII:     STPETER<p>
    Cherokee: &#x13DA;&#x13A2;&#x13B5;&#x13AC;&#x13A2;&#x13AC;&#x13D2;<p>

I get displayed empty square boxes in MSIE and square boxes with tiny
xdigits inside in FF 3.6.  Maybe you should have used glyphs from a less
exotic codepage than Cherokee, e.g. Cyrillic (but it's lacking the R):

    Cyrillic: &#x0405;&#x0422;&#x0420;&#x0415;&#x0422;&#x0415;R<p>

Anyway, I consider it completely unnecessary trying to demonstrate (and fail)
that there exist different unicode codepoints that have similar glyphs.
It is perfectly sufficient to state that this is the case and leave all
the rest to the Unicode SDO.


-Martin


More information about the rfc-interest mailing list