julian.reschke at gmx.de
Mon Feb 29 11:01:54 PST 2016
On 2016-02-29 19:46, Heather Flanagan (RFC Series Editor) wrote:
>> "Searches against RFC indexes and database tables need to return
>> expected results and support appropriate Unicode string matching
>> It's not clear what that means, in particular unless we define expected
> People expect search engines to be able to perform searches such that
> searching on "GEANT", for example, will return matches for both "GEANT"
> and "GÉANT". The reverse would also be true. I expect this is
> established enough behavior that we do not need to define it in more
> detail (insert implied question here).
OK, but how exactly does that affect the vocabulary?
>> "For names that include characters outside of the Unicode Latin and
>> Latin Extended script, an author-provided, ASCII-only identifier is
>> required to assist in search and indexing of the document."
>> It would be good to be more precise about what non-ASCII characters are
>> allowed (range?).
> My understanding is that "Latin Extended" is a reasonable way to capture
> Basic Latin (ASCII)
> Latin-1 Supplement
> Latin Extended-A
> Latin Extended-B
> Latin Extended-C
> Latin Extended-D
> Latin Extended-E
> Latin Extended Additional
OK, so the code ranges as per <http://www.unicode.org/charts/>, we may
want to include those over here.
(I also note that there's an "IPA Extensions" code page I'll have to
>> "Keywords and citation tags must be ASCII only."
>> What does "Keywords" refer to? The things we put into the xml2rfc
>> <keyword> element?
Ok. Maybe state that, as the keywords currently are invisible in the
specs, so people might not get what this is about...
Best regards, Julian
More information about the rfc-interest