[rfc-i] [IAB Trac] #269: Discussion of UTF-8 in RFCs (Section 3.3)
pkyzivat at alum.mit.edu
Wed Feb 27 14:55:13 PST 2013
My basic concern is that we provide sufficient tools to ensure that
drafts conform to our formatting rules. If the decision about validity
of UTF-8 usage is specified as being subjective, then we must identify
who is expected to apply the subjective judgement, and give them
sufficient tools to do the job. idnits could provide the warnings as
such a tool, but if there is a high false positive rate, then it will
increase reviewer burden and make it more likely that errors will be missed.
So anything that can be done to automate more of this checking and
reduce the false positive rate will be important.
One possibility would be to tag sections in the markup version of the
draft with metadata describing whether they are normative or not. But
then idnits would need to use the markup version rather than the display
version. And tagging sections as normative or not could be burdensome.
On 2/28/13 6:28 AM, Dave Thaler wrote:
> Paul Kyzivat wrote:
>> On 2/28/13 1:39 AM, Heather Flanagan (RFC Series Editor) wrote:
>>> On 2/26/13 11:52 PM, Brian E Carpenter wrote:
>>>> A test case if I may. Is this normative or informative use of UTF-8?
>>>> "UTF-8 strings MUST be allowed (for example, 'smörgåsbord')."
>>>> The first half is clearly normative but the second half, IMHO, isn't.
>>>> I'm not trying to be clever here; I am genuinely unsure what your
>>>> text means to me as an author or reviewer.
>>> This is the same type of judgement call made on a regular basis when
>>> it comes to deciding things like references. I expect most cases will
>>> be obvious, and edge cases will need to be discussed, exactly as they
>>> are today.
>> While I think we need allow this, the subjective nature of the decision means
>> that idnits can't be authoritative about it. It will probably need to flag most
>> instances of UTF-8 as warnings, and in some documents there could be
>> *many* of those. The only clean solution I see to that is to permit UTF-8
> Some portions, like the page header, abstract, and references section could
> be omitted from idnits checking for UTF-8.
> For other sections which could have both normative and non-normative
> content, I agree it would be good for idnits to generate warnings
> (much as it does for certain classes of IP addresses whenever they're
> used in a doc).
> I disagree that permitting UTF-8 everywhere is "clean" :)
More information about the rfc-interest