[rfc-i] Input Syntax vs Canonical Form/rfcedstyle vs Output Formats [was: Re: Comments on draft-hoffman-xml2rfc-06]

Elwyn Davies elwynd at folly.org.uk
Thu May 1 16:19:54 PDT 2014


Hi.

On Wed, 2014-04-30 at 20:46 -0700, Paul Hoffman wrote:
> As always, thanks. Comments below; other stuff is now in the pre-draft for -07.
> 
:-)

> --Paul Hoffman

Splitting out a couple of the issues and your responses:
> 
> On Apr 30, 2014, at 5:54 PM, Elwyn Davies <elwynd at dial.pipex.com> wrote:
> 
<<snip>>
> > s2.7.3, "initials": I guess the spacing stuff was too much like
> > formatting.  Are we still allowed optional space in the string? (If only
> > for backward compatibility!)
> 
> That's a question for the Style Guide and thus Heather, not me.
> 
<<snip>>
> > s2.16.2: I (still) don't see why the month can't be alternatively
> > specified as a month number (possibly easier for non mother tongie
> > authors).  Your argument that this was a style issue doesn't seem to
> > hold water, since the formatter can map from numbers just as easily if
> > that is what the style requires.
> 
> I think I said it was an issue for the Style Guide. That is, take it up in a separate thread for that document and Heather, not this one and me. 
> 
<<snip>>

Also adding in:

> 2.46.8.  'obsoletes' attribute
> 
> 
>    A comma-separated list of RFC numbers or Internet-Draft names.
> 
>    Processors ought to parse the attribute value, so that incorrect
>    references can be detected and, depending on output format,
>    hyperlinks can be generated.  Also, the value ought to be reformatted
>    to insert whitespace after each comma if not already present.
> 
> 
> 2.46.15.  'updates' attribute
> 
> 
>    A comma-separated list of RFC numbers or Internet-Draft names.
> 
>    Processors ought to parse the attribute value, so that incorrect
>    references can be detected and, depending on output format,
>    hyperlinks can be generated.  Also, the value ought to be reformatted
>    to insert whitespace after each comma if not already present.
> 

Stepping back from this...

I think it may be that these four points are pointing out that we (or
maybe just me) need to think a bit more about the role of the canonical
form vs the v3 vocabulary.  

The v3 vocabulary plays three slightly different roles:
- It is an input syntax in which the original authoring is done
- It is an output format in which the canonical form is generated
- It is an input syntax for formatters/renderers to create output from
either the original input or the canonical form.

Jumping ahead a bit, I am sort of assuming that the RFC
editing/production process will want some mechanical assistance with
converting the input documents into the canonical form.  This is (I
guess) a souped up version of the xml2rfc output mode that the existing
tooling has.  That is, it is a sort of formatter or
'canonicalizer' (ugh!).

I would suspect that things this tool might be doing among others are:
- Pretty printing the XML
- Inserting the autogenerated items in the vocabulary
- Maybe adding a comment block at the beginning with document ID,
copyright, abstract, etc., etc., in a form that is easy to read for
humans (I could see that this might look a whole lot like the 'front'
part of a current text RFC :-) ).

In addition I could envisage that this formatter should enforce (rather
than just checking) some of the more mechanical RFC editor style rules
(e.g., ToC depth), which is where we come back to the issues/items
above.

My points about s2.7.3 and s2.16.2 are issues with the *input syntax*:
- In s2.7.3, is white space allowed?
- In s2.16.2, could the month be given as a number instead of as text?

The quoted bits from s2.46 are a sort of mix up where it is implied that
white space is allowed but the canonical form at least (and probably any
other formatter) should canonicalize the layout (as well as checking the
numbers make sense).

So my point is that the vocabulary should specify that variants are
possible (white space in varying quantities or none; month numbers or
English names); this is only peripherally related to the RFC editor
style.  On the other hand, the canonicalizer could be expected to
generate the canonical form as per rfcedstyle, e.g., period and single
space between initials, comma and single space between items in
updates/obsoletes lists, month names in dates).  These of course still
match the v3 vocabulary.

The canonicalizer could be mentioned in the discussions of formatters.

Regards,
Elwyn




More information about the rfc-interest mailing list