[rfc-i] RFC editing tools
mellon at fugue.com
Fri Dec 7 11:38:49 PST 2012
On Dec 7, 2012, at 1:53 PM, Paul Hoffman <paul.hoffman at vpnc.org> wrote:
>> Right, then what you are talking about is a presentation form, not a representation form.
> ...or both. We're talking about making it both.
Yes, you're talking about making it both a dessert topping and a floor wax.
> Ummm, no. If the form is both a presentation and a representation, the "produce the presentation form" tools takes the representation form and spits out whatever it wants. Same as it ever was.
No, it's not the same, because the representation isn't in the structure of the HTML, but has to be parsed out of the HTML and the text.
>>> Again: please show an example of "them" so we can understand what you are objecting to.
>> Sections 4.5 and 4.6 of the draft are good examples of what I'm talking about. I think it's no accident that these sections are currently incomplete.
> Are you expecting the author and references to appear without any formatting in the final document? That seems... odd. Why wouldn't the RFC Editor's tools format those to follow all the picky preferences we seem to impose? These are exactly the things we expect to be formatted by the tools.
If the representational form, which you seem to think should be HTML, has to be processed by the RFC editor into a different HTML file in order to be useable, then there is no benefit to using HTML. You could use any parseable representation format and get the same output.
That being the case, why not choose a representation that's easy to parse, instead of the proposed HTML representation, which will be difficult to parse?
To be clear, what I mean by "easy to parse" is that if the representation is some form of XML, the parser just has to read in the XML, and it has a tree structure containing the document, which can be processed and spat out. The parsing process is trivial.
The format Joe has proposed is "hard to parse" because the parser first has to read in HTML, not XML; HTML itself is hard to parse because some tags (e.g., <br>) do not have to be terminated. But then there's the further complexity that once the HTML has been successfully transformed into a tree structure, the parser has to groom the tree structure, examining each element to see if it requires special parsing of the enclosed text and then, if so, doing that special parsing, for which there is no grammar—it's just free-form text. Once this is done, we now have a tree structure containing the document which can be processed and spat out in a different form.
So I think it's simply wrong to claim that the HTML format Joe proposes is a representational format. It is a presentational format, and not a bad one. But making it the canonical format for RFCs means that we lose the benefit of a good representational format: that it can be easily transformed for multiple different uses.
What Joe's draft has started to document is what an RFC should look like when it's presented for viewing in a browser, not what the canonical format of an RFC should be.
More information about the rfc-interest