[rfc-i] Proposed new RFC submission requirements

Joe Hildebrand jhildebr at cisco.com
Sat May 26 00:00:23 PDT 2012


On 5/26/12 12:40 AM, "Joe Touch" <touch at isi.edu> wrote:

> Here's the counterexample:
> 
> heading
> para
> para
> para
> list item
> list item
> list item
> list item
> 
> Is that one list of four items? Is it two lists of two items each? Where is
> the list container? Does the list belong to the paragraph that precedes it, or
> as a separate container belonging to the heading level?

I was only talking about sections.  There's no good way in HTML to do list
items without an ol or ul around them, and I don't believe that Word is
generating lists without wrappers.  In a text-only format, this is exactly
the sort of ambiguity that the doc doesn't have enough structure to answer.
In the face of a lack of data, I'd say that it's one list with four items.
If it doesn't matter to the author enough to use a tool that preserves his
or her intent, then nobody else is likely to care about the difference
downstream.

>>> E.g., Word doesn't use that structure.
>> 
>> You post-process the output of Word anyway.  Whoever writes the
>> post-processing tool is going to have to write a few lines of code.
> 
> Some of it is easy - as you note, I can generate tags that contain sections
> within the headings that delimit them.
> 
> I cannot generate section containers for groupings that cannot be indicated by
> Word - as per the list above.

If Word is generating <li> without a <ul> or <ol> around it, there's a bug.

> Further, why group all the paragraphs under one heading? At least one output
> from Word treats them as one long paragraph with BRs in between, rather than
> as individual paragraphs.

English text contains paragraphs.  In RFC's, we often group multiple
paragraphs together into a section; the lineprinter format uses a blank line
to delineate a paragraph boundary.

Word knows how to deal with paragraphs.  It's inserting br's in order to
gain control over line splitting, which is one of the things we're trying to
solve for in the "reflowing" discussion.

> That isn't the same structure most people use when writing XML2RFC, but it is
> just as valid.

If you know the size of the screen of the device that the reader is going to
use, and there's only one such size, perhaps.  We don't live in that world
anymore.

>>> I could add it for heading sections
>>> (it's easy to generate from nested navigation tags), but cannot generate it
>>> for lists or other sections - there's no way to differentiate between a set
>>> of
>>> paragraphs that are not related and ones that are, so there's no way to
>>> group
>>> them in a container.
>> 
>> I assume the sections are separated by a header, which has a depth
>> associated with it?  Everything between headers is in the same section.
> 
> But not necessarily the same container.

You can intuit a container (and add it if need-be) if the sections are
separated.  I'd walk through the logic for it, but you haven't been
interested in algorithms to this point.  Perhaps you could either care, or
take my word for it?

> You've only given the same reason repeatedly - editing. Support for editing
> was not given for any formats except authoring, which we all seem to agree
> ought to be up to authors.

No, I've given two.  Programmatic editing is one, and information extraction
is the other.  The extraction function has nothing to do with editing, since
it does not modify the file.

-- 
Joe Hildebrand



More information about the rfc-interest mailing list