[rfc-i] Potential RFC format approach: HTML

Joe Hildebrand jhildebr at cisco.com
Sat Mar 24 07:32:06 PDT 2012

On 3/24/12 2:33 AM, "Julian Reschke" <julian.reschke at gmx.de> wrote:

> +1 in general. The requirement "expectation that it will continue to
> work" isn't really helping, because in practice, anything that is widely
> used today isn't going to go away.

I was thinking of stuff on the way out, like blink, and using strong instead

> I believe what's as important is to profile the document structure,
> define default CSS (we want some uniformity, right?), and recommend
> certain ways to use metadata. The good thing is, if an xml2rfc++
> vocabulary is used as a starting point, we just need to tweak the
> XML->HTML generation tool.


>> - Make the decision that the files will ALWAYS be encoded UTF-8.
> It's not strictly necessary as long as the encoding is properly
> declared, but certainly helpful.

Settling one exactly one makes writing tools MUCH easier.  If it's UTF-8,
then a lot of the existing Unix text infrastructure, like grep, will mostly
work.  Since English is the primary specification language of the IETF,
UTF-8 is a clear winner in size, as well.

>> - Define a very strict structure, using a microformat/semantic style, that
>> makes it easy to pull out information semantically with a little bit of
>> jQuery in post-processing tools.  Make a choice about XML-style
>> well-formedness, which is probably not needed.
> +-0; it's possible to embed all metadata in the HTML, but that makes us
> depend on conventions (microformats) or specs that are currently a bit
> in flux (RDFa vs Microdata). I'd prefer to focus on the set of
> information we need, and to make sure that they are properly captured in
> the source format (as opposed to the HTML which I currently would see as
> a publication format only).

Ah.  I'd like HTML to be the source format if possible.  I'm not talking
about getting overly standars-y in the definition; when I said
"microformat/semantic style", I meant using divs marked up with meaningful
semantic class names, rather than picking a side in RDFa vs. Microdata.

>> - Ensure that the document is self-contained, embedding styles and images,
>> so that we can do signatures easily.  This means that newer files may look
>> different from older files, and that's ok.  Greasemonkey or it's successors
>> can swap in whatever styles you want.
> Again, if we have a separate source format that requirement wouldn't
> apply to the output format.

I'd like to be able to pass single files around, but I agree I could be
talked out of this requirement if need be.

>> - Ensure that we know how to refer to pieces of the document in ways other
>> than page and line number.  Doing this programmatically is easy with the
>> right structure, but we may need to think about what humans do.  It may be
>> easiest to just give the humans good tools to do review comments, like the
>> WHATWG does.
> Or like what I've started using recently in rfc2629.xslt? Example:
> <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-latest.html>

Yes, exactly.

>> - Ensure we've got good diff tools.  There are several adequate ones for
>> HTML already.
> I believe this is where using (X)HMTL as the one and only format gets
> tricky. The HTML diff tools I have seen get easily confused (for
> instance, whatever is used for the one variant of AUTH48 diffs).

Just because we're using bad diff tools doesn't mean that a) good ones don't
exist, or b) they're impossible to write.  The W3C has a list of some
candidates here:


> We need to make extra careful not to give up on rfcdiff until we have
> something that works as well. This would mean keeping text/plain as one
> output format maybe not "the" output format) for the time being.

That seems prudent.

>> - Tweak the output from xml2rfc to generate the agreed-upon structure,
>> either for transition, or for people that want to have a high probability of
>> passing the nit checks the first time.
> And i volunteer to tune rfc2629.xslt for different HTML output
> requirements. (And of course I'd like to participate in defining what
> that format would be).

Of course.  I actually use your XSLT stuff myself, and had rudely assumed
that you'd probably be the first to implement anyway. :)

>> - Begin the transition to HTML being the authoring format as well as the
>> archive format.  Help people set up their graphical HTML editors to generate
>> the structure.  With a good template, snippets, instructions, and hooks to
>> the nit tool, this seems achievable.  It should also be possible to write a
>> pretty comprehensive editing suite that runs right in the browser, if
>> someone has the time.
> I'm very skeptical about HTML being the authoring format (as opposed to
> the publication format).

That should probably be our point of discussion this week.  I'm betting
there are several points of common ground we can work from to come up with
an approach.

>> - Spend a little money to hire a real designer to make the output sing.  The
>> goals would be readability and the perception of expertise and stability.
> Maybe.

Or induce one of the designers we know to help, of course.

Joe Hildebrand

More information about the rfc-interest mailing list