[rfc-i] Potential RFC format approach: HTML

Julian Reschke julian.reschke at gmx.de
Sat Mar 24 01:33:57 PDT 2012


On 2012-03-24 07:12, Joe Hildebrand wrote:
> It should surprise nobody that I think HTML is a good fit for the
> requirements I've heard thus far for a new RFC format.  Here is a set of
> steps that might be able to get us there.
>
>
> - Subset HTML.  The subset would be decided upon based on current wide
> support, expectation that it will continue to work, repeatability, stability
> of output, etc.  However, as long as old-ish browsers can render the output,
> they might not need to get the full experience.

+1 in general. The requirement "expectation that it will continue to 
work" isn't really helping, because in practice, anything that is widely 
used today isn't going to go away.

I believe what's as important is to profile the document structure, 
define default CSS (we want some uniformity, right?), and recommend 
certain ways to use metadata. The good thing is, if an xml2rfc++ 
vocabulary is used as a starting point, we just need to tweak the 
XML->HTML generation tool.

> - Make the decision that the files will ALWAYS be encoded UTF-8.

It's not strictly necessary as long as the encoding is properly 
declared, but certainly helpful.

> - Define a very strict structure, using a microformat/semantic style, that
> makes it easy to pull out information semantically with a little bit of
> jQuery in post-processing tools.  Make a choice about XML-style
> well-formedness, which is probably not needed.

+-0; it's possible to embed all metadata in the HTML, but that makes us 
depend on conventions (microformats) or specs that are currently a bit 
in flux (RDFa vs Microdata). I'd prefer to focus on the set of 
information we need, and to make sure that they are properly captured in 
the source format (as opposed to the HTML which I currently would see as 
a publication format only).

> - Ensure that the document is self-contained, embedding styles and images,
> so that we can do signatures easily.  This means that newer files may look
> different from older files, and that's ok.  Greasemonkey or it's successors
> can swap in whatever styles you want.

Again, if we have a separate source format that requirement wouldn't 
apply to the output format.

> - Define requirements for how we can add new capabilities over time.
>
> - Define requirements for accessibility.

Yes.

> - Build libraries that know about the structure for several programming
> languages.  Survey folks that parse the current text format for their
> requirements.  It seems like JavaScript might be a good staring point.
>
> - Write a tool that checks the HTML for the structure, then the several
> hundred nits that will grow up over time.
>
> - Ensure that we know how to refer to pieces of the document in ways other
> than page and line number.  Doing this programmatically is easy with the
> right structure, but we may need to think about what humans do.  It may be
> easiest to just give the humans good tools to do review comments, like the
> WHATWG does.

Or like what I've started using recently in rfc2629.xslt? Example: 
<http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-latest.html>

> - Ensure we've got good diff tools.  There are several adequate ones for
> HTML already.

I believe this is where using (X)HMTL as the one and only format gets 
tricky. The HTML diff tools I have seen get easily confused (for 
instance, whatever is used for the one variant of AUTH48 diffs).

We need to make extra careful not to give up on rfcdiff until we have 
something that works as well. This would mean keeping text/plain as one 
output format maybe not "the" output format) for the time being.

> - Tweak the output from xml2rfc to generate the agreed-upon structure,
> either for transition, or for people that want to have a high probability of
> passing the nit checks the first time.

And i volunteer to tune rfc2629.xslt for different HTML output 
requirements. (And of course I'd like to participate in defining what 
that format would be).

> - Begin the transition to HTML being the authoring format as well as the
> archive format.  Help people set up their graphical HTML editors to generate
> the structure.  With a good template, snippets, instructions, and hooks to
> the nit tool, this seems achievable.  It should also be possible to write a
> pretty comprehensive editing suite that runs right in the browser, if
> someone has the time.

I'm very skeptical about HTML being the authoring format (as opposed to 
the publication format).

> - Spend a little money to hire a real designer to make the output sing.  The
> goals would be readability and the perception of expertise and stability.

Maybe.

> - Allow the RFC Editor to start accepting HTML-formatted entries that pass
> all the nit checks.  For a transition period, the text format could still be
> required, but in order to get the tooling situation straightened out, the
> Editor would have to start accepting HTML-only submissions earlier than some
> might find comfortable.

I think it would be a step into the right direction if we could submit 
HTML right now as supplement to text/plain.

> There are likely other steps, and there's a lot of personal opinion strewn
> throughout this proposal, but I figured it would be nice to have a starting
> point.

Thanks for the start!

Best regards, Julian



More information about the rfc-interest mailing list