[rfc-i] Potential RFC format approach: HTML

Joe Hildebrand jhildebr at cisco.com
Sat Mar 24 08:02:03 PDT 2012

On 3/24/12 4:09 AM, "Dave Crocker" <dhc at dcrocker.net> wrote:

> HTML isn't even close to trivial, by that metric.  Worse, as soon as you say
> "subset" you mandate specialized software.

You've said that before, which leads me to believe I haven't gotten my ideas
across to you adequately.  We're not talking about creating a parallel set
of HTML with new web browsers.  We're talking about using the *existing*
bits of HTML in a way that will be the most stable, easy to process, and
easy to render.  There's a bunch of crappy HTML that would just be
prohibited in an RFC.

> "HTML" in the open Internet is spectacularly variable.  It works because HTML
> engines know to process quite a bit of that variation.  The remainder doesn't
> get processed properly.
> This establishes an extremely unstable processing base, no matter how widely
> usable it is at any given moment, such as "today".

This is what the nit checking tool would address.  We've got enough HTML
expertise in the group that we can generate and enforce REALLY good HTML.

> In an environment like that, any assurances of future support -- say 30 or 50
> years from now, nevermind 100 -- is problematic.

I am 100% certain that soberly created HTML of today has a vastly higher
chance of being readable 50 years from now than RDF/a, for example.  And
seeing as how I can barely read our current line-printer format on many of
my devices, I don't believe that the status quo solves for this problem

>> I believe what's as important is to profile the document structure, define
>> default CSS (we want some uniformity, right?), and recommend certain ways to
>> use
> css.  Excellent.  A parallel item to support and hope is compatible decades
> from now, if the item can be found.

The CSS is just for pretty-fication.  The document will be completely
readable without the CSS.  You should turn off CSS in your browser to keep
us honest.

>>> - Define a very strict structure, using a microformat/semantic style, that
>>> makes it easy to pull out information semantically with a little bit of
>>> jQuery in post-processing tools. Make a choice about XML-style
>>> well-formedness, which is probably not needed.
>> +-0; it's possible to embed all metadata in the HTML, but that makes us
>> depend
>> on conventions (microformats) or specs that are currently a bit in flux (RDFa
>> vs
> flux.  exactly the right word.

The flux is in one tiny part of the web community that doesn't have to
affect us.  We'll avoid the flux.

> For the current exercise, the requirement to attain long-term viability that
> is 
> on a par with what was achieved with the original ASCII base, is simplicity
> and stability.

Yes.  And I believe my proposal does that.  Outside the IETF, it is pretty
common to consider HTML a long-term viable format.  Inside the IETF, we
allow folks who don't know much about HTML to argue successfully for the
status quo.

This is the point in the conversation where someone usually pulls out the
expertise card, and sets an impossibly high bar to be a part of the
discussion.  I'm expecting the next argument to be that none of us are
archival librarians, so we have to accept the status quo.

I think we *do* have the expertise.  I think we're having a sober and
rational discussion about one possible approach and its implications.
Although it would be nice to hear from folks with other skillsets, I don't
think we have to stop work waiting for these near-mythical librarians to
present themselves in order to make progress.

> It's not enough to be clever with something that can "be made to work".  It
> might or might not work tomorrow.  Anyone thinking otherwise needs to produce
> a comparable historical basis for their belief.

I don't see this approach as clever.  I see it as dumbed down to the point
where it will work.  I see several places where I haven't been clear enough
for you to understand why this is the case, but that's a failure of the
messenger, not the message.

Joe Hildebrand

More information about the rfc-interest mailing list