[rfc-i] Potential RFC format approach: HTML

Phillip Hallam-Baker hallam at gmail.com
Sat Mar 24 09:01:10 PDT 2012

I find it very hard to believe that any civilization that has failed
to decode Linear B would forget how to decode the RFC series.

The question is not whether it will be possible to read the documents
but whether it will be easy to do so without recourse to special

The IETF RFCs are the only documents I need to use a special
configuration to print successfully. They are the only documents I
have ever encountered that work on the assumption that the reader will
print them using US paper sizes and a certain font.

W3C has already addressed the same set of problems and has developed a tool set.

The main constraint to be addressed is that the W3 want to use their
own style sheet. This means that they have to strip out any other
formatting introduced by Word etc.

That is not difficult to do.

The main constraint when using HTML is going to be deciding what
formatting features are going to be supported in the stylesheet. The
stylesheet itself is rather less relevant as the reader can even
change it if they like.

So the only features of HTML that I think we would end up using in the
source document are <H1..5>, <p>, <dl>,<ol>,<ul> and tables, stuff
that hasn't really changed in ten years. That plus the attributes
style= and id=.

That plus whatever is the flavor of the month for sub and superscripts
would do fine.

The real issue would then be what set of styles we would have. We
would certainly want to define separate styles for code. I seem to
remember W3 has separate styles for code that is examples and code
that is normative. Which is useful.

The other set of considerations I see would be outside the remit of
the IETF standard but matter a lot to some people writing specs,
particularly specs with XML or other schemas or with worked examples.

The way I generate those is to have an HTML document with just the
running text with <include> directives specifying the places where
segments are to be pulled from another source.

This approach makes spec writing a lot easier as it means that every
time I build the spec I can run the regression tests to generate and
check all my examples and then pull in a completely checked and
consistent set of schemas, examples etc.

On Sat, Mar 24, 2012 at 6:09 AM, Dave Crocker <dhc at dcrocker.net> wrote:
> On 3/24/2012 9:33 AM, Julian Reschke wrote:
>>> - Subset HTML. The subset would be decided upon based on current wide
>>> support, expectation that it will continue to work, repeatability,
>>> stability
>>> of output, etc. However, as long as old-ish browsers can render the
>>> output,
>>> they might not need to get the full experience.
>> +1 in general. The requirement "expectation that it will continue to work"
>> isn't
>> really helping, because in practice, anything that is widely used today
>> isn't
>> going to go away.
> Oh boy.
> In spite of the fact that you are an experienced and prudent guy, let me
> suggest your above assertion is not terribly prudent.
> The strength of the ASCII base has been its minimal processing requirements
> and its excellent, long-term stability.
> HTML isn't even close to trivial, by that metric.  Worse, as soon as you say
> "subset" you mandate specialized software.
> "HTML" in the open Internet is spectacularly variable.  It works because
> HTML engines know to process quite a bit of that variation.  The remainder
> doesn't get processed properly.
> This establishes an extremely unstable processing base, no matter how widely
> usable it is at any given moment, such as "today".
> In an environment like that, any assurances of future support -- say 30 or
> 50 years from now, nevermind 100 -- is problematic.
>> I believe what's as important is to profile the document structure, define
>> default CSS (we want some uniformity, right?), and recommend certain ways
>> to use
> css.  Excellent.  A parallel item to support and hope is compatible decades
> from now, if the item can be found.
>>> - Define a very strict structure, using a microformat/semantic style,
>>> that
>>> makes it easy to pull out information semantically with a little bit of
>>> jQuery in post-processing tools. Make a choice about XML-style
>>> well-formedness, which is probably not needed.
>> +-0; it's possible to embed all metadata in the HTML, but that makes us
>> depend
>> on conventions (microformats) or specs that are currently a bit in flux
>> (RDFa vs
> flux.  exactly the right word.
> For the current exercise, the requirement to attain long-term viability that
> is on a par with what was achieved with the original ASCII base, is
> simplicity and stability.
> It's not enough to be clever with something that can "be made to work".  It
> might or might not work tomorrow.  Anyone thinking otherwise needs to
> produce a comparable historical basis for their belief.
> d/
> --
>  Dave Crocker
>  Brandenburg InternetWorking
>  bbiw.net
> _______________________________________________
> rfc-interest mailing list
> rfc-interest at rfc-editor.org
> https://www.rfc-editor.org/mailman/listinfo/rfc-interest

Website: http://hallambaker.com/

More information about the rfc-interest mailing list