[rfc-i] A proposal: HTML

Julian Reschke julian.reschke at gmx.de
Tue May 29 04:52:59 PDT 2012


On 2012-05-29 11:44, Iljitsch van Beijnum wrote:
> On 24 May 2012, at 23:38 , Julian Reschke wrote:
>
>>> <p class="hiddenfrontmatter">
>>>    <form name="rfcattributes">
>>>      <input type="hidden" name="toc" value="yes">
>>>      <input type="hidden" name="ipr" value="trust200902">
>>>      <input type="hidden" name="docname" value="draft-ietf-behave-ftp64-12">
>>>      <input type="hidden" name="category" value="std">
>>>    </form>
>
>>>    <form name="author1">
>>>      <input type="hidden" name="surname" value="Van Beijnum">
>>>      <input type="hidden" name="fullname="Iljitsch van Beijnum">
>>>      <input type="hidden" name="shortfullname="I. van Beijnum">
>>>      <input type="hidden" name="altfullname="Ильи́ч van Beijnum">
>>>      <input type="hidden" name="organization" vaule="Institute IMDEA Networks">
>>> </form>
>
>> Hiding things makes it more likely to break them.
>
> The reason I made this hidden was because I can't think of a way to make this stuff human readable in a reasonable way as well as machine readable in a reasonable way. So we should probably have the machine readable stuff hidden from humans, and then a nicely laid out but not marked up human readable version of the same information visible.

RDFa, Microdata and Microformats should allow this.

>> There are better ways to do this, such as RDFa, microformats or microdata. Unfortunately these are multiple, competing ways, so there will be disagreement about which to pick. This is why I personally would prefer to stick with what we have (RFC2629) and to carefully extend it.
>
> If we make a new format that gives us the opportunity to make a clean break. So we should do that where it makes sense. And to me, it certainly makes sense for this kind of stuff, because it is just so painful to author it in XML2RFC format. Because this stuff is specific to RFCs, there are also no opportunities to convert from other formats so it really has to be easy to do this by hand. I'm now actually starting to think the above is still to difficult, so make it more like this:
>
> rfc-toc: yes
> rfc-ipr: trust200902
> draft-name: draft-ietf-behave-ftp64-12
> rfc-intended-status: standard
>
> rfc-author1-surname: Van Beijnum
> rfc-author1-fullname: Iljitsch van Beijnum
> rfc-author1-nonlatin-name: Ильи́ч van Beijnum
>
> etc

That means that the submission format needs conversion to become a 
suitable consumption format, right?

> The advantage of this is that you can write it using a word processor where the styles are converted to CSS classes without the need to create any HTML by hand. So this would be a good submission format, although of course not a good archival/presentation format.
>
>> What you consider a feature others (IMHO correctly) see as a defect in HTML 4, and HTML 5 happens to have ...<section>  (<http://dev.w3.org/html5/spec/single-page.html#the-section-element>).
>
> Interestingly, both in older HTML and in word processors, there is no way to differentiate between:
>
> <section l1>
>    text
>      <section l2>
>        text
>        text
>      </section>
> </section>
>
> and:
>
> <section l1>
>    text
>      <section l2>
>        text
>      </section>
>    text
> </section>

Of course there is in HTML; <div> groups things.

> ...
>>> <!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
>>>    <!ENTITY rfc959 PUBLIC '' "http://xml.resource.org/public/rfc/bibxml/reference.RFC.0959.xml">
>>> ]>
>
>> No, you don't need that at all. You can simply copy&  paste the contents of<http://xml.resource.org/public/rfc/bibxml/reference.RFC.0959.xml>  into your document.
>
> It doesn't make sense to include extensive metadata for each reference in the source of a draft / RFC document. It just clutters up the document.

The back section. Why is this a problem?

> Remember that references are basically human-parsable hyperlinks. If we then add a machine-parsable actual hyperlink to either the document itself or its canonical location, there is no need to include any additional data except which is presented on screen / paper for human consumption.

Yes. But it has drawbacks, such as the document loosing information when 
you're offline.

>>> so I get to use these in my text:<xref target="RFC0959" />. I guess the decision on whether the leading zero should or shouldn't be there was solved by randomly requiring it to be present or requiring it to be absent.
>
>> That's a tool problem. The fact that nobody has written that tool could mean that the problem isn't a big as you think. I personally have no problem maintaining my references by hand.
>
> In the grand scheme of things this isn't the worst problem ever, but just the fact that the format requires me to write the RFC number with and without the leading zero in different places shows that not much care went into all of this, and if we are going to come up with a new format we need to do better.

No, the format doesn't require you to do that. It's an artefact of the 
citation library you are using.

>>> So it's entirely conceivable that the same HTML file can be displayed like this:
>
>>> http://tools.ietf.org/html/rfc2460
>
>>> or like this:
>
>>> http://pretty-rfc.herokuapp.com/RFC2460
>
>> Almost. These two use entirely different HTML formats.
>
> Yes. And my point is that you can make the eventual output look like many different things by just changing a CSS file rather than changing the HTML itself. I acknowledge the fact that that isn't the case here.

Indeed.

Best regards, Julian


More information about the rfc-interest mailing list