[rfc-i] Proposed new RFC submission requirements

Joe Hildebrand jhildebr at cisco.com
Sat May 26 23:41:44 PDT 2012


On 5/26/12 11:26 PM, "Joe Touch" <touch at isi.edu> wrote:

> Was there some part of "check the example I posted" above that was ambiguous?
> Of course that's the cruft - I even said that above.

I've now lost track of what you wanted me to check.  I think I got the main
point, which is that Word generates HTML that isn't up to the current
industry best practices.

>>> <p class=MsoNormal>This document replaces TCP MD5 as follows [RFC2385]:</p>
>> 
>> What does MsoNormal mean?  Why no quotes on the class name?  Why no link on
>> RFC2385?
> 
> No quotes because none are required. That's "Normal" in Microsoft Office (it
> prepends Mso). No link on the ref because that's not how it was generated.
> 
> Again, ignore the cruft.

There's nothing there but cruft.  If the question is "is it possible to
massage cruft into something useful?", then the answer is yes.

If the question is "is that worth my time to write the code for?", then the
answer is no, because I don't see my self using that toolset.

If the question is "can someone else be taught to write that code", the
answer is yes. 

> In case that's too subtle -

I get it, you think I'm an idiot.  That's not very subtle.

> again, that's *representative* of a very widely
> used commercial tool's interpretation of how to generate HTML.

In case I was being too subtle, your tool is generating HTML that's so bad
that I suggest you find a better tool.

> It is impossible to differentiate between lists that could be adjacent in the
> HTML but are semantically separate.

Give me one example of a real-world document that has two lists next to one
another that are semantically separate, please.  I don't believe that such a
beast exists.

>> Why make this more difficult for them than is needed, if it's roughly the
>> same amount of work to come up with an adequate format in the first place?
> 
> Because it's not the same amount of work.

I disagree.  There, we've both used a scientific approach to generate our
estimates.  I'm planning on refuting your argument by making code available
that you think is impossible.

>> One of the signs of good architecture ...
> 
> This isn't an academic exercise in good architecture. It's the design of a
> production publication system.

Almost everything I do is an exercise in trying to attain good architecture.
Every day.  I'm not going to stop because you don't value it.

> If you want to create a new publication architecture, by all means please do.
> We should consider it for RFCs after it's been widely used for around 10-15
> years.

Done.  HTML.  Let's consider it.

And by the way, thank you for the further condescension.  It gives me the
energy to keep going.

> We are trying to get better, but not to be a research project.

It's possible to get better by seeing what the rest of the industry is
doing, and taking the parts that make sense.  That isn't research.  That's
engineering.

>> Serious question: do you refer to list item numbers frequently?  Would you
>> mind pointing me to a document that does this, so I can reason about ways to
>> approach this?
> 
> Same for Section numbers. You know, the ones that everyone wants to now use as
> a substitute for page numbers?

No seriously.  Please answer the question instead of using sarcasm to try to
make me stop asking the question.  Please give an example of a place where
you referred to a list item by number.

>>> When Word inserts a BR it's because the author used a CTL-CR. That's because
>>> the line break has meaning that the author wants to convey, which also means
>>> that it should NOT be reflowed at the discretion of the viewer.
>> 
>> Again serious, can you point me to a document where this was important?  The
>> only places I can think of at the moment all belong in pre elements.
> 
> Again, it's in the example I gave. It's used to generate breaks in the
> elements of lists to separate the list item from its description.

The br isn't semantic there, so no, it's not in that example.  It's a crutch
because your tool can't do proper lists.  Please give a proper example where
you needed a br semantically.

> It's not the 
> only way to do it - HTML has about a dozen ways to do almost anything. And
> it's almost never true that only one way is "correct".

There are often ways that are better than others, for which you would get
wide consensus amongst folks that work on HTML every day.  On the preference
between these two:

<p>o<span>foo</span></p>

<li>foo</li>

There would be thunderous unanimity.

> Until you know you're sure, it's premature to assert that you know you can do
> it.

It's also premature for you to assert it's impossible.

> I've shown a number of ways in which containment adds requirements that aren't
> needed, and in some cases get in the way of copy/paste operations later.


> No, I don't think "information extraction" is the primary purpose of going to
> the new format. 

Nor do I, but if we can get it, it will make a bunch of things easy that are
possible only with a lot of work today.

> If it is, call me when the publishing community converges on a
> format for that.

They have.  It's HTML.  The "publishing community" is now the called "the
web".

-- 
Joe Hildebrand



More information about the rfc-interest mailing list