[rfc-i] Reminder: xml2rfc v2 transition

Brian E Carpenter brian.e.carpenter at gmail.com
Sun Apr 6 20:36:59 PDT 2014


On 07/04/2014 13:15, Tony Hansen wrote:
> On 4/6/14, 4:36 PM, Brian E Carpenter wrote:

...
>> I think the main issues in moving from v1 to v2 (or v1/strict) were:

BTW, I ran my tests on the oldest xml file I could find on my disk, dated 2005,
which still works OK with v1 in 'fast' mode.

>>
>> ENTITY declarations must be inside <!DOCTYPE rfc SYSTEM
>> "rfc2629.dtd"[   ]>
> 
> I think this falls under the category of only accepting "valid XML".

It does. But even with the improved v2 diagnostics, the resulting
error message isn't transparent.

ERROR: Unable to parse the XML document: INPUT
 INPUT: Line 5: StartTag: invalid element name
 INPUT: Line 5: StartTag: invalid element name
 INPUT: Line 5: Extra content at the end of the document

> 
>> <list> </list> must be inside <t> </t>
>> (and other similar nesting restrictions)
> 
> ok

And here, by the way, the v2 diagnostics are quite helpful.

> 
>> Does not tolerate non-RFC ASCII characters
> 
> I'm not sure I understand this item.

I beg your pardon. It's actually non-ASCII. I'm guessing
it's a bit of Microsoftese white space. In any case, v1
tolerated it.

 INPUT: Line 1004: Input is not proper UTF-8, indicate encoding ! Bytes: 0xA0 0xA0 0x20 0x20

> 
>> Needs <![CDATA[  ]]> for some artwork that didn't need it for v1.
> 
> I think this also falls under the category of only accepting "valid XML".

Correct, but again the diagnostic is unhelpful:

 INPUT: Line 210: xmlParseEntityRef: no name
 INPUT: Line 210: xmlParseEntityRef: no name
 INPUT: Line 221: xmlParseEntityRef: no name

(This an obscure way of saying 'Your artwork contains ampersands'.)

BTW - none of this is a complaint. Just a suggestion that the
RFC Editor could post a few helpful hints about converting files
to fully conform.

    Brian



More information about the rfc-interest mailing list