[rfc-i] URL checking

Joe Touch touch at ISI.EDU
Tue Oct 18 05:38:36 PDT 2005



Keith Moore wrote:
>>> Greetings again. During the editing process, does the RFC Editor
>>> check whether the URLs in references work? I just found a bad URL in
>>> a document that is in the queue. (I told the authors, of course, and
>>> they'll fix it in AUTH48.)
>>
>>
>>
>> If so, why would it matter? It's just as likely to break by the time it
>> goes from the queue to final (i.e., URLs aren't persistent anyway - why
>> bother checking at all?)
> 
> 
> it's useful as a sanity check against typographical or transcription
> errors.  also, URIs are not all created equal.  if you take the time to
> understand why URIs aren't always persistent and make an effort to make
> your URIs persistent, you'll probably succeed to a reasonable degree.
> 
> Keith

Rather than continuing the debate on whether URIs are persistent or how
persistent, let me ask a different question:

- would a test that says "page found" be sufficient? I.e., anything
other than a connection error or page-not-found is fine?

If so, it's fairly simple to script-extract all URLs and match them,
provided they're full URLs (http://). We can't necessarily test short
ones (e.g., listed as 'www.postel.org'), since many are just DNS names
that need not start with www.

Would that kind of test be sufficient?

Joe
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : http://www.rfc-editor.org/pipermail/rfc-interest/attachments/20051018/a40479dc/signature.bin


More information about the rfc-interest mailing list