[rfc-i] URL checking
touch at ISI.EDU
Tue Oct 18 05:38:36 PDT 2005
Keith Moore wrote:
>>> Greetings again. During the editing process, does the RFC Editor
>>> check whether the URLs in references work? I just found a bad URL in
>>> a document that is in the queue. (I told the authors, of course, and
>>> they'll fix it in AUTH48.)
>> If so, why would it matter? It's just as likely to break by the time it
>> goes from the queue to final (i.e., URLs aren't persistent anyway - why
>> bother checking at all?)
> it's useful as a sanity check against typographical or transcription
> errors. also, URIs are not all created equal. if you take the time to
> understand why URIs aren't always persistent and make an effort to make
> your URIs persistent, you'll probably succeed to a reasonable degree.
Rather than continuing the debate on whether URIs are persistent or how
persistent, let me ask a different question:
- would a test that says "page found" be sufficient? I.e., anything
other than a connection error or page-not-found is fine?
If so, it's fairly simple to script-extract all URLs and match them,
provided they're full URLs (http://). We can't necessarily test short
ones (e.g., listed as 'www.postel.org'), since many are just DNS names
that need not start with www.
Would that kind of test be sufficient?
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 254 bytes
Desc: OpenPGP digital signature
Url : http://www.rfc-editor.org/pipermail/rfc-interest/attachments/20051018/a40479dc/signature.bin
More information about the rfc-interest