[rfc-i] URL checking
touch at ISI.EDU
Wed Oct 19 17:29:31 PDT 2005
Paul Hoffman wrote:
> At 12:56 PM -0400 10/19/05, Keith Moore wrote:
>>I suspect the real cost is in context switching. Every time the RFC
>>Editor finds something that causes the document to be deferred, it is
>>necessary to change the state of the document, send notices to the
>>appropriate parties, etc. The checks must then be made again when the
>>new document and/or corrections are submitted, and the person doing the
>>check may have to rediscover the particulars of that document.
> Such a context switch only happens if this is the *only*
> process-stopping bug in the document. Otherwise, it could clearly be
> ganged with another.
> It seems that the cost of a context switch instead of publishing an
> RFC with a known-bad reference is worth it.
>>How many URLs per the last 100 RFCs?
>> henrik at shiraz $ egrep -h "(http|ftp|https)://" $(ls rfc4???.txt|
>>tail -100) | grep -v "http://www.ietf.org/ipr" | wc -l
> Tool weenies: gotta love 'em. :-)
>>So that's about 1.6 URL per RFC.
> Meaning, about one additional minute per RFC. Maybe this would not be
> prohibitive (or even measurable)
One minute to check that a page comes up.
More than one to check that it is a sensible page.
Don't even count the minutes to see if each citation matches the
appropriate page. That requires going back through the doc to read the
context of the citations, as well as reading the context of the page.
IMO - cheap to verify it's a real page; everything else is prohibitive,
bordering on the same cost as the spell/grammar check.
(PS - this is my personal opinion)
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 254 bytes
Desc: OpenPGP digital signature
Url : http://www.rfc-editor.org/pipermail/rfc-interest/attachments/20051019/092029d7/signature.bin
More information about the rfc-interest