[rfc-i] URL checking

Henrik Levkowetz henrik at levkowetz.com
Wed Oct 19 11:05:56 PDT 2005

on 2005-10-19 18:30 Paul Hoffman said the following:
> At 6:42 PM -0700 10/18/05, Joe Touch wrote:
>>I suspect the cost of manual checking by others is prohibitive
> How much would it cost? How much is prohibitive? Determining both of 
> these answers is fairly trivial.
> It should take about an hour to scan the last 50 issued RFCs and 
> extract all the URLs to get a count to determine the average number 
> of URLs per RFC.

How many URLs per the last 100 RFCs?

  henrik at shiraz $ egrep -h  "(http|ftp|https)://" $(ls rfc4???.txt| tail -100) | grep -v "http://www.ietf.org/ipr" | wc -l

Any special ones which re-occur and should be removed from that?

  henrik at shiraz $ egrep -h  "(http|ftp|https)://" $(ls rfc4???.txt| tail -100) | grep -v "http://www.ietf.org/ipr" | sed -r 's/^.*((http|https|ftp):\/\/[^ "!]*).*$/\1/' | sort | uniq -c | sort | tail
      2 http://www.ops.ietf.org/mib-review-tools.html
      3 http://infinite.example.com/keywordinfo.html
      3 http://www.cablelabs.com/specifications/archives.
      3 http://www.cablemodem.com.
      3 http://www.example.com/vts
      3 http://www.ietf.org/ID-Checklist.html
      3 http://www.w3.org/2001/XMLSchema-instance
      4 http://www.itu.int
      5 http://www.w3.org/2001/XMLSchema

No, looks pretty good.

So that's about 1.6 URL per RFC.

> It is safe to estimate that resolving a URL and 
> looking at the first screen takes less than 30 seconds. Dividing the 
> average number of URLs per RFC by 120, then multiplying the result by 
> the dollars per hour the staff who is doing this gets paid, the 
> multiplying the result by the average number of RFCs per year, tells 
> the cost per year of the RFC Editor adding this service. This can 
> then be compared to what the IAOC considers prohibitive.

That would be interesting figures, yes.


More information about the rfc-interest mailing list