[rfc-i] "canonical" URI for RFCs, BCPs

Julian Reschke julian.reschke at gmx.de
Mon Feb 1 09:37:27 PST 2010

Paul Hoffman wrote:
> At 6:02 PM +0100 2/1/10, Julian Reschke wrote:
>> John R Levine wrote:
>>> ...
>>>> For the exceptions above, that should return the PS. For all
>>>> others, it should return the TXT.
>>> Honestly, that strikes me as being the absolute worst of all possible worlds.  You have a URL that might return a text file, might return a Postcript file, and in the future might return something else.  How is that useful to anyone?
>>> ...
>> It would be useful for anyone who wants to produce "the" canonical URI for a given RFC -- this is what started the whole discussion.
> Joe wants the canonical URL to use the http: scheme and to end with ".txt", even if the type of document is Postscript. John was arguing that such a canonical URL would be terrible. I agree with John: if we can choose our canonical URLs, making them look like they are text *even when they return another type* shows less engineering skill than one would expect.

Yes, indeed.

Again: from <http://www.w3.org/Provider/Style/URI>:

"...What to leave out

Everything! After the creation date, putting any information in the name 
is asking for trouble one way or another.

     * Authors name- authorship can change with new versions. People 
quit organizations and hand things on.
     * Subject. This is tricky. It always looks good at the time but 
changes surprisingly fast. I discuss this more below.
     * Status- directories like "old" and "draft" and so on, not to 
mention "latest" and "cool" appear all over file systems. Documents 
change status - or there would be no point in producing drafts. The 
latest version of a document needs a persistent identifier whatever its 
status is. Keep the status out of the name.
     * Access. At W3C we divide the site into "Team access", "Member 
access" and "Public access". It sounds good, but of course documents 
start off as team ideas, are discussed with members, and then go public. 
A shame indeed if every time some document is opened to wider discussion 
all the old links to it fail! We are switching to a simple date code now.
     * File name extension. This is a very common one. "cgi", even 
".html" is something which will change. You may not be using HTML for 
that page in 20 years time, but you might want today's links to it to 
still be valid. The canonical way of making links to the W3C site 
doesn't use the extension.(how?)
     * Software mechanisms. Look for "cgi", "exec" and other give-away 
"look what software we are using" bits in URIs. Anyone want to commit to 
using perl cgi scripts all their lives? Nope? Cut out the .pl. Read the 
server manual on how to do it.
     * Disk name - gimme a break! But I've seen it.

BR, Julian

More information about the rfc-interest mailing list