[rfc-i] "canonical" URI for RFCs, BCPs
julian.reschke at gmx.de
Mon Feb 1 09:37:27 PST 2010
Paul Hoffman wrote:
> At 6:02 PM +0100 2/1/10, Julian Reschke wrote:
>> John R Levine wrote:
>>>> For the exceptions above, that should return the PS. For all
>>>> others, it should return the TXT.
>>> Honestly, that strikes me as being the absolute worst of all possible worlds. You have a URL that might return a text file, might return a Postcript file, and in the future might return something else. How is that useful to anyone?
>> It would be useful for anyone who wants to produce "the" canonical URI for a given RFC -- this is what started the whole discussion.
> Joe wants the canonical URL to use the http: scheme and to end with ".txt", even if the type of document is Postscript. John was arguing that such a canonical URL would be terrible. I agree with John: if we can choose our canonical URLs, making them look like they are text *even when they return another type* shows less engineering skill than one would expect.
Again: from <http://www.w3.org/Provider/Style/URI>:
"...What to leave out
Everything! After the creation date, putting any information in the name
is asking for trouble one way or another.
* Authors name- authorship can change with new versions. People
quit organizations and hand things on.
* Subject. This is tricky. It always looks good at the time but
changes surprisingly fast. I discuss this more below.
* Status- directories like "old" and "draft" and so on, not to
mention "latest" and "cool" appear all over file systems. Documents
change status - or there would be no point in producing drafts. The
latest version of a document needs a persistent identifier whatever its
status is. Keep the status out of the name.
* Access. At W3C we divide the site into "Team access", "Member
access" and "Public access". It sounds good, but of course documents
start off as team ideas, are discussed with members, and then go public.
A shame indeed if every time some document is opened to wider discussion
all the old links to it fail! We are switching to a simple date code now.
* File name extension. This is a very common one. "cgi", even
".html" is something which will change. You may not be using HTML for
that page in 20 years time, but you might want today's links to it to
still be valid. The canonical way of making links to the W3C site
doesn't use the extension.(how?)
* Software mechanisms. Look for "cgi", "exec" and other give-away
"look what software we are using" bits in URIs. Anyone want to commit to
using perl cgi scripts all their lives? Nope? Cut out the .pl. Read the
server manual on how to do it.
* Disk name - gimme a break! But I've seen it.
More information about the rfc-interest