[rfc-i] Can the web be archived?
nico at cryptonector.com
Wed Jan 21 08:02:46 PST 2015
On Tue, Jan 20, 2015 at 05:41:04PM -0800, Heather Flanagan (RFC Series Editor) wrote:
> Indeed. This problem is extremely frustrating to STEM (scientfic,
> technical, engineering, and medical) publishers, too. Different
> models try to "fix" it, but all models I'm aware of require a
> publisher to Do Something, where something means registering their
> documents with a service or trying to archive a copy of everything
> they reference in their books/journals/articles.
> So far this hasn't proven to be a tractable problem, particularly not
> when anyone and everyone can be a publisher on the web.
We could have a distinction between robots that... index the web for
searching for content and robots that index the web for URN resolution
service. Publishers would only have to assign URNs, publish an updated
robots.txt, and the URN resolution robots would only make and serve an
index (and maybe keep backup copies, without serving them). Such a
robot might even be able to assign its own URNs to otherwise URN-less
documents, thus leaving publishers with almost nothing to do.
I know, wishful thinking. But how wishful is it? Web archival sites do
exist, and here we're talking about a rather simpler service.
More information about the rfc-interest