[rfc-i] Digital Preservation Considerations for the RFC Series -- draft-flanagan-rfc-preservation-00.txt is posted
johnl at taugh.com
Tue Sep 9 20:04:45 PDT 2014
>As per usual, top-to-bottom reviews* are more than welcome, but if you
>have a particular issue you would like to see discussed in more detail
>by the community, please pull it out into its own thread on this list.
Hi. I think the analysis is fine as far as it goes, but it misses
some stuff. In particular, it seems to me that it's at least as
important to save the format and tool documentation as the software
because a significant way to recover ancient stuff is to reverse
engineer the tools using more modern technology.
To take an example in a different context, if you have a deck of punch
cards, there are approximately no working card readers left to read
what's on them. But it is not hard to kludge something up to run them
through an optical scanner or past a camera, nor is it hard using
modern software to find the punched holes in the images and recover
the text on the cards. But that depends on knowing the specs for the
cards, both the physical specs for the hole positions, and the logical
specs for what combination of holes represents what character.
I have written scripts in modern(ish) languages like perl and python
to translate ancient markup languages into something I can run though
current document formatters. Hence I would encourage the archive to
include the XML separate from the PDF/A, because XML would be a lot
easier to deal with if you're reverse engineering from scratch, and to
preserve the xml2rfc specs in various online and offline formats to
help reconstruction efforts.
More information about the rfc-interest