[rfc-i] rfc-interest Digest, Vol 89, Issue 9

Leonard Rosenthol lrosenth at adobe.com
Thu Mar 22 10:43:24 PDT 2012

>As others have pointed out less directly,
>we have a historical requirement for being able to create new
>documents by hacking away at older ones.  
Seems like a reasonable requirement.

>If the authoring
>format and the archival format are different, then the authoring format effectively has to meet the same archival requirements as the archival format.  
No, not at all.

The needs for an archival version may (and usually will) differ from the needs of an authoring version, because they may/will be used by different people for different purposes.  

Let me give one example based on real-world usage.

The European Patent office uses an XML format for the authoring and reviewing of patents, but they use PDF/A for the archiving of those patents (with the original XML & any associated resources/assets embedded). Their reasoning is that the archival version is for reference and thus human readability and self-containment is of primary importance, while the XML is provided for machine processing.  The Government Printing Office in the US takes this a step further by applying a digital signature to the PDFs they distribute (and also sent to NARA - the National Archives in the US) so that their authenticity is w/o question.

But again, those are their criteria/requirements - they may not be yours.

>For example, we have to be able to guarantee that editors and processors for the authoring format will remain readily available as far into the future as we care about (plus a bit),
That's an impossible task.  

What you CAN do, however, is ensure that the format itself has the appropriate attributes.  Some of these include:
- self-describing
- open standard
- well known/understood

>We also have to worry about things staying in synch.  For example, if a given source file is pushed through a given processor or set of steps to product PDF/A today, can it be guaranteed >that a processor that will be available and functional a decade or two from now, when applied to the same source, will produce the same PDF/A output. 
Is that necessarily a requirement?   If you have the archival version, why do you need to remake it in the same format?   I could see remaking it in another format (aka migration),  but that's different.

>In addition, while I'm actually a big fan of PDF/A for other uses, I think the rather nice "Details Matter" article you cites may not adequately cover the problem.  Adding to it the question >that Paul Hoffman (and maybe others asked), I've noticed that there are a rather large number of programs on the market for various platforms, all of which make extravagant claims >about producing PDF files.  In my experience, a rather large fraction of them can't render PDF/A properly and it is really hard to find out whether or not they can do so.   
That is true.  Most PDF producing tools don't also support PDF/A.   That just means you have to choose your tools.  But that is true for any format you select - you will always have a finite number of tools to choose from.   The question, of course, is what is an "acceptable size" for that finality and what criteria is used to select them.

>Worse, very few desktop systems make it easy to use different rendering or processing programs for conventional PDF and PDF/A files.  
I don't understand this.  Can you explain?

>As soon as one starts down that path, reasonable criteria for accessibility and usability may suggest (at least for the present and the near future) that PDF/A has to be treated as a >proprietary format that can be rendered accurately only by Acrobat and small (and generally unidentified) selection of other tools.   
In my presentation, I have provided a small list of well known authoring tools, programming libraries, viewers and validation tools ranging from commercial to open source.  This is just the tip of the iceberg, but it does highlight (what I would consider some of) the most common tools in use on various desktop and server platforms today.  All of which support PDF/A.

If you need a larger list, the PDF Association (http://www.pdfa.org) maintains a list of hundreds of vendors and products that support PDF/A.

But as I mentioned above - what's the criteria for "enough" and "available" when defining tooling?!?


More information about the rfc-interest mailing list