[rfc-i] issue: canonical formats

Leonard Rosenthol lrosenth at adobe.com
Fri Jun 1 08:25:22 PDT 2012

On 5/31/12 9:36 PM, "John Levine" <johnl at taugh.com> wrote:
>The problem with output formats is that they are simultaneously
>overconstrained and undertagged.  Something like PDF/A prints nicely,
>but it is full of stuff like fonts and line and page breaks that
>aren't relevant to the semantics of the document, while missing the
>metadata about the abstract and the postcode.

I would actually argue, especially in light of the other thread on
non-US-ASCII characters, that the Font program is EXTREMELY IMPORTANT, not
only for visual display but also for semantic relevance (as it could/would
incorporate encoding information).

Additionally, it should be recognized that PDF supports a VERY RICH
object-level (as well as document level) metadata model.  One can
associate either simplistic "name/value pairs" or an entire chuck of XMP
(and XML/RDF-based standard (ISO 16684)) with any set of graphic objects.


More information about the rfc-interest mailing list