User Tools

Site Tools


design:formats

This is an old revision of the document!


Thoughts on Non-Canonical Formats

This page is for keeping thoughts about the expected output formats *other than* XML.

The formats discussed so far are:

  • Well-structured HTML
  • Unpaginated text
  • Paginated text
  • PDF
  • EPUB

Recent discussion was around generating the well-structured HTML from the XML, and the rest of the formats from the HTML, because there are usually more tools for conversion from HTML than XML.

Well-structured HTML

A strong design goal is that the conversion from canonical XML to HTML should be round-trippable, that is, that it should be possible to convert the HTML back to XML with literally zero loss of semantic content.

Joe goes here.

Unpaginated Text

Paginated Text

A way to eliminate the need for widow/orphan control or bad page breaks in figures is to just be willing to leave extra white space at the bottom of the paginated pages. If a single paragraph or figure is too large to fit on a paginated page (the tool should warn about this every time it emits paginated text output), the Production Center can break the paragraph or split the figure into two.

PDF

We have talked about using PrinceXML to generate PDF.

We need to have at least two formats: US-standard (8.5 x 11) and A4.

EPUB

If we also want to do MOBI (the native Amazon format), we might consider running the free-but-closed-source program from Amazon http://www.amazon.com/gp/feature.html?ie=UTF8&docId=1000765211. If people are hesitant about us even partially supporting that program, we could consider having a pointer to the program on an advisory page on rfc-editor.org.

design/formats.1377044082.txt.gz · Last modified: 2013/08/20 17:14 by rsewikiadmin