This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
design:formats [2013/08/16 16:23] paul created |
design:formats [2013/10/15 19:07] (current) rsewikiadmin |
||
---|---|---|---|
Line 1: | Line 1: | ||
====== Thoughts on Non-Canonical Formats ====== | ====== Thoughts on Non-Canonical Formats ====== | ||
- | |||
This page is for keeping thoughts about the expected output formats *other than* XML. | This page is for keeping thoughts about the expected output formats *other than* XML. | ||
Line 7: | Line 6: | ||
* Well-structured HTML | * Well-structured HTML | ||
- | * Unpaginated text | + | * Text (Unpaginated text and Paginated text) |
- | * Paginated text | + | |
* EPUB | * EPUB | ||
Line 14: | Line 12: | ||
===== Well-structured HTML ===== | ===== Well-structured HTML ===== | ||
- | < | + | Initial proposal: A strong design goal is that the conversion from canonical XML to HTML should be round-trippable, |
+ | Response: | ||
+ | * Round-tripping would require preserving non-semantic information. | ||
+ | * For instance, it'll be hard not no loose information from <author> elements, because the various aspects get rendered into different places. | ||
+ | * If roundtripping is not a goal, we need to make clear what kind of information we want to be represented in the HTML. "All semantic content" | ||
+ | * Semantic information *will* be lost during the transformation. The balancing act is making certain that enough semantic information is kept for making the HTML output useful for html-processing tools. The tough part is how to express that as a requirement. | ||
- | ===== Unpaginated Text ===== | + | For the example of counter=" |
+ | | ||
+ | | ||
+ | when the namespace characters are different. (What do you do with a | ||
+ | | ||
+ | | ||
+ | down. But is it the type of semantic information that *needs* to be | ||
+ | | ||
- | ===== Paginated Text ===== | + | Initial proposal: Consider allowing (eventually) javascript |
+ | Response: No | ||
+ | * This would negatively impact people feeling safe when opening RFCs | ||
+ | * Would make it more difficult to ensure RFCs look the same in all environments | ||
+ | * We wouldn' | ||
+ | * It's completely unnecessary for a simple text document. | ||
+ | * Think of the testing involved. | ||
- | A way to eliminate | + | |
+ | ===== Text ===== | ||
+ | === ASCII vs. UTF-8 for Text Output === | ||
+ | |||
+ | As of 2013-10-09, it is not clear whether or not the text output will be ASCII or UTF-8. The following assumes ASCII. If the format is UTF-8, then the following is wrong. | ||
+ | |||
+ | The text-only format must have the same character-set limitations as the current RFC format. For new RFCs that have non-ASCII characters in them, each such character must be represented as // | ||
+ | |||
+ | //Dave thinks: disagree with the above paragraph. I'm leaning towards saying there should be a separate UTF-8 (e.g. .utf8) text version. | ||
+ | |||
+ | //Paul thinks: if there are two versions, the .txt should be UTF-8 and the ASCII version should be .asc. If there is an all-ASCII version, we need to ask the authors how they want their names (mis)spelled in ASCII.// | ||
+ | |||
+ | Initial proposal: There should be multiple text outputs: ASCII-only with page breaks, ASCII-only without page breaks, UTF-8 with page breaks, UTF-8 without page breaks. | ||
+ | |||
+ | Response: Limit the .txt output to one option only, as similar as reasonable to what is available today. | ||
+ | |||
+ | ==== Avoiding Bad Breaks | ||
+ | |||
+ | The paginated text format needs to deal with the issue of paragraph or art that would be split over a page break. | ||
+ | |||
+ | [PH] Eliminate the problem | ||
+ | |||
+ | [TH] (widow == bottom line of a paragraph that winds up in the next column/ | ||
+ | |||
+ | If you must limit the page size to a maximum of N lines, then you can use a page length of N-1 lines to force another line onto the top of the next page. If headings occur prior to the orphan, then they must be moved to the next page as well. Paragraphs exactly 3 lines long that have been split in either direction would just be moved to the next page, along with any headings. | ||
===== PDF ===== | ===== PDF ===== | ||
- | We have talked about using [[http:// | + | Initial proposal: The document needs to include live links |
+ | For linking between RFCs, pointers to RFCs published before the format switchover will point to the TXT version | ||
+ | For linking between RFCs, pointers to RFCs published after the format switchover will point to the PDF version and will allow for pointers to specific sections within a document | ||
+ | The PDF version will include the standard front page header and include page numbers | ||
+ | | ||
- | We need to have at least two formats: US-standard (8.5 x 11) and A4. | + | Response: With HTML as an option, there is not a compelling case to require links in the PDF. One use case described was that of the IESG, several members of which choose to print out the PDF version for review. |
+ | |||
+ | Team also briefly considered how a tool like [[http:// | ||
===== EPUB ===== | ===== EPUB ===== | ||
If we also want to do MOBI (the native Amazon format), we might consider running the free-but-closed-source program from Amazon [[http:// | If we also want to do MOBI (the native Amazon format), we might consider running the free-but-closed-source program from Amazon [[http:// | ||
+ | |||
+ | If the HTML output is designed well, it can be used to create EPUB output with few, if any, additional requirements. | ||