User Tools

Site Tools


design:text-requirements

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
design:text-requirements [2013/09/03 16:22]
rsewikiadmin
design:text-requirements [2013/10/09 13:57]
paul Added Paul comments
Line 1: Line 1:
- ==== Requirements for Text Output ====+==== Requirements for Text Output ====
  
-Technical requirements: +The following assumes that there is going to be only one text-only non-canonical format, that its intended consumer is people who like the current format but want to be able to get all the semantic content out of the new canonical format, and that the HTML format will be done in a way where copy-and-paste of text will be easy and predictable.
-The .txt publication format must +
-  * use classic/branded headers on the first page +
-  * have paginated output +
-  * use classic/branded footers on each page +
-  * support UTF-8 encoding+
  
 +The text-only format must have the same page height limitations as the current RFC format. It must contain the same headers and footers and page break character as the current RFC format. There is no requirement that the text-only format use widow and orphan control; the use case for this is to allow similar referencing of paragraphs from the text-only format to the canonical format and non-canonical HTML format.
  
-Policy requirements: +The text-only format must allow for preformatted text and art to be 80 characters wide without wrapping those linesThis is due to the relaxation of the 72-character restriction on preformatted text and art in the new canonical format.
-  * an RFC may include ASCII art OR SVG, but not both +
-  * the .txt publication output will include links to the info pages where there are images +
-    * we could potentially include direct links to the HTML anchors, but am concerned about future failure of links; info page links are likely to be more stable +
-  * non-ASCII characters in the Author's Address section only (for now this will expand in the future)+
  
 +The text-only format must represent all SVG-based artwork as a "see reference" that goes to a stable URL, hosted on the RFC Editor's web site, that is part of the RFC's information page. The use case here is someone who is reading the text-only document with a modern text editor that can follow URLs. For example, if Figure 3 in the canonical version of RFC 7890 is a piece of SVG art, the text only format would say something like "See http://www.rfc-editor.org/info/rfc7890/figure3.svg for this art".
  
-Current requirements re: page breaking for xml2rfc v2+=== ASCII vs. UTF-8 for Text Output ===
  
-  #172: page-breaking enhancements +As of 2013-10-09, it is not clear whether or not the text output will be ASCII or UTF-8. The following assumes ASCII. If the format is UTF-8then the following is wrong.
-  Improve the PI autobreaks="yes". Currently, it seems to be "yes" by default but has no effect. It should prevent a page break from appearing in +
-  - the middle of a figure (see Figure 1 in the attached file) +
-  in the middle of a reference (see Section 10.2 in the attached file) +
-  between a section title and the first sentence (see Section 5 in the attached file). This is the same as #72, which is marked fixed, but doesn't seem to be for this case. +
-  If there is an Appendix Ainsert a page break before it. (This could be tied to rfcedstyle="yes", as it's RFC Editor style.) Or, improve the ability to insert a page break -- vspace blankLines="100" +
-doesn't always work.+
  
-Ticket URL: <http://trac.tools.ietf.org/tools/xml2rfc/trac/ticket/172>]+The text-only format must have the same character-set limitations as the current RFC format. For new RFCs that have non-ASCII characters in them, each such character must be represented as //[*U+xxxx*]//, where //xxxx// is a 4- or 6- character hex valueThe use case here is that it must be possible to convert all of the encoded versions of the non-ASCII characters in the text-only document exactly to the correct characters in the canonical documentThe choice of //[*U+xxxx*]// was made because it is extremely unlikely for that sequence to be part of a normal RFC, even one that talks about Unicode code points by their hex values. For example, an author's name that is represented in the canonical format as "Martin Dürst" would be represented in the text-only format as "Martin D[*U+00FC*]rst". This requires that lines in the text-only format be longer than 80 columns if those lines contain non-ASCII characters.
  
 +//Dave thinks: disagree with the above paragraph. I'm leaning towards saying there should be a separate UTF-8 (e.g. .utf8) text version.  And for either version I don't think any U+ sequence should appear for a person's name.//
  
-Current internal guidance to the RPC: +//Paul thinksif there are two versions, the .txt should be UTF-8 and the ASCII version should be .ascIf there is an all-ASCII versionwe need to ask the authors how they want their names (mis)spelled in ASCII.//
- +
-  "3+ lines held together"  i.e., at the bottom of a page, there is not a single line or 2 lines (unless they are their own paragraph). +
-  Keep a section title with the first 3 lines of the section. +
-  Keep a figure on one page. This includes its title (if any). +
-   if figure does not fit on one page, break between discrete items in the figure +
-  Keep a table on one pageThis includes its title (if any). +
-   if table does not fit on one pagebreak between rows. +
-  Keep a term with its definition. +
-   - if definition is long (over 5 lines), keep at least the first 3 lines of the definition with the term. +
-  Keep a given reference on one page. +
-  Start Appendix A on a new page. +
- +
-  More subjective: +
-  Keep intro text with what it's introducing. (Example: text that introduces a list, an equation, or a piece of pseudocode) +
-  Break pseudocode nicely.+
  
design/text-requirements.txt · Last modified: 2013/10/15 19:52 by paul