User Tools

Site Tools


design:text-requirements

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
design:text-requirements [2013/09/03 16:14]
rsewikiadmin
design:text-requirements [2013/10/09 13:57]
paul Added Paul comments
Line 1: Line 1:
 ==== Requirements for Text Output ==== ==== Requirements for Text Output ====
  
-Technical requirements: +The following assumes that there is going to be only one text-only non-canonical format, that its intended consumer is people who like the current format but want to be able to get all the semantic content out of the new canonical format, and that the HTML format will be done in a way where copy-and-paste of text will be easy and predictable.
-The .txt publication format must +
-  * use classic/branded headers on the first page +
-  * have paginated output +
-  * use classic/branded footers on each page +
-  * support UTF-8 encoding+
  
 +The text-only format must have the same page height limitations as the current RFC format. It must contain the same headers and footers and page break character as the current RFC format. There is no requirement that the text-only format use widow and orphan control; the use case for this is to allow similar referencing of paragraphs from the text-only format to the canonical format and non-canonical HTML format.
  
-Policy requirements: +The text-only format must allow for preformatted text and art to be 80 characters wide without wrapping those lines. This is due to the relaxation of the 72-character restriction on preformatted text and art in the new canonical format. 
-  * an RFC may include ASCII art OR SVG, but not both + 
-  * the .txt publication output will include links to the info pages where there are images +The text-only format must represent all SVG-based artwork as a "see reference" that goes to a stable URL, hosted on the RFC Editor's web site, that is part of the RFC's information page. The use case here is someone who is reading the text-only document with a modern text editor that can follow URLs. For example, if Figure 3 in the canonical version of RFC 7890 is a piece of SVG art, the text only format would say something like "See http://www.rfc-editor.org/info/rfc7890/figure3.svg for this art"
-    * we could potentially include direct links to the HTML anchorsbut am concerned about future failure of links; info page links are likely to be more stable + 
-  * non-ASCII characters in the Author'Address section only (for now this will expand in the future)+=== ASCII vs. UTF-8 for Text Output === 
 + 
 +As of 2013-10-09, it is not clear whether or not the text output will be ASCII or UTF-8. The following assumes ASCII. If the format is UTF-8, then the following is wrong. 
 + 
 +The text-only format must have the same character-set limitations as the current RFC format. For new RFCs that have non-ASCII characters in themeach such character must be represented as //[*U+xxxx*]//, where //xxxx// is a 4- or 6- character hex value. The use case here is that it must be possible to convert all of the encoded versions of the non-ASCII characters in the text-only document exactly to the correct characters in the canonical document. The choice of //[*U+xxxx*]// was made because it is extremely unlikely for that sequence to be part of a normal RFC, even one that talks about Unicode code points by their hex values. For example, an author'name that is represented in the canonical format as "Martin Dürst" would be represented in the text-only format as "Martin D[*U+00FC*]rst". This requires that lines in the text-only format be longer than 80 columns if those lines contain non-ASCII characters. 
 + 
 +//Dave thinks: disagree with the above paragraph. I'm leaning towards saying there should be a separate UTF-8 (e.g. .utf8) text version.  And for either version I don't think any U+ sequence should appear for a person's name.// 
 + 
 +//Paul thinks: if there are two versions, the .txt should be UTF-8 and the ASCII version should be .asc. If there is an all-ASCII version, we need to ask the authors how they want their names (mis)spelled in ASCII.//
  
design/text-requirements.txt · Last modified: 2013/10/15 19:52 by paul