User Tools

Site Tools


design:text-requirements

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
design:text-requirements [2013/10/04 15:06]
paul
design:text-requirements [2013/10/09 13:57]
paul Added Paul comments
Line 6: Line 6:
  
 The text-only format must allow for preformatted text and art to be 80 characters wide without wrapping those lines. This is due to the relaxation of the 72-character restriction on preformatted text and art in the new canonical format. The text-only format must allow for preformatted text and art to be 80 characters wide without wrapping those lines. This is due to the relaxation of the 72-character restriction on preformatted text and art in the new canonical format.
 +
 +The text-only format must represent all SVG-based artwork as a "see reference" that goes to a stable URL, hosted on the RFC Editor's web site, that is part of the RFC's information page. The use case here is someone who is reading the text-only document with a modern text editor that can follow URLs. For example, if Figure 3 in the canonical version of RFC 7890 is a piece of SVG art, the text only format would say something like "See http://www.rfc-editor.org/info/rfc7890/figure3.svg for this art".
 +
 +=== ASCII vs. UTF-8 for Text Output ===
 +
 +As of 2013-10-09, it is not clear whether or not the text output will be ASCII or UTF-8. The following assumes ASCII. If the format is UTF-8, then the following is wrong.
  
 The text-only format must have the same character-set limitations as the current RFC format. For new RFCs that have non-ASCII characters in them, each such character must be represented as //[*U+xxxx*]//, where //xxxx// is a 4- or 6- character hex value. The use case here is that it must be possible to convert all of the encoded versions of the non-ASCII characters in the text-only document exactly to the correct characters in the canonical document. The choice of //[*U+xxxx*]// was made because it is extremely unlikely for that sequence to be part of a normal RFC, even one that talks about Unicode code points by their hex values. For example, an author's name that is represented in the canonical format as "Martin Dürst" would be represented in the text-only format as "Martin D[*U+00FC*]rst". This requires that lines in the text-only format be longer than 80 columns if those lines contain non-ASCII characters. The text-only format must have the same character-set limitations as the current RFC format. For new RFCs that have non-ASCII characters in them, each such character must be represented as //[*U+xxxx*]//, where //xxxx// is a 4- or 6- character hex value. The use case here is that it must be possible to convert all of the encoded versions of the non-ASCII characters in the text-only document exactly to the correct characters in the canonical document. The choice of //[*U+xxxx*]// was made because it is extremely unlikely for that sequence to be part of a normal RFC, even one that talks about Unicode code points by their hex values. For example, an author's name that is represented in the canonical format as "Martin Dürst" would be represented in the text-only format as "Martin D[*U+00FC*]rst". This requires that lines in the text-only format be longer than 80 columns if those lines contain non-ASCII characters.
  
-The text-only format must represent all SVG-based artwork as a "see reference" that goes to a stable URL, hosted on the RFC Editor's web site, that is part of the RFC's information pageThe use case here is someone who is reading the text-only document with a modern text editor that can follow URLsFor example, if Figure 3 in the canonical version of RFC 7890 is piece of SVG art, the text only format would say something like "See http://www.rfc-editor.org/info/rfc7890/figure3.svg for this art".+//Dave thinks: disagree with the above paragraph. I'm leaning towards saying there should be a separate UTF-8 (e.g. .utf8) text version And for either version I don't think any U+ sequence should appear for person's name.//
  
 +//Paul thinks: if there are two versions, the .txt should be UTF-8 and the ASCII version should be .asc. If there is an all-ASCII version, we need to ask the authors how they want their names (mis)spelled in ASCII.//
  
design/text-requirements.txt · Last modified: 2013/10/15 19:52 by paul