User Tools

Site Tools


design:utf8-requirements

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revision Both sides next revision
design:utf8-requirements [2013/10/15 14:23]
rsewikiadmin created
design:utf8-requirements [2013/11/06 13:53]
rsewikiadmin
Line 1: Line 1:
  
-    All documents will be UTF-8 encoded and MUST apply Normalization Form C to all metadata fields such as document name, authors, and references unless a specific exception is granted by the RSE. The body of the document MAY contain other normalization forms as declared necessary by the authors.  Non-ASCII characters are only allowed in Author namescontact information, examples, and References.  Author names will also require an ASCII representation to encourage broader indexing.+Author names: Valid Unicode is required, and for non-ASCII names, an ASCII-only identifier is required. 
 + 
 +Bibliographic text: The reference must point to something that has been translated to English; whatever subfields 
 +are present MUST be available in ASCII (translated to English when appropriate), as long as good sense is usedthey MAY also appear in non-ASCII characters at author discretion.  This applies to both normative and informative references.   
 + 
 +Keywords: US-ASCII only 
 + 
 +Body: The mention of non-ASCII characters requires identifiers, encourage characters, allow unicode names.  General use does not require any clarifying identifiers or unicode names. (Note: use versus mention distinction) 
 + 
 +  We would NOT apply in the use case and we WOULD apply in the mention case.  So, 
 +    CATEGORY        NUMBER 
 +    naïve           300 
 +  but 
 +   CATEGORY        EXAMPLES 
 +    Latin           naïve (U+0063 U+0061 U+00EF U+0076 U+0065) 
 + 
 +Tables: Tables follow the same rules for identifiers and characters as the body.  If it is sensible (i.e., more understandable for a reader) for a given document to have two tables, one including the identifiers and characters, one with just the characters, that will be allowed on a case by case basis. 
 + 
 +U+ notation must be used except within a code component where you must follow the rules of the programming language in which you are writing the code 
 + 
 +Normalization forms: If the normalization matters to the contentthe authors must submit in a normalization-resistant form.  Do not expect normalization forms to be preserved. 
 + 
 + 
 +All documents should identify themselves as being UTF-8.  Both the canonical XML format and the non-canonical HTML format must contain metadata that specifies that the encoding is UTF-8. The non-canonical text-only format must begin with a UTF-8 BOM.
          
-    An implementer must be able to implement the specification without any confusion or ambiguity introduced by the use of UTF-8 rather than ASCII.+An implementer must be able to implement the specification without any confusion or ambiguity introduced by the use of UTF-8 rather than ASCII.
          
-    Must be able to reference (cite) the document from elsewhere in a standard way, including from documents that only support ASCII.+People must be able to reference (cite) the RFC from elsewhere in a standard way, including from documents that only support ASCII.
          
-    Must be able to reference (cite) other documents in an unambiguous way.+The RFC must be able to reference (cite) other documents in an unambiguous way.
          
-    Cross-references (including references to other documents) must be unambiguous even from a printed document.+Cross-references (including references to other documents) must be unambiguous even from a printed document.
          
-    Must be able to index the document in various ways, so searching by keyword, author nameetc. can work.+Tools must be able to index the RFC in various ways, so searching for keywords, author namesand so on can work.
  
  
design/utf8-requirements.txt · Last modified: 2013/11/06 18:40 by rsewikiadmin