This shows you the differences between two versions of the page.
Next revision | Previous revision Next revision Both sides next revision | ||
design:utf-8 [2013/09/04 19:26] dthaler created |
design:utf-8 [2013/10/15 15:54] dthaler |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | Some discussion of requirements, | + | Some discussion of requirements, |
The table below summarizes a taxonomy of cases where (non-ASCII) UTF-8 might or might not be allowed, along with some thoughts. The intent is that each row represents a separate policy decision. | The table below summarizes a taxonomy of cases where (non-ASCII) UTF-8 might or might not be allowed, along with some thoughts. The intent is that each row represents a separate policy decision. | ||
Line 5: | Line 5: | ||
^ Case ^ Section ^ Use ^ Consensus ^ Comments ^ | ^ Case ^ Section ^ Use ^ Consensus ^ Comments ^ | ||
- | | 1a | (title page) | Author name | Yes? | Answer should match (6a) | | + | | 1a | (title page) | Author name | Yes | Answer should match (6a) | |
- | | 1b | (title page) | Author affiliation | | Answer should match (6b) | | + | | 1b | (title page) | Author affiliation | Yes | Answer should match (6b) | |
- | | 1c | (title page) | Document title | | | | + | | 1c | (title page) | Document title | No? | | |
- | | 2 | Abstract | | | | | + | | 2 | Abstract | Prose | No? | Could be same as (3e), but Abstracts may also be separately compiled into other indices so could have a different answer | |
- | | 3a | Body or Appendix | Example string | | E.g. fictional person name, IRI, EAI, domain name, etc. Currently there' | + | | 3a | Body or Appendix | Example string | Yes | E.g. fictional person name, IRI, EAI, domain name, etc. Currently there' |
- | | 3b | Body or Appendix | Code snippet | | | | + | | 3b | Body or Appendix | Code snippet | No | " |
- | | 3c | Body or Appendix | Literal protocol element | | Transliteration | + | | 3c | Body or Appendix | Literal protocol element | Yes? | Required transliteration |
- | | 3d | Body or Appendix | Document title of a cited document | | Answer should match (4c) | | + | | 3d | Body or Appendix | Document title of a cited document | Yes | Answer should match (4c) | |
- | | 3e | Body or Appendix | Prose | No? | e.g. use of " | + | | 3e | Body or Appendix | Prose | No | e.g. use of " |
- | | 4a | References | Author name | Yes? | Answer should match (1a) | | + | | 3f | Body or Appendix | Section title | No? | | |
- | | 4b | References | Author affiliation | | Answer should match (1b) | | + | | 4a | References | Author name | Yes | Answer should match (1a) | |
- | | 4c | References | Document title | | Not necessarily an RFC that's being referenced | | + | | 4b | References | Author affiliation | Yes | Answer should match (1b) | |
- | | 4d | References | Document IRI | No? | | | + | | 4c | References | Document title | Yes | Not necessarily an RFC that's being referenced | |
- | | 5a | Acknowledgements | Person name | Yes? | | | + | | 4d | References | Document IRI | No | | |
- | | 5b | Acknowledgements | Organization name | | | | + | | 5a | Acknowledgements | Person name | Yes | | |
- | | 6a | Authors Addresses | Author name | Yes? | | | + | | 5b | Acknowledgements | Organization name | Yes | | |
- | | 6b | Authors Addresses | Author affiliation | | | | | + | | 6a | Authors Addresses | Author name | Yes | | |
- | | 6c | Authors Addresses | Author email address (EAI) | | | | + | | 6b | Authors Addresses | Author affiliation | Yes | | | |
- | | 6c | Authors Addresses | Author IRI | No? | | | + | | 6c | Authors Addresses | Author email address (EAI) | No? | | |
+ | | 6d | Authors Addresses | Author IRI | No | | | ||
+ | | 6e | Authors Addresses | Author postal address | Yes | | | ||
+ | | 7a | (page footer) | Author surname | Yes? | | | ||
+ | | 7b | (page header) | Abbreviated document name | No? | | | ||
+ | | 8a | (metadata) | Keywords | Yes | | | ||
| | | | ||
+ | |||
+ | Other open questions: | ||
+ | - Where UTF-8 is allowed, what normalization form(s) are ok? (NFC, NFD, NFKC, or NFKD) | ||
+ | - //Paul thinks//: irrelevant. Characters are characters. | ||
+ | - Can you reference an external document that contains a non-ascii title/ | ||
+ | - //Paul thinks//: Yes, definitely. This is important for non-ASCII author names. Transliteration can be guessed at by the RFC author. | ||
+ | - // | ||
+ | - ... | ||
+ | |||
+ | Strawman requirements (beyond those listed at https:// | ||
+ | - An implementer must be able to implement the specification without any confusion or ambiguity introduced by the use of UTF-8 rather than ASCII. | ||
+ | - Must be able to reference (cite) the document from elsewhere in a standard way, including from documents that only support ASCII. | ||
+ | - Must be able to reference (cite) other documents in an unambiguous way. | ||
+ | - Cross-references (including references to other documents) must be unambiguous even from a printed document. | ||
+ | - Must be able to index the document in various ways, so searching by keyword, author name, etc. can work. | ||
+ | - All documents will be UTF-8 encoded and MUST apply Normalization Form C to all metadata fields such as document name, authors, and references unless a specific exception is granted by the RSE. The body of the document MAY contain other normalization forms as declared necessary by the authors. | ||
+ | |||
+ | Strawman principles (similar to RFC 6912 approach): | ||
+ | - If something could affect interoperability or would block an implementer from being able to implement, any use of UTF-8 must be accompanied by an ASCII transliteration. The transliteration must be called out in a way to make it clear it is a transliteration (instead of just a part of the original item). | ||
+ | - Do not assume that any non-ASCII character will necessarily be rendered correctly (or at all) | ||
+ | - ... |