RFC Errata
RFC 7159, "The JavaScript Object Notation (JSON) Data Interchange Format", March 2014
Note: This RFC has been obsoleted by RFC 8259
Source of RFC: json (app)
Errata ID: 4220
Status: Rejected
Type: Technical
Publication Format(s) : TEXT
Reported By: mirabilos
Date Reported: 2015-01-05
Rejected by: Barry Leiba
Date Rejected: 2015-03-01
Section 7 says:
[…] All Unicode characters may be placed within the quotation marks, except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F). […] unescaped = %x20-21 / %x23-5B / %x5D-10FFFF
It should say:
[…] All Unicode characters may be placed within the quotation marks, except for the characters that must be escaped: quotation mark, reverse solidus, the control characters (U+0000 through U+001F), and codepoints beyond the BMP (U-00010000 through U-0010FFFF). […] unescaped = %x20-21 / %x23-5B / %x5D-FFFF
Notes:
ECMA-262 states that "ECMAScript source text is assumed to be a sequence
of 16-bit code units", and everything must be converted to UTF-16 first.
This stems from the fact that Java™, like Win32, used UCS-2, which only
later got extended to allow UTF-16. This means that SMP codepoints must
always be escaped into their twelve-character form.
This is also an interoperability issue: implementations may wish to parse
(or generate) JSON by using a wchar_t data type (in C) which, depending on
the platform, may be only 16 bits wide. ECMAscript allows for this.
--VERIFIER NOTES--
This is reporting a disagreement with a decision the working group made in the development of the document, and is not reporting an erratum.