RFC Errata


Errata Search

 
Source of RFC  
Summary Table Full Records

RFC 8259, "The JavaScript Object Notation (JSON) Data Interchange Format", December 2017

Source of RFC: jsonbis (art)

Errata ID: 7603
Status: Reported
Type: Technical
Publication Format(s) : TEXT

Reported By: Guillaume Fortin-Debigaré
Date Reported: 2023-08-13

Section 1 says:

   A string is a sequence of zero or more Unicode characters [UNICODE].

It should say:

   A string is a sequence of zero or more Unicode code points [UNICODE].

Notes:

Surrogate code points are not Unicode characters, as explained here: https://www.unicode.org/glossary/#surrogate_character

However, a surrogate code point outside of a surrogate pair is allowed in JSON strings both in escaped and unescaped forms according to the ABNF grammar in section 7 and the warning in section 8.2, despite an UTF-8 incompatibility for the unescaped form. In addition, the original text contradicts ECMA-404 section 9, which states: "A string is a sequence of Unicode code points wrapped with quotation marks (U+0022). All code points may be placed within the quotation marks except for the code points that must be escaped: quotation mark (U+0022), reverse solidus (U+005C), and the control characters U+0000 to U+001F. "

Report New Errata



Advanced Search