RFC Errata


Errata Search

 
Source of RFC  
Summary Table Full Records

Found 8 records.

Status: Verified (3)

RFC 7159, "The JavaScript Object Notation (JSON) Data Interchange Format", March 2014

Note: This RFC has been obsoleted by RFC 8259

Source of RFC: json (app)

Errata ID: 3915
Status: Verified
Type: Technical
Publication Format(s) : TEXT

Reported By: Vasiliy Faronov
Date Reported: 2014-03-07
Verifier Name: Pete Resnick
Date Verified: 2014-05-16

Section 12 says:

   Since JSON's syntax is borrowed from JavaScript, it is possible to
   use that language's "eval()" function to parse JSON texts.

It should say:

   Since JSON's syntax is borrowed from JavaScript, it is possible to
   use that language's "eval()" function to parse most (but not all)
   JSON texts.

Notes:

This wording may be construed as meaning that every compliant JSON text is parseable as JavaScript, which is not the case: <http://timelessrepo.com/json-isnt-a-javascript-subset>. (Actually I would prefer this to be stated clearly elsewhere in the document, e.g. where it says "JSON's design goals were for it to be [...] a subset of JavaScript".)

--VERIFIER NOTES--
As per the above citation, there are characters (in particular line terminators like U+2028 and U+2029) which are permissible in JSON but not permissible in JavaScript. The corrected text makes this clearer.

Errata ID: 4264
Status: Verified
Type: Editorial
Publication Format(s) : TEXT

Reported By: Federico do Pino
Date Reported: 2015-02-05
Verifier Name: Barry Leiba
Date Verified: 2015-03-01

Section 15.2 says:

   [Err3607]  RFC Errata, Errata ID 3607, RFC 3607,
              <http://www.rfc-editor.org>.

   [Err607]   RFC Errata, Errata ID 607, RFC 607,
              <http://www.rfc-editor.org>.

It should say:

   [Err3607]  RFC Errata, Errata ID 3607, for RFC 4627,
              <http://www.rfc-editor.org>.

   [Err607]   RFC Errata, Errata ID 607, for RFC 4627,
              <http://www.rfc-editor.org>.

Notes:

The references point to RFCs by the same numbers as the errata IDs, while the intention was to refer to errata (by the same IDs) reported for the previous JSON RFC, namely RFC 4627. (RFCs 607 and 3607 are completely unrelated.)

The links may also be replaced with direct links to the errata pages, for instance http://www.rfc-editor.org/errata_search.php?eid=607 and http://www.rfc-editor.org/errata_search.php?eid=3607

Errata ID: 4336
Status: Verified
Type: Editorial
Publication Format(s) : TEXT

Reported By: Martin Pain
Date Reported: 2015-04-14
Verifier Name: Barry Leiba
Date Verified: 2015-04-14

Section Appendix A says:

[NO MENTION OF SECTION 3 OF RFC 4627]

It should say:

   o  Removed method of detection of character encoding from
      section 3 "Encoding" of RFC 4627.

       

Notes:

Appendix 1 (listing changes between RFC 4627 and RFC 7159) does not include any comment on the removal of this text from RFC 4627 section 3:

[START QUOTE]
Since the first two characters of a JSON text will always be ASCII
characters [RFC0020], it is possible to determine whether an octet
stream is UTF-8, UTF-16 (BE or LE), or UTF-32 (BE or LE) by looking
at the pattern of nulls in the first four octets.

00 00 00 xx UTF-32BE
00 xx 00 xx UTF-16BE
xx 00 00 00 UTF-32LE
xx 00 xx 00 UTF-16LE
xx xx xx xx UTF-8
[END QUOTE]


The new section 8.1 "Character encoding" states that:

[START QUOTE]
JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32
[END QUOTE]

but, unlike RFC 4627 section 3, it does not say anything about how to distinguish which has been used when parsing a byte string as JSON.


RFC 7159 section 8.1 also says:

[START QUOTE]
Implementations MUST NOT add a byte order mark to the beginning of a
JSON text.
[END QUOTE]

which rules out using a byte order mark for this purpose.


Additionally, RFC 7159 section 11 says:

[START QUOTE]
Note: No "charset" parameter is defined for this registration.
Adding one really has no effect on compliant recipients.
[END QUOTE]

which rules out one means of communicating which character encoding is in use when communicating JSON over HTTP (namely a charset parameter on the media type), and implies that there is another means of detecting the character encoding, but does not say what it is.


I've reported this as an erratum on the appendix, as I expect there is an existing means of detecting which of the Unicode character encodings are in use, but I was expecting the appendix to reference it as part of an explanation of the removal of the text I quoted from RFC 4627 section 3 but no such explanation is present. It may be the case that the erratum ought to be against section 8.1 to provide a reference there.

Status: Held for Document Update (1)

RFC 7159, "The JavaScript Object Notation (JSON) Data Interchange Format", March 2014

Note: This RFC has been obsoleted by RFC 8259

Source of RFC: json (app)

Errata ID: 4388
Status: Held for Document Update
Type: Editorial
Publication Format(s) : TEXT

Reported By: Kerry Lynn
Date Reported: 2015-06-09
Held for Document Update by: Barry Leiba
Date Held: 2015-06-09

Section 1.2 says:

   This document updates [RFC4627], which describes JSON and registers
   the media type "application/json".


It should say:

   This document replaces [RFC4627], which previously described JSON and
   registered the media type "application/json".


Notes:

The link at https://tools.ietf.org/html/rfc7159 says it obsoletes RFC 4627. I believe this is the intent and result of RFC 7159, and that RFC 4627 need not be consulted except for historical reasons going forward.

Status: Rejected (4)

RFC 7159, "The JavaScript Object Notation (JSON) Data Interchange Format", March 2014

Note: This RFC has been obsoleted by RFC 8259

Source of RFC: json (app)

Errata ID: 3983
Status: Rejected
Type: Technical
Publication Format(s) : TEXT

Reported By: Alain BENEDETTI
Date Reported: 2014-05-09
Rejected by: Pete Resnick
Date Rejected: 2014-05-16

Section 2. says:

JSON-text = ws value ws

It should say:

JSON-text = [BOM] ws value ws

BOM = %xFEFF

Notes:

Section 8.1 states:
[START QUOTE]
(...) implementations that parse JSON texts *MAY* ignore the presence of a byte order mark rather than treating it as an error.
[END QUOTE]

Indeed that means that a BOM *CAN* occur, and *MAY* be accepted instead of returning an error. So, if the BOM can be accepted, the grammar is incomplete as it does not show it.

Furthermore, about BOMs, see my other comments.
--VERIFIER NOTES--
The WG was clear that syntactically, the BOM is not part of a valid JSON-text, but implementation advice included that if one appears, it MAY be ignored. The document is correct as-is.

Errata ID: 3984
Status: Rejected
Type: Technical
Publication Format(s) : TEXT

Reported By: Alain BENEDETTI
Date Reported: 2014-05-09
Rejected by: Pete Resnick
Date Rejected: 2014-05-16

Section 7 says:

unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

It should say:

unescaped = %x20-21 / %x23-5B / %x5D-D7FF / %xE000-10FFFF

Notes:

Section 1 says:
[START QUOTE]
A string is a sequence of zero or more Unicode characters [UNICODE]. Note that this citation references the latest version of Unicode rather than a specific release.
[END QUOTE]

Section 8.2 emphasizes on the characters not allowed by the Unicode norm:
[START QUOTE]
However, the ABNF in this specification allows member names and string values to contain bit sequences that cannot encode Unicode characters; for example, "\uDEAD" (a single unpaired UTF-16 surrogate).
[END QUOTE]

The ABNF cannot at the same time allow non conformant Unicode codepoints (section 7) and states conformance to Unicode (section 1).

Therefore, there is an incoherence that must be fixed between section 1 (and 8.2 that emphasizes it) and section 7. Hence the proposition of modification in section 7.

The other less preferable solution to this fix incoherence in RFC7159 would have been to NOT require Unicode conformance.
--VERIFIER NOTES--
The WG's consensus was to leave the full range present in the ABNF and add the interoperability guidance about values outside the Unicode accepted range.

Errata ID: 4220
Status: Rejected
Type: Technical
Publication Format(s) : TEXT

Reported By: mirabilos
Date Reported: 2015-01-05
Rejected by: Barry Leiba
Date Rejected: 2015-03-01

Section 7 says:

   […] All Unicode characters may be placed within the
   quotation marks, except for the characters that must be escaped:
   quotation mark, reverse solidus, and the control characters (U+0000
   through U+001F).

[…]

      unescaped = %x20-21 / %x23-5B / %x5D-10FFFF

It should say:

   […] All Unicode characters may be placed within the
   quotation marks, except for the characters that must be escaped:
   quotation mark, reverse solidus, the control characters (U+0000
   through U+001F), and codepoints beyond the BMP (U-00010000
   through U-0010FFFF).

[…]

      unescaped = %x20-21 / %x23-5B / %x5D-FFFF

Notes:

ECMA-262 states that "ECMAScript source text is assumed to be a sequence
of 16-bit code units", and everything must be converted to UTF-16 first.
This stems from the fact that Java™, like Win32, used UCS-2, which only
later got extended to allow UTF-16. This means that SMP codepoints must
always be escaped into their twelve-character form.

This is also an interoperability issue: implementations may wish to parse
(or generate) JSON by using a wchar_t data type (in C) which, depending on
the platform, may be only 16 bits wide. ECMAscript allows for this.
--VERIFIER NOTES--
This is reporting a disagreement with a decision the working group made in the development of the document, and is not reporting an erratum.

Errata ID: 4298
Status: Rejected
Type: Technical
Publication Format(s) : TEXT

Reported By: Michael Kay
Date Reported: 2015-03-11
Rejected by: Barry Leiba
Date Rejected: 2015-03-11

Section 7 Strings says:

4HEXDIG

It should say:

Something along the lines

4hexdig
hexdig = 0..9 A..F a..f

Notes:

The prose states that within a six-character escape sequence starting \u, the hex digits can be either upper or lower case. But the grammar only allows upper case, because it uses the symbol HEXDIG which is defined in RFC 5234 to include upper case A..F only. There is thus an inconsistency beween the grammar and the prose, and although most readers are likely to know which to believe, the spec is formally ambiguous on this point.
--VERIFIER NOTES--
The reporter doesn't understand RFC 5234. Look in Section 2.3 (Terminal Values), and see this:

> NOTE:
>
> ABNF strings are case insensitive and the character set for these
> strings is US-ASCII.

There is no inconsistency, and the text as written allows both upper and lower case for the hex digits.

Report New Errata



Advanced Search