RFC Errata


Errata Search

 
Source of RFC  
Summary Table Full Records

RFC 8610, "Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures", June 2019

Source of RFC: cbor (art)

Errata ID: 6527
Status: Held for Document Update
Type: Technical
Publication Format(s) : TEXT

Reported By: Sean Bartell
Date Reported: 2021-04-11
Held for Document Update by: Francesca Palombini
Date Held: 2021-04-23

Section B says:

text = %x22 *SCHAR %x22
SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC
SESC = "\" (%x20-7E / %x80-10FFFD)

It should say:

SESC = "\" ( %x22 / "/" / "\" /                 ; \" \/ \\
             %x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t
             (%x75 hexchar) )                   ; \u
hexchar = non-surrogate / (high-surrogate "\" %x75 low-surrogate)
non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) /
               ("D" %x30-37 2HEXDIG )
high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG
low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG

Notes:

As discussed during the CBOR interim 2021-04-21 and in the mailing list: https://mailarchive.ietf.org/arch/msg/cbor/ljHAiw-WhNqoIKAzkjZa4WIf56Q/
--
The ABNF used in [RFC8610] for the content of text string literals is rather permissive:

text = %x22 *SCHAR %x22
SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC
SESC = "\" (%x20-7E / %x80-10FFFD)

This allows almost any non-C0 character to be escaped by a backslash, but critically misses out on the \uXXXX and \uHHHH\uLLLL forms that JSON allows to specify characters in hex. Both can be solved by updating the SESC production to:

SESC = "\" ( %x22 / "/" / "\" / ; \" \/ \\
%x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t
(%x75 hexchar) ) ; \u
hexchar = non-surrogate / (high-surrogate "\" %x75 low-surrogate)
non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) /
("D" %x30-37 2HEXDIG )
high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG
low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG

Now that SESC is more restrictively formulated, this also requires an update to the BCHAR production used in the ABNF syntax for byte string literals:

bytes = [bsqual] %x27 *BCHAR %x27
BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
bsqual = "h" / "b64"
The updated version explicit allows \', which is no longer allowed in the updated SESC:

BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / "\'" / CRLF

Report New Errata



Advanced Search