RFC Errata
RFC 8610, "Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures", June 2019
Source of RFC: cbor (art)
Errata ID: 6527
Status: Held for Document Update
Type: Technical
Publication Format(s) : TEXT
Reported By: Sean Bartell
Date Reported: 2021-04-11
Held for Document Update by: Francesca Palombini
Date Held: 2021-04-23
Section B says:
text = %x22 *SCHAR %x22 SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC SESC = "\" (%x20-7E / %x80-10FFFD)
It should say:
SESC = "\" ( %x22 / "/" / "\" / ; \" \/ \\ %x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t (%x75 hexchar) ) ; \u hexchar = non-surrogate / (high-surrogate "\" %x75 low-surrogate) non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) / ("D" %x30-37 2HEXDIG ) high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG
Notes:
As discussed during the CBOR interim 2021-04-21 and in the mailing list: https://mailarchive.ietf.org/arch/msg/cbor/ljHAiw-WhNqoIKAzkjZa4WIf56Q/
--
The ABNF used in [RFC8610] for the content of text string literals is rather permissive:
text = %x22 *SCHAR %x22
SCHAR = %x20-21 / %x23-5B / %x5D-7E / %x80-10FFFD / SESC
SESC = "\" (%x20-7E / %x80-10FFFD)
This allows almost any non-C0 character to be escaped by a backslash, but critically misses out on the \uXXXX and \uHHHH\uLLLL forms that JSON allows to specify characters in hex. Both can be solved by updating the SESC production to:
SESC = "\" ( %x22 / "/" / "\" / ; \" \/ \\
%x62 / %x66 / %x6E / %x72 / %x74 / ; \b \f \n \r \t
(%x75 hexchar) ) ; \u
hexchar = non-surrogate / (high-surrogate "\" %x75 low-surrogate)
non-surrogate = ((DIGIT / "A"/"B"/"C" / "E"/"F") 3HEXDIG) /
("D" %x30-37 2HEXDIG )
high-surrogate = "D" ("8"/"9"/"A"/"B") 2HEXDIG
low-surrogate = "D" ("C"/"D"/"E"/"F") 2HEXDIG
Now that SESC is more restrictively formulated, this also requires an update to the BCHAR production used in the ABNF syntax for byte string literals:
bytes = [bsqual] %x27 *BCHAR %x27
BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / CRLF
bsqual = "h" / "b64"
The updated version explicit allows \', which is no longer allowed in the updated SESC:
BCHAR = %x20-26 / %x28-5B / %x5D-10FFFD / SESC / "\'" / CRLF