RFC Errata

Errata Search
RFC Number:		Errata ID:

Source of RFC		Status:
Area Acronym:		Type:
WG Acronym:		Submitter Name:
Other:		Date Submitted:
Summary Table Full Records

Found 12 records.

Status: Verified (2)

RFC 5234, "Augmented BNF for Syntax Specifications: ABNF", January 2008

Note: This RFC has been updated by RFC 7405

Source of RFC: IETF - NON WORKING GROUP
Area Assignment: app

Errata ID: 2968
Status: Verified
Type: Technical
Publication Format(s) : TEXT
Reported By: Daniel van Vugt
Date Reported: 2011-09-12
Verifier Name: Pete Resnick
Date Verified: 2013-01-28

Section 4 says:

elements       =  alternation *c-wsp

It should say:

elements       =  alternation *WSP

Notes:

The grammar in section 4 of RFC 5234 is ambiguous. This was discovered by my own parsing code when trying to parse the ABNF grammar with itself. The ambiguity can be seen in a simplified form using the following 10 characters of input:

Input: X = Y \r \n ; Z \r \n
Offset: 0 1 2 3 4 5 6 7 8 9

My parser finds these two (ambiguous) solutions...

SOLUTION 1:

rulelist @ 0 len 10
rule @ 0 len 10
rulename @ 0 len 1 "X"
ALPHA @ 0 len 1
star_c_wsp @ 1 len 0
defined_as @ 1 len 1
star_c_wsp @ 2 len 0
elements @ 2 len 4
alternation @ 2 len 1
concatenation @ 2 len 1
repetition @ 2 len 1
element @ 2 len 1
rulename @ 2 len 1 "Y"
ALPHA @ 2 len 1
star_c_wsp @ 3 len 3
c_wsp @ 3 len 3
c_nl @ 3 len 2
CRLF @ 3 len 2
CR @ 3 len 1
LF @ 4 len 1
WSP @ 5 len 1
SP @ 5 len 1
c_nl @ 6 len 4
comment @ 6 len 4 ";Z
"
WSP_or_VCHAR @ 7 len 1
VCHAR @ 7 len 1
CRLF @ 8 len 2
CR @ 8 len 1
LF @ 9 len 1

SOLUTION 2:

rulelist @ 0 len 10
rule @ 0 len 5
rulename @ 0 len 1 "X"
ALPHA @ 0 len 1
star_c_wsp @ 1 len 0
defined_as @ 1 len 1
star_c_wsp @ 2 len 0
elements @ 2 len 1
alternation @ 2 len 1
concatenation @ 2 len 1
repetition @ 2 len 1
element @ 2 len 1
rulename @ 2 len 1 "Y"
ALPHA @ 2 len 1
star_c_wsp @ 3 len 0
c_nl @ 3 len 2
CRLF @ 3 len 2
CR @ 3 len 1
LF @ 4 len 1
star_c_wsp @ 5 len 1
c_wsp @ 5 len 1
WSP @ 5 len 1
SP @ 5 len 1
c_nl @ 6 len 4
comment @ 6 len 4 ";Z
"
WSP_or_VCHAR @ 7 len 1
VCHAR @ 7 len 1
CRLF @ 8 len 2
CR @ 8 len 1
LF @ 9 len 1

The solution to this ambiguity is to change:
elements = alternation *c-wsp
to:
elements = alternation *WSP

--VERIFIER NOTES--

The current document is clearly incorrect. However, though the solution appears correct, it has not been tested.

Errata ID: 3076
Status: Verified
Type: Technical
Publication Format(s) : TEXT
Reported By: Daniel van Vugt
Date Reported: 2012-01-04
Verifier Name: Pete Resnick
Date Verified: 2013-01-28

Section 4 says:

 rulelist       =  1*( rule / (*c-wsp c-nl) )

It should say:

 rulelist       =  1*( rule / (*WSP c-nl) )

Notes:

This errata is very similar to errata 2968, but different.

The grammar in section 4 is ambiguous. This ambiguity is revealed using 7 characters of input:
';' <CR> <LF> <SP> ';' <CR> <LF>

which produces 2 different matches (please forgive my program output):

rulelist @ 0 len 7
rulelist1 @ 0 len 3
star_c_wsp @ 0 len 0
c_nl @ 0 len 3
comment @ 0 len 3 ";\r\n"
CRLF @ 1 len 2
CR @ 1 len 1
LF @ 2 len 1
rulelist1 @ 3 len 4
star_c_wsp @ 3 len 1
c_wsp @ 3 len 1
WSP @ 3 len 1
SP @ 3 len 1
c_nl @ 4 len 3
comment @ 4 len 3 ";\r\n"
CRLF @ 5 len 2
CR @ 5 len 1
LF @ 6 len 1

-----------

rulelist @ 0 len 7
rulelist1 @ 0 len 7
star_c_wsp @ 0 len 4
c_wsp @ 0 len 4
c_nl @ 0 len 3
comment @ 0 len 3 ";\r\n"
CRLF @ 1 len 2
CR @ 1 len 1
LF @ 2 len 1
WSP @ 3 len 1
SP @ 3 len 1
c_nl @ 4 len 3
comment @ 4 len 3 ";\r\n"
CRLF @ 5 len 2
CR @ 5 len 1
LF @ 6 len 1

-----------

A solution to this ambiguity, which I have verified works, is:
rulelist = 1*( rule / (*WSP c-nl) )

This prevents the c-nl inside c-wsp from getting confused with the c-nl in rulelist.

--VERIFIER NOTES--

The current document is clearly incorrect. However, though the solution appears correct, it has not been tested.

Status: Held for Document Update (4)

RFC 5234, "Augmented BNF for Syntax Specifications: ABNF", January 2008

Note: This RFC has been updated by RFC 7405

Source of RFC: IETF - NON WORKING GROUP
Area Assignment: app

Errata ID: 2820
Status: Held for Document Update
Type: Editorial
Publication Format(s) : TEXT
Reported By: Spiros Bousbouras
Date Reported: 2011-06-03
Held for Document Update by: Pete Resnick
Date Held: 2011-06-11

Section 4 says:

         prose-val      =  "<" *(%x20-3D / %x3F-7E) ">"
                                ; bracketed string of SP and VCHAR
                                ;  without angles

It should say:

         prose-val      =  "<" *(%x20-3D / %x3F-7E) ">"
                                ; bracketed string of SP and VCHAR
                                ;  without ">"

Notes:

"without angles" suggests that ">" and "<"
must not appear but the range of codes given
suggests that only ">" must not appear.

Errata ID: 2914
Status: Held for Document Update
Type: Editorial
Publication Format(s) : TEXT
Reported By: Rudolf Dovicin
Date Reported: 2011-08-03
Held for Document Update by: Pete Resnick
Date Held: 2011-08-04

Throughout the document, when it says:

         alternation    =  concatenation
                           *(*c-wsp "/" *c-wsp concatenation)

         concatenation  =  repetition *(1*c-wsp repetition)

         repetition     =  [repeat] element

         repeat         =  1*DIGIT / (*DIGIT "*" *DIGIT)

         element        =  rulename / group / option /
                           char-val / num-val / prose-val

         group          =  "(" *c-wsp alternation *c-wsp ")"

         option         =  "[" *c-wsp alternation *c-wsp "]"

Notes:

Section 4. (ABNF Definition of ABNF) contains at least 2 recursions.
Recursions are not explicitly mentioned in the document, which may be confusing.

Errata ID: 6172
Status: Held for Document Update
Type: Editorial
Publication Format(s) : TEXT
Reported By: Glyn Normington
Date Reported: 2020-05-14
Held for Document Update by: Barry Leiba
Date Held: 2020-05-14

Section Appendix B says:

Note that these rules are only valid for ABNF encoded in 7-bit ASCII or in characters sets that are a superset of 7-bit ASCII.

It should say:

Note that these rules are only valid for ABNF encoded in 7-bit ASCII or in character sets that are a superset of 7-bit ASCII.

Notes:

Typo.

Errata ID: 6173
Status: Held for Document Update
Type: Editorial
Publication Format(s) : TEXT
Reported By: Glyn Normington
Date Reported: 2020-05-14
Held for Document Update by: Barry Leiba
Date Held: 2020-05-14

Section 3.6 says:

The operator "*" preceding an element indicates repetition. The full form is:

 <a>*<b>element

where <a> and <b> are optional decimal values, indicating at least <a> and at most <b> occurrences of the element.

Default values are 0 and infinity so that *<element> allows any number, including zero; 1*<element> requires at least one; 3*3<element> allows exactly 3; and 1*2<element> allows one or two.

It should say:

The operator "*" preceding an element indicates repetition. The full form is:

 <a>*<b>element

where <a> and <b> are optional decimal values, indicating at least <a> and at most <b> occurrences of the element.

The default value of <a> is 0. If <b> is omitted, there is no upper limit to the number of occurrences of the element. Consequently *<element> allows any number, including zero; 1*<element> requires at least one; 3*3<element> allows exactly 3; and 1*2<element> allows one or two.

Notes:

infinity is not a decimal value

Status: Rejected (6)

RFC 5234, "Augmented BNF for Syntax Specifications: ABNF", January 2008

Note: This RFC has been updated by RFC 7405

Source of RFC: IETF - NON WORKING GROUP
Area Assignment: app

Errata ID: 1423
Status: Rejected
Type: Technical
Publication Format(s) : TEXT
Reported By: David J. Rutkin
Date Reported: 2008-05-13
Rejected by: Alexey Melnikov
Date Rejected: 2010-09-03

Section 4. says:

repeat         =  1*DIGIT / (*DIGIT "*" *DIGIT)

It should say:

repeat         =  *DIGIT ["*" *DIGIT]

Notes:

In section 4. ABNF Definition of ABNF, on Page 10, the definition of <repeat> appears to be ambiguous.

After many weeks of study of RFC 5234 I am unable to discern which alternation to choose for <repeat> for the following string.

21*ARULENAME

Both alternatives are a valid solution but I was not able to determine from the RFC which one should be chosen. By my way of thinking, the first one encountered should be chosen but when done, the "*" is left and will cause a parsing error. Conversely I could have performed a look ahead check in my parsor, but that would produce a less efficient parser, forcing all alternatives to always be processed. In either case, the rule is ambiguous and, in my opinion, requires further definition.

My recommendation would be to change it to

repeat = *DIGIT ["*" *DIGIT]

This solution does not introduce any ambiguity and does not break any components of the definition of ABNF.

I realize that the forced presence of a digit on the <repeat> without the * is no longer present, but it was not necessary since the only use of <repeat> is by <repetition> and its presence is optional.

--VERIFIER NOTES--

The following is the text of an analysis of Erratum entry <http://www.rfc-editor.org/errata_search.php?rfc=5234&eid=1423>, made by Paul Overell:

> Section: 4.
>
> Original Text
> -------------
> repeat = 1*DIGIT / (*DIGIT "*" *DIGIT)
>

I can see nothing wrong with this, there is no ambiguity. To be ambiguous it would have to be able to generate a particular string in more than one way.

A <repeat> without a "*" can only be generated by the first alterative. A <repeat> with a "*" can only be generated by the second alternative.

> Corrected Text
> --------------
> repeat = *DIGIT ["*" *DIGIT]

This changes the syntax of <repeat> as it allow the empty string.

> Notes
> -----
> In section 4. ABNF Definition of ABNF, on Page 10, the definition of <repeat> appears to be ambiguous.
>
> After many weeks of study of RFC 5234 I am unable to discern which alternation to choose for <repeat> for the following string.
>
> 21*ARULENAME

This clearly matches the second alternative (*DIGIT "*" *DIGIT) and therefor is an instance of a <repeat>. This string can only be generated using the second alternative.

> Both alternatives are a valid solution but I was not able to determine from the RFC which one should be chosen. By my way of thinking, the first one encountered should be chosen but when done, the "*" is left and will cause a parsing error.

The first cannot be chosen precisely because the "*" is left and causes a parsing error.

> Conversely I could have performed a look ahead check in my parsor, but that would produce a less efficient parser, forcing all alternatives to always be processed.

The grammar given for ABNF is not, and does not claim to be, LL(0). I expect the ABNF grammar could be recast as LL(0) but there is no need, it was written for clarity.

I can't see the relevance of parser efficiency here.

> In either case, the rule is ambiguous and, in my opinion, requires further definition.

As discussed above, there is no ambiguity.

> My recommendation would be to change it to

> repeat = *DIGIT ["*" *DIGIT]
>
> This solution does not introduce any ambiguity and does not break any components of the definition of ABNF.

It is not equivalent to the original in that it allows the empty string.

> I realize that the forced presence of a digit on the <repeat> without the * is no longer present, but it was not necessary since the only use of <repeat> is by <repetition> and its presence is optional.

The proposed change to <repeat> breaks the definition of <repetition> as it then becomes ambiguous.

Consider the string "foo". With the proposed change to <repeat> it can be parsed as a <repetition> in two ways: <repeat> <element> or as <element>, the first with the option taken, the second with the option not taken. Two parse trees for the same string. To fix this ambiguity the definition of <repetition> would have to be changed to

repetition = repeat element

I recommend that this errata is rejected.

There is no ambiguity to fix, the proposed change is unnecessary, and would require an additional change to the definition of <repetition>.

Errata ID: 3096
Status: Rejected
Type: Technical
Publication Format(s) : TEXT
Reported By: MURATA Yasuhisa
Date Reported: 2012-01-23
Rejected by: Peter Saint-Andre
Date Rejected: 2012-01-26

Section B.1 says:

LWSP           =  *(WSP / CRLF WSP)

It should say:

LWSP           =  1*(WSP / CRLF WSP)

Notes:

RFC 822 said:
linear-white-space = 1*([CRLF] LWSP-char)
--VERIFIER NOTES--
Paul Overell notes the following (and Dave Crocker concurs):

###

The suggested change would give LWSP the same syntactic definition as RFC822's linear-white-space.

However, the successor to RFC822, RFC5322, doesn't use LWSP, it has its own definitions specifying header folding. Nor does RFC5234 itself use LWSP.

There are RFCs that use the existing definition, e.g. RFC6376, RFC5191, RFC5987. These would need to be fixed if we changed the definition of LWSP.

The existing definition of LWSP has been around since 1997, it is not wrong or unreasonable, just different from RFC822's linear-white-space.

###

Errata ID: 4040
Status: Rejected
Type: Technical
Publication Format(s) : TEXT
Reported By: Chris Morgan
Date Reported: 2014-07-03
Rejected by: Barry Leiba
Date Rejected: 2014-07-04

Section B.1 says:

HEXDIG         =  DIGIT / "A" / "B" / "C" / "D" / "E" / "F"

It should say:

HEXDIG         =  DIGIT
               / "A" / "B" / "C" / "D" / "E" / "F"
               / "a" / "b" / "c" / "d" / "e" / "f"

Notes:

Various RFCs are quoting HEXDIG under the currently incorrect understanding that it includes a-f as well as 0-9 and A-F, and I believe this is the place where it should be changed, rather than altering them to use a different rule.

Here are a couple of examples of the problem this causes that came quickly to hand:

- RFC 7230: section 1.2 cites ?HEXDIG (hexadecimal 0-9/A-F/a-f)?, section 4.1 proceeds to use HEXDIG in chunk-size, which in RFC 2616 was HEX which did indeed include a-f.

- RFC 3986: it replaces the hex of RFC 2396, which included a-f, with this HEXDIG of RFC 2234 (of which this document is the latest form). This is then used in pct-encoded (percent-encoding in URLs), IPvFuture and h16 (IPv6 addresses), all of which can reasonably be expected to permit lowercase a-f as well as uppercase.
--VERIFIER NOTES--
In Section 2.3, RFC 5234 explicitly says this:

NOTE:
ABNF strings are case insensitive and the character set for these
strings is US-ASCII.

So the definition of HEXDIG already allows for both upper and lower case (or a mixture).

It's true that some people aren't aware of that, and write their documents without understanding it. But it is not an error in 5234.

Errata ID: 4564
Status: Rejected
Type: Technical
Publication Format(s) : TEXT
Reported By: iliwoy
Date Reported: 2015-12-14
Rejected by: Barry Leiba
Date Rejected: 2015-12-14

Section 3.8 says:

         [foo bar]

   is equivalent to

         *1(foo bar).

It should say:

         [foo bar]

   is equivalent to

         *1(foo/bar).

Notes:

--VERIFIER NOTES--
It seems that the reporter misunderstands what "[foo bar]"
means. It means that "foo bar" (the two tokens together, as a unit)
is optional. It is not suggesting an alternative of "foo" or "bar".

Errata ID: 5110
Status: Rejected
Type: Technical
Publication Format(s) : TEXT
Reported By: Allan Hazarian
Date Reported: 2017-09-09
Rejected by: Barry Leiba
Date Rejected: 2020-05-14

Section B.1. says:

BIT            =  "0" / "1"

It should say:

BIT            =  %b0 / %b1

Notes:

The core rule BIT is written in the char-val rule syntax, which produces the ASCII chars %x30 ("0") and %x31 ("1"). For producing the 0 and 1 bit the bin-val rule should be used instead. Since the core rules use the hex-val rule extensively, using the bin-val rule shouldn't be a problem.
--VERIFIER NOTES--
No, the "BIT" construct is part of the "bin-val" construct, which gives "b0" or "b1", and "bin-val" is then part of "num-val", which is what produces "%b0" or "%b1". The ABNF is correct as written.

Errata ID: 4361
Status: Rejected
Type: Editorial
Publication Format(s) : TEXT
Reported By: Brian S. Julin
Date Reported: 2015-05-10
Rejected by: Barry Leiba
Date Rejected: 2015-05-12

Section 2.1 says:

Unlike original BNF, angle brackets ("<", ">") are not required.
However, angle brackets may be used around a rule name whenever their
presence facilitates in discerning the use of a rule name.  This is
typically restricted to rule name references in free-form prose

Notes:

Section 2.1 could be more explicit as to whether a parser, encountering a string that could be interpreted by humans as a rule name inside angles, should process it as a <rule-name> or not. The use of the words "typically restricted" allow for leeway in interpretation, and in light of section 4 note 1, implementors may be tempted to disregard the formal definition as being simply informative. Section 3.10 gives <prose-val> looser precedence than rule-name, adding weight to the idea that since there is a ambiguity to be resolved, angle brackets around rule names might be expected to appear in parser input.

It is also worth noting that the rule definition for <prose-val> prohibits the use of angle brackets around rule names within the contents of <prose-val> itself, and that the intended use of the <prose-val> rule is only described via comments.

Corrected text is not provided as the intent is unclear.
--VERIFIER NOTES--
Within an ABNF parsing context, the handling of rulenames with or
without angle-brackets is a small distinction. Parsers can and do
easily handle rulenames without the brackets; obviously they also can
handle the presence of the angle-brackets, although they generally
don't. Omitting brackets from ABNF intended for parsing was generally
received as an improvement over BNF, 40 years ago.

Report New Errata

Search RFCs

RFC Editor

The Series

For Authors

Sponsor

RFC Errata

Errata Search

Status: Verified (2)

RFC 5234, "Augmented BNF for Syntax Specifications: ABNF", January 2008

Status: Held for Document Update (4)

RFC 5234, "Augmented BNF for Syntax Specifications: ABNF", January 2008

Status: Rejected (6)

RFC 5234, "Augmented BNF for Syntax Specifications: ABNF", January 2008