RFC Errata
Found 3 records.
Status: Reported (3)
RFC 9309, "Robots Exclusion Protocol", September 2022
Source of RFC: IETF - NON WORKING GROUPArea Assignment: art
Errata ID: 7124
Status: Reported
Type: Technical
Publication Format(s) : TEXT, PDF, HTML
Reported By: Samuel K. Lam
Date Reported: 2022-09-10
Section 5.2 says:
In the following case, /example/page/disallowed.gif MUST be used for the URI example.com/example/page/disallow.gif.
It should say:
In the following case, /example/page/disallowed.gif MUST be used for the URI example.com/example/page/disallowed.gif.
Notes:
The two file names in that sentence ("disallowed.gif" and "disallow.gif")
doesn't match. i.e. "ed" is missing from the second file name.
This error renders the example given in section 5.2 incorrect.
Errata ID: 7128
Status: Reported
Type: Technical
Publication Format(s) : TEXT
Reported By: Yoshiro Yoneya
Date Reported: 2022-09-13
Section 2.2.2 says:
For example: +==================+=======================+=======================+ | Path | Encoded Path | Path to Match | +==================+=======================+=======================+ | /foo/bar?baz=quz | /foo/bar?baz=quz | /foo/bar?baz=quz | +------------------+-----------------------+-----------------------+ | /foo/bar?baz= | /foo/bar?baz= | /foo/bar?baz= | | https://foo.bar | https%3A%2F%2Ffoo.bar | https%3A%2F%2Ffoo.bar | +------------------+-----------------------+-----------------------+ | /foo/bar/ | /foo/bar/%E3%83%84 | /foo/bar/%E3%83%84 | | U+E38384 | | | +------------------+-----------------------+-----------------------+ | /foo/ | /foo/bar/%E3%83%84 | /foo/bar/%E3%83%84 | | bar/%E3%83%84 | | | +------------------+-----------------------+-----------------------+ | /foo/ | /foo/bar/%62%61%7A | /foo/bar/baz | | bar/%62%61%7A | | | +------------------+-----------------------+-----------------------+
It should say:
For example: +==================+=======================+=======================+ | Path | Encoded Path | Path to Match | +==================+=======================+=======================+ | /foo/bar?baz=quz | /foo/bar?baz=quz | /foo/bar?baz=quz | +------------------+-----------------------+-----------------------+ | /foo/bar?baz= | /foo/bar?baz= | /foo/bar?baz= | | https://foo.bar | https%3A%2F%2Ffoo.bar | https%3A%2F%2Ffoo.bar | +------------------+-----------------------+-----------------------+ | /foo/bar/ | /foo/bar/%E3%83%84 | /foo/bar/%E3%83%84 | | U+30C4 | | | +------------------+-----------------------+-----------------------+ | /foo/ | /foo/bar/%E3%83%84 | /foo/bar/%E3%83%84 | | bar/%E3%83%84 | | | +------------------+-----------------------+-----------------------+ | /foo/ | /foo/bar/%62%61%7A | /foo/bar/baz | | bar/%62%61%7A | | | +------------------+-----------------------+-----------------------+
Notes:
The "Path" component of third example seems to indicate Unicode codepoint, rather than UTF-8 encoded hexadecimal. If it was, the correct codepoint for %E3%83%84 is U+30C4, or ツ (in Unicode form).
Errata ID: 7995
Status: Reported
Type: Technical
Publication Format(s) : TEXT
Reported By: Shawn Tice
Date Reported: 2024-06-18
Section 2.2 says:
path-pattern = "/" *UTF8-char-noctl ; valid URI path pattern
It should say:
path-pattern = ("/" / "*") *UTF8-char-noctl ; valid URI path pattern
Notes:
The `path-pattern` rule requires that `/` be the first character, but the Simple Example in section 5.1 has `Disallow: *.gif$`, where the path pattern starts with a `*`. The notes preceding the example explicitly say: The "*" character designates any character, including the otherwise-required forward slash; see Section 2.2.
This seems to indicate that either the formal syntax is wrong or the guidance in section 5.1 is wrong. I assume the formal syntax is wrong.