RFC Errata
RFC 9309, "Robots Exclusion Protocol", September 2022
Source of RFC: IETF - NON WORKING GROUPArea Assignment: art
Errata ID: 7128
Status: Reported
Type: Technical
Publication Format(s) : TEXT
Reported By: Yoshiro Yoneya
Date Reported: 2022-09-13
Section 2.2.2 says:
For example: +==================+=======================+=======================+ | Path | Encoded Path | Path to Match | +==================+=======================+=======================+ | /foo/bar?baz=quz | /foo/bar?baz=quz | /foo/bar?baz=quz | +------------------+-----------------------+-----------------------+ | /foo/bar?baz= | /foo/bar?baz= | /foo/bar?baz= | | https://foo.bar | https%3A%2F%2Ffoo.bar | https%3A%2F%2Ffoo.bar | +------------------+-----------------------+-----------------------+ | /foo/bar/ | /foo/bar/%E3%83%84 | /foo/bar/%E3%83%84 | | U+E38384 | | | +------------------+-----------------------+-----------------------+ | /foo/ | /foo/bar/%E3%83%84 | /foo/bar/%E3%83%84 | | bar/%E3%83%84 | | | +------------------+-----------------------+-----------------------+ | /foo/ | /foo/bar/%62%61%7A | /foo/bar/baz | | bar/%62%61%7A | | | +------------------+-----------------------+-----------------------+
It should say:
For example: +==================+=======================+=======================+ | Path | Encoded Path | Path to Match | +==================+=======================+=======================+ | /foo/bar?baz=quz | /foo/bar?baz=quz | /foo/bar?baz=quz | +------------------+-----------------------+-----------------------+ | /foo/bar?baz= | /foo/bar?baz= | /foo/bar?baz= | | https://foo.bar | https%3A%2F%2Ffoo.bar | https%3A%2F%2Ffoo.bar | +------------------+-----------------------+-----------------------+ | /foo/bar/ | /foo/bar/%E3%83%84 | /foo/bar/%E3%83%84 | | U+30C4 | | | +------------------+-----------------------+-----------------------+ | /foo/ | /foo/bar/%E3%83%84 | /foo/bar/%E3%83%84 | | bar/%E3%83%84 | | | +------------------+-----------------------+-----------------------+ | /foo/ | /foo/bar/%62%61%7A | /foo/bar/baz | | bar/%62%61%7A | | | +------------------+-----------------------+-----------------------+
Notes:
The "Path" component of third example seems to indicate Unicode codepoint, rather than UTF-8 encoded hexadecimal. If it was, the correct codepoint for %E3%83%84 is U+30C4, or ツ (in Unicode form).