RFC Errata
RFC 5890, "Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework", August 2010
Source of RFC: idnabis (app)
Errata ID: 4823
Status: Held for Document Update
Type: Technical
Publication Format(s) : TEXT
Reported By: Juan Altmayer Pizzorno
Date Reported: 2016-05-17
Held for Document Update by: Alexey Melnikov
Date Held: 2016-10-07
Section 2.3.2.1 says:
expansion of the A-label form to a U-label may produce strings that are much longer than the normal 63 octet DNS limit (potentially up to 252 characters)
It should say:
expansion of the A-label form to a U-label may produce strings that are much longer than the normal 63 octet DNS limit (potentially up to 59 Unicode code points or 236 octets)
Notes:
The contents of U-labels are encoded in the up to 59 ASCII characters (see 2.3.2.1 itself)
output by the Punycode algorithm in their corresponding A-labels. The Punycode
decoder (https://tools.ietf.org/html/rfc3492#section-6.2) consumes at least one
of those ASCII characters for each code point inserted into the U-label. An U-label,
thus, can contain at the most 59 Unicode code points.
Since U-labels are defined (in 2.3.2.1) to be expressed in a standard Unicode Encoding
Form, and UTF-32, UTF-16 and UTF-8 (as revised by RFC3629) all can encode a code
point in at most 4 octets, 236 octets is an upper bound for an U-label's length.
I think it should be possible to derive a tighter bound, but its rationale would likely be
less straighforward.
I imagine the number 252 was originally derived by multiplying 63, the maximum
length of an A-label (including the "xn--" prefix), by 4, the maximum number of
octets needed to represent a code point.