RFC Errata
RFC 5892, "The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)", August 2010
Note: This RFC has been updated by RFC 8753
Source of RFC: idnabis (app)See Also: RFC 5892 w/ inline errata
Errata ID: 3312
Status: Verified
Type: Editorial
Publication Format(s) : TEXT
Reported By: Patrik Fältström
Date Reported: 2012-08-09
Verifier Name: Barry Leiba
Date Verified: 2012-08-09
Section A and A.1 says:
In A: Code point: The code point, or code points, to which this rule is to be applied. Normally, this implies that if any of the code points in a label is as defined, then the rules should be applied. If evaluated to True, the code point is OK as used; if evaluated to False, it is not OK. In A.1: Rule Set: False; If Canonical_Combining_Class(Before(cp)) .eq. Virama Then True; If RegExpMatch((Joining_Type:{L,D})(Joining_Type:T)*\u200C (Joining_Type:T)*(Joining_Type:{R,D})) Then True;
It should say:
In A: Code point: The code point, or code points, to which this rule is to be applied. Normally, this implies that if any of the code points in a label is as defined, then the rules should be applied. If evaluated to True, the code point is OK as used; if evaluated to False, it is not OK. For the rule to be evaluated to True for the label, it MUST be evaluated separately for every occurrence of the Code point in the label; each of those evaluations must result in True. In A.1: Rule Set: False; If Canonical_Combining_Class(Before(cp)) .eq. Virama Then True; If cp .eq. \u200C And RegExpMatch((Joining_Type:{L,D})(Joining_Type:T)*cp (Joining_Type:T)*(Joining_Type:{R,D})) Then True;
Notes:
The original text did not make it clear whether the actual rule is to be applied once for every occurrence of the code point in the label. This is a regular expression that can be interpreted in multiple ways, plus it does not take into account the case where more than one U+200C exists in a label.