RFC 5322, "Internet Message Format", October 2008Source of RFC: IETF - NON WORKING GROUP
Area Assignment: app
Errata ID: 2104
Reported By: Wolf Lammen
Date Reported: 2010-04-02
Rejected by: Pete Resnick
Date Rejected: 2011-06-11
Section 4.1 says:
obs-unstruct = *((*LF *CR *(obs-utext *LF *CR)) / FWS)
It should say:
obs-unstruct = *(*LF *CR (obs-utext / FWS)) *LF *CR ;The CR of a following CRLF MUST NOT be consumed.
[I was notified, that errata report 1766 and 1908 has been verified, but apparently, report 1905 was left undecided. I am a bit concerned, because this report points to a true defect of RFC 5322, so, maybe, a clearer reasoning helps. This report is meant to replace my former posting 1905.]
The <obs-unstruct> rule covers field bodies of unstructured fields (as declared in section 2.2.1), that conform to obsoleted RFC 2822 or RFC 822. It is only to be used for downward compatibility when reading messages. It corresponds to <text> in RFC 822 and <obs-utext> in RFC 2822. In a conforming message, an unstructured field body is always followed by a field delimiting non-folding CRLF.
<obs-unstruct> is equivalent to *(d0-d127), so the CRLF delimiting the field is made part of the field body, as is the rest of the header section.
So my proposal refines the rule: Use any, possibly empty, sequence of ASCII characters, as long as it does not contain a CRLF outside of a folding white space.
There is a nuisance left: when applied possessively, the CR of the adjacent field delimiter CRLF is consumed as well. This cannot be overcome, no matter how you rewrite the rule, because the ABNF in RFC 5324 is not rich enough to express something like "consume all but one CR". A higher level construct like "obs-unstruct CRLF" must enforce a back-tracking step on the part of <obs-unstruct> in order to match. That is why I added the comment.
Duplicate of erratum 1905, which I believe is correct as stated.