RFC Errata
RFC 2046, "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types", November 1996
Note: This RFC has been updated by RFC 2646, RFC 3798, RFC 5147, RFC 6657, RFC 8098
Source of RFC: 822ext (app)
Errata ID: 8667
Status: Reported
Type: Technical
Publication Format(s) : TEXT
Reported By: Jason Yundt
Date Reported: 2025-12-02
Section 4.1.2 says:
The complete US-ASCII character set is listed in ANSI X3.4- 1986.
It should say:
In US-ASCII octet sequences, the highest order bit (most significant bit) in each octet is always zero. The correct interpretation of the remaining 7 bits in each octet is listed in ANSI X3.4-1986.
Notes:
Section 2.2 of RFC 2045 defines the term “character set”:
> The term "character set" is used in MIME to refer to a method of
> converting a sequence of octets into a sequence of characters.
RFC 2046 says that US-ASCII is a character set and points to ANSI X3.4-1986 for the definition of that character set. Unfortunately, ANSI X3.4-1986 does not give a complete definition of a character set because ANSI X3.4-1986 is not about sequences of octets. Section 4 of ANSI X3.4-1986 says:
> The bits of the bit combinations of the 7-bit code are
> identified by b₇, b₆, b₅, b₄, b₃, b₂, and b₁, where
> b₇ is the highest order bit (most significant bit), and
> b₁ is the lowest order bit (least significant bit).
ANSI X3.4-1986 never mentions a b₈ bit, or what to do if you want to decode a stream of octets. It only alludes to octets in passing. Section 2.1.1 of ANSI X3.4-1986 says:
> 2.1.1 Conforming Interchange. The data stream
> comprising the exchange of information using the
> coding of this standard is subject to the following rules.
> Conforming interchange:
> (1) Shall be a sequence of 7-bit or 8-bit bit combin-
> ations.
> (2) Shall use the names of all control characters as
> specified in this standard.
> (3) Shall not include any bit combinations allo-
> cated by this standard for any purpose other than that
> defined in this standard, unless they represent func-
> tions or elements of other coded character sets that
> have been invoked by code extension in accordance
> with ANSI X3.41-1974, subject to mutual agreement
> between the interchange parties.
(An 8-bit bit combination is an octet that contains character data. See section 3.2 of ANSI X3.4-1986.)
Unfortunately, the rest of ANSI X3.4-1986 doesn’t tell us how to handle sequences of 8-bit bit combinations. It only tells us how to handle 7-bit bit combinations. Section 9.2 of ANSI X3.41-1974 specifies that the highest order bit should be zero when storing ASCII in 8-bit bit combinations, but you don’t have to implement anything from ANSI X3.41-1974 in order to properly implement ANSI X3.4-1986. In fact, ANSI X3.41-1974 is mostly about shift and escape sequences so most of it must not be implemented for US-ASCII. As section 4.1.2 of RFC 2046 says:
> All of these character sets are used as pure 7bit or 8bit sets
> without any shift or escape functions.
Additionally, there shouldn’t have been a space after the hyphen in “ANSI X3.4- 1986”.
See also: RFC 20.
