database logo graphic

STD 63

RFC 3629

"UTF-8, a transformation format of ISO 10646", November 2003

Canonical URL:
http://www.rfc-editor.org/rfc/rfc3629.txt
This document is also available in this non-normative format: PDF.
Status:
INTERNET STANDARD
Obsoletes:
RFC 2279
Author:
F. Yergeau
Stream:
INDEPENDENT

Cite this RFC: TXT  |  XML

Other actions: Find Errata (if any)  |  Submit Errata  |  Find IPR Disclosures from the IETF


Abstract

ISO/IEC 10646-1 defines a large character set called the Universal Character Set (UCS) which encompasses most of the world's writing systems. The originally proposed encodings of the UCS, however, were not compatible with many current applications and protocols, and this has led to the development of UTF-8, the object of this memo. UTF-8 has the characteristic of preserving the full US-ASCII range, providing compatibility with file systems, parsers and other software that rely on US-ASCII values but are transparent to other values. This memo obsoletes and replaces RFC 2279.


For the definition of Status, see RFC 2026.

For the definition of Stream, see RFC 4844.


Go to the RFC Editor Homepage.