[rfc-i] comments on "Format FAQ"

Heather Flanagan (RFC Series Editor) rse at rfc-editor.org
Wed Feb 19 10:23:08 PST 2014


On 2/19/14 9:53 AM, Julian Reschke wrote:
> On 2014-02-19 18:42, Andrew Sullivan wrote:
>> On Wed, Feb 19, 2014 at 03:44:30PM +0100, Julian Reschke wrote:
>>> Not really? Why does the encoding matter, as long as it is a common
>>> one and declared properly?
>>
>> Not every format has a mechanism to declare the encoding.
>>
>> I think it was primarily the simplicity of the tools that led to the
>> selection of only UTF-8 for cases where the file can declare.  Also,
>> if we don't insist on UTF-8 then we'll need to have an argument about
>> BOM and UTF-16, and that would be boring.
> 
> I'd say the Production Center can accept anything the want. If somebody
> submits a UTF-16 plain text, or an XML file using ISO-8859-1, where's
> the problem as long as the Production Center can use it?
> 
> The only interesting question is encodings in the canonical and the
> derived publication formats.

In the broadest sense, yes, you are correct, the RFC Production center
(RPC) are capable of accepting all sorts of things.  However, there are
several constraints that reality puts on this situation.  First, as
Andrew points out in a later post, we are working with an archival set
of documents and making the first major change in decades.  A
conservative approach is the most reasonable at this time.  Second, we
need to consider what we want the RPC to be focused on - understanding
different character encodings sufficiently to know what to look for in
case or problems, or focusing on editing the documents for clarity,
consistency, and readability.  The RPC can do more, but it comes with
costs that should not be ignored.

So, for now, yes, we will restrict the canonical and publication formats
to UTF-8.

-Heather



More information about the rfc-interest mailing list