[rfc-i] comments on "Format FAQ"
"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Mon Feb 24 02:53:55 PST 2014
On 2014/02/20 4:32, Bjoern Hoehrmann wrote:
> * Julian Reschke wrote:
>> I'd say the Production Center can accept anything the want. If somebody
>> submits a UTF-16 plain text, or an XML file using ISO-8859-1, where's
>> the problem as long as the Production Center can use it?
> It is not necessarily easy to identify mislabeled content and to ensure
> documents are properly transcoded, so accepting more legacy encodings is
> more work and allows more things to go wrong, for no real benefit. "Send
> UTF-8" is a simple, clear, and easily satisfiable policy, and it would
> be quite fine to accept no other encodings.
I clearly agree with Björn. Outside of UTF-8, there are some easy cases,
but there are also quite difficult cases. The RFC Editor and the
Production Center are well advised to restrict input to UTF-8, in
particular also because these days, nobody should have problems
producing UTF-8. On top of that, it may help to note that IDs and input
for RFC production was always in a *single* encoding (US-ASCII, not e.g.
EBCDIC,...), and there's no good reason to change the number of
encodings accepted (UTF-8 just subsumes US-ASCII).
More information about the rfc-interest