[rfc-i] Unicode or UTF-8

Phillip Hallam-Baker hallam at gmail.com
Tue Mar 27 20:21:32 PDT 2012


+1

Though I suggest that any upload tool be lenient in what it accepts
and conservative in what it emits. i.e. instead of rejecting non-UTF8
encoding outright, convert them to canonical form.

On Tue, Mar 27, 2012 at 11:08 PM, "Martin J. Dürst"
<duerst at it.aoyama.ac.jp> wrote:
> This is a followup to
> http://www.ietf.org/proceedings/83/minutes/minutes-83-rfcform.txt and
> http://www.ietf.org/jabber/logs/rfcform (which currently doesn't exist; hope
> this gets fixed).
>
> The minutes note:
>
> 8:00 - 18:10 = Questions/Comments
> -------------
>
> 13) ? from Jabber: let's not talk about encodings. Unicode not UTF-8.
>
>
> Having a fixed encoding is *extremely* helpful (if you don't believe that,
> just think about how easy US-ASCII is in comparison with the mess of
> encodings we have for non-ASCII text).
>
> For the IETF, that would be UTF-8, because of RFC 2130
> (http://tools.ietf.org/html/rfc2277) and everything after. That should apply
> to xml2rfc source, the equivalent of the current .txt, and HTML at least, as
> far as these are going to be used.
>
> This may not apply to all formats, though. I'm not sure what .docx uses
> internally. I'm not sure what PDF uses internally if you tell it to keep all
> the text in Unicode. [Leonard?] For these cases, there's of course
> absolutely no point to force them to use UTF-8 if they use another encoding
> of Unicode.
>
> So I'd think the conclusion should be:
>
> UTF-8 where there's an (unnecessary!) choice, whatever Unicode encoding was
> chosen for formats where a single choice already has been made.
>
> Regards,    Martin.
> _______________________________________________
> rfc-interest mailing list
> rfc-interest at rfc-editor.org
> https://www.rfc-editor.org/mailman/listinfo/rfc-interest



-- 
Website: http://hallambaker.com/


More information about the rfc-interest mailing list