[rfc-i] character sets, Bug in RFC Search page...

Leonard Rosenthol lrosenth at adobe.com
Sun Oct 1 10:23:06 PDT 2017

There is a standard BOM for UTF-8, just as there is for UCS2.  Seems like a good idea, though most systems don't use it...


-----Original Message-----
From: rfc-interest [mailto:rfc-interest-bounces at rfc-editor.org] On Behalf Of John Levine
Sent: Wednesday, September 27, 2017 1:54 PM
To: rfc-interest at rfc-editor.org
Cc: tte at cs.fau.de
Subject: Re: [rfc-i] character sets, Bug in RFC Search page...

In article <20170926215743.GM31007 at faui40p.informatik.uni-erlangen.de> you write:
>Given how browsers are quite inconsistent in the characer sets they 
>load, it would be nice to have explicit indication of the character 
>sets required to render a document. I've seen PDF documents where i had to go to page 100 before some crucial text was not rendered because it used some character set not available to me.

Those are buggy PDFs.  A PDF is supposed to contain all the fonts it uses beyond a small standard set, but there is a lot of PDF crudware that wrongly assumes that whatever fonts it finds on the machine where it's running will exist everywhere.  

>There is a lot of tool chains out there that will not render UTF-8 
>correctly, therefore it is IMHO extremely viable to have a clear 
>indication of the minimum toolchain features required to render a 
>document correctly (in the first page header of an RFC wold be great).

Hey, maybe we could put a couple of bytes at the begining that say this is a UTF-8 document.

rfc-interest mailing list
rfc-interest at rfc-editor.org

More information about the rfc-interest mailing list