[rfc-i] Another example of a draft with non-ASCII characters (draft-ietf-iri-3987bis-12.txt)

"Martin J. Dürst" duerst at it.aoyama.ac.jp
Wed Jul 18 00:53:51 PDT 2012

On 2012/07/17 23:04, Joe Hildebrand (jhildebr) wrote:
> On 7/17/12 6:53 AM, "Brian E Carpenter"<brian.e.carpenter at gmail.com>
> wrote:
>> It seems that Wordpad handles UTF16 correctly, but not UTF8. If you
>> do "Save As Unicode" from Notepad, Wordpad can read it, but there are some
>> unexpected changes of font.
> If you put a Byte Order Mark
> (http://en.wikipedia.org/wiki/Byte_order_mark) at the front, Wordpad will
> probably do just fine.

Yes it does. But as I said, I'm reluctant to use a BOM because that 
might create hick-ups somewhere down the line.

> It has no context to go on, so it's having to sniff
> out the encoding and guess based on the first bit of the file.

I'm not sure why the dump Notepad gets it, but Wordpad doesn't. But then 
I'm not using either very much.

Regards,   Martin.

> As Julian has said, this is one of several reasons why just saying "make
> the current .txt format UTF8 and stop" is not an adequate solution to the
> problems at hand.  HTML and XML have ways of declaring their encoding
> definitively inside the file format, so processors don't have to guess.

More information about the rfc-interest mailing list