[rfc-i] Data point [Re: Fwd:I-D ACTION:draft-hoffman-utf8-rfcs-03.txt]

Joe Touch touch at ISI.EDU
Tue Oct 7 08:37:09 PDT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Julian Reschke wrote:
> Joe Touch wrote:
>>> A UTF-8 form feed *is* an ASCII form feed, so I'm really not sure what
>>> you are testing...
>>
>> Formfeed is 0x0c
>>
>> UTF-8 formfeed is U+000C (though prohibited by draft-hoffman as per
>> Section 2.2; let's assume it's allowed).
> 
> UTF-8 is an encoding of the Unicode character repertoire into an octet
> sequence. The UTF-8 representation of the Unicode form feed character
> *is* the octet 0x0c, just like in ASCII (and that applies to all Unicode
> code points < 128).
> 
>> It appears that Wordpad doesn't recognize U+000C as a page break.
>>
>>>> To type unicode, I typed the following:
>>>>
>>>>     0 1 2 <alt-x>
>>>>
>>>> Am I missing something? (I'm new to UTF)...
>>>> ...
>>> Hard to say. Can you produce an hex dump of the file? (od -cx)
>>
>> That's a Unix-ism, but yes - I can do that on Windows:
>>
>> 0000000 377 376   a  \0   b  \0   c  \0  \r  \0  \n  \0 022  \0  \r  \0
>>         65279    97    98    99    13    10    18    13
>> 0000020  \n  \0   d  \0   e  \0   f  \0  \r  \0  \n  \0
>>            10   100   101   102    13    10
>> 0000034
>>
>> Is that correct UTF-8 for what I typed?
> 
> First of all, it looks like UTF-16 (two octets per code point).

OK; that means that Wordpad doesn't generate UTF-8. That was the one
editor that appeared to support at least some of the cut/paste tests.

If that doesn't work, then we don't have a viable UTF-8 editor
identified for Windows.

> Between the two CR/LF pairs I see something with code point 18; so your
> method of entry apparently didn't produce the desired result.

Hmm. I did what the web pages for UTF-8 said:

http://www.fileformat.info/info/unicode/char/000c/index.htm
(see "how to type in Windows")

It appears that this is incorrect, but further points to the immaturity
of this format.

Joe
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjrgiUACgkQE5f5cImnZrvgcQCfXVbRQ7E1qWtuLyVkWQr2O++h
JnsAni8tRdBCNBKDCTb3qsr9gs4rlYiT
=dnIu
-----END PGP SIGNATURE-----


More information about the rfc-interest mailing list