[rfc-i] Data point [Re: Fwd:I-D ACTION:draft-hoffman-utf8-rfcs-03.txt]

Joe Touch touch at ISI.EDU
Tue Oct 7 08:20:14 PDT 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Julian Reschke wrote:
> Joe Touch wrote:
>> Above I'm referring to both in different places. Below is the specific
>> test.
>>
>>> We're talking about the former, not the latter.
>>
>> Here's what I did:
>>
>> 1. test ASCII FF; the following prints as two pages:
>>
>>     abc
>>     <FF>
>>     def
>>
>> 2. test Unicode FF; the following prints as one page:
>>
>>     abc
>>     [000c]
>>     def
> 
> A UTF-8 form feed *is* an ASCII form feed, so I'm really not sure what
> you are testing...

Formfeed is 0x0c

UTF-8 formfeed is U+000C (though prohibited by draft-hoffman as per
Section 2.2; let's assume it's allowed).

It appears that Wordpad doesn't recognize U+000C as a page break.

>> To type unicode, I typed the following:
>>
>>     0 1 2 <alt-x>
>>
>> Am I missing something? (I'm new to UTF)...
>> ...
> 
> Hard to say. Can you produce an hex dump of the file? (od -cx)

That's a Unix-ism, but yes - I can do that on Windows:

0000000 377 376   a  \0   b  \0   c  \0  \r  \0  \n  \0 022  \0  \r  \0
        65279    97    98    99    13    10    18    13
0000020  \n  \0   d  \0   e  \0   f  \0  \r  \0  \n  \0
           10   100   101   102    13    10
0000034

Is that correct UTF-8 for what I typed?

Joe

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjrfi4ACgkQE5f5cImnZrs5YQCg8G/RqtAPXNyHVVnjT0NZe9Pk
jXIAn3ZZU8zDNzyFHPTq3+Ig8zHNtH11
=y+8+
-----END PGP SIGNATURE-----


More information about the rfc-interest mailing list