[rfc-i] draft-hildebrand-html-rfc

Julian Reschke julian.reschke at gmx.de
Mon Jul 23 08:54:25 PDT 2012

On 2012-07-23 17:44, Joe Hildebrand (jhildebr) wrote:
> ...
>>> Which reminds me: are we ok with non-ASCII characters being represented
>>> by their UTF-8 encoding? For those stuck in the previous millennium we
>>> could simply require ASCII encoding, and use character references for
>>> everything non-ASCII.
>> This is not a question of previous millennium or not. I think we should
>> not disallow character references, because there are some characters for
>> which it's really helpful if they are explicitly visible, think e.g.
>>   (non-breaking space). But in general, it's much better if the
>> characters are visible directly. It's also way, way closer to what you
>> get in the final product.
> Of course, I misunderstood Julian's point, which upon re-reading was
> clear.  The tooling that is used to pretty-print needs to know which
> character references to emit syntactically.  I think   is probably
> nice, since some other processing tools silently switch U+00A0 to U+0020.
> <, >, &, ", and ' are probably important.  Everything
> else, I'd recommend against, but am open to suggestion.

  is a bit tricky as it is not predefined, thus you'll need a more 
complex doctype.

Best regards, Julian

More information about the rfc-interest mailing list