[rfc-i] draft-hildebrand-html-rfc
Julian Reschke
julian.reschke at gmx.de
Mon Jul 23 08:54:25 PDT 2012
On 2012-07-23 17:44, Joe Hildebrand (jhildebr) wrote:
> ...
>>> Which reminds me: are we ok with non-ASCII characters being represented
>>> by their UTF-8 encoding? For those stuck in the previous millennium we
>>> could simply require ASCII encoding, and use character references for
>>> everything non-ASCII.
>>
>> This is not a question of previous millennium or not. I think we should
>> not disallow character references, because there are some characters for
>> which it's really helpful if they are explicitly visible, think e.g.
>> (non-breaking space). But in general, it's much better if the
>> characters are visible directly. It's also way, way closer to what you
>> get in the final product.
>
> Of course, I misunderstood Julian's point, which upon re-reading was
> clear. The tooling that is used to pretty-print needs to know which
> character references to emit syntactically. I think is probably
> nice, since some other processing tools silently switch U+00A0 to U+0020.
> <, >, &, ", and ' are probably important. Everything
> else, I'd recommend against, but am open to suggestion.
is a bit tricky as it is not predefined, thus you'll need a more
complex doctype.
Best regards, Julian
More information about the rfc-interest
mailing list