"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Mon Jul 23 18:23:29 PDT 2012
On 2012/07/24 0:54, Julian Reschke wrote:
> On 2012-07-23 17:44, Joe Hildebrand (jhildebr) wrote:
>> Of course, I misunderstood Julian's point, which upon re-reading was
>> clear. The tooling that is used to pretty-print needs to know which
>> character references to emit syntactically. I think is probably
>> nice, since some other processing tools silently switch U+00A0 to U+0020.
>> <, >, &, ", and ' are probably important. Everything
>> else, I'd recommend against, but am open to suggestion.
> is a bit tricky as it is not predefined, thus you'll need a more
> complex doctype.
Where exactly is it not predefined? HTML4 includes a full set of
character entities for Latin-1, see
http://www.w3.org/TR/html4/sgml/entities.html. I couldn't imagine that
some of these were abandoned in HTML5. XML is a different story of
course, but here we are talking about HTML, no?
More information about the rfc-interest