[rfc-i] Bug in RFC Search page...

Toerless Eckert tte+ietf at cs.fau.de
Tue Sep 26 15:17:10 PDT 2017


On Tue, Sep 26, 2017 at 10:49:56PM +0200, Carsten Bormann wrote:
> 
> > On Sep 26, 2017, at 22:41, Toerless Eckert <tte+ietf at cs.fau.de> wrote:
> > 
> > What terminology do you propose
> > 
> > ASCII text -> plain-text
> > UTF-8 text -> plain-text unicode
> 
> Well, both are plain-text, and both are unicode.
> Plain-text-beyond-ascii would be more like it for the second, but why call that out.

There is a lot of tool chains out there that will not render UTF-8 correctly,
therefore it is IMHO extremely viable to have a clear indication of the
minimum toolchain features required to render a document correctly (in the first
page header of an RFC wold be great).

Also: As long as the "minimum rendering" features of documents are not explicitly
noted in human readable form anywhere, we do not need to define a terminology.

> (And no, we don???t need a third category Plain-text-beyond-the-basic-multilingual-plane, or Plain-text-with-astral for short.)

Not sure what you are referring to "More UTF-8 characters than possible with iso8859" ?
Given how we should not publish text-only documents other than ascii or UTF-8, i think we can happily
ignore those legacy options.

> > HTML ASCII -> HTML
> > HTML UTF-8 -> HTML unicode
> 
> I don???t think that distinction ever needs to be made, because HTML embeds metadata about its charset, and there are no real interop problems when you do that.

Given how browsers are quite inconsistent in the characer sets they load, it would be nice to have explicit indication of the character sets required to render a document. I've seen PDF documents where i had to go to page 100 before some crucial text was not rendered because it used some character set not available to me.

> Grüße, Carsten
     ^
https://en.wikipedia.org/wiki/Capital_%E1%BA%9E

Oh, the innovation...

tte at cs.fau.de



More information about the rfc-interest mailing list