[rfc-i] tt and quotes-in-text : Re: What the text version is used for (was Re: The <tt> train wreck)

Carsten Bormann cabo at tzi.org
Thu Aug 26 11:24:40 PDT 2021


On 2021-08-25, at 22:56, Robert Sparks <rjsparks at nostrum.com> wrote:
> 
> Changing the subject again, because this is really now about tt, and the other thread I do want to see continue.
> 
> The principle of not breaking the plain text doesn't really help with this one - it's a new v3 artifact, and there are complaints about how <tt> has been rendering in v3 plain text to date.

<tt is the direct replacement of <spanx style=“verb”.
The only thing “new” about <tt is that it combines with <em and <strong, which together build a bitset of three bits that can be set or reset (<sub/<sup are also roughly in the same set, but are more than bits).

<tt sailed through the v3 process because it is that direct replacement; it is otherwise hard to understand why this would go into the same bucket with <em and <strong, which are named in a more “semantic” way.
(The main application of <tt is called <code in html, but there are also <kbd, <samp, <var, <dfn, <cite, some of which are closer to how we use <em.)

Reviewers of my specs have repeatedly noted that the plaintext-fallback for <em or <strong is jarring.  We aren’t we “fixing” (amputating the plaintext-fallback for) these as well?

Inventing new processing rules for <tt without even considering <em and <strong strikes me as another instance of the weird way v3 ripened.

I think the discussion we had in the other thread was useful to recognize that plaintext is here to stay, and that we will *not* fall for beautifying the plaintext.

Quoting John again:

>> > This tells me that we need to keep the text version and it needs to have the full
>> > contents of the document, but not that it has to be particularly beautiful.  The HTML
>> > is where you get the beautiful version.


So having ugly plaintext fallbacks is OK as long as these are consistent, so a reader can create a mental image of what was meant.  We already have damaged that consistency for <sub and <sup for certain “simple” cases, as well as for table cells that carry only a <tt.  This is the path we should *not* go further.

Completely removing from the plaintext form the information that is embodied by the <tt tags is so outlandish that I indeed struggled arguing against it.

Of course, we could deprecate <tt entirely (and there are probably good arguments for that), but if we want to keep it, we can’t tell authors “you can use it, but the plaintext version won’t show the full contents of the document” (in particular not retroactively, with documents that actually discuss the plaintext rendering and will be starting to lie about themselves after the change is deployed).

(The proper fix is to give the authors a way to influence the fallback, but that seemed to be hard to get consensus for with the previously dominating view that plaintext will just vanish.)

Grüße, Carsten



More information about the rfc-interest mailing list