[rfc-i] Bring Back the BR! Re: typesetting poetry
dev+ietf at seantek.com
Mon Oct 12 07:40:47 PDT 2015
On 10/12/2015 1:15 AM, Miek Gieben wrote:
> [ Quoting <dev+ietf at seantek.com> in "Re: [rfc-i] Bring Back the BR!
> Re: ..." ]
>>> On 9/5/2015 7:56 AM, Donald Eastlake wrote:
>>>> Section 1.1 of RFC 6325 is also a poem.
>>> See: a modern reference!
>> To clarify and modulate my outburst:
>> There is in fact a <br> (in v3 xml2rfc-15 and beyond), but it only
>> appears as the child of a <th> or <td> element; it cannot appear in a
>> <t> element (aka paragraph).
>> I suppose that means that poetry, and pseudocode, can appear inside
>> tables (namely tables with one column and one row), but not inside
>> "text" per-se. Inside <th> and <td>, <t> is mutually exclusive with
>> the other elements including <br>. I suppose that if your poetry or
>> pseudocode has multiple paragraph-like divisions (e.g., multiple
>> stanzas, multiple verses, multiple function blocks), you need to put
>> them in multiple cells or rows. Not really sure about the wisdom of
> <br> only solves the line-break. Leading whitespace (for instance) is
> also a
> problem. Basically: for poetry you'll need a paragraph-type where *all*
> whitespace is significant.
I advocated for this very thing about "700 e-mails" ago—but in any
event, the grammar supports it. xml2rfc has always permitted
@xml:space="preserve" in various places; however, implementations have
not consistently implemented it so various detractors have thought it
wise to deprecate it.
The most normal and natural sense of "preserve" would be to preserve all
white space. Compare this with the CSS white-space property. In keeping
with that property, a discussion should also be had about line breaking
to fill line boxes. Controlling line breaking to fill line boxes might
be achieved with new markup, or with the many breaking format characters
I would like to quote from the XML 1.0 Fifth Edition Standard:
S <http://www.w3.org/TR/2008/PER-xml-20080205/#NT-S>(white space)
consists of one or more space (#x20) characters, carriage returns,
line feeds, or tabs.
 |S| ::= |(#x20 | #x9 | #xD | #xA)+|
The presence of #xD in the above production is maintained purely for
backward compatibility with theFirst Edition
<http://www.w3.org/TR/1998/REC-xml-19980210>. As explained in*2.11
<http://www.w3.org/TR/2008/PER-xml-20080205/#sec-line-ends>, all #xD
characters literally present in an XML document are either removed
or replaced by #xA characters before any other processing is done.
The only way to get a #xD character to match this production is to
use a character reference in an entity value literal.
2.10 White Space Handling
In editing XML documents, it is often convenient to use "white
space" (spaces, tabs, and blank lines) to set apart the markup for
greater readability. Such white space is typically not intended for
inclusion in the delivered version of the document. On the other
hand, "significant" white space that should be preserved in the
delivered version is common, *for example in poetry and source code*
all characters in a document that are not markup through to the
application. Avalidating XML processor
the application which of these characters constitute white space
appearing inelement content
What we see here is that white[ ]space is only SP, HT, CR, and LF. At
the "application level", xml2rfc-23 says:
Tools interpreting the XML described here will collapse horizontal
whitespace and linebreaks to a single whitespace (except inside
<artwork> and <sourcecode>), and will trim leading and trailing
The natural conclusion of this is that only SP, HT, CR, and LF are going
to get smushed.
Therefore, you are welcome to use all of these delightful space
characters in the Unicode standard:
May I suggest few U+2003 EM SPACEs for your recipes: nice and big. Add
to pot to taste. You may also consider a few dashes of U+00A0 NO-BREAK
SPACE, U+2007 FIGURE SPACE, or U+202F NARROW NO-BREAK SPACE, none of
which are wrappable.
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 3705 bytes
Desc: S/MIME Cryptographic Signature
More information about the rfc-interest