[whatwg] Whitespace handling in ruby

Roland Steiner rolandsteiner at google.com
Wed Jul 29 20:22:56 PDT 2009


As I am currently writing an implementation for ruby rendering, I wondered
about the exact way white-space is supposed to be handled between runs of
ruby text.

As far as I see it, <ruby> is fundamentally an inline element, and thus
whitespace would normally be collapsed, but not entirely eliminated.
However, for the examples given for the <ruby> element, this would result in
a single whitespace between the ideographic characters:

<ruby> *[ws]*
漢<rp>(</rp><rt>かん</rt><rp>)</rp> *[ws]*
字<rp>(</rp><rt>じ</rt><rp>)</rp> *[ws]*
</ruby>

rendered without ruby support would become (easier for e-mail):

漢(かん)* [ws]* 字(じ)

The whitespace would also be present with proper ruby rendering above the
base characters.

OTOH, removing those white-spaces may not be desirable if the bases are not
ideographic scripts, e.g.:

<ruby>
European<rp>(</rp><rt>E</rt><rp>)</rp>
Union<rp>(</rp><rt>U</rt><rp>)</rp>
</ruby>

(This example has yet another drawback: the white-space before "Union" would
become part of the base and thus shift the annotation "U" slightly left of
the center of the word "Union".)

For the time being I'm using a block-based rendering approach that
automatically eliminates leading and trailing white-space in the base text,
but I wondered what the correct approach would be within the scope of HTML5
(aside: an XHTML-like explicit <rb> container for the ruby base side-steps
this problem, but is not a real option due to need for legacy support).


- Roland
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090730/b9d99752/attachment-0002.htm>


More information about the whatwg mailing list