[whatwg] Ruby markup - Furigana
Henri Sivonen
hsivonen at iki.fi
Fri Jan 5 02:05:30 PST 2007
On Jan 4, 2007, at 14:51, Michael(tm) Smith wrote:
> Henri Sivonen <hsivonen at iki.fi>, 2007-01-04 14:38 +0200:
>
>> On Jan 4, 2007, at 12:05, Karl Dubost wrote:
>>
>>> Or read the kanjis that are too difficult to be known when browsing.
>>
>> How does furigana map to aural rendering? Is only the annotation read
>> out loud and the base ignored?
>
> If by "base" you mean the kanji,
Yes. I meant the normal-sized text on the usual baseline.
> So one of the common uses of furigana (outside of just
> being used in texts for learners who haven't mastered reading yet)
> it to show readings for kanji combinations that are otherwise
> ambiguous. Or to show that a kanji combination should be read
> differently from the way it would otherwise normally be read.
That was my understanding. (Though one has to wonder why they don't
just use straight hiragana without kanji in cases where the usability
properties of kanji are so bad that furigana is needed.) So when
considering what would make sense as a default aural rendering, I
thought suppressing the ruby base and only reading the ruby text
might make sense.
http://www.w3.org/TR/ruby/#non-visual is basically a long-winded way
of saying that the spec writers couldn't come up with a default aural
rendering and, hence, ruby markup is not media-independent as
specified. The real problem seems to be that ruby is used both for
supplementary text and for alternative text.
If it was as simple as this, there should be a flag for indicating
whether the ruby text is alternative text (in which case aural
rendering would read it and suppress the base) or whether the ruby
text is supplementary text (in which case both the base and the ruby
text should be read with some indication about the ruby text being an
annotation).
Next, the problem is what should be the default. The default aural
rendering should make sense in the common case for documents written
by visually-oriented authors without requiring the authors to jump
through semantic hoops, which most of them won't do anyway in
practice. The W3C spec suggests flagging the alternative text case
with class='reading', but the spec is woefully inadequate as it
doesn't normatively specify class='reading'. This make sense, because
suppressing the aural rendering of the ruby base is potentially
dataloss. However, if giving the reading is the most common use case,
the default aural rendering should do the right thing for that use
case without a special flag and the case where the base needs to be
read should be the case that requires the author (or WYSIWYG editor!)
to use a special flag. (Moreover, the W3C samples places the class
attribute on the ruby text element instead of the wrapper ruby
element. This isn't nice considering CSS selectors for suppressing
the ruby base.)
For now, to provoke comments supporting or refuting my totally
uninformed hypothesis, I am assuming that example of the reading not
really being the reading (は vs. わ) as explained at http://
www.w3.org/TR/ruby/#non-visual is not a real problem (see below). Or
at least that leaving the problem unsolved and the default aural
rendering slightly wrong in some cases is better than saying that
ruby cannot have a reasonable default aural rendering at all.
Here's what makes me suspect that alleged i18n problems may not be
real problems:
If you ask a random Finn on the street, what letters in addition to
a–z are needed for writing Finnish, chances are that he says ä and
ö, but the alphabet also includes å to cover Swedish as well. Now,
if you ask the people who represent Finland on international
standards bodies, they'll say you also need š and ž. If you tell
this to the random Finn from the street, he can come up with use
cases for š (although he doesn't know how to use the letter on a
computer and types sh instead) but he'll think claiming that Finnish
needs ž is just crazy. The sad part is that foreigners have no way of
knowing that š and ž are an esoteric thing the language people came
up with and that is of no concern to real end users. So then Red Hat
et al. actually believe that the sky falls unless they ship an
ISO-8859-15 Finnish locale instead of an ISO-8859-1 locale (or
upgrade straight to UTF-8). Trouble ensues.
After finding out what kind of things the Finnish language
representatives keep telling foreigners as Finnish “requirements”,
I can only assume that representatives from elsewhere may be giving
equally detached from reality “requirements” for other languages.
(And the mostly verifiable rumors I’ve heard about the committee
behavior of Danish and Dutch representatives support my hypothesis of
crazy things happening in international standards bodies over alleged
national requirements.) Since in the Finnish case the expert opinion
is detached from reality (and I know what the reality with Finnish is
with utmost confidence), I now suspect expert statements about the
requirements of other languages as well or I suspect the seriousness
of alleged problems. :-(
--
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/
More information about the whatwg
mailing list