[whatwg] HTML: A DOM attribute that returns the language of a node
Jukka K. Korpela
jkorpela at cs.tut.fi
Thu Aug 8 07:29:31 PDT 2013
2013-08-08 2:57, Ryosuke Niwa wrote:
> On Aug 2, 2013, at 6:10 AM, Jukka K. Korpela <jkorpela at cs.tut.fi>
> wrote:
[...]
>> But regarding the effect of language markup on fonts, the effect is
>> limited to situations where the font is not specified in a style
>> sheet. This is a rather uncommon scenario these days; authors are
>> more than eager to set fonts.
>
> Do you have actual statistics to support this point?
No, it’s just an impression from looking at numerous pages and their
coding as well as views presented in authors’ forums.
> As far as I
> checked, neither baidu.com nor yahoo.com.tw seems to explicitly
> specify a Chinese font.
They both have font-family settings, slightly different, but basically
the most common (sorry, no statistic on this either) setup that uses
Arial (possibly with Helvetica as second option, which does not change
much). So, granted, they don’t specify a Chinese font in the sense of
including any specific fonts containing CJK characters in the
font-family list.
Baidu doesn’t set lang either, so they seem to be accepting, for any
characters not covered by Arial, whatever happens to be in each
browser’s list of fallback fonts, when no information about content
language is available. Yahoo.com.tw sets lang="zh-tw", so they do care,
but only to the extent that the fallback font should be one intended for
Traditional Chinese.
So the lang markup may affect fonts, but only under some conditions. And
if you care about fonts, as an author, then an explicit list of font
alternatives has better chances of creating the desired result.
>> It is true that they might specify a font list where none of the
>> fonts supports some characters that might be entered, and then a
>> fallback font would be used. However, using “annotations”
>> (presumably, lang attributes, along with extra <span> elements when
>> needed) does not sound like a feasible approach to this.
>
> Whether it’s feasible or not, that’s what we have been doing due to
> the Han unification. If we could, we’ll undo the Han unification and
> use different glyphs for each character but we can’t do that at this
> point in time.
If a page contains texts to be rendered using different forms
(Traditional Chinese, Simplified Chinese, Japanese, Korean) for Han
characters, you will need to control the rendering somehow. Using lang
markup might be theoretically most adequate, but it’s indirect and less
effective than just setting different fonts (via font-family lists that
contain reasonably many alternatives).
But even if lang attributes are used, I don’t think the issue has much
relevance to the original question here. A DOM attribute that returns
the language of a node would be useful for the purpose only if you
intend to affect rendering via JavaScript.
Yucca
More information about the whatwg
mailing list