[whatwg] HTML: A DOM attribute that returns the language of a node

Ryosuke Niwa rniwa at apple.com
Wed Aug 7 16:57:35 PDT 2013

On Aug 2, 2013, at 6:10 AM, Jukka K. Korpela <jkorpela at cs.tut.fi> wrote:

> 2013-08-02 2:43, Ryosuke Niwa wrote:
>>> Are you saying that for HTML contenteditable-based editors that want to
>>> support drag-and-drop editing, they need to be able to annotate the
>>> outgoing HTML fragment with the effective language so that when it's
>>> embedded somewhere, the right fonts get used?
>> Yes, but not just for drag and drop.
> This would mean that the editor would have to guess the language from the text or ask the user to specify it. This is not as unrealistic as it may first seem. Microsoft Word does such things, sometimes getting things right, often messing things up. It typically detects change of language too late, and often infers language from keyboard settings, making it rather impossible to use a multilingual keyboard easily.
> But regarding the effect of language markup on fonts, the effect is limited to situations where the font is not specified in a style sheet. This is a rather uncommon scenario these days; authors are more than eager to set fonts.

Do you have actual statistics to support this point?  As far as I checked, neither baidu.com nor yahoo.com.tw seems to explicitly specify a Chinese font.

Also, I have just recently experienced the font type change on Gmail when I was conversing with a native Chinese speaker.  Her mail client used Chinese fonts before Japanese fonts whereas mine had Japanese fonts before Chinese fonts.

> It is true that they might specify a font list where none of the fonts supports some characters that might be entered, and then a fallback font would be used. However, using “annotations” (presumably, lang attributes, along with extra <span> elements when needed) does not sound like a feasible approach to this.

Whether it’s feasible or not, that’s what we have been doing due to the Han unification.  If we could, we’ll undo the Han unification and use different glyphs for each character but we can’t do that at this point in time.

- R. Niwa

More information about the whatwg mailing list