[whatwg] HTML: A DOM attribute that returns the language of a node

Peter Occil poccil14 at gmail.com
Tue Apr 23 22:49:36 PDT 2013

What's my use case?

Well in my case, I have written an HTML parser in Java and C# [1][2], which
parses HTML documents and returns an object that implements
a subset of the DOM, so far.  As far as possible, I included only methods
and attributes that were specified in the DOM or HTML specification,
such as the characterSet attribute (which is called getCharacterSet
on my DOM's IDocument interface), and more recently the innerHTML
attribute (which is called getInnerHTML on my DOM's IElement interface)

However, when I decided to implement an RDFa processor based on
my HTML parser, I had need to include a method that returns the
language of a node (see, for example, section 3.3 of reference [3]).
As a result, I included a method called getLanguage on my DOM's
INode interface (which may correspond to a possible--future--DOM
attribute called "language" on the Node interface).  I feel uneasy
having to include this extension to what ought to be a subset of

While a "language" attribute on Node may also be useful to
HTML+RDFa processors in JavaScript, I have no plans to implement
such a processor in JavaScript, though.

[1] https://github.com/peteroupc/HtmlParser
[2] https://github.com/peteroupc/HtmlParserCSharp
[3] http://www.w3.org/TR/rdfa-in-html/

-----Original Message----- 
From: Kang-Hao (Kenny) Lu
Sent: Tuesday, April 23, 2013 11:08 PM
To: Peter Occil
Cc: WHAT Working Group
Subject: Re: [whatwg] HTML: A DOM attribute that returns the language of a 

(13/04/23 16:44), Peter Occil wrote:
> I believe there should be a DOM attribute that returns the language
> of a node, as defined in section "The lang and xml:lang
> attributes".

What's your use case? If you want to style a particular language then
there's the CSS :lang() pseudo-class.

Use cases are important because otherwise I think there are very few
pages with multiple lang attributes...

> While there is a "lang" DOM attribute, it's inadequate because it's
> only affected by the element's "lang" content attribute.

That's true. However, if the case isn't important, we can do tree
traversal (modulo HTTP Content-Language header and pragma) or exhaust

> Also, I don't see a way to get the "language of a node" otherwise,
> especially since it depends not only on "lang" and "xml:lang", but
> also on the HTTP Content-Language header, which may not be possible
> to retrieve with existing JavaScript methods, as far as I can tell.


Web Specialist, Opera Sphinx Game Force, Oupeng Browser, Beijing
Try Oupeng: http://www.oupeng.com/ 

More information about the whatwg mailing list