[whatwg] RDFa is to structured data, like canvas is to bitmap and SVG is to vector

Henri Sivonen hsivonen at iki.fi
Sun Jan 18 02:07:30 PST 2009

On Jan 18, 2009, at 02:02, Sam Ruby wrote:

> On Sat, Jan 17, 2009 at 5:51 PM, Henri Sivonen <hsivonen at iki.fi>  
> wrote:
>> On Jan 17, 2009, at 22:35, Shelley Powers wrote:
>>> Generally, though, RDFa is based on reusing a set of attributes  
>>> already
>>> existing in HTML5, and adding a few more.
>> Also, RDFa uses CURIEs which in turn use the XML namespace mapping  
>> context.
>>> I would assume no differences in the DOM based on XHTML or HTML.
>> The assumption is incorrect.
>> Please compare
>> http://hsivonen.iki.fi/test/moz/xmlns-dom.html
>> and
>> http://hsivonen.iki.fi/test/moz/xmlns-dom.xhtml
>> Same bytes, different media type.
> The W3C Recommendation for DOM also describes a readonly attribute on
> Attr named 'name'.  Discuss.

I have added this to the test cases.

In the DOM API, you can use the namespace-unaware DOM Level 1 view to  
make both cases look the same upon getting a parser-inserted value.  
(This is, of course, totally against namespace-aware programming  
practices, and in non-browser apps, the API might not even expose  
qnames or higher-level technologies like RELAX NG or XPath can't  
trigger on them.)

But it's too early to declare victory. Surely we want also scripted  
setters that mutate the DOM into a state that could have been the  
result of a parse.

Now we have tentatively seen that DOM Level 1 APIs seem to do what we  
want. So let's try using setAttribute():
The result looks the same as the HTML case earlier:

But now, the XHTML side using the setter:
...gives a result that is different from the parser-inserted attribute  
Furthermore, the resulting DOM is no longer serializable as XML 1.0.

So let's move to a less intuitive case and use the namespace-aware  
Level 2 setter while assuming the use of the namespace-unaware Level 1  
Looks good compared to the parser-inserted XHTML case:

But now, the HTML side is broken:

>>> I put together a very crude demonstration of JavaScript access of a
>>> specific RDFa attribute, about. It's temporary, but if you go to  
>>> my main web
>>> page,http://realtech.burningbird.net, and look in the sidebar for  
>>> the click
>>> me text, it will traverse each div element looking for an "about"  
>>> attribute,
>>> and then pop up an alert with the value of the attribute. I would  
>>> use
>>> console rather than alert, but I don't believe all browsers  
>>> support console,
>>> yet.
>> This misses the point, because the inconsistency is with attributes  
>> named
>> xmlns:foo.
> There is a similar inconsistency in how xml:lang is handled.  Discuss.

The xml:lang DOM inconsistency has lead to a situation where the  
xml:lang/lang area in Validator.nu has has the highest incidence of  
validator buts per spec sentence of all areas of HTML5. You've  
reported at least one of those bugs. The amount of developer time  
needed to get it right was ridiculously high.

fantasai recently wrote: “Unless you're working on a CSS layout engine  
yourself, the level of detail, complex interactions with the rest of  
CSS, and design and implementation constraints we need to deal with  
here are more complicated than you can imagine.” (Source: http://fantasai.inkedblade.net/weblog/2009/layout-is-expensive/)

 From my experience with Validator.nu (that doesn't even have a DOM!)  
I think I can say: Unless you're working on a software product whose  
code reuse between HTML and XHTML depends on the DOM Consistency  
Design principle, the badness caused by violations of the DOM  
Consistency Design principle is more complicated than you can imagine.  
(Where 'you' is not you, Sam, but the generic English you.)

xml:lang was introduced by people who were designing for an XML  
universe when it seemed that would be the way the world would go, so  
they can be forgiven, and the WHATWG can clean up the mess. Likewise,  
the syntax that the SVG WG chose made sense given that they were  
designing for an XML world. It can be accepted as legacy, and HTML5  
parser writers can spend time optimizing the conditional camel casing.

RDFa, on the other hand, was created by people who fully expected it  
to be served as text/html, even though they called it something like  
XHTML 1.1 plus RDFa instead of calling it HTML5. Furthermore, when  
they saw they wanted to have RDFa in HTML5, too, instead of addressing  
HTML issues then, they just continued pushing towards REC. It's easily  
looks like this was done so that RDFa could be presented as a done  
deal that HTML5 needs to deal with instead of something whose details  
are negotiable. Creating a new mess that would have been easily  
avoidable is not similarly forgivable. Also, it sets in very bad  
precedent if we allow other groups to keep us on the treadmill by  
injecting new HTML-hostile features and expecting us to spend cycles  
to sort them out by "working the issues".

Henri Sivonen
hsivonen at iki.fi

More information about the whatwg mailing list