[whatwg] Recursion and loops of Microdata items

Philip Jägenstedt philipj at opera.com
Thu Jun 9 09:35:29 PDT 2011


On Wed, 08 Jun 2011 21:51:57 +0200, Ian Hickson <ian at hixie.ch> wrote:

> The goal of itemref="" was just to have a way to handle cases where you
> have an item's properties scattered around a document.
>
> It's caused us more difficulties than helped anything, as far as I can
> tell. Has anyone implemented it or used it and liked it? I'd be fine with
> removing it if it's not a lot of trouble...
>
> I haven't fixed the algorithm to be written more simply, nor fixed the
> loops in the JSON stuff, because if we remove itemref="" then those
> problems just go away. If we want to keep itemref="", though, I will fix
> them. Any opinions one way or the other?

Unfortunately Tomasz is home sick, so he's asked me to reply in his place.

I think that itemref, while certainly the most complicated part of the  
API, should remain. Not having it would be rather limiting in cases where  
you don't want to change the structure of your markup (and perhaps layout)  
to fit the nesting that the vocabulary requires.

The algorithm in the spec clearly has some bugs and is rather hard to  
digest, so instead of suggesting deltas to it, I'll describe what we think  
the overall behavior should be.

Tomasz has implemented loop elimination using the Cheriyan–Mehlhorn/Gabow  
algorithm [1] and this looks promising. The algorithm finds the strongly  
connected components [2] and then we remove all components that have more  
than one item. Another way of stating this (I think) is that any item  
which is part of a loop (not just leading into one) is removed.

As for the spec, I don't think it can or needs to define the algorithm on  
a form suitable form implementation. Something along these lines would be  
much clearer for reference:

1. create a (possibly disconnected) graph of all the items in the document  
(or subtree)
2. find the strongly connected components
3. create a list of "loopy" items: those that are in the same component as  
any other item

The traversal would remain mostly as before, but whenever an item is  
encountered, one checks if it is in the list of "loopy" items and if so  
ignores it. Since "loopy" is a global property, you'll see the same  
properties regardless of the path taken to reach it, which may or may not  
be the case with the current spec. (In any case, it's a nice feature.)

Would you be willing to spec something along these lines?

[1] http://en.wikipedia.org/wiki/Cheriyan–Mehlhorn/Gabow_algorithm
[2] http://en.wikipedia.org/wiki/Strongly_connected_component

-- 
Philip Jägenstedt
Core Developer
Opera Software


More information about the whatwg mailing list