[whatwg] Recursion and loops of Microdata items

Tomasz Jamroszczak toja at opera.com
Wed Jun 8 00:38:09 PDT 2011


	Hello.
	I'm implementing Microdata for Opera and I've got problems with loops in  
graphs of Microdata items.


	Summary:
	1. Is there a bug in the "crawl the properties" algorithm of Microdata?
	2. Is there a bug in "get the object" algorithm of converting Microdata  
to JSON?
	3. What is the true meaning of itemref?


	I've been looking into Microdata specification and it struck me, that  
crawling algorithm  
(http://dev.w3.org/html5/md/Overview.html#associating-names-with-items) is  
so complex, when it comes to expressing simple ideas.  I think that  
foremost the algorithm should be described in the specification with  
explanation what it's supposed to do, before steps of what exactly is to  
be done are written.
	Let's see, what are the properties of Microdata item from HTML element  
with id=up from following HTML:

<div itemscope id=up itemprop=prop0>
	<div itemscope id=down itemprop=prop1 itemref="up"></DIV>
</div>.


CRAWL
root = up
memory = {}
1. xxx
2. COLLECT
    1. results = {}
       pending = {}
    3. pending = {down}
    4. xxx
    5. pending = {}
       current = down
    7. xxx
    8. results = {down}
    results = {down}
3. xxx
4. new_memory = {up}
5. element = down
    CRAWL
    0. memory2 = {up}
       root2 = down
    1. xxx
    2. COLLECT
       1. results2 = {}
          pending2 = {}
       3. xxx
       4. pending2 = {up}
       5. pending2 = {}
          current2 = up
       7. xxx
       8. results2 = {up}
       results2 = {up}
    3. xxx
    4. new_memory2 = {up, down}
    5. element2 = up
       CRAWL
       0. memory3 = {up, down}
          root3 = up
       1. return FAIL
!!!   results2 = results2 - up = {}
    7. return results2 == {} (not FAIL).
7. return results == {down}

	In the end properties of Microdata item from HTML element with id=up has  
length=1.
	The troubling part is in the line marked with triple exclamation marks.   
It means that step 5. of the algorithm should be simplified to "For each  
element in results that has an itemscope attribute specified, if the  
element is equal to /root/, then remove the element from results [and  
increment errors]".  Further recursive crawling is not needed.

	But then there's problem with infinite recursion when going through  
stringification algorithm of http://dev.w3.org/html5/md/Overview.html#json  
for HTML given above.  We can proceed in two ways:

a) allow loops of Microdata items and make JSONification of Microdata item  
behave just like JSONification of any javascript object, that is - throw  
exception when loop is found.  Or

b) exclude loops of Microdata items (so in above example Microdata item  
 from HTML element with id=up would have no Microdata properties).  This  
will result in crippling functionality of a quite nice HTML API, but also  
it will produce consistent results in HTMLPropertiesCollection and  
stringification.  Third solution:

c) cut only offending links, is not good, because in case of graph of  
Microdata items with following paths: "A->B->C->D->B" and "E->D"  
stringification of item A would result in item D having no properties,  
while stringification of E would result in D having property B - so  
presence of property would depend on path's starting part.

	I can imagine good usages of loops of Microdata items, for example "John  
knows Amy, Amy knows John":

<div itemscope id="john" itemprop>
	<div itemprop="friends" itemref="fred1 jenny2 amy1"></div>
</div>
<div itemscope id="amy1" itemprop>
	<div itemprop="friends" itemref="john"></div>
</div>

There's loop:  jonh->amy1->john->... .

If the loop is to be excluded, and thus recursion, the same data could be  
written as:

<div itemscope>
	<div itemprop=addressbook_id>1</div>
	<div itemprop=name>John</div>
	<div itemprop=knows>2</div>
</div>
<div itemscope>
	<div itemprop=addressbook_id>2</div>
	<div itemprop=name>Amy</div>
	<div itemprop=knows>1</div>
</div>.

maybe with some <meta> instead of <div> or more verbosely:

<p itemscope itemid="#john" id="#john">John knows <a  
itemprop="http://xmlns.com/foaf/0.1/knows" href="#amy">Amy</a>.</p>
<p itemscope itemid="#amy" id="#amy">Amy knows <a  
itemprop="http://xmlns.com/foaf/0.1/knows" href="#john">John</a>.</p>

	The problem I'm addressing revolves around meaning of link between  
itemref and id attributes.  Is it meant to be a part of Microdata data  
model?  Or maybe it is introduced to cope with the fact that Microdata  
graph is defined on top of existing data, which is something completely  
different, and is meant to be rendered to the user (that is on top of HTML  
tree)?  So the meaning of itemref attribute should also hint  
interpretation of it inside the specification.

-- 
Best Regards,
Tomasz Jamroszczak


More information about the whatwg mailing list