[whatwg] Recursion and loops of Microdata items
Tomasz Jamroszczak
toja at opera.com
Wed Jun 8 00:38:09 PDT 2011
Hello.
I'm implementing Microdata for Opera and I've got problems with loops in
graphs of Microdata items.
Summary:
1. Is there a bug in the "crawl the properties" algorithm of Microdata?
2. Is there a bug in "get the object" algorithm of converting Microdata
to JSON?
3. What is the true meaning of itemref?
I've been looking into Microdata specification and it struck me, that
crawling algorithm
(http://dev.w3.org/html5/md/Overview.html#associating-names-with-items) is
so complex, when it comes to expressing simple ideas. I think that
foremost the algorithm should be described in the specification with
explanation what it's supposed to do, before steps of what exactly is to
be done are written.
Let's see, what are the properties of Microdata item from HTML element
with id=up from following HTML:
<div itemscope id=up itemprop=prop0>
<div itemscope id=down itemprop=prop1 itemref="up"></DIV>
</div>.
CRAWL
root = up
memory = {}
1. xxx
2. COLLECT
1. results = {}
pending = {}
3. pending = {down}
4. xxx
5. pending = {}
current = down
7. xxx
8. results = {down}
results = {down}
3. xxx
4. new_memory = {up}
5. element = down
CRAWL
0. memory2 = {up}
root2 = down
1. xxx
2. COLLECT
1. results2 = {}
pending2 = {}
3. xxx
4. pending2 = {up}
5. pending2 = {}
current2 = up
7. xxx
8. results2 = {up}
results2 = {up}
3. xxx
4. new_memory2 = {up, down}
5. element2 = up
CRAWL
0. memory3 = {up, down}
root3 = up
1. return FAIL
!!! results2 = results2 - up = {}
7. return results2 == {} (not FAIL).
7. return results == {down}
In the end properties of Microdata item from HTML element with id=up has
length=1.
The troubling part is in the line marked with triple exclamation marks.
It means that step 5. of the algorithm should be simplified to "For each
element in results that has an itemscope attribute specified, if the
element is equal to /root/, then remove the element from results [and
increment errors]". Further recursive crawling is not needed.
But then there's problem with infinite recursion when going through
stringification algorithm of http://dev.w3.org/html5/md/Overview.html#json
for HTML given above. We can proceed in two ways:
a) allow loops of Microdata items and make JSONification of Microdata item
behave just like JSONification of any javascript object, that is - throw
exception when loop is found. Or
b) exclude loops of Microdata items (so in above example Microdata item
from HTML element with id=up would have no Microdata properties). This
will result in crippling functionality of a quite nice HTML API, but also
it will produce consistent results in HTMLPropertiesCollection and
stringification. Third solution:
c) cut only offending links, is not good, because in case of graph of
Microdata items with following paths: "A->B->C->D->B" and "E->D"
stringification of item A would result in item D having no properties,
while stringification of E would result in D having property B - so
presence of property would depend on path's starting part.
I can imagine good usages of loops of Microdata items, for example "John
knows Amy, Amy knows John":
<div itemscope id="john" itemprop>
<div itemprop="friends" itemref="fred1 jenny2 amy1"></div>
</div>
<div itemscope id="amy1" itemprop>
<div itemprop="friends" itemref="john"></div>
</div>
There's loop: jonh->amy1->john->... .
If the loop is to be excluded, and thus recursion, the same data could be
written as:
<div itemscope>
<div itemprop=addressbook_id>1</div>
<div itemprop=name>John</div>
<div itemprop=knows>2</div>
</div>
<div itemscope>
<div itemprop=addressbook_id>2</div>
<div itemprop=name>Amy</div>
<div itemprop=knows>1</div>
</div>.
maybe with some <meta> instead of <div> or more verbosely:
<p itemscope itemid="#john" id="#john">John knows <a
itemprop="http://xmlns.com/foaf/0.1/knows" href="#amy">Amy</a>.</p>
<p itemscope itemid="#amy" id="#amy">Amy knows <a
itemprop="http://xmlns.com/foaf/0.1/knows" href="#john">John</a>.</p>
The problem I'm addressing revolves around meaning of link between
itemref and id attributes. Is it meant to be a part of Microdata data
model? Or maybe it is introduced to cope with the fact that Microdata
graph is defined on top of existing data, which is something completely
different, and is meant to be rendered to the user (that is on top of HTML
tree)? So the meaning of itemref attribute should also hint
interpretation of it inside the specification.
--
Best Regards,
Tomasz Jamroszczak
More information about the whatwg
mailing list