[whatwg] Possible bugs : Microdata Itemscope on <link/> and <meta/>

Tim van Oostrom tim at depulz.nl
Sun Nov 29 03:46:16 PST 2009


Philip Jägenstedt wrote:
> On Thu, 26 Nov 2009 22:30:41 +0100, Tim van Oostrom <tim at depulz.nl> 
> wrote:
>
>> Hi, I made a forumpost : 
>> http://forums.whatwg.org/viewtopic.php?t=4176, concerning a possible 
>> "microdata specification bug" and a bug in the james.html5.org 
>> microdata extractor.
>>
>> Comes down to <link/> and <meta/> elements possibly being unfit for 
>> use with the itemscope attribute.
>>
>> I made an example in the forum post with some nice ubb formatting .
>>
> There are some other issues with <link> and <meta> you might want to 
> review first: [1]
Ok
> Your second example was:
>
> <div itemtype="http://url.to/geoVocab#country" itemscope>
>    <span itemprop="http://xmlns.com/foaf/spec/index.rdf#name" 
> lang="cn">中華人民共和國</span>
>    <span itemprop="http://xmlns.com/foaf/spec/index.rdf#name" 
> lang="en">China</span>
>    <link itemprop="http://url.to/city" href="http://url.to/shanghai" 
> itemscope itemref="city-shanghai" />
>    <div id="city-shanghai">
>       <span 
> itemprop="http://xmlns.com/foaf/spec/index.rdf#name">Shanghai</span>
>       <span itemprop="http://url.to/demoVocab#population">14.61 
> million people</span>
>       <span itemprop="http://url.to/physicsVocab#time" 
> datetime="2009-11-26 11:43">11:43 pm (CT)</span>
>    </div>
> </div>
>
> By using itemprop+itemscope, you're saying that the property is itself 
> an item. Also specifying href="http://url.to/shanghai" does nothing.
I also pointed that out in my forumpost.
> <link>, <meta> and any other void elements are usually the wrong 
> choice for itemprop+itemscope because they don't have child elements, 
> so itemref is the only way to add properties. 
Yes, see forumpost. Shouldn't this be noted in the Spec then ?  (maybe i 
read over it)
> What you've accidentally done above is add the 3 properties of 
> Shanghai to both the top-level item and the sub-item, see [2] for 
> details. 
Well, i did it in full awareness, i interpreted the itemref attribute 
like this. But if it can't be used this way, isn't this a setback on the 
flexibility of itemref? Or was it intended this way.
According to this an "itemref" attribute can never be added to an "item" 
within an itemscope of another "item" without the crawled prop/val pairs 
also applying to the ancestors itemscope.

For example, when i rearrange the "Amanda" example from the spec like 
this :

<div itemscope itemtype="http://url.to/whatwgDiscussionVocab#example">
<div itemprop="http://url.to/whatwgDiscussionVocab#exampleSubject" 
id=amanda itemref="a b" itemscope="" 
itemtype="http://xmlns.com/foaf/0.1/Person" ></div>
<p id=a>Name: <span itemprop=name>Amanda</span></p>
<div id=b itemprop=band itemref=c itemscope=""></div>
<div id=c>
 <p>Band: <span itemprop=name>Jazz Band</span></p>
 <p>Size: <span itemprop=size>12</span> players</p>
</div>
</div>

prop:name and prop:size should, imo, still only apply to itemscope which 
is band's 'Object' like :

<none> <http://www.w3.org/1999/xhtml/microdata#item> _:cMSiQpGf15 .

_:cMSiQpGf15 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://url.to/discussionVocab#example> .
_:cMSiQpGf15 <http://url.to/discussionVocab#exampleSub> _:cMSiQpGf16 .

_:cMSiQpGf16 "name" "Amanda" .
_:cMSiQpGf16 "band" _:cMSiQpGf17 .

_:cMSiQpGf17 "name" "Jazz Band" .
_:cMSiQpGf17 "size" "12" .

But instead the name "Jazz Band" also applies to _:cMSiQpGf15 .
This is a somewhat restrictive use of the itemref="" attribute.

> I would rewrite it to something like:
>
> <div itemtype="http://url.to/geoVocab#country" itemscope>
>    <span itemprop="http://xmlns.com/foaf/0.1/name" lang="zh-CN">中华人 
> 民共和国</span>
>    <span itemprop="http://xmlns.com/foaf/0.1/name" lang="en">China</span>
>    <div itemprop="http://url.to/whatShanghaiIsToChina" itemscope
>         itemtype="http://url.to/geoVocab#city">
>       <span itemprop="http://xmlns.com/foaf/0.1/name">Shanghai</span>
>       <span hidden itemprop="http://url.to/demoVocab#population" 
> content="14610000"></span>
>       <time itemprop="http://url.to/physicsVocab#time" 
> datetime="2009-11-26T11:43+08:00">11:43 pm (CT)</time>
>    </div>
> </div>
Ok, clear, but :

_:itemA <type> <country>
_:itemA <name> "China"
_:itemA <city> _:itemB
_:itemB <name> "Shanghai"

Already says to me "as a human" that the thing named Shanghai is a city 
of Country China. The following would violate the DRY principle :

_:itemA <type> <country>
_:itemA <name> "China"
_:itemA <city> _:itemB
_:itemB <type> <city>
_:itemB <name> "Shanghai"

My case would function like ' itemproptype="city" ' as in : Country - 
City - X , X - name - Shanghai. But i imagine RDF does not want to be 
treated this way.

>
> I don't know how you meant for <http://url.to/city> to be used, 

I meant <http://url.to/geoVocab#city>

> the vocabulary at <http://url.to/geoVocab#country> has to define what 
> properties are valid and their semantics. The itemprop 
> <http://url.to/whatShanghaiIsToChina> could be lots of things, what 
> you want is maybe something that means "financial center of" or 
> "largest city", I don't know. 

What is Shanghai to China ? My guess is that people would say : "a City" 
and then maybe "Largest City" or "financial center". But what would 
people say if you ask : What is Nanchang to China ? The answer would 
probably "a City" but i assume it would unlikely be "The 20th largest 
city of China" or "Another financial center"


<div itemtype="http://url.to/geoVocab#country" itemscope>
   <span itemprop="http://xmlns.com/foaf/0.1/name" lang="en">China</span>
   <div itemprop="http://url.to/geoVocab#LargestCity" itemscope
        itemtype="http://url.to/geoVocab#city">
      <span itemprop="http://xmlns.com/foaf/0.1/name">Shanghai</span>
   </div>
   <div itemprop="http://url.to/geoVocab#20ThLargestCity" itemscope
        itemtype="http://url.to/geoVocab#city">
      <span itemprop="http://xmlns.com/foaf/0.1/name">Nanchang</span>
   </div>
   <div itemprop="http://url.to/geoVocab#xxxthLargestCity" itemscope
        itemtype="http://url.to/geoVocab#city">
      <span itemprop="http://xmlns.com/foaf/0.1/name">Qufu</span>
   </div>
</div>


Assigning "unique" properties to Subjects for RDF's sake doesn't seem 
like a good idea to me.
Ofcourse i can make other, more sensible, html markup but the whole 
point of a solid annotating language is that i can apply it to my 
existing markup without changing it.

> If <http://url.to/shanghai> is a global identifier for Shanghai you 
> should use itemid.

Correct, but the href="" is ignored in that example. If it only 
concerned a property i'd use :

<link itemprop="http://url.to/city" href="http://url.to/shanghai" />

and it would be valid. I used it to show my point about <link/> and 
itemscope, see forumpost.

>
> I don't know what <http://url.to/physicsVocab#time> is, but note that 
> an exact time isn't very useful without a timezone, so I added the PRC 
> timezone for you. I'll also note that using traditional Chinese for 
> the full name of the PRC is an odd choice, so I changed it to 
> simplified Chinese above.
>
> Marking up the population as "14.61 million people" isn't terribly 
> helpful if you want a computer to be able to find the city with the 
> biggest population among several cities, unless your vocabulary 
> defines how to parse "14.61 million people" into a number, which would 
> be strange. In any case this is hidden metadata unless you want 
> 14610000 or some other easily machine-parsable representation to be 
> visible in the page rendering.

Ok but using content="" on a span, is that valid ? Your suggestion in 
[1] would be nicer. But i prefer the use of <link/> and <meta/> elements.

> Finally, I think <http://xmlns.com/foaf/spec/index.rdf#name> should be 
> <http://xmlns.com/foaf/0.1/name>. If you're going to use existing 
> vocabularies like FOAF and want your data to be play nice with the RDF 
> world, make sure to check that the result of the RDF extraction 
> algorithm [3] is what you intended. 
This remains unclear to me. For example : http://xmlns.com/foaf/0.1/name 
redirects to an html page but : http://purl.org/dc/terms/title redirects 
to an .rdf file. For readability, wouldn't 
http://xmlns.com/foaf/spec/#term_name and 
http://dublincore.org/documents/dcmi-terms/#elements-title be better  ?
If i dereference these url's i find the information i want in one stop.
> In particular, you probably want to use itemid where possible and make 
> sure that all your URIs are exactly correct. Personally, though, 
> unless I could reuse existing vocabularies for every single item and 
> property, I would only use a full URI for itemtype and point that to a 
> vocabulary that defines what simpler property names like "name" and 
> "city" mean and how to convert the vocabulary to RDF.
>
> [1] 
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-November/024116.html 
>
> [2] 
> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item 
>
> [3] 
> http://www.whatwg.org/specs/web-apps/current-work/multipage/converting-html-to-other-formats.html#rdf 
>
>





More information about the whatwg mailing list