[whatwg] Microdata feedback

Philip Jägenstedt philipj at opera.com
Fri Jul 8 15:18:44 PDT 2011

On Fri, 08 Jul 2011 21:31:49 +0200, Ian Hickson <ian at hixie.ch> wrote:

> On Fri, 8 Jul 2011, Philip Jägenstedt wrote:
>> On Fri, 08 Jul 2011 00:33:14 +0200, Ian Hickson <ian at hixie.ch> wrote:
>> > On Wed, 8 Jun 2011, Tomasz Jamroszczak wrote:
>> > >
>> > > I've been looking into Microdata specification and it struck me,
>> > > that crawling algorithm is so complex, when it comes to expressing
>> > > simple ideas.  I think that foremost the algorithm should be
>> > > described in the specification with explanation what it's supposed
>> > > to do, before steps of what exactly is to be done are written.
>> >
>> > Yeah. Turns out the algorithms involved here are quite badly broken.
>> >
>> > It was intended to expose the microdata graph as completely as
>> > possible while dropping anything that would introduce a loop, at the
>> > point where the first repetition would start (so A->B->C=>A would
>> > break at the =), in the API, in the JSON, and in the conformance
>> > rules. I didn't do a good job speccing that, though!
>> >
>> > I've fixed the algorithms to make sense (I hope).
>> http://www.whatwg.org/specs/web-apps/current-work/multipage/microdata.html#the-properties-of-an-item
>> I had a look at this to verify that it is black-box-equivalent to what
>> Opera has implemented, and only discovered one issue:
>> <div itemprop=""> should not be added to the .properties collection,
>> because it has no properties. My bad for suggesting that the criteria
>> should be the presence of an itemprop attribute, it should be an
>> itemprop attribute containing at least one token. Can you update the
>> spec to match?
> What needs updating? As far as I can tell, what you describe is what the
> spec requires.

Step 11 is "If current has an itemprop attribute specified, add it to  
results." but should be "If current has one or more property names, add it  
to results." Property names are defined in  

Why? If you start with <div itemprop="foo">, then  
div.itemProp.remove("foo") would give you <div itemprop="">. It'd be weird  
if the element still showed up in the properties collection after removing  
the only property name.

>> > On Wed, 29 Jun 2011, Philip Jägenstedt wrote:
>> > >
>> > > Indeed, multiple types doesn't work at all if you want to mix
>> > > different types. I was assuming that the use case was to extend
>> > > types, kind of like http://schema.org/Person/Governor. However, it
>> > > doesn't work all that well even in that case, since there's no way
>> > > to know which type is the extension of the other and which
>> > > properties exist only on the extended type.
>> >
>> > I don't really understand this use case. Can you elaborate on the
>> > problem that needs solving here?
>> It's whatever problem <http://schema.org/docs/extension.html> is trying
>> to solve, which is something like "allow people to geek out with more
>> specific vocabularies without interfering with search results".
> That doesn't seem to be a problem. I don't really understand what problem
> this is solving.

Neither do I.

> If the problem is just "I want to annotate data that isn't defined in  
> this
> vocabulary", that's already possible using URL property names.
>> If I were schema.org, I would just encourage people to do this:
>> <div itemscope itemtype="http://schema.org/Person">
>>  <div id="wrapper">
>>    <div itemprop="name">Arnold</div>
>>    <div itemscope itemtype="http://example.com/Governor"  
>> itemref="wrapper">
>>      <div itemprop="state">California</div>
>>    </div>
>>  </div>
>> </div>
> That's a bit weird. Why not just:?
>  <div itemscope itemtype="http://schema.org/Person">
>   <div itemprop="name">Arnold</div>
>   <div itemprop="http://example.com/Governor/state">California</div>
>  </div>

Yeah, that's better, at least when the number of additional attributes is  

> It's hard to know without knowing what concrete user problem we're trying
> to solve here.

I'll leave this discussion to the schema.org sponsors and just hope that  
the method in <http://schema.org/docs/extension.html> doesn't catch on.

Philip Jägenstedt
Core Developer
Opera Software

More information about the whatwg mailing list