[whatwg] Microdata feedback

Wed Oct 14 04:53:46 PDT 2009

On Fri, 21 Aug 2009, Philip JÃ¤genstedt wrote:
> 
> The spec says that "properties can also themselves be groups of 
> name-value pairs", but this isn't exposed in a very convenient way in 
> the DOM API. The 'properties' DOM-property is a HTMLPropertyCollection 
> of all associated elements. Discovering if the item-property value is a 
> plain string or an item seems to require item.hasAttribute('item'), 
> which seems out of place when everything else has been so neatly 
> reflected.

This is now reflected on item.itemScope.

> Also, the 'contents' DOM-property is always the item-property value 
> except in the case where the item-property is another item -- in that 
> case it is something random like .href or .textContent depending on the 
> element type. I think it would be better if the DOM-property were simply 
> called 'value' (the spec does talk about name-value pairs after all) and 
> corresponded more exactly to 'property value' [3]. Elements that have no 
> 'property names' [4] should return null and otherwise elements with an 
> 'item' attribute should return itself, although I don't think it should 
> be writable in that case. One might also/otherwise consider adding a 
> valueType DOM-property which could be 'string', 'item' or something 
> similar.

Interesting idea. I've renamed 'content' to 'itemValue', and made it 
return null if there's no itemprop="", and the element itself if there's 
an itemscope="".

> One example [5] uses document.items[item].names but document.items isn't 
> defined anywhere. I assume this is an oversight and that it is 
> equivalent to document.getItems() Further, names is a member of 
> HTMLPropertyCollection, so document.items[item].properties.names is 
> probably intended instead of document.items[item].names. Assuming this 
> the example actually produces the output it claims to.

Fixed.

> Shouldn't namedItem [6] be namedItems? Code like .namedItem().item(0) 
> would be quite confusing.
> [6] http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#dom-htmlpropertycollection-nameditem

I don't understand what this is referring to.

> Also, RadioNodeList should be PropertyNodeList.

Fixed.

> I think many will wonder why item and itemprop can't be given on a 
> single element for compactness:
> 
> <span item="org.example.fruit" itemprop="org.example.name">Apple</span>s and
> <span item="org.example.fruit" itemprop="org.example.name">Orange</span>s
> don't compare well.

Modulo the changes to the syntax (s/item=/itemscope itemtype=/g), this is 
allowed -- but it means the same as this:

   <span itemprop="org.example.name" itemscope itemtype="org.example.fruit">...

...which is to say, it's giving a property whose value is itself an item.

On Sun, 23 Aug 2009, Eduard Pascual wrote:
> On Sat, Aug 22, 2009 at 11:51 PM, Ian Hickson<ian at hixie.ch> wrote:
> >
> > Based on some of the feedback on Microdata recently, e.g.:
> >
> >   http://www.jenitennison.com/blog/node/124
> >
> > ...and a number of e-mails sent to this list and the W3C lists, I am 
> > going to try some tweaks to the Microdata syntax. Google has kindly 
> > offered to provide usability testing resources so that we can try a 
> > variety of different syntaxes and see which one is easiest for authors 
> > to understand.
> >
> > If anyone has any concrete syntax ideas that they would like me to 
> > consider, please let me know. There's a (pretty low) limit to how many 
> > syntaxes we can perform usability tests on, though, so I won't be able 
> > to test every idea.
> 
> This would be more than just tweaking the syntax, but I think 
> appropriate to bring forth my CRDF proposal as a suggestion for an 
> alternative to Microdata.

I considered testing this, as well as RDFa, but due to time constraints we 
ended up only being able to test a few changes, so I concentrated 
specifically on microdata variants.

On Tue, 25 Aug 2009, Philip JÃ¤genstedt wrote:
> 
> There's something like an inverse relationship between simplicity of the 
> syntax and complexity of the resulting markup, the best balance point 
> isn't clear (to me at least). Perhaps option 3 is better, never allowing 
> item+itemprop on the same element.

That would preclude being able to make trees.

> > > Given that flat items like vcard/vevent are likely to be the most 
> > > common use case I think we should optimize for that. Child items can 
> > > be created by using a predefined item property: 
> > > itemprop="com.example.childtype item". The value of that property 
> > > would then be the first item in tree-order (or all items in the 
> > > subtree, not sure). This way, items would have better copy-paste 
> > > resilience as the whole item element could be made into a top-level 
> > > item simply by moving it, without meddling with the itemprop.
> > 
> > That sounds kinda confusing...
> 
> More confusing than item+itemprop on the same element? In many cases the 
> property value is the contained text, having it be the contained item 
> node(s) doesn't seem much stranger.

Based on the studies Google did, I'm not convinced that people will find 
the nesting that complicated. IMHO the proposal above is more confusing, 
too. I'm not sure this is solving a problem that needs solving.

> > > If the parent-item (com.example.blog) doesn't know what the 
> > > child-items are, it would simply use itemprop="item".
> > 
> > I don't understand this at all.
> 
> This was an attempt to have anonymous sub-items. Re-thinking this, 
> perhaps a better solution would be to have each item behave in much the 
> same way that the document itself does. That is, simply add items in the 
> subtree without using itemprop and access them with .getItems(itemType) 
> on the outer item.

How would you do things like "agent" in the vEvent vocabulary?

> Comparing the current model with a DOM tree, it seems odd in that a 
> property could be an item. It would be like an element attribute being 
> another element: <outer foo="<inner/>"/>. That kind of thing could just 
> as well be <outer><foo><inner/></foo></outer>, <outer><inner 
> type="foo"/></outer> or even <outer><inner/></outer> if the relationship 
> between the elements is clear just from the fact that they have a 
> parent-child relationship (usually the case).

Microdata's datamodel is more similar to JSON's than XML's.

> It's only in the case where both itemprop and item have a type that an 
> extra level of nesting will be needed and I expect that to be the 
> exception. Changing the model to something more DOM-tree-like is 
> probably going to be easier to understand for many web developers.

I dunno. People didn't seem to have much trouble getting it once we used 
itemscope="" rather than just item="". People understand the JSON 
datamodel pretty well, why would this be different?

On Wed, 26 Aug 2009, Brian Campbell wrote:
> 
> Why do we need separate items and item properties? They seem to confuse 
> people, when something can be both an item and an itemprop at the same 
> time. They also seem to duplicate a certain amount of information; items 
> can have "types", while itemprops can have "names", but they both seem 
> to serve about the same role, which is to indicate how to interpret them 
> in the context of page or larger item.
> 
> What if we just had "item", filling both of the roles? The value of the 
> item would be either an associative array of the descendent items (or 
> ones associated using "about") if those exists, or the text content of 
> the item (or URL, depending on the tag) if it has no items within it.

Thanks for this suggestion.

We tried this (it was variant 3). In practice, it didn't seem to lead to 
any significant improvement of understanding in the participants; people 
understood (to my surprise!) the difference between the concept of "type" 
and the concept of "name", and actually in several cases started trying to 
provide a type even when the examples didn't call for it.

On Tue, 6 Oct 2009, tjeddo wrote:
> On Mon, Oct 5, 2009 at 7:51 PM, Ian Hickson <ian at hixie.ch> wrote:
> > On Sun, 27 Sep 2009, tjeddo wrote:
> >>
> >> I am surprised at how little concern there seems to be over the lack 
> >> of bibliography markup in HTML5.
> >
> > There's a lot of concern, but it was deemed that microdata is a better 
> > way of addressing this than specific elements.
> 
> Thanks for your response. After reviewing the info on microdata, I 
> certainly agree that microdata would be a great fit for marking up 
> bibliographies and their entries. I do hope that a controlled vocabulary 
> is worked out and gets widely adopted... but I recall this issue was 
> already discussed at length.

I encourage any interested in this to write a vocabulary spec. There are 
some samples you can use to make new vocabularies:

   http://www.whatwg.org/specs/vocabs/current-work/

> In my understanding, microdata certainly seems like a sufficient way to 
> handle bibliography entries--once again, hoping that a standardized 
> vocabulary develops. The scheme I discussed about introducing a 
> 'bibliography' element and reusing the 'dt' and 'dd' elements within, I 
> simply felt was consistent with the introduction of other new HTML5 
> elements describing the pieces of a virtual document (e.g., article, 
> section, figure, aside, etc.).  Additionally, the scheme consistently 
> reused the elements 'dt' and 'dd' in the 'bibliography' context just as 
> they are reused in the new 'figure' and 'details' context.  Although, I 
> have to admit I'm not sure I'm a fan of this element overloading as 
> opposed to introducing explicit tags to cover these concepts when 
> appropriate.  But I do understand that HTML5 is constrained by legacy 
> HTML and also that microdata is another way to work around these 
> constraints.

I think for bibliography data, we're going to need much more detail than 
we can really get just with a section and dt/dd.

> I'm not arguing that microdata isn't the best approach here; but it 
> should be considered that first class elements are more legible than 
> microdata. And I'm sure this is why many of the new HTML5 elements are 
> not implemented as microdata.

Most of the new elements are more intended to make styling easier than to 
get data out of the page. The use cases intended to get data out of the 
page -- contact information, events, work licensing -- all use Microdata.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'