[whatwg] Annotating structured data that HTML has no semantics for

Shelley Powers shelleyp at burningbird.net
Tue May 12 05:30:28 PDT 2009


Philip Taylor wrote:
> On Tue, May 12, 2009 at 11:55 AM, Eduard Pascual <herenvardo at gmail.com> wrote:
>   
>> [...]
>> (at least for now: many RDFa-aware agents vs. zero HTML5's
>> microdata -aware agents)
>>     
>
> HTML5 microdata parsers seem pretty trivial to write -
> http://philip.html5.org/demos/microdata/demo.html is only about two
> hundred lines to read all the data and to produce JSON and
> N3-serialised RDF. It shouldn't take more than a few hours to produce
> a similar library for other languages, including the time taken to
> read the spec, so the implementation cost for generic parser libraries
> doesn't seem like a significant problem.
>   

Writing something that will produce triples may be easy, but what's 
important is that you're producing an RDF model.

Philip, I've been looking at your application, and you're not producing 
the same model for Ian's microdata proposal that is produced using 
either eRDF or RDFa. I'll have more on this later.
> The cost of integration with backend RDF-based systems seems more
> significant - hopefully you could simply replace the frontend RDFa
> parser with a microdata parser and generate the same RDF triples and
> it would all work fine, but I don't know whether that's true in
> practice (because maybe the microdata syntax is too restrictive to
> represent the vocabularies people want to use, and so they'd have to
> go to lots of extra effort to create a new vocabulary).
>
>   
>> [...] there are other cases where
>> separate values might be needed: for example using a street address
>> for the human-readable representation of a location and the exact
>> geographic coordinates as the machine-readable (since not all
>> micro-data parsers can rely on Google Maps's database to resolve
>> street addresses, you know); or using a colored name (such as "lime
>> green" displayed on lime green color) as the human-readable
>> representation of a color, and the hexcode (like #00FF00) as the
>> machine-readable representation.
>>     
>
> You could replace
>   <span itemprop="color">lime green</span>
>   <span itemprop="location">1 High Street</span>
> with
>   <meta itemprop="color" content="#00FF00"><span>lime green</span>
>   <meta itemprop="location.lat" content="56.78"><meta
> itemprop="location.long" content="-12.34"><span>1 High Street</span>
> to get the desired output. (Not particularly elegant syntax, though.)
>
>   

It's funny, but oddly enough, this discussion reminds me of when I 
started at Boeing, right after college. I started just when the great 
debate between SQL and QUEL was ending, in SQL's favor. Most folks still 
feel that QUEL was the "superior" option, but SQL won out in the end 
because it had widespread use, and was supported by more of the 
(powerful) database companies, and hence the companies using the databases.

The same could be said of Betamax versus VHS, and even the recent HDTV 
and Blu-Ray debates: we can get caught up in issues of superiority and 
argue the fine points of (mostly) obscure markup until the cows come 
home, but at some point in time, you have to pick a standard to get 
behind, or no one will any confidence in _any_ of the options being 
proposed--and the concept underlying the competing technologies (or 
standards) is hindered, perhaps for years.

Sorry, I digress. Eduard, looking forward to seeing your own 
interpretation of the best metadata annotation.

Shelley




More information about the whatwg mailing list