[whatwg] Annotating structured data that HTML has no semantics for

Leif Halvard Silli lhs at malform.no
Tue May 12 20:45:13 PDT 2009


Tab Atkins Jr. on Tue, 12 May 2009 12:30:27 -0500:

> On Tue, May 12, 2009 at 5:55 AM, Eduard Pascual:
>   
>> > [...] It would be preferable to be able
>> > to state something like "each (row) <tr> in the <table> describes an
>> > iguana: the <img>s are each iguana's picture, the contents of the
>> > <a>'s are the names, and the @href of the <a>'s are the URLs to their
>> > main pages" just once.
Indeed.
>> > If I only need to state the table headings once
>> > for the users to understand this concept, why should a micro-data
>> > consumer require me to state it 20 times, once for each row?
>> > Please note how such a page would be quite painful to maintain: any
>> > mistake in the micro-data mark-up would generate invalid data and
>> > require a manual harvest of the data on the page, thus killing the
>> > whole purpose of micro-data.

Indeed. (But of course, for "copy-paste" safety, the format has to be 
"wordy" and repetitive.)

>>  And repeating something 20 (or more)
>> > times brings a lot of chances to put a typo in, or to miss an
>> > attribute, or any minor but devastating mistake like these.
>>     
>
> Well, he didn't quite *ignore* it - he did explicitly call out that
> requirement to say that his solution didn't solve it at all.  He also
> laid down the reason why - it's unlikely that any reasonable simple
> in-place metadata solution would allow you to do that.  You either
> need significant complexity, some reliance on language semantics (like
> tables can rely on their headers), or moving to out-of-band
> specification, likely through a Selectors-based model.
>   
Indeed. And Ian's arguments against a selector based model (the claim 
that authors have problems understanding selectors) was one of the least 
convincing arguments he made, I think.  CSS and selectors appears to be 
one of the best understood technologies of the web.
> The last is likely the best solution for that, and is even easier to
> implement within Ian' simplified proposal.  I don't see a good reason
> why that can't advance on a separate track, as (being out-of-band) it
> doesn't require changes to HTML to be usable.
>
> I floated a basic proposal for Cascading RDF[1] several months ago,
> and someone else (I think Eduard?  I'd have to check my archives) did
> something very similar.
>
> [1]: http://www.xanthir.com/rdfa-vs-crdf.php
>   

Hear hear.  Lets call it "Cascading RDF Sheets". It could be used for 
the following purposes:

1. The IRI of the Cascading RDF Sheet could serve the role of profile URI;
2. The Cascading RDF Sheet itself could serve the role of a profile 
document; (Finally we could get some kind of registered profile format.)
3. Just as CSS sheets today, a cRDFsheet could be used as authoring 
help, when authoring with a microformat. HTML editing programs could 
offer the elements + classes in the Cascading RDF Sheet to authors, the 
same way that some editors to today use the selectors in stylesheets as 
a "vocabulary repository" for the current file or project. CSS selectors 
is already a well known format. (One may then, of course, already use a 
CSS style sheet for this, kind of. But this soon becomes clumsy. Better 
to separate styling from semantics and structure.)

In fact, I myself begun looking into creating something along these 
lines ... Though rather than a "Cascading RDF Sheet", I looked into 
creating a "Profile Style Sheet" which could be used to define a machine 
readable microformat profile. My motivation for doing this was the 
authoring side of things, as I have been using a text editor which more 
or less uses CSS selectors the same way. (Instead of only offering me to 
pick "<p>" it also offers me to to pick <p class="a"> etc.) Ian's 
proposal do not give much thought about the authoring side, I feel, 
except  for the more casual author. For authors, it is helpful to have a 
"recipe" document and to avoid repetition and "data rot", as you 
mentioned in another message.

Ian's microdata format is easy to grasp the inner logics of - that is a 
good side of the proposal, this could help that it gets used.  But when 
it comes to author's and author groups' ability to define their own, 
decentralised semantics etc., then a decent profile format, which could 
be easily and simply integrated with authoring tools,  seems like a just 
as important  issue as a super simple microdata format.

The microformats.org community does not really have a machine parsable 
profile format. If there were such a format, I believe we would see more 
of more decentralized microformats.
-- 
leif halvard silli


More information about the whatwg mailing list