[whatwg] A Selector-based metadata proposal (was: Annotating structured data that HTML has no semantics for)

Fri May 15 14:11:49 PDT 2009

On Thu, May 14, 2009 at 9:50 AM, Eduard Pascual <herenvardo at gmail.com> wrote:
> I have put online a document that describes my idea/proposal for a
> selector-based solution to metadata.
> The document can be found at http://herenvardo.googlepages.com/CRDF.pdf
> Feel free to copy and/or link the file wherever you deem appropriate.
>
> Needless to say, feedback and constructive criticism to the proposal
> is always welcome.
> (Note: if discussion about this proposal should take place somewhere
> else, please let me know.)

Ah, thanks Eduard.  Have you cleaned this up significantly since the
last time this discussion came up?  It seems to read much better now
than before, but it's possible that I was just stupider several months
ago.

As far as I can tell (I am a novice, so YMMV), it conveys everything
that RDFa does, and more specifically, matches RDF-EASE's features.  I
think it has a friendly syntax than RDF-EASE, though, which I think is
tied too much to the exact structure of RDFa.  The author does
acknowledge that he leans directly on RDFa, but I think that's a
mistake - RDFa is designed to deal with the limitations of the
attr/value pairs that you can place on elements.  When you're
designing a new language by itself, you can employ the magic of
syntactic sugar to tighten things up and make them easier and more
expressive.

Frex, RDF-EASE uses -rdf-property to specify what property something
should be, and -rdf-content to specify whether a property should take
its value from the element's content or from an attribute.  This split
is necessary when embedding attributes in HTML, but your proposal
combines those two things into a single line, which I think is much
clearer, and makes it easier to use when specifying multiple
properties.  (Not to mention making inline specification even easier
than RDFa, as you point out.)

I recommend using 'self' as the value for @|subject that corresponds
to a blank node for each matched element.

How would you write the situation where you have two vocabs applying
to content in an intertwined way, with different subjects?  I can't
think of an explicit example right now, but say you had content like
<foo><bar><baz/></bar></foo>, where <foo> and <bar> are both subjects
using different vocabs, and <baz> has facts about both of them.  It
seems like you can handle this by specifying two separate blocks with
an identical selector but different @|subject rules.  Is this correct?

If so, it seems then that at least one of those @|subject rules would
require either a url(...) or blank(...) value, which limits ones
ability to use this technique on multiple elements on a page.
RDF-EASE uses the nearest-ancestor(selector) functional notation to
indicate these sorts of relationships.

(Ah, here we go, an example:
http://buzzword.org.uk/2008/rdf-ease/spec#ssec-properties--rdf-about
talks about mixing foaf and vcard together, with one scenario matching
what I outlined earlier.)

Your proposal doesn't seem to have a way to specify the datatype
currently.  Since several people have brought up the lack of datatype
as a weakness in Ian's microdata proposal, this may be a weakness.

RDF-EASE allows you to 'reset' elements, *overriding* metadata given
by less-specific selectors rather than just augmenting it.  This does
seem like a nice ability, specifically when you need to provide a
general rule for a particular class, say, and give a slightly
different rule for one of those elements with a particular id.  On the
other hand, you can just write the general rule with :not() to avoid
the more specific element.  I'm not sure whether this is good enough,
or if it really is easier to use something like 'reset'.

~TJ