[whatwg] Trying to work out the problems solved by RDFa

Fri Jan 2 04:01:34 PST 2009

On 2/1/09 10:38, Henri Sivonen wrote:
> More to the point, Microformats not only require per-format processing
> but the processing required for each Microformat isn't specified at all.
> That's bad.

Some do have processing specified (at least to some degree):

http://microformats.org/wiki/hcard-parsing

For the rest, this seems like something fixable, so I'm not sure how 
this is more to the point?

 > That is, have
> there been attempts of defining unified parsing while retaining the feel
> of Microformats without relying on the namespace mapping context from
> the layer below?

I suppose -

* http://microformats.org/wiki/design-patterns (reusable microformat 
components)

* http://microformats.org/wiki/parsing-brainstorming (attempt to 
actually specify precise parsing rules for all microformats)

* 
http://microformats.org/discuss/mail/microformats-discuss/2008-August/012435.html 
(proposal for specifying generic mapping of microformats to RDF - I 
think there's been more detailed work by various parties in this regard, 
but I'm not sure where best to link to)

- are approaching this problem from three different angles.

> Why hasn't the community fixed it?

I think the microformats community moves slowly, for better or worse, 
even when it agrees that there's a problem to solve. For example, 
progress on the problems with the abbr-design-pattern has been 
snail-like while losing the community an important user (the BBC), 
although admittedly the problems are basically intractable in HTML4/XHTML1.

I'm not sure how far the community as a whole does or doesn't view the 
lack of unified parsing as one of its bigger problems; I'm no spokesman 
though.

> Is it a non-problem after all in practice?

It's an additional barrier to creating and using (especially new) 
microformats or other extractable patterns.

The microformats community isn't there to support the creation of new 
extractable patterns outside the microformats community, which is where 
an iguana database pattern would likely need to be.

It could of course be the RDFa curie is worse than the disease.

An advantage of RDFa that is not related to curies and for which the 
three approaches towards unified extraction mentioned above are not a 
substitute is that RDFa provides a generic way to include hidden 
machine-friendly equivalents to human-readable information in the form 
of the (not especially well-named) "content" attribute.

http://www.w3.org/TR/rdfa-syntax/#rdfa-attributes

In general, this is something microformats rightly try to avoid:

http://microformats.org/wiki/principles

But sometimes it's unavoidable:

http://microformats.org/wiki/machine-data

http://microformats.org/wiki/value-excerption-pattern-issues

I do not believe that HTML5 as currently specified would remove the need 
to employ similar hacks as are mentioned on those pages, although it 
will remove the need in many cases (e.g. for datetimes within a given 
range), which is an improvement.

> Is the problem in the case of recipes that the provider of the page
> navigation around the recipe is unwilling to license the navigation bits
> under the same license as the content proper?

I thought Toby's example was that each recipe on the page needed a 
different licence, rather than a distinction between the main content 
area and the navigation.

> In the case of images, why should a program inferring something about
> licensing trust assertions made in a different HTTP resource (possibly
> even from a different Origin)?

Why should it trust assertions made in the same resource?

For example, presumably you could download an image, change its 
licencing metadata, and host it at your own Origin? Admittedly, that's a 
little more work than just hotlinking.

--
Benjamin Hawkes-Lewis