[whatwg] Annotating structured data that HTML has no semantics for

Mon May 18 02:18:10 PDT 2009

Henri Sivonen wrote:
> There's no indirection. A decade of Namespaces in XML shows that both 
> authors and implementors have trouble getting prefix-based indirection 
> right.

It's true that people get this wrong again and again. But it's also true 
that lots of developers understand it once for all, and then 
consistently get it right.

The interesting question here is whether there's a better system.

>> I have been a Java programmer for some years, and
>> still find that convention absurd, horrible, and annoying. I'll agree
>> that CURIEs are ugly, and maybe hard to understand, but reversed
>> domains are equally ugly and hard to understand.
> 
> Problems shared by CURIEs, URIs and reverse DNS names:
>  * Long.
>  * Identifiers outlive organization charts.

That depends on the choice of the URI scheme.

> Problems that reverse DNS names don't have but CURIEs and URIs do have:
>  * "http://" 7 characters of even extra length.
>  * Affordance of dereferencability when mere identifier sementics are 
> meant.

Again, that depends on the URI scheme.

> Problems that reverse DNS names and URIs don't have but CURIEs have:
>  * Prefix-based indirection.

HTML developers regularly have to deal with a much more complicated 
indirection mechanism (CSS).

>  * Violation of the DOM Consistency Design Principle if xmlns:foo used.

I think there is consensus that this is a drawback, but not about how 
significant this is.

> The syntax is simpler for the use cases it was designed for. It uses a 
> simpler conceptual model (trees as opposed to graphs). It allows short 
> token identifiers. It doesn't use prefix-based indirection. It doesn't 
> violate the DOM Consistency Design Principle.

(devil's advocate argument) - so how does the syntax behave for those 
use cases it *hasn't* been designed for?

> Compared to microformats, microdata defines the processing model and 
> conformance criteria. The microformats community has failed to provide 
> processing model and conformance criteria on similar level of detail. 

Indeed.

> The processing model side is perceived to be such a serious issue that 
> the lack of a unified microformats parsing spec is cited as a motivation 
> to use RDFa instead of microformats.

Indeed.

> RDFa uses a data model that is an overkill for the use cases.

It would be interesting to understand which use cases that RDFa can do 
are not supported by "microdata" (I don't understand enough about the 
subject to try myself), and whether the potential advantage of having a 
simpler model outweighs the disadvantage of not using network effects 
and creating a competing syntax.

> ...

BR, Julian