[whatwg] RDFa Features (was: RDFa Problem Statement)

Thu Aug 28 01:58:42 PDT 2008

On Aug 27, 2008, at 16:33, Smylers wrote:

> So that is one disadvantage of URIs: they are long.  In fact they  
> are so
> long that people have gone to the bother of inventing additional  
> syntax
> to avoid having to write them out.

Moreover, having to look up the URIs is a major pain when writing  
software that processes namespaced XML. I can remember "xhtml",  
"xlink" or "svg", but I can't remember the namespace URIs. What random  
year do they contain? Is there a slash in the end?

The RDF community has minted a lot of namespace URIs over the past 10  
years. In addition to minting URIs, they have minted canonical  
prefixes for each one. It would be interesting to analyze how often  
those canonical prefixes actually collide and how often the local  
property is within the namespace also collide when the prefix collides.

> That suggests that giving users the freedom to use either URIs or any
> other prefixes of their choice is superior to forcing them to use  
> URIs,
> surely?

HTML5 isn't really giving language users the choice. It gives the  
choice to designers of language extensions.

It seems to me that if the RDF community isn't going to stop using  
URIs when they mint new property vocabularies, the only way to get a  
bidirectional generic registryless open-ended mapping between syntax  
going into text/html resources and RDF easy using URIs for identifying  
properties.

So the cost of using URIs should be weighed against the benefit of  
having such a generic bidirectional mapping that is open-ended in the  
sense that the mapping algorithm doesn't need new out-of-band input  
from a registry (other than the URI scheme and domain name registries  
apparently...) when someone mints a new vocabulary.

. . .

If we didn't want the mapping to be bidirectional and only wanted it  
to be unidirectional from HTML identifiers onto URIs, we could specify  
that you can make a URI from any identifier you find in HTML by  
concatenating it to a common URI prefix. It doesn't really matter much  
if we call the prefix http://www.w3.org/1999/xhtml/vocab#, http://n.whatwg.org/rdf-compat# 
  or something else. (To avoid collisions between e.g. rel and class,  
you could put the attribute name in the URI: http://n.whatwg.org/rdf-compat/rel#keyword 
  http://n.whatwg.org/rdf-compat/class#class-name.)

. . .

But with a scheme that maps HTML syntax to *some* URIs, can the Power  
of RDF solve the rest? After all, the Power of RDF already assumes  
that clients have hard-coded knowledge of URI schemes (at least http)  
and can consult DNS dynamically. Moreover, the Power of RDF already  
assumes a dynamically consulted mapping registry (like http://creativecommons.org/ns#) 
  per each vocabulary. We could just say that HTML5 is one big  
vocabulary, so it can have one mapping registry.

The HTML5 draft already specifies a registry for rel values:
http://wiki.whatwg.org/wiki/RelExtensions
What if this registry where extended to contain machine-readable  
equivalence statements between rel values and RDF properties and the  
registry was then served at http://n.whatwg.org/rdf-compat/rel with  
n.whatwg.org served by a CDN? An RDF-in-HTML5 client seeing  
rel=license could dereference the URI http://n.whatwg.org/rdf-compat/ 
rel and find that #license is same as http://creativecommons.org/ns#license 
.

When going from RDF to HTML5, the converter would consult the registry  
to find equivalences in the other direction.

Now, one might argue that this introduces a single point of failure  
when n.whatwg.org goes down. However, if you are traversing a graph  
with more than one namespace you'd get distributed failure combined  
with AND. Usually, when you distribute something you want the failure  
to be combined with OR. Suppose that the servers serving namespace  
documents have a 1% probability of being down at a given moment. If  
your graph only depends on one such server your probability of success  
is 99%. However, if traversing your graph requires dereferencing  
namespace URI is from 10 servers, the probability of success is only  
90% (.99^10).

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/