[whatwg] RDFa

Thu Aug 21 18:48:39 PDT 2008

(Back on WHATWG.)

On Thu, 21 Aug 2008, Ben Adida wrote:
> Ian Hickson wrote:
> > I've taken this off the WHATWG list since it isn't about HTML5.
> 
> That's unfortunate, since that's where it started regarding adding RDFa 
> support in HTML5. It would be nice if at least folks could see the 
> answer I sent.

I've whitelisted your e-mail address so that you can post to the WHATWG 
list without subscribing. However, if the e-mails on this thread were 
intended to be a request that the RDFa attributes be considered for HTML5, 
I must admit to having misunderstood the request.

I've addressed RDFa only in this message (as opposed to creative commons 
markup). It would be helpful if you could send a separate message that is 
specifically asking for the changes you desire, and explaining what 
problem it is they address, and what research shows that that is an 
important enough problem that we should address it. The messages sent so 
far seemed to be more about Creative Commons than about RDFa.

> Yes, there are lots of people who ask us (CC) if we have tools to 
> automatically embed license + attribution + title + more information and 
> automatically extract it.

That's weird. I wonder why these people are asking Creative Commons for 
these tools and not asking other communities (e.g. the WHATWG community). 
Usually I find that when there is a need, the people with that need 
approach multiple different groups trying to get their need met.

I'm also curious as to why, if this is so commonly requested, similar 
features such as hCard and hCalendar have seen limited uptake. I mean, 
sure, they are used on hundreds of sites, but on the cosmic scale of 
things that's pretty limited, especially given the wide applicability of 
those formats.

> Every time you say "copyright statement" as if that's the entire issue, 
> it shows that you haven't taken much time to think about our needs.

Who is the antecedant of "our"? It's the needs of the entire Web community 
that have to be taken into account for HTML. There are thousands of small 
communities with their own needs, we can't possibly address each one in 
HTML. Indeed, we have design principles that make addressing the needs of 
small communities an explicit non-goal.

> If the world always went by what the biggest player thinks is useful, 
> innovation would slow down pretty quickly. Some of your competitors are 
> quite a bit more interested in RDFa than you are.

There certainly is _some_ interest in RDFa, or we wouldn't be having this 
conversation. But I haven't seen the level of interest that, say, video or 
offline Web applications have had. I haven't even seen the level of 
interest that random HTML elements like <abbr> have received. The interest 
in technologies like RDF seems to be almost exclusively from people in the 
metadata processing space.

> > You can create your own vocabularies without clashing with the 
> > Microformats community and without introducing extensions to HTML.
> 
> How do you know you're not clashing?

Use a unique name, e.g. include a domain name in the name, as in 
"license.creativecommons.org" or "home.foaf.w3.org", or use a name you 
know isn't used because it's an unusual name, e.g. "cc:license".

> > Fundamentally my opinion is that RDFa is solving a problem that people 
> > at large have no interest in addressing.
> 
> I'm not sure what "people at large" means, but I do think your opinion 
> is not very well informed. There is significant interest in a generic 
> syntax for adding metadata inside HTML. It's a lot of the same drive for 
> microformats, for all the folks who can't handle the centralized process 
> and who need the vocab modularity.

I honestly don't see significant interest in computer-readable metadata. 
Just look at the average user's media library; most users have terrible 
metadata hygene.

But in any case HTML5 already has extension mechanisms, so the discussion 
should not be over whether RDFa is worth it or not, the discussion should 
be over what extension mechanisms RDF needs that HTML5 doesn't provide.

> > That's not to say that I don't think computer-readable detailed 
> > metadata is a great idea and everything, I just don't think it'll work 
> > when your average human faces it.
> 
> Why don't we at least build a real mechanism for expressing web-based 
> data, with distributed innovation and such, and the good parts of 
> microformats (*in* the HTML, DRY, etc...) And then we'll see. But you 
> keep basing your opinion on a past that never attempted to do this 
> right.

The failures of the past have had little to do with the syntax or 
expression mechanisms. They have to do with users simply not caring. It 
doesn't matter how wonderful or overengineered the solution we give them, 
if they don't care, they aren't going to use it correctly (if at all).

> > With things like licensing metadata, where the person who benefits the 
> > most isn't the person who writes the data, users simply aren't going 
> > to bother doing a good job.
> 
> That's an incorrect assumption.

It's a verifiable fact! Just look at metadata like lang="", character 
encoding information, Content-Type headers, etc. It's so unreliable that 
any serious system that processes large amounts of data from multiple Web 
authors always ends up ignoring the metadata (or at best using it as a 
hint) and using heuristics to determine the real information.

In controlled environments, e.g. on a single site, or in a single person's 
media library, or within a small coherent community where all the 
participants have compatible goals, it is possible to get enough 
discipline that metadata is both reliable and useful. And for such 
communities we have a raft of extension mechanisms, and clashes can be 
avoided easily by simply using names that nobody in the community is 
already using.

But as soon as this kind of thing is applied to people outside the 
tightnit community, the metadata becomes an utter mess, misused, wrong, 
missing, syntactically incorrect, semantically incorrect, unusable. We 
have shown time and time again that when metadata mechanisms face the 
wider Web community, they fail. Ignoring this doesn't make it go away.

> We want, for example, to allow folks to express how people should give 
> them attribution. There's a very good reason to do so as a publisher, so 
> you get proper credit.

There are several very good reasons to provide accurate Content-Type and 
character encoding information, but people still widely get it wrong.

Note: I did read the ccREL paper before I wrote the previous message.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'