[whatwg] Trying to work out the problems solved by RDFa

Ben Adida ben at adida.net
Fri Jan 9 15:55:12 PST 2009

Ian Hickson wrote:
> We have to make sure that whatever we specify in HTML5 actually is going 
> to be useful for the purpose it is intended for. If a feature intended for 
> wide-scale automated data extraction is especially susceptible to spamming 
> attacks, then it is unlikely to be useful for wide-scale automated data 
> extraction.

It's no more susceptible to spam than existing HTML, as per my previous

> Nobody is suggesting that user agents derive any behavior from <title>, so 
> it doesn't matter if <title> is spammed or not.

And RDFa does not mandate any specific behavior, only the ability to
express structure. The power lies in products like SearchMonkey that
make use of this structure with innovative applications.

Can one imagine tools that make poor use of this structured data so that
they incentivize spam? Absolutely. Is this the bar for HTML5? If bad or
poorly conceived applications can be imagined, then it's not in the

> It is less likely for a user to intentionally visit a 
> spammy page than for a user to visit a page that happens to contain spammy 
> content embedded within it (e.g. in blog comments).

You've done plenty of web security work, and I suspect you know well
that spammy RDFa is the least in a large set of problems that come with
accepting arbitrary markup in blog comments. This is a strawman.

> However, browsers don't do this kind of processing -- 
> indeed, this kind of processing appears to be exactly what RDFa proponents 
> are trying to enable (though to what end, I'm still trying to find out, 
> since nobody has actually replied to all the questions I asked yet [1]).

While client-side processing is indeed an important use case (Ubiquity,
Fuzzbot, etc...), it's not the only one. SearchMonkey, which you
continue to ignore, is an important use case.

Before I invest significant time in responding to your barrage of
questions, I'm looking for a hint of objective evaluation on your end. I
thought I saw an opportunity for productive discussion based on common
ground with SearchMonkey, but this has led again into a new and
close-to-bogus reason for blocking consideration of RDFa.

> Note that search engines aren't the problem here

Actually, we were discussing SearchMonkey, so I think it's very much the
context for this sub-thread. You continue to ignore SearchMonkey, for
reasons which, as I've pointed out in a response earlier today, are
factually incorrect.


More information about the whatwg mailing list