[whatwg] Trying to work out the problems solved by RDFa
Ian Hickson
ian at hixie.ch
Fri Jan 9 15:37:30 PST 2009
On Fri, 9 Jan 2009, Ben Adida wrote:
>
> Is inherent resistance to spam a condition (even a consideration) for
> HTML5?
We have to make sure that whatever we specify in HTML5 actually is going
to be useful for the purpose it is intended for. If a feature intended for
wide-scale automated data extraction is especially susceptible to spamming
attacks, then it is unlikely to be useful for wide-scale automated data
extraction.
> If so, where is the concern around <title>, which is clearly featured in
> search engine results?
Nobody is suggesting that user agents derive any behavior from <title>, so
it doesn't matter if <title> is spammed or not. The only effect would be
some spam in the user's session history. Furthermore, <title> is page-
wide, meaning that the actual page author would have to spam the page for
it to be spamed. It is less likely for a user to intentionally visit a
spammy page than for a user to visit a page that happens to contain spammy
content embedded within it (e.g. in blog comments).
If browsers were expected to crawl all pages for all links and then
populate the browser's interface with the most popular links, then one
would quickly expect everyone's browsers to be advertising Viagra, porn
sites, and the like. However, browsers don't do this kind of processing --
indeed, this kind of processing appears to be exactly what RDFa proponents
are trying to enable (though to what end, I'm still trying to find out,
since nobody has actually replied to all the questions I asked yet [1]).
Note that search engines aren't the problem here -- large operations like
search engines are quite capable of running the massive processing
required to filter spam. The problem is automated processing on the
client, where those resources aren't available.
[1] http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2008-December/018023.html
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list