[whatwg] Trying to work out the problems solved by RDFa
Calogero Alex Baldacchino
alex.baldacchino at email.it
Sat Jan 3 18:23:57 PST 2009
Toby A Inkster ha scritto:
> Calogero Alex Baldacchino wrote:
>
>> My concern is: is RDFa really suitable for everyone and for Web
>> automation? My own answer, at first glance, is no. That's because RDF(a)
>> can perhaps address nicely very niche needs, where determining how much
>> data can be trusted is not a problem, but in general misuses AND
>> deliberate abuses may harm automation heavily
>
> If your agent isn't going to trust the data gleaned from RDFa, then
> why should it trust the data gleaned from the web page's natural
> language? If the page has been authored by a reprobate that cannot be
> trusted to put honest and correct data in a few RDFa attributes, why
> should we trust their prose text?
>
If you sell computers but your site talks about cars I'll never buy a
notebook from you; thus you're not cheating me, but yourself and
damaging your business. But if you believe cars are searched more often
than computers (just an example), one may use false metadata to cheat
any UAs relying on metadata instead of prose, and take me on a store
selling computers instead of cars.
Reliability of metadata (with respect to the described data) is an issue
separated from reliability of content: it's not up to any UA to
understand AND filter content basing on the author being trusted to be
saing the truth (such would be a form of censorship), but if I ask the
UA to bring me a page talking about horses, I don't want it to bring me
a page talking about v.i.a.g.r.a. (that's spam), thus it is up to any UA
relying on metadata to understand AND filter them basing on their
reliability.
> An oft-quoted answer is that the prose text is "visible" whereas the
> RDFa is somehow "invisible". Apart from the fact that UIs which make
> use of data pulled in from RDFa will make this data visible, there is
> also the fact that RDFa, unlike an external RDF/XML file, or some
> metadata embedded in a <script> block, makes use of as much visible
> data as possible: visible links, visible text, etc.
>
> <p>My name is <span property="foaf:name"
> about="#me">Toby Inkster</span>.</p>
>
> If you can't trust someone to correctly mark up what their name is,
> then why trust them to mark up what deserves <em>phasis? Why believe
> the <address> they provide? What if the instance they marked up with
> <dfn> is not really the defining one? What if a <var> is really a
> constant?
>
I don't really need a proper markup to understand a name is a name, a
variable is a variable, a definition is a definition, and so on; you can
use plain text and I'll understand your content the same way. If one
makes a mistake when combining a <dfn> with an anchor, the result may be
a broken link, perhaps making me look for a better site. If one's
misusing <var> or <em>, the worst possible consequence is a bad
presentation, and a bad presentation can be an attempt to cheat a UA (as
when people puts a lot of keywords in a page and style them with the
same color as the background to cheat search engines), but such is only
if it is a deliberate choice, not a misuse (and I'm concerning mainly on
abuses) -- anyway, it is easier to cheat a UA by the mean of false
metadata than cheating a human person by the mean of wrong markup.
If some markup is like,
<p>We sell <a href="www.cheatingcarseller.com" property="foaf:name"
content="Toby Inkster">cars</a></p>
in any advertisement, I'll notice it's about cars and I'll choice
whether to follow it or not, basing on my interest at the moment, but if
I query "Toby Inkster" to a semantic UA blindly relying on metadata, I
might get a page of a cars webstore instead of your homepage (for instance).
Furthermore, I started my replies from a Charles McCathieNevile's mail,
explicitly talking about trusted data and (mainly) small use cases, not
a wide-scale web automation. If there's no agreement about what kind of
needs are best addressed by RDFa, maybe I have to agree with people
saying that technology must grow and become more mature (or, at least,
better understood) before it is merged into HTML5 specification (and
2023 is far enough to accomplish such a goal :-) ). And I re-throw my
suggestion to map RDFa attribute to data-rdfa-* attributes and build
RDFa processor plugins for most common browsers, to test HTML5 and RDFa
convergence in a wider scale before having browser natively supporting
RDFa in HTML5 documents (for the purpose of a test - but not only - I
don't think "data-rdfa-property" vs "rdfa:property" vs "property" would
be much of a problem).
I'm not saying RDFa is a bad thing, or it is useless, I just don't think
any kind of markup can fit perfectly the semantic of "random" content
for the purposes of a "global", wide-scale and automatic classification
of content.
Best regards,
Alex
--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f
Sponsor:
Incrementa la visibilita' della tua azienda con l'invio di newsletter e campagne email marketing.
* Con investimento di soli 250 Euro puoi incrementare la tua visibilita'
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8350&d=4-1
More information about the whatwg
mailing list