[whatwg] Fuzzbot (Firefox RDFa semantics processor)
Martin Atkins
mart at degeneration.co.uk
Sun Jan 11 21:23:04 PST 2009
Ian Hickson wrote:
>
>> They have already solved some problems with RDF and wish only to adapt
>> this generalized solution to work in HTML, while you wish to re-solve
>> all of these problems from the ground up.
>
> I don't necessarily wish to resolve the problems -- if they have existing
> good solutions, I'm all in favour of reusing them. I just want to know
> what those problems are that we're solving, so that we can make sure that
> the solutions we're adopting are in fact solving the problems we want to
> solve. It would be irresponsible to add features without knowing why.
>
I would assume that our resident proponents are already satisfied that
their higher-level problem have been solved, and this is why they're
frustrated that you won't just let them map their existing solutions
into HTML all in one fell swoop.
I'm not sure I'd put myself into the "RDF proponent" bucket, but I do
know one use-case of RDF that I've encountered frequently so I'll post
it as a starting point.
The FOAF schema for RDF[0] addresses the problem of making personal
profile data machine-readable along with some of the relationships
between people. From the outside looking in, it seems that the goal they
set themselves was to make machine-readable the sort of information you
find on a social networking site.
One problem this can solve is that an agent can, given a URL that
represents a person, extract some basic profile information such as the
person's name along with references to other people that person knows.
This can further be applied to allow a user who provides his own URL
(for example, by signing in via OpenID) to bootstrap his account from
existing published data rather than having to re-enter it.
Google Social Graph API[1] apparently makes use of FOAF (when serialized
as XML) as one of the sources of data so that given a URL that
represents a person it can return a list of URLs that represent friends
of that person.
The Google Profiles application[2] makes use of the output of the Social
Graph API to suggest URLs that a user might want to list on his profile
page, so the user only needs to fill in a couple of URLs by hand.
So, to distill that into a list of requirements:
- Allow software agents to extract profile information for a person as
often exposed on social networking sites from a page that "represents"
that person.
There is a number of existing solutions for this:
* FOAF in RDF serialized as XML, Turtle, RDFa, eRDF, etc
* The vCard format
* The hCard microformat
* The PortableContacts protocol[3]
* Natural Language Processing of HTML documents
- Allow software agents to determine who a person lists as their friends
given a page that "represents" that person.
Again, there are competing solutions:
* FOAF in RDF serialized as XML, Turtle, RDFa, eRDF, etc
* The XFN microformat[4]
* The PortableContacts protocol[3]
* Natural Language Processing of HTML documents
-----------------------------------------------
Assuming that the above is a convincing problem domain, now let's add in
the following requirement:
- Allow the above to be encoded without duplicating the data in both
machine-readable and human-readable forms.
Now our solution list is reduced to (assuming we consider both
requirements together):
* FOAF in RDF serialized as RDFa or eRDF
* The hCard microformat + the XFN microformat
* Natural Language Processing of HTML documents
All three of the above options address the use-cases as I stated them --
the Social Graph API apparently uses all three if you're willing to
consider a MySpace-specific "screen-scraper" as Natural Language
Processing -- so what would be the advantages of the first solution?
* Existing RDF-based systems can use an off-the-shelf RDFa or eRDF
parser and get the same data model (RDF triples of FOAF predicates) that
they were already getting from the XML and Turtle RDF serializations,
reducing the amount of additional work that must be done to consume this
format.
* FOAF has an extensive vocabulary that's based on fields that have
been observed on social networking sites, while hCard is built on vCard
which has a more constrained scope intended for the sort of entries
you'd expect to find in an "address book".
* FOAF has been adopted -- usually in the RDF-XML serialization -- by
some number of social networking sites (e.g. LiveJournal) so they are
presumably already somewhat familiar with the FOAF vocabulary and may
therefore be able to adopt it more easily in the RDFa or eRDF
serializations.
Though there are of course also some disadvantages:
* Some sites are already publishing XFN and/or hCard so consuming
software would need to continue to support these in addition to
FOAF-in-HTML-somehow, which is more work than supporting only XFN and
hCard. (In other words, "XFN/hCard already work today")
* RDFa requires extensions to the HTML language, while XFN, hCard and
NLP do not.
* Many existing FOAF parsers are not actually RDF parsers but are
rather using stock XML parsers and assuming a particular tree layout, so
they would not be able to reuse any code in processing triples from RDFa
or eRDF.
-------------------------------------
Is this the sort of thing you're looking for, Ian?
Much of the above section could be applied to any other RDF vocabulary
with a bit of search and replace, but I'll leave that to others since
FOAF is the only RDF vocabulary with which I have any experience.
(and if I've misrepresented any of the facts about FOAF or RDF I'm happy
to be corrected. I'm writing this only in an attempt to move the
discussion forward; I'm currently neutral on whether RDFa should be
adopted into HTML5.)
[0]http://www.foaf-project.org/
[1]http://code.google.com/apis/socialgraph/
[2]http://www.google.com/support/accounts/bin/answer.py?answer=97703&hl=en
[3]http://portablecontacts.net/
[4]http://www.gmpg.org/xfn/
More information about the whatwg
mailing list