[whatwg] RDFa Problem Statement (was: Creative Commons Rights Expression Language)

Manu Sporny msporny at digitalbazaar.com
Mon Aug 25 18:50:09 PDT 2008


Ian Hickson wrote:
> I have no idea what problem RDFa is trying to solve. I have no idea what 
> the requirements are.

Ian, this is not an official response from the RDF in XHTML Task Force
or the Semantic Web Deployment Workgroup. It is a personal attempt to
outline some of the problems that RDFa is addressing.

Web browsers currently do not understand the meaning behind human
statements or concepts on a web page. While this may seem academic, it
has direct implications on website usability. If web browsers could
understand that a particular page was describing a piece of music, a
movie, an event, a person or a product, the browser could then help the
user find more information about the particular item in question. It
would help automate the browsing experience. Not only would the browsing
experience be improved, but search engine indexing quality would be
better due to a spider's ability to understand the data on the page with
more accuracy.

Currently, browsing is a very manual process, requiring a user to find
information about a particular subject and then copy-paste that
information from one page into another search page to explore
information about the subject. If we are to automate the browsing
experience and deliver a more usable web experience, we must provide a
mechanism for describing, detecting and processing semantics. If we are
to improve the search and indexing accuracy of web spiders, we must
provide a mechanism to describe, detect and process semantics.

Here's a really short intro video on why semantics are important (I'm
sure you already know all this stuff, but it outlines why online
communities are concerned about semantics and is meant to educate those
that aren't familiar with the concept of web semantics):

http://www.youtube.com/watch?v=OGg8A2zfWKg

The Microformats community has done a remarkable job of working on the
web semantics problem, creating several different methods of expressing
common human concepts (contact information (hCard), events (hCalendar),
and audio recordings (hAudio)). The method employed by the Microformats
community to embed semantics in web pages, using pre-existing HTML4 tags
and re-purposing them, was taken because none of the standards bodies
were effectively tackling the problem of embedded web semantics at the
time. In short, the community did the best that they could with what was
available to them at the time.

The results of the first set of Microformats efforts were some pretty
cool applications, like the following one demonstrating how a web
browser could forward event information from your PC web browser to your
phone via Bluetooth:

http://www.youtube.com/watch?v=azoNnLoJi-4

Here is another demonstration of how one could use music metadata
embedded in a web page to find more information about your favorite band:

http://www.youtube.com/watch?v=oPWNgZ4peuI

or how one could use movie metadata on a web page to find more
information about a movie:

http://www.youtube.com/watch?v=PVGD9HQloDI

The Mozilla Labs Aurora demos also show that semantic web markup is
necessary in order to execute upon some of the ideas demonstrated in
their future browsers project:

Aurora - Part 1 - Collaboration, History, Data Objects, Basic Navigation
http://www.vimeo.com/1450211

Aurora - Part 2 - Geo-location-based browsing
http://www.vimeo.com/1476338

Aurora - Part 3 - Integrating Web w/ Physical Environment
http://www.vimeo.com/1481810 (non-WebHD)

Aurora - Part 4 - Personal Data Portability
http://www.vimeo.com/1488633

Both RDFa and Microformats enable these user interaction scenarios and
make browsing the web a richer experience.

If one understands web semantics to be an important part of the web's
future, the question then becomes, why RDFa? Why not Microformats?

While there are a number of technical merits that speak in favor of RDFa
over Microformats (fully qualified vocabulary terms, prefix short-hand
via CURIEs, accessibility-friendly, unified processing rules, etc.),
this issue really boils down to one of centralized innovation vs.
distributed innovation.

The Microformats community, and all communities like it, require a group
of people to come together, collaborate and create a standard vocabulary
to express ALL semantics. A somewhat strained analogy would be bringing
in representatives from all of the cultures of the world and having them
agree on a universal vocabulary. It is an untenable prospect, there is
too much diversity in the world to agree on one master vocabulary. This
is, however, the approach that Microformats has taken, for better or worse.

When you do not scope vocabularies, like the Microformats community has
chosen to do, you force new vocabulary development through a design
bottleneck. This isn't a theoretical bottleneck, it is one that we deal
with each day in the Microformats community.

The RDFa approach is to remove this vocabulary development bottleneck by
addressing the problem of creating a method of semantics expression. The
web has always relied on distributed innovation and RDFa allows that
sort of innovation to continue by solving the tenable problem of a
semantics expression mechanism. Microformats has no such general purpose
solution.

In short, RDFa addresses the problem of a lack of a standardized
semantics expression mechanism in HTML family languages. RDFa not only
enables the use cases described in the videos listed above, but all use
cases that struggle with enabling web browsers and web spiders
understand the context of the current page.

-- manu

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: Bitmunk 3.0 Website Launches
http://blog.digitalbazaar.com/2008/07/03/bitmunk-3-website-launches




More information about the whatwg mailing list