[whatwg] Trying to work out the problems solved by RDFa

Benjamin Hawkes-Lewis bhawkeslewis at googlemail.com
Mon Jan 12 14:14:14 PST 2009

On 12/1/09 20:26, Calogero Alex Baldacchino wrote:
> I just mean that, as far as I know, there is no official standard
> requiring UAs to support (parse and expose through the DOM) attributes
> and elements which are not part of the HTML language but are found in
> text/html documents.

Perhaps, but then prior to HTML5, much of what practical user agents 
must do with HTML has not been required by any official standard. ;)

RFC 2854 does say that "Due to the long and distributed development of 
HTML, current practice on the Internet includes a wide variety of HTML 
variants. Implementors of text/html interpreters must be prepared to be 
'bug-compatible' with popular browsers in order to work with many HTML 
documents available the Internet."


HTML 4.01 does recommend that "[i]f a user agent encounters an element 
it does not recognize, it should try to render the element's content" 
and "[i]f a user agent encounters an attribute it does not recognize, it 
should ignore the entire attribute specification (i.e., the attribute 
and its value)".


Clearly these suggestions are incompatible with respect to attributes; 
AFAIK all popular UAs insert unrecognized attributes into the DOM and 
plenty of web content depends on that behaviour.

>> Reuse of "data-*" by DHTML widgets would not impose any additional
>> requirements on user agents, so it would be fine from the perspective
>> elaborated above. It wouldn't change the language by the back door.
> Really? Is it so much different from the case of the pattern attribute
> (which addresses, at the UA and language level, a problem earlier solved
> by scripts -- e.g. getting elements by their ids)? I don't think it's
> very different. From this perspective, if data-* attributes existed
> before the pattern attribute, someone might have used them to declare a
> regex then used by a script implementing a generic checking, and such
> might have been a good reason to add the pattern attribute to form
> inputs, requiring UAs to contrast the input value to its relative
> regular expression (a solution wich also works for UAs not supporting
> scripts, for instance).

Just like proprietary elements/attributes introduced with user agent 
behaviours (marquee, autocomplete, canvas), scripted uses of "data-*" 
might suggest new features to be added to HTML, which would then become 
requirements for UAs.

But unlike proprietary elements/attributes introduced with user agent 
behaviors, scripted uses of "data-*" do not impose new processing 
requirements on UAs.

Therefore, unlike proprietary elements/attributes introduced with user 
agent behaviors, scripted uses of "data-*" impose _no_ design 
constraints on new features.

Establishing user agent behaviours with "data-*" attributes, on the 
other hand, imposes almost as many design constraints as establishing 
them with proprietary elements and attributes. (There's just less 
pollution of the primary HTML "namespace".)

If no RDFa was in deployment, you could argue it would be less wrong 
(from this perspective) to abuse "data-*" than introduce new attributes.

But to the extent that these attributes are already in use in text/html 
and standardized within the "http://www.w3.org/1999/xhtml" namespace, 
processing requirements are effectively already being imposed on user 
agents (such as not introducing conflicting treatment of the "about" 
attribute). All that adding user agent behaviours with "data-rdfa*" 
attributes would do at this point is add _more_ requirements, without 
rescuing the polluted attributes.

 > I also guess that,
> if microformats experience (or the "realworld semantics" they claim to
> be based on) had suggested the need to add a new element/attribute to
> the language, a new element/attribute would have been added.

I'm not really sure what you mean.

(It's watching the microformats community struggle with the problem of 
encoding machine data equivalents, for things like dates and telephone 
number types and measurements, that persuaded me HTML5 should include a 
generic machine data attribute, because it seems likely to me that the 
problem will be recurrent.)

Benjamin Hawkes-Lewis

More information about the whatwg mailing list