[whatwg] Micro-data/Microformats/RDFa Interoperability Requirement

Manu Sporny msporny at digitalbazaar.com
Thu May 7 07:56:46 PDT 2009


Ian Hickson wrote:
>> - Not triggering output in a Microformats/RDFa parser as a side-effect
>>   of WHATWG micro-data markup.
>> - Not creating an environment where WHATWG micro-data markup breaks or
>>   eliminates Microformats/RDFa markup.
> 
> This isn't possible. 

Things that are impossible just take longer, right :)

In all seriousness, I think there was a mis-communication here, the
things that I'm thinking about are possible, I'll elaborate and further
clarify below.

> Even a regular HTML5 document with no microdata 
> annotations that links in a style sheet ends up triggering output in 
> Microformats and RDFa parsers 

This is fine and is what legacy markup does in the same RDFa processor
that purposefully doesn't throw an error when processing a
non-XHTML1.1+RDFa document. By design, RDFa generates triples for
certain rel/rev values that were semantic terms in HTML4.

> -- Microformats because any use of 'class' 
> can clash with a Microformats class name, 

Yes, and unintended clashes with Microformats is fine as long as
whatever micro-data solution that this community adopts doesn't do
something like:

<div class="vcard"><span class="fn">Rupert Giles</span></div>

and then define different values for 'fn' inside an 'hcard' that
conflict with what the Microformats community has stated that they mean.
Since Microformats don't use named scopes for most of their vocabulary
terms, it's far easier to clash with those terms than it is to clash
with the RDFa terms.

The RDFa community was very careful to not negatively impact existing
Microformats markup. We purposefully did not re-use @class because in
the RDFa designs that did re-use it, it caused a number of
clashes/issues. Similarly, we did not re-use @id because authors were
already using it in a way that did not mesh well with semantic markup
concepts.

So, this argument isn't "Don't use @class at all", but rather "Don't
create ambiguity in @class where there is none currently."

Not re-defining things to mean something different than the Microformats
community has already defined should be a design requirement. For
example, don't create an ambiguous situation where a Microformats parser
would parse an vcard/FN and understand that this is a hCard Microformat
with a "Formatted Name" of "Rupert Giles" and a HTML5 micro-data parser
would do the same and determine that you're talking about something that
is semantically different.

> and RDFa because any use of the "rel" attribute can do the same.

No, not /any/ use - /specific/ uses of rel, and then only if the HTML5
micro-data solution does something that is counter to how RDFa uses the
attribute or the value. Really, the @rel values are only defined for
XHTML1.1, so this is technically less of a concern for RDFa than it is
for Microformats.

The most important issue with RDFa is not re-using attributes already
defined by XHTML1.1+RDFa without them having the exact same use in
HTML5. Attributes like @about, @property, @datatype, @resource, @content
and @typeof.

> It's also not clear to me what RDFa's position in text/html is. As I 
> understand it, RDFa only applies to XHTML. Thus it seems that HTML5 has 
> already broken compatibility with RDFa, since it requires processors to 
> handle text/html content in a non-XML manner.

No exactly, the processing rules for RDFa are more-or-less XML-agnostic
(by design) and can be made completely XML-agnostic with a minimal set
of known changes.

There is this lingering issue of xmlns in HTML5, but even if WHATWG
removes xmlns entirely from HTML5 (which we don't want you to do), there
is still the @prefix alternative[1] for RDFa.

RDFa also uses @rev extensively, so removing that value from HTML5 has
some fairly strong negative consequences for RDFa, which is why we would
prefer that it not be removed.

> Similarly, the rules for handling CURIEs in RDFa, especially in rel="", 
> are already incompatible with HTML4 and HTML5 rules. For example, the way 
> that "n:next" and "next" can end up being equivalent in RDFa processors 
> despite being different per HTML rules (assuming an "n" namespace is 
> appropriately declared).

If they end up being equivalent in RDFa, the RDFa author did so
explicitly when declaring the 'n' prefix to the default prefix mapping
and we should not second-guess the authors intentions. Unless I'm
missing something, "n:next" and "next" cannot accidentally end up being
equivalent - they cannot become equivalent due to language ambiguity and
they cannot become equivalent due to a bug in the current RDFa
processing rules.

We were very careful about not accidentally creating the ambiguity you
describe. So, I'm asserting that the rules for handling CURIEs in RDFa
are not incompatible with HTML4 and HTML5 rules. Do you have any other
examples that counter this point? Please either confirm or deny so that
I can make note of it on the RDFa wiki.

In any case, this discussion is about HTML5 /not/ creating ambiguity for
values that Microformats/RDFa uses /if/ a different micro-data solution
is chosen. So, to solve this issue in world where HTML5 adopts a
different micro-data solution from Microformats/RDFa - don't adopt xmlns
(which sounds like the direction you're going) and don't use @prefix to
declare prefix mappings (which sounds like you guys don't want to do
that anyway, so it shouldn't be a big deal).

> I don't think there's much that can be done about this (this isn't 
> something that we can change HTML5 rules for; browser vendors would not 
> accept having to resolve QNames in rel="" attributes as part of 
> processing, for one).

This has been explained many[2] times[3] now[4], CURIEs are not QNames.
If you have an issue with CURIEs, please state the exact issue that you
have with CURIEs and don't use a false analogy.

Browser vendors already resolve things like class="big-bold-letters"
using CSS stylesheets, they are in the business of resolving values in
attributes, what makes this any different? Nobody has been able to
effectively highlight this argument to me, does a wiki page exist that
explains this argument in detail? If not, can somebody create one?

Again, since this particular thread is about
Micro-data/Microformats/RDFa Interoperability - and if HTML5 creates
their own micro-data solution, don't define any CURIE-like behavior in
your micro-data solution that would conflict with Microformats and RDFa.

>> I think these are implied since HTML5 has gone to great lengths to
>> provide backward compatibility.
> 
> Backwards compatibility in HTML5 is primarily concerned with being 
> compatible with legacy markup, of which there is very little when it comes 
> to either RDFa or Microformats (especially RDFa, since there's so little 
> XHTML content for it to be found in).

When is the cut-off date for this? What constitutes legacy markup for
HTML5? I ask because the finishing touches for HTML5 aren't supposed to
be done until 2022, so I would expect that all markup before that date
would constitute legacy markup? As you know RDFa is being integrated
into Drupal core as we speak, which means that we can expect roughly 1.4
million sites to have RDFa integrated after 4 months of the next Drupal
release[5]. Is that enough? How many sites must adopt a particular web
technology for them to be considered valid legacy sites per HTML5's
metrics? Could this information be placed into a wiki page if it isn't
already?

-- manu

[1]http://rdfa.info/wiki/Alternate-prefix-declaration-mechanism
[2]http://lists.w3.org/Archives/Public/www-html-editor/2008OctDec/0000.html
[3]http://www.w3.org/TR/curie/#s_intro
[4]http://rdfa.info/wiki/Developer-faq#QNames_have_been_identified_as_a_known_anti-pattern.2C_does_RDFa_revive_QName_use.3F
[5]http://buytaert.net/drupal-download-statistics-2008

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: A Collaborative Distribution Model for Music
http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/



More information about the whatwg mailing list