[whatwg] Micro-data/Microformats/RDFa Interoperability Requirement

Sat May 9 19:37:45 PDT 2009

Ian Hickson wrote:
> On Thu, 7 May 2009, Manu Sporny wrote:
>> That's certainly not what the WHATWG blog stated just 20 days ago for
>> rel="license" [...]
> 
> The WHATWG blog is an open platform on which anyone can post, and content 
> is not vetted for correctness. Mark can sometimes make mistakes. Feel free 
> to post a correction. :-)

Well, the problem is that I don't know who to correct - you or Mark.
It's unclear to me if it's the spec that needs correcting or the blog post?

> For rel-license, the HTML5 spec defines the value to apply to the content 
> and not the page as a whole. This is a recent change to match actual 
> practice and I will be posting about this shortly.

Hmm, yes - after re-reading the definitions, they do differ...
especially in how the hAudio Microformat uses rel="license". I find the
HTML5 one to be very problematic. Microformats rel="license" is better,
and the RDFa use of rel="license" is even better (I can go into the
reasoning if those on the list are curious).

For example, in HTML5, how do you express 20 items on a page, each with
separate licenses? How do you differentiate a page that has 3 primary
topics, each with a separate license?

In short - what's the purpose of rel="license" if a machine can't use it
to help the person browsing identify important sections of a page?
Afterall, it's only machine readable, isn't it? What's the sense in
having rel="license" if a machine can't be sure of the section of the
page to which it applies?

>>> Surely this is what namespaces were intended for.
>> Uhh, what sort of namespaces are we talking about here? xmlns-style, 
>> namespaces?
> 
> The idea of XML Namespaces was to allow people to extend vocabularies
> with a new features without clashing with older features by putting the 
> new names in new namespaces. It seems odd that RDFa, a W3C technology for 
> an XML vocabulary, didn't use namespaces to do it.

As you are aware, the RDF in XHTML 1.1 Task Force was created to figure
out a way to express RDF via XHTML. The standard mechanism for extending
XHTML is XHTML Modularization[1]. From the XHTML modularization spec:

"""
This modularization provides a means for subsetting and extending
XHTML... this specification is intended for use by language designers as
they construct new XHTML Family Markup Languages.
"""

We used the standard mechanism, specifically designed to extend XHTML
vocabularies, approved by the XHTML working group, the W3C TAG and a
number of other W3C and public web groups, to extend the language.

Had we been tasked with expressing RDF in XML, we would have called it
RDF/XML[2]... :)

>>>>> For example, the way that "n:next" and "next" can end up being 
>>>>> equivalent in RDFa processors despite being different per HTML rules 
>>>>> (assuming an "n" namespace is appropriately declared).
>>>> If they end up being equivalent in RDFa, the RDFa author did so 
>>>> explicitly when declaring the 'n' prefix to the default prefix 
>>>> mapping and we should not second-guess the authors intentions.
>>> My only point is that it is not compatible with HTML4 and HTML5, 
>>> because they end up with different results in the same situation (one 
>>> can treat two different values as the same, while the other can treat 
>>> two different values as different).
>> It is only not compatible with HTML5 if this community chooses for it to 
>> not be compatible with HTML5. Do you agree or disagree that we shouldn't 
>> second guess the authors intentions if they go out of their way to 
>> declare a mapping for 'n'?
> 
> I don't think that's a relevant question. My point is that it is possible 
> in RDFa to put two strings that have different semantics in HTML4 and yet 
> have them have the same semantics in RDFa. This means RDFa is not 
> compatible with HTML4.

Of course it's relevant - the whole reason there are two strings with
the same semantics, in your rather contrived example, is because the
author went out of their way to make the statement. This doesn't happen
by accident - the web page author intended it to happen.

More importantly - you've just made the same statement twice in RDFa and
once in HTML4. I can't think of a single technically significant
negative repercussion for generating a duplicate triple in a corner
case. Why does one duplicated triple in a contrived example mean that
the entirety of RDFa isn't compatible with HTML4?

More importantly, if you see this as an issue, why don't you see the
semantic difference between rel="alternate"[3] in HTML4 and
rel="alternate"[4] in HTML5 as being an issue? That case is even worse,
exactly the same string - entirely different semantics.

If HTML4 validation is a concern, there's even a preliminary HTML4+RDFa
DTD that is available:

http://www.w3.org/MarkUp/DTD/html4-rdfa-1.dtd

I do get your point - but why should we be concerned about it?

>>> Browser vendors would not accept having to resolve prefixes in 
>>> attribute values as part of processing link relations.
>> Why not?
> 
> You would have to ask them. I tend not to argue with implementor feedback. 
> If they tell me they won't do something, I don't tell them to do it.

I would expect the primary editor for a web specification to understand
the reason that every single one of their implementors refuse to
implement a technique that they use elsewhere in their products. Since
I've successfully implemented this stuff[5] - I know how easy it is to
do what we're proposing.

I'd love to ask eacho of them why it is perceived as difficult and
discuss possible solutions. A public e-mail would be best so that we can
discuss on this list, but a private e-mail would be fine as well. So
please, if your organization has decided to not resolve prefixes in
attribute values, please send me an e-mail.

After two weeks, I'll check back in with the mailing list to report on
the number of responses I received and a summary of the reasoning
(anonymized, of course) for the benefit of this community.

> This by the way is the complaint regarding "QNames in content" and "QNames 
> in attribute values". CURIEs don't change this, which is why you'll find a 
> lot of people just say that CURIEs are no different to QNames.

We are having a technical discussion and therefore it behooves us to be
technically accurate when describing a particular technology. It is
technically inaccurate to refer to CURIEs and QNames interchangeably
just as it is technically inaccurate to refer to HTML5 and HTML4
interchangeably.

I should also point out that there has been no technical reason
expressed as to why "the complaint" is valid.

>> The only similarity they have is that they can be expanded to full URIs.
> 
> That's the complaint. (Specifically, that they can do so in a dynamic 
> fashion -- e.g. what happens if a user changes a prefix="" or xmlns:*="" 
> attribute value dynamically from script? Do all the RDFa triples change 
> too?)

If you run the RDFa processing rules right after the change, yes, of
course a subset of the RDFa triples would change. Why is this a
technical issue?

Since the author has decided, via code, to change the prefix mappings -
they are making a statement that they would like the semantics of the
page to change.

Put another way, if you do this:

var color="red";

and then use 'color' in your CSS styling via Javascript for one
iteration of your page render (say to set some red text). Then you go
and change the color to 'green', and do a second iteration of your page
render - wouldn't you expect the colors that were 'red' to change to
'green'?

We do this sort of thing all the time with CSS. What is the exact
technical issue with this approach?

>>> The problem with QNames in attributes is that they require the 
>>> attribute processor to have information from the namespace processor, 
>>> and as far as I can tell this continues to exist in RDFa.)
>> If that's really the problem, why don't you just have a prefix processor 
>> that the attribute processor relies on and drop the namespace processor 
>> entirely?
> 
> There are two fundamental problems, one is with having the prefix 
> processor be used by code that is in the XML parser and code that is very 
> separate from the XML parser (e.g. based on DOM code) -- implementors find 
> combining the two to be highly impractical 

I find that hard to believe. It may be valid, but we have a number of
entirely SAX-based and DOM-based implementations of RDFa in XHTML done
in a variety of languages (C, Javascript, Python, Perl, Ruby, C#, etc.)
and not one of our implementors complained about this "problem".

> and the other is that the 
> prefix processor has to handle dynamic changes to prefixes.

Again, why is this a technical issue?

-- manu

[1]http://www.w3.org/TR/2008/REC-xhtml-modularization-20081008/
[2]http://www.w3.org/TR/rdf-syntax-grammar/
[3]http://www.w3.org/TR/html401/types.html#type-links
[4]http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#link-type-alternate
[5]http://rdfa.digitalbazaar.com/librdfa/trac/

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: A Collaborative Distribution Model for Music
http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/