[whatwg] Annotating structured data that HTML has no semantics for

Mon May 18 07:44:21 PDT 2009

On May 18, 2009, at 16:05, Eduard Pascual wrote:

> On Mon, May 18, 2009 at 10:38 AM, Henri Sivonen <hsivonen at iki.fi>  
> wrote:
>> (If we were limited to reasoning about something that we don't have
>> experience with yet, I might believe that people can't be too inept  
>> to use
>> prefix-based indirection. However, a decade of actual evidence  
>> shows that
>> actual behavior defies reasoning here and prefix-based indirection is
>> something that both authors and implementors get wrong over and  
>> over again.)
> Curious: you refer to "a decade of actual evidence", but you fail to
> refer to any actual evidence. I'm eager to see that evidence; could
> you share it with us? Thank you.

I thought everyone had seen the confusion. There are pointers at
http://wiki.whatwg.org/wiki/Namespace_confusion
The wiki page is less than a decade old, so it's length isn't quite  
that impressive.

>>> I have been a Java programmer for some years, and
>>> still find that convention absurd, horrible, and annoying. I'll  
>>> agree
>>> that CURIEs are ugly, and maybe hard to understand, but reversed
>>> domains are equally ugly and hard to understand.
>>
>> Problems shared by CURIEs, URIs and reverse DNS names:
>>  * Long.
>>  * Identifiers outlive organization charts.
> Ehm. CURIEs ain't really long: the main point of prefixes is to make
> them as short as reasonably possible.

You need to consider the length of the prefix declarations, too.

>> Problems that reverse DNS names and URIs don't have but CURIEs have:
>>  * Prefix-based indirection.
> Indirection can't be taken as a problem when most currently used RDFa
> tools don't use it at all (which proves that they can work without
> relying on it).

What do you mean? Current RDFa tools don't use prefixes?

>>>> (I understand that if the microdata syntax offered no advantages  
>>>> over
>>>> RDFa,
>>>> then it would be a wasted effort to diverge.
>>>
>>> Which are the advantages it offers?
>>
>> The syntax is simpler for the use cases it was designed for. It  
>> uses a
>> simpler conceptual model (trees as opposed to graphs). It allows  
>> short token
>> identifiers. It doesn't use prefix-based indirection. It doesn't  
>> violate the
>> DOM Consistency Design Principle.
> Ok, the syntax is simpler for a subset of the use cases; but it leaves
> entirely out the rest of use cases.

What are the rest of the use cases? Why weren't they put forward when  
Hixie asked for use cases?

> The DOM Consistency again is not an advantage of the microdata syntax
> because this could have been fulfilled with other syntaxes as well.

It's an advantage over RDFa-in-XHTML-served-as-text/html. It's not an  
advantage over microformats or may not be an advantage over a  
speculative yet undefined variation of RDFa.

>>>> It seems to me that it avoids much of what microformats advocates  
>>>> find
>>>> objectionable
>>>
>>> Could you specify, please? Do you mean anything else than WHATWG's
>>> almost irrational hate toward CURIEs and everything that involves
>>> prefixes?
>>
>> RDFa uses a data model that is an overkill for the use cases.
> Which use cases?

http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-April/019374.html

>>> No, it *can't* represent a full RDF model: it has already been shown
>>> several times on this thread.
>>
>> That's a feature.
> What?? Being unable to deal with all the use cases is a feature??

Being simpler while addressing all the use cases is a feature.

>>> Wait. Are you refering to microdata as an incremental improvement  
>>> over
>>> RDFa?? IMO, it's rather a decremental enworsement.
>>
>> That depends on the point of view. I'm sensing two major points of  
>> view:
>>
>> 1) Graphs are more general than trees. Hence, being able to  
>> serialize graphs
>> is better.
>>
>> 2) Graphs are more general than trees. Hence, graphs are harder to  
>> design
>> UIs for, harder to traverse and harder for authors to grasp. Hence,  
>> if trees
>> are enough to address use cases, we should only enable trees to be
>> serialized.
> ¬¬ Again, what's your basis to decide that "trees are enough to
> address use cases"?? Of course, they are enough to solve some use
> cases, but the convenience of dealing with just trees is not worth
> sacrificing the needs of those use cases you are arbirarily deciding
> to ignore.

I don't see anything on http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-April/019374.html 
  that doesn't boil down to trees or simple key-value pairs attached  
to an item.

>> I subscribe to view #2, and it seems that trees are indeed enough  
>> for the
>> use cases (that were stipulated by the pro-graph people!).
>>
>>> - Microdata can't represent the full RDF data model (while RDFa  
>>> can):
>>> some complex structures are just not expressable with microdata.
>>
>> That's not a use case. That's "theoretical purity".
> It's not "theoretical purity", it's something simpler:
> *extensibility*. And, with over two decades between versions of the
> specs, this is a strong requirement: if a problem is noticed after
> HTML5 becomes "the standard", it's essential to be able to solve it
> without waiting 10 or 20 years for HTML6 to come out.

Well, you have to commit to some bounds on extensibility. For example,  
the DOM is committed to being a tree.

> In addition,
> your alleged "simplified" data model is actually an over-complication,
> as it is defined in the form of restrictions and/or limitations over
> RDF's model. Try to explain what can be represented in RDF, and what
> can be represented with microdata, and you'll see what's simpler.

Microdata isn't defined in terms of RDF's model. It has its own model  
which is mappable to RDF (even though you cannot map RDF to microdata  
in the general case).

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/