[whatwg] Creative Commons Rights Expression Language
hsivonen at iki.fi
Fri Aug 22 00:50:59 PDT 2008
On Aug 21, 2008, at 21:53, Ben Adida wrote:
> Not to mention that our design approach was specifically tailored to
> be HTML5-friendly.
It really isn't HTML5-friendly, since it depends on the namespace
mapping context at a node.
> Henri Sivonen writes:
>> and those additions use a Namespace-dependent
>> anti-pattern, so they aren't portable to HTML.
> Namespaces are an anti-pattern, really? Says who?
The anti-pattern I was referring to was qnames-in-content. (But, I'm
not saying that Namespaces in XML were not themselves an anti-
> The web is inherently
> namespaced. Everything you go to is scoped to a URL prefix. There
> one "Paris" or one "New York," there is wikipedia/paris, and
At least in the case of New York, the settlers had the good sense to
choose a short disambiguating prefix instead of thinking they were off
in a different default namespace like Texas and free to reuse local
names causing problems with global map search usability later.
> So is it the ":" that bothers you? Is that really relevant?
It's not the colon per se, although now that XML and HTML do DOM-wise
different things with the colon, the colon is trouble for element and
Here's what bothers me about namespaces:
1) I need write namespaces URIs several times a day, but the URIs
aren't memorable. Mistyping an NS URI would waste even more time as
bugs than looking URIs up for copying and pasting, so I look them up
for copying and pasting, and it's a huge waste of time.
2) The indirection layer from prefix to URI confuses people.
3) Namespaces not inheriting to attributes confuses people. (I have
had to give a crash course in how namespaces work on W3C telecons and
f2f meetings! Others have had to do it as well. This point is so
confusing that people whose job is working on Web specs get it wrong.
I've been told about a professor teaching a class about XML who got it
4) Instead of comparing names against a string literals, you have to
compare two datums against two literals. That is, instead of doing
"foo-bar".equals(name), you have to do "http://www.example.com/2008/08/namespace#
".equals(uri) && "bar".equals(localName).
5) Removing uri,local pairs from XML parsing context makes it hard
to write the full name in a compact form. Witness the NSResolver
complications with XPath and Selectors DOM APIs.
6) That the prefix is semantically not important confuses people who
go and write uninteroperable software thinking that they should be
comparing the prefix instead of the URI.
7) The design of namespaces considers parsing. It doesn't consider
serialization. Writing an XML serializer that doesn't suck isn't
trivial, and one will spend most of the development time on dealing
with Namespaces. (The prefixes aren't important but people still have
aesthetic opinions about how they should be generated...)
8) Namespaces dropped the HTML ball a decade ago letting the HTML
and XML DOMs diverge.
9) Namespaces stuff their syntax into attributes as opposed to
having syntax on their own meaning that certain magic attribute names
need blacklisting both in parsing and in serialization.
10) Namespaces slow down parsing. (By over 20% with Xerces-J and the
Wikipedia front page!)
11) I've spent *a lot* of time writing code that is Namespace-wise
excruciatingly correct. Yet, Namespaces have never actually solved a
problem for me. My software developer friends complain to me about how
Namespaces cause them grief. No one can remember Namespaces solving a
real problem. It's like feeding a white elephant.
Qnames in content have further problems: They complicate APIs and the
application layer when the mapping context needs to leak to the
application instead of being a parser-internal thing. Under scripted
DOM scenarios, there's the issue of the mapping context not getting
captured at node creation time thereby making the meaning of qnames
brittle under tree mutations. Finally, serializing XML that *may* have
qnames in content without the serializer knowing which values are
qnames (i.e. writing a generic serializer) is complex. (See also the
TAG finding about problems with digital signatures.)
> Just look at what microformats are forced to do, which is effectively
> re-inventing ad-hoc namespaces with "-" separators.
That's different. When the prefixes are fixed and go inside a name
token without an indirection layer of without the name becoming a
tuple, that's fine. You can still do "foo-bar".equals(name).
> The "namespaces are bad" argument is the most mind-boggling web-tech
> meme I've seen in a while.
It's Namespaces in XML that are bad--not *necessarily* lower-case 'n'
namespaces. Also, qname-in-content are even worse than just Namespaces
>> making them to identify which CC
>> license they mean, making them understand what permissions they are
>> giving irrevocably to others upon granting a license and making them
>> understand what licenses used by others mean (NonCommercial,
>> anyone?). Syntax doesn't solve any of these.
> I appreciate the strategy advice, but let's stick to the tech. I don't
> think it would be relevant to question Google's business plan when Ian
> makes a tech proposal :)
If Hixie made a proposal about HTML syntax citing Google's needs, but
there was something else going on at Google making the syntax moot, I
think it would be relevant. (I guess metadata aiding
translate.google.com is the recent example.)
>> Also note that even CC leadership omits the license URI.
> So you want a URI in the video content itself? What good would that
It's not me wanting it, it's the CC licenses:
"You must include a copy of, or the Uniform Resource Identifier (URI)
for, this License with every copy of the Work You Distribute or
> With ccREL (and specifically RDFa), the surrounding HTML can easily
> say "*this* video is licensed under *that* license."
I meant the license URIs of the photos used in the video.
Either way, putting RDFa in a HTML file means that the license data
doesn't travel with the video if I download it from Blip.tv or get it
via a podcast client.
HTML5 already has a way to express that the HTML document as a whole
is under a certain Creative Commons license: rel=license. This doesn't
allow you to say things about *another* resource, but that's OK,
because out-of-band metadata and data often travel their separate
ways. I think it would be better to develop simple ways of putting the
"license",license-URI key-value pair inside other popular file
formats. After all, you don't need triples in this case--just a key-
value pair and it's implied that it is *about* the file it is in.
Having to spec this for many formats isn't as appealing as speccing
one way for all formats, but the one way put forward isn't really that
great. (A *graph* in XMP is an overkill when key-value pairs would do.)
For example, in PDF, do people *really* need all this cruft:
<?xpacket begin="" id=""?>
...instead of putting the key-value pair "License","http://creativecommons.org/licenses/by-sa/2.5/
" document information dictionary of the PDF file?
>> Getting back to the comment thread on intertwingly.net, a later
>> comment contained this gem:
>> My sarcasm detector isn't quite working, so I can't tell if the
>> comment was *meant* to mock RDF, but the follow-up comment is spot
> I think your argument is "copyright is hard, so RDF sucks."
No, my argument is:
Copyright is hard. Sprinkling URIs and angle brackets doesn't make
people grok copyright. RDF adds even more hardness that normal people
> Lots of things about RDF are complicated, and lots of things about
> copyright are complicated.
Together they don't cancel each other out.
> I'd say that Creative Commons has helped make copyright *easier* to
> understand, not harder, though of course there are
> cases where we have failed and where we're trying to improve.
That may be, but I wouldn't attribute it to RDF.
> Now, what does that have to do with expressing user intent in
> machine-readable language, exactly? Is it harder to understand
> *because* of RDF and RDFa? I don't think so. I don't think those two
> things are even related.
No, RDF doesn't make copyright itself harder. It just adds something
else that's hard, so it's not helping.
> The point of ccREL and RDFa is to help express, in a machine-readable
> way, the act of copyright licensing, attribution, and such. It's meant
> to make machines helpful in expressing and interpreting these
I think trying to break complex licenses (especially ones that don't
originate from CC) into URI-identifiable components and letting
software interpret these for the user seems risky compared to doing
something simpler like having a finite catalog of licenses recognized
by software and mapping them to logos that the user can identify after
*actually reading* the licenses first without the software pretending
to relieve the user from finding out what the licenses mean.
For example, the CC licenses have a pretty significant component
lurking there that isn't covered by the RDF terms (or by the "human-
readable" deeds): the anti-TPM clause. What if a tool happily tells
someone that just giving me attribution for my photos is sufficient
for using a photo taken by me in a book without telling them that my
photos come with a poison pill that prohibits publishing the book on
> just like they don't need to understand the deep legal contract.
(I disagree, but that's off-topic for WHATWG, except to the extent of
pointing out that the RDF modeling doesn't cover significant aspects
of the licenses like the anti-TPM clause.)
> [.. a number of comments regarding the specifics of the RDFa
> syntax ...]
> We discussed the syntax in a public group, and we came to consensus. I
> don't see that you raised any issues or comments until 2 weeks ago,
> which was long past our deadline for comments.
If RDFa is considered immutable at this point, I guess HTML5 is put in
a "take it or leave it" situation. :-/ I'd choose leaving it if taking
it comes with the qnames-in-content and Namespaces in XML baggage.
> There could always be an alternate syntax, but the one we have was
> obtained through an open process of consensus. I suspect the same
> true for HTML5: lots of options, pick one that works and is relatively
> clean, and form consensus.
Actually, HTML5 hasn't been developed by consensus.
hsivonen at iki.fi
More information about the whatwg