[whatwg] Helping people seaching for content filtered by license

Eduard Pascual herenvardo at gmail.com
Wed Jun 10 01:46:08 PDT 2009


On Fri, May 8, 2009 at 9:57 PM, Ian Hickson <ian at hixie.ch> wrote:
> [...]
> This has some implications:
>
>  - Each unit of content (recipe in this case) must have its own
>   independent page at a distinct URL. This is actually good practice
>   anyway today for making content discoverable from search engines, and
>   it is compatible with what people already do, so this seems fine.

This is, on a wide range of cases, entirely impossible: while it might
work, and maybe it's even good practice, for contents that can be
represented on the web as a HTML document, it is not achievable for
many other formats. Here are some obvious cases:

Pictures (and other media) used on a page: An author might want to
have protected content, but to allow re-use of some media under
certain licenses. A good example of this are online media libraries,
which have a good deal of media available for reuse but obviously
protect the resources that inherently belong to the site (such as the
site's own logo and design elements): Having a separate page to
describe each resource's licensing is not easily achievable, and may
be completelly out of reach for small sites that handle all their
content by hand (most prominently, desginer's portfolio sites that
offer many of their contents under some "attribution" license to
promote their work).

Software: I have stated this previously, but here it goes again: just
like with media, it's impossible to simply put a "<link
rel=license..." on a msi package or a tarball. Sure, the package
itself will normally include a file with the text of the corresponding
license(s), but this doesn't help on making the licensing discoverable
by search engines and other forms of web crawlers. It looks like I
should make a page for each of the products (or even each of the
releases), so I can put the <link> tag there and everybody's happy...
actually, this makes so much sense that I actually already have such
pages for each of my release (even if there aren't many as of now);
but I *can't* put the <link> on them, because my software is under
more liberal licenses (mostly GPL) than other elements of the page
(such as the site's logo, appearing everywhere on the page, which is
CC-BY-NC-ND), and I obviously don't want such contents to appear on
searches for "images that I can modify and use commercially", for
example.

Until now, the best way to approach this need I have seen would be
RDF's "triple" concept: instead of saying "licensed under Y", I'm
trying to say "X is licensed under Y", and maybe also "and X2 is
licensed under Y2", and this is inherently a triple. I am, however,
open to alternatives (at least on this aspect), as long as they
provide any benefit other than mere vadilation (which I don't even
care about anymore, btw) over currently deployed and available
solutions. I am not sure whether Microdata can handle this case or not
(after all, it is capable of expressing some RDF triples), but the
fact is that I can make my content discoverable by google and yahoo
using CCREL (quite suboptimal, and wouldn't validate on HTML5, but
would still work), but I can't do so using Microdata (which is also
suboptimal, would validate on HTML5, but doesn't work anywhere yet).

Regards,
Eduard Pascual


More information about the whatwg mailing list