[whatwg] Allowing authors to keep track of where content originates
Ian Hickson
ian at hixie.ch
Wed May 6 16:07:38 PDT 2009
One of the use cases I collected from the e-mails sent in over the past
few months was the following:
USE CASE: Allow authors to keep track of where content originates.
SCENARIOS:
* A blog, say htmlfive.net, copies content wholesale from another, say
blog.whatwg.org (as permitted and encouraged by the license). The
author of the original content would like the reader of the reproduced
content to know the provenance of the content. The reader would like
to find the original blog post so he can leave comments for the
original author.
* Chaals could improve the Opera intranet if he had a mechanism for
identifying the original source of various parts of a page, as that
would let him contact the original author quickly to report problems
or request changes.
REQUIREMENTS:
* Parsing rules should be unambiguous.
* Should not require changes to HTML5 parsing rules.
The two scenarios are subtly different, so I'm going to handle them
separately.
First, the blog syndication scenario:
* A blog, say htmlfive.net, copies content wholesale from another, say
blog.whatwg.org (as permitted and encouraged by the license). The
author of the original content would like the reader of the reproduced
content to know the provenance of the content. The reader would like
to find the original blog post so he can leave comments for the
original author.
This case is relatively easy: the original author need but ask for the
editor of the syndicating site to include a link to the original content.
If the editor isn't willing to do this, then there's nothing at the HTML
language level that we can do to force him. In practice, with htmlfive.net
syndicating blog.whatwg.org content, the editor of the former happily
agreed to include a link to the original blog, and does so. The current
setup doesn't link to the original article, but the titles aren't changed,
so an author can relatively easily find the original content.
Similarly, "Planet"-style syndicators include links to the original
entries, so this is already possible.
The odds of syndicators including these links can be improved a little by
putting the link explicitly in the post markup in the feed, since
typically syndicators just display the feeds verbatim.
This doesn't require any new parsing at all, so the requirements are met
too.
Next, the mashup page:
* Chaals could improve the Opera intranet if he had a mechanism for
identifying the original source of various parts of a page, as that
would let him contact the original author quickly to report problems
or request changes.
Since this is an intranet, I again assume that we can rely on the authors
and editors to cooperate.
HTML4 had a solution to this, the cite="" attribute on <blockquote> or
<q>. Within a controlled environment, this can be used quite well, as Mark
showed in late 2002. However, using <blockquote> for mashups is a bit
weird, and not really in the spirit of the <blockquote> tag (though
probably in the letter, admittedly). So I've added cite="" to the
<section> and <article> elements, so that mashup authors can more easily
keep track of where the sections come from.
The requirements collected as part of this effort for these scenarios are:
* Parsing rules should be unambiguous.
The parsing rules here are the same as for <blockquote cite="">, which is
very well-defined at this point.
* Should not require changes to HTML5 parsing rules.
This doesn't affect any of the parsing rules.
In conclusion, this use case can be addressed with a combination of
discussion with editors, including explicit links using <a href="">, and
using the new cite="" attribute on <section> and <article>.
A number of further use cases remain to be examined. I will send further
e-mail hopefully this week as I address them.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list