[whatwg] the cite element
Brian Campbell
brian.p.campbell at Dartmouth.EDU
Mon Aug 17 05:16:58 PDT 2009
Oops. This has been sitting in my outbox for a while, so it's a
response to somewhat old messages, but I think it still has some
value, especially the examples taken from Philip Taylor's data and
elsewhere on the web.
On Jul 19, 2009, at 5:58 AM, Ian Hickson wrote:
> Certainly there are situation-specific cases where names might be
> styled,
> but I think it's mostly as a side-effect of location rather than
> because
> the text is a name. Consider:
>
> <aside class="testimonial">
> <q>Best value for the money!</q>
> J. Random User
> </aside>
>
> <aside class="bookquote">
> <q>Best value for the money!</q>
> A Random Book
> </aside>
>
> <aside class="review">
> <q>Best value for the money!</q>
> Newspaper
> </aside>
>
> <aside class="logfiles">
> <q>[23:02] evaluator: best value</q>
> filename.log
> </aside>
Hmm. Isn't the common theme here that those names are a source that is
being cited (either a work or person)? For many authors, when writing
stylesheets to apply to these types of uses, it makes more sense or is
easier to have a specific element to style, rather than simply a text
node that is a sibling of a <q> and/or a descendent of a particular
class of <aside>.
Earlier, when justifying why you changed the definition of <cite> from
HTML 4.01, you said:
> I don't think it makes sense to use the <cite> element to refer to
> people,
> because typographically people aren't generally marked up anyway. I
> don't
> really see how you'd use it to refer to untitled works.
This usage is an example of when people are typographically marked up.
So this argument should not apply. It seems fairly common, when doing
block-level quotations, to mark up the source of a quote, whether it
is the name of the author or the title of a work, usually in italics
(which is generally how browsers mark up a <cite> element in the
absence of CSS).
And there are numerous examples of this use, which seem to contradict
this argument:
> HTML4 actually defined <cite> more like what you describe above; we
> changed it to be a "title of work" element rather than a "citation"
> element because that's actually how people were using it.
Among them (selected from some I have run across myself, as well as
some from Philip Taylor's data):
* http://www.webporter.com (from Philip Taylor's data)
<cite> is used to mark up the source of a testimonial.
* http://www.thesentencegame.com/ (from Philip Taylor's data)
<cite> is used to mark up the user who wrote or drew a particular
piece of content.
* http://en.wikipedia.org/wiki/RNA_interference (from Philip Taylor's
data)
<cite> is used to mark up a full bibliographic citation. Also used
on other pages on Wikipedia.
* http://www.igofigure.com/page/testimonials/
<cite> is used for the source of a testimonial.
* http://thelede.blogs.nytimes.com/2009/07/14/running-with-the-bulls-in-pamplona/
(and other articles on the NY Times Blogs)
<cite> is used to mark up the author of a comment.
* http://www.w3.org/TR/html401/struct/text.html#h-9.2.1
In the very example given in HTML 4.01, <cite> is used to mark up
the author of a quote.
* http://diveintomark.org/archives/2009/04/07/hhgregg-doa
<cite> is used to mark up the author of a comment.
* http://diggingintowordpress.com/ThemePlayground/index.php?wptheme=H5%20Theme%20Template
Even some folks who are trying to use HTML5 are using <cite> to
mark up the author of a comment; take a look at the comments on one of
the example articles.
* http://microformats.org/wiki/posh-patterns
Another recommendation to use <cite> to mark up a person who is the
source of a quote (as well as to use <cite> for a bibliographic
citation).
By changing the definition of <cite> in HTML5, you are saying that
numerous users of the HTML4 definition of <cite> are no longer
conforming, and not really giving any alternative that does the same
job. I suppose ideally we would have <cite>, <title> and <author>
(among others) that could be nested in such a way as to express
exactly what the author means. In the absence of that, having <cite>
mean simply a source being cited, and allowing the author to determine
whether they want to use it for titles of works, authors, or entire
citations, seems to be both reasonable and compatible with existing
content. If the author wishes to be more specific, they can use a
class to specify which type of citation they are referring to (perhaps
"citation", "author", "title"), or microdata, a microformat, or RDFa.
For example:
<cite class="author">Aristotle</cite>
<cite class="title">The Meaning of Life</cite>
<cite class="citation"><span class="author">Mencken, H. L.</span>
<span class="title">Prejudices: A Selection</span> <span
class="publisher">Johns Hopkins University Press</span> <time>2006</
time></cite>
Generally, though, I don't think that the class would be necessary for
these; you could instead simply select on the context of the citation:
- For marking up a person who is the source of a quotation:
.testimonial cite {}
.comment cite {}
- For marking up a full citation in a bibliography:
.bibliography cite {}
- And for general use of titles in text (which does seem to be the
default usage of <cite> if not in another context):
cite {}
> What's the alternative? Just say "em, i, cite and dfn mean 'italics'"?
> That doesn't seem particularly useful either. Why not just drop all
> but
> <i> if that's what we do?
>
> No, it seems useful to have elements that people can use for specific
> purposes, so that style sheets can be shared, so that tools can make
> use
> of the elements, if only in limited circles.
No, I don't believe that you should remove all mention of semantics
that aren't machine checkable from the spec; just that the tightening
of the semantics in this case does not seem to be gaining anything
(what is actually going to change if people use <cite> only for
titles, and resort to spans to mark up authors or full bibliographic
citations?), while simultaneously ruling out usages that are currently
valid and don't seem to cause any harm.
> Backwards compatibility (with legacy documents, which uses it to mean
> "title of work") is the main reason.
> People who use <cite> seem to use it for titles
> In the 15
> or more years that <cite> has supposedly been used for citations,
> I'm only
> aware of one actual use of that semantic, and that use has since been
> discontinued. Meanwhile, lots of people use <cite> for "title of
> work".
You claim that people seem to use it for titles many times, but in
practice, while that is the most common use, it is also used to refer
to authors or speakers, and sometimes also used for full bibliographic
citations. How many sites using <cite> for other purposes, including
quite prominent ones, would it take to convince you that this is
indeed a common pattern?
-- Brian Campbell
More information about the whatwg
mailing list