[whatwg] the cite element

Erik Vorhes erik at textivism.com
Mon Aug 3 07:47:19 PDT 2009

On Mon, Aug 3, 2009 at 6:29 AM, Ian Hickson <ian at hixie.ch> wrote:
> >
> > See <http://www.four24.com/>; note near the top of the source:
> > <blockquote id="verse" cite="John 4:24">...
> My statement stands, on the aggregate:
> On Mon, 27 Jul 2009, Philip Taylor wrote:
> >
> > See http://philip.html5.org/data/cite-attribute-values.txt for some
> > data. (Looks like non-URI values are quite rare.)

I agree that @cite is rarely used as anything other than a URI; I was
attempting to demonstrate that even very recent uses of HTML don't
necessarily "get" that it is for URIs (the site I referenced launched
last month, as I recall).

> While we're at it, Philip had other data:
> > Also maybe relevant: see http://philip.html5.org/data/cite.txt for some
> > older data about <cite>. (Looks like non-title uses are very common.)
> This seems to support my point that <cite> is used for a whole variety of
> purposes, like <em>, <i>, <q>, HTML4's <cite>, and HTML5's <cite>. Very
> few, actually much fewer than I had remembered from my last look at the
> data, are names of people, citations or otherwise.

I actually took this information the other way, that there are indeed
other uses for <cite> out there beyond titles.

> On Mon, 27 Jul 2009, Erik Vorhes wrote:
> >
> > > A new element wouldn't work in legacy UAs, so it wouldn't be as
> > > compelling a solution. Also, <cite> is already being used for this
> > > purpose.
> >
> > My preference would be for <cite> to retain the flexibility it has in
> > pre-HTML5 specifications, which would include referencing titles.
> The flexibility doesn't seem as useful as limiting it to titles. What is
> the problem solved by allowing names to be marked up in the same manner as
> titles? The problem solved by allowing titles specifically to be marked up
> is that titles are usually typographically offset from the surrounding
> text in a distinctive fashion. This doesn't apply to names. Reusing the
> same element for both encourages authors to use <cite> for both which
> makes it harder for them to get the right typographic effect, leading to a
> lower quality of typography overall. I think this is a bad thing.

This is not just about names. It allows other (non-title) text to be
identified as a citation. If <cite> is identified as "title of work,"
you can't cite many major orchestral arrangements at all, nor can you
cite legal decisions. Unless by "title of work" you mean "standard
citation for an item, usually its title"; but then <cite> really means
what it is defined as in the HTML 4.01 specification.

> > If backwards compatibility is that big a concern, why does HTML5 use
> > <legend> outside of <fieldset> elements?
> There were no existing elements that could be reused for many of the new
> semantics. When there were, we used them (e.g. <i>, <b>, <cite>, <menu>,
> <legend>, <h1>).

I agree that there aren't always existing elements for the new
semantics included in HTML5, but I don't believe that backwards
compatibility is as big a concern as you claim it is. HTML5's re-use
of <legend>, for example, is completely broken in every extant
browser. (See <http://html5doctor.com/legend-not-such-a-legend-anymore/>
for evidence).

Besides, there's already <tt>, which could be used to identify "title
text" or something like that.

> > > What is the pressing need for an element for citations, which would
> > > require that we overload <cite> with two uses?
> >
> > A title can be a citation, but not all citations are titles. What's the
> > pressing need for limiting <cite> only to titles?
> As described above, the need to have an element for titles is that there
> are typographic conventions that apply to titles. What is the pressing
> need for an element for citations, which would require that we overload
> <cite> with two uses?

As I have said previously, there aren't consistent typographic
conventions that apply to titles. The "pressing need" is that <cite>
is already used to define citations. There's no reason to limit it to
a subset of citation (more below).

> But why does that have value? How would you use this information?

To collect citation information. I don't see how that as any less
value that collecting titles of works, especially since not all works
have titles or means of reference that would constitute a conventional

> > >> > Note that HTML5 now has a more detailed way of marking up
> > >> > citations, using the Bibtex vocabulary. I think this removes the
> > >> > need for using the <cite> element in the manner you describe.
> > >>
> > >> Since this is supposed to be the case, why shouldn't HTML5 just ditch
> > >> <cite> altogether? (Aside from "backward compatibility," which is
> > >> beside the point of the question.)
> > >
> > > Backwards compatibility (with legacy documents, which uses it to mean
> > > "title of work") is the main reason.
> >
> > I'd beg to differ, regarding "legacy documents." See, for example the
> > automated citation generation at Wikipedia:
> > http://en.wikipedia.org/wiki/Wikipedia:Citation_templates
> What specifically am I looking for here? This doesn't seem to have any
> relevance to HTML.

Wikipedia automatically wraps citations in the <cite> element. View
source on any of the Example sections.

> > In addition, the comments at zeldman.com use <cite> to reference authors
> > of comments. While that specific example is younger than HTML5, this is
> > merely an example of a relatively common use-case for <cite> that does
> > not use it to signify "title of work."
> As I said, the most common use of <cite> is to mark up italics. I agree
> entirely that it's misused.

I haven't said that it's misused. I apologize that you have
misunderstood me. I have repeatedly and consistently contended that
<cite> should be used for more than just titles. I believe that
Zeldman's use is perfectly appropriate and correctly used.

> Blog commenters don't need to be marked up any differently than the number
> of the comment -- that's a stylistic issue that varies from blog to blog.
> I don't see the need for an element specifically for people commenting on
> blogs. In most blogs that I've seen, the name isn't even highlighted in
> any particular fashion.

Again, this isn't just about citing people or "blog commenters"; this
was just an example of a current, non-title, and correct use of <cite>
according to current specifications. (And why does it matter if
something is particularly highlighted? Is HTML supposed to be a
presentational language? Why limit <cite> to the place of a
presentational element?)

> > Existing tools that treat <cite> exclusively as "title of work" do so
> > against every HTML specification out there (i.e., HTML 4.01 and
> > earlier).
> Existing tools generally have had very few problems in finding ways to do
> things against every HTML specification out there. Over 90% of all content
> on the Web is syntactically invalid in some way, and I'm sure that more
> than 10% of content on the Web is generated by tools.

Yes, and one of those tools is Wikipedia, which wraps entire citations
in the <cite> element, not just titles. It correctly follows current
HTML specifications in using <cite> to identify a citation.

> Not all titles are citations, actually. For example, I've heard of the
> /Pirates of Penzance/, but I'm not citing it, just mentioning it in
> passing.

No, that actually is a citation, whether you realize it or not. You
are making reference to a musical and are therefore citing it, even in

> > > Indeed, there is a lot of misuse of the element -- as alternatives for
> > > <q>, <i>, <em>, and HTML5's meaning of <cite>, in particular.
> > >
> > > Expanding it to cover the meanings of <q>, <i>, and <em> doesn't seem as
> > > useful as expanding it just to cover works.
> >
> > I believe you mean "limiting it just to cover works" here.
> I meant expanding it, since not all titles of works are citations.

Any reference to a title of a work is by definition a citation.
Therefore you are limiting <cite> to a subset of citation.

> As a first approximation, titles are italics, and names are not. I think
> that's a far closer approximation of typographical conventions than
> lumping titles and names together into one default style.

This doesn't seem to be an issue for you with the reuse of <legend> in
another context, even though it is broken. So why is it an issue here?
(And again, titles are not always in italics.)

> I haven't changed the spec. I continue to hold the position that covering
> titles of works is more useful than covering titles of works and names of
> people, and more useful than covering only names of people or works that
> are explicitly cited.

You are misconstruing my argument. This isn't about including names of
people; that is just the most obvious non-title form of citation. This
is about properly understanding what a citation can be and writing the
specification for the <cite> element to account for those
possibilities. Citations are references to works, people, etc. By
limiting it to "title of work" you are actually limiting it to a
subset of a subset, as many objects worth citing don't have
conventional titles.

Erik Vorhes

More information about the whatwg mailing list