[whatwg] the cite element
ian at hixie.ch
Wed Aug 12 16:21:23 PDT 2009
On Mon, 3 Aug 2009, Erik Vorhes wrote:
> On Mon, Aug 3, 2009 at 6:29 AM, Ian Hickson <ian at hixie.ch> wrote:
> > Not all titles are citations, actually. For example, I've heard of the
> > /Pirates of Penzance/, but I'm not citing it, just mentioning it in
> > passing.
> No, that actually is a citation, whether you realize it or not. You are
> making reference to a musical and are therefore citing it, even in
Your definition of "citation" is far looser than my dictionary's ("a
quotation from or reference to"). In fact your definition seems to be
basically the same as HTML5's -- a title of a work. Unless you think that
this should be valid use of <cite>:
<p>I picked up <cite>my favourite book</cite>, and put it next to
<cite>the painting I got from my aunt</cite>.</p>
I don't think that those references to works should use <cite>. Doing so
has zero benefit, as far as I can tell.
> > > See <http://www.four24.com/>; note near the top of the source:
> > > <blockquote id="verse" cite="John 4:24">...
> > My statement stands, on the aggregate:
> > On Mon, 27 Jul 2009, Philip Taylor wrote:
> > >
> > > See http://philip.html5.org/data/cite-attribute-values.txt for some
> > > data. (Looks like non-URI values are quite rare.)
> I agree that @cite is rarely used as anything other than a URI; I was
> attempting to demonstrate that even very recent uses of HTML don't
> necessarily "get" that it is for URIs (the site I referenced launched
> last month, as I recall).
Mistakes are common with HTML, sure.
> > While we're at it, Philip had other data:
> > > Also maybe relevant: see http://philip.html5.org/data/cite.txt for
> > > some older data about <cite>. (Looks like non-title uses are very
> > > common.)
> > This seems to support my point that <cite> is used for a whole variety
> > of purposes, like <em>, <i>, <q>, HTML4's <cite>, and HTML5's <cite>.
> > Very few, actually much fewer than I had remembered from my last look
> > at the data, are names of people, citations or otherwise.
> I actually took this information the other way, that there are indeed
> other uses for <cite> out there beyond titles.
I don't think anyone has argued otherwise. I've only argued that of the
uses that <cite> is put to, the only ones that are common but have no
other more appropriate elements (i.e. aren't flat out mistakes) are
citations and titles, and not people's names.
> > On Mon, 27 Jul 2009, Erik Vorhes wrote:
> > >
> > > > A new element wouldn't work in legacy UAs, so it wouldn't be as
> > > > compelling a solution. Also, <cite> is already being used for this
> > > > purpose.
> > >
> > > My preference would be for <cite> to retain the flexibility it has
> > > in pre-HTML5 specifications, which would include referencing titles.
> > The flexibility doesn't seem as useful as limiting it to titles. What
> > is the problem solved by allowing names to be marked up in the same
> > manner as titles? The problem solved by allowing titles specifically
> > to be marked up is that titles are usually typographically offset from
> > the surrounding text in a distinctive fashion. This doesn't apply to
> > names. Reusing the same element for both encourages authors to use
> > <cite> for both which makes it harder for them to get the right
> > typographic effect, leading to a lower quality of typography overall.
> > I think this is a bad thing.
> This is not just about names. It allows other (non-title) text to be
> identified as a citation. If <cite> is identified as "title of work,"
> you can't cite many major orchestral arrangements at all, nor can you
> cite legal decisions.
Why not? An orchestral arrangement is a work, and has a title -- the spec
explicitly lists "score", "song", and "opera" as possible works, for
I've added "legal case report" to the list, to clarify that you can use
<cite> to name such reports.
> Unless by "title of work" you mean "standard citation for an item,
> usually its title"; but then <cite> really means what it is defined as
> in the HTML 4.01 specification.
Unless you have a very loose definition of "citation", or unless you
consider a person to be a possible "source", <cite> in HTML5 is a strict
superset of HTML4's definition.
For example, the following is valid HTML5 but wouldn't be valid HTML4,
since it's not a citation or reference to another source, but merely
something mentioned in passing:
<p>Today, as I was moving my copy of <cite>Dreamer's Void</cite>, I
hurt my back.</p>
> > > If backwards compatibility is that big a concern, why does HTML5 use
> > > <legend> outside of <fieldset> elements?
> > There were no existing elements that could be reused for many of the
> > new semantics. When there were, we used them (e.g. <i>, <b>, <cite>,
> > <menu>, <legend>, <h1>).
> I agree that there aren't always existing elements for the new semantics
> included in HTML5, but I don't believe that backwards compatibility is
> as big a concern as you claim it is.
> HTML5's re-use of <legend>, for example, is completely broken in every
> extant browser.
Yeah, <legend> is a complicated case where a number of factors have
prevented an ideal solution. (The alternative, introducing yet another
element that means the same as <legend>/<label>/<caption>/<h1>/<th>/etc,
is worse, on the long run, than simply waiting a few years to intoduce
<figure> and <details>.)
> Besides, there's already <tt>, which could be used to identify "title
> text" or something like that.
It has the wrong default styles.
> > > > What is the pressing need for an element for citations, which
> > > > would require that we overload <cite> with two uses?
> > >
> > > A title can be a citation, but not all citations are titles. What's
> > > the pressing need for limiting <cite> only to titles?
> > As described above, the need to have an element for titles is that
> > there are typographic conventions that apply to titles. What is the
> > pressing need for an element for citations, which would require that
> > we overload <cite> with two uses?
> As I have said previously, there aren't consistent typographic
> conventions that apply to titles.
There are widely used conventions, though, for which <cite> has
appropriate default styles.
> The "pressing need" is that <cite> is already used to define citations.
<cite> is also used to mark up titles that aren't citations, as shown by
> There's no reason to limit it to a subset of citation (more below).
I honestly don't understand how HTML5 is a subset of HTML4 here, unless
you mean people's names, which as far as I can tell aren't commonly used
with <cite>, and for which there is no benefit to using <cite>.
> > But why does that have value? How would you use this information?
> To collect citation information. I don't see how that as any less value
> that collecting titles of works, especially since not all works have
> titles or means of reference that would constitute a conventional
Virtually nobody either collects citation information _or_ collects titles
of works. If that is the use case that we have to deal with, then please
provide evidence that there is actually a significant need for this. So
far I'm not aware of anyone actually doing this other than Mark Pilgrim,
and he stopped doing it years ago.
Currently, <cite> in HTML5 isn't for collecting anything, it's purely to
provide a hook for styling.
> > > >> > Note that HTML5 now has a more detailed way of marking up
> > > >> > citations, using the Bibtex vocabulary. I think this removes
> > > >> > the need for using the <cite> element in the manner you
> > > >> > describe.
> > > >>
> > > >> Since this is supposed to be the case, why shouldn't HTML5 just
> > > >> ditch <cite> altogether? (Aside from "backward compatibility,"
> > > >> which is beside the point of the question.)
> > > >
> > > > Backwards compatibility (with legacy documents, which uses it to
> > > > mean "title of work") is the main reason.
> > >
> > > I'd beg to differ, regarding "legacy documents." See, for example
> > > the automated citation generation at Wikipedia:
> > > http://en.wikipedia.org/wiki/Wikipedia:Citation_templates
> > What specifically am I looking for here? This doesn't seem to have any
> > relevance to HTML.
> Wikipedia automatically wraps citations in the <cite> element. View
> source on any of the Example sections.
Wikipedia's output is not an argument for consuming <cite>. In fact, what
they're doing is an argument against keeping <cite> for that purpose: they
are explicitly overriding the only behaviour <cite> gives them (italics)
and then going out of their way to reintroduce that effect on a <span>! If
that's not an argument for changing the meaning of <cite> to something
more convenient, I don't know what is.
> > > In addition, the comments at zeldman.com use <cite> to reference
> > > authors of comments. While that specific example is younger than
> > > HTML5, this is merely an example of a relatively common use-case for
> > > <cite> that does not use it to signify "title of work."
> > As I said, the most common use of <cite> is to mark up italics. I
> > agree entirely that it's misused.
> I haven't said that it's misused. I apologize that you have
> misunderstood me. I have repeatedly and consistently contended that
> <cite> should be used for more than just titles. I believe that
> Zeldman's use is perfectly appropriate and correctly used.
I disagree. I view it as an example of semantic markup for the sake of it.
We can be more helpful to authors.
> > Blog commenters don't need to be marked up any differently than the
> > number of the comment -- that's a stylistic issue that varies from
> > blog to blog. I don't see the need for an element specifically for
> > people commenting on blogs. In most blogs that I've seen, the name
> > isn't even highlighted in any particular fashion.
> Again, this isn't just about citing people or "blog commenters"; this
> was just an example of a current, non-title, and correct use of <cite>
> according to current specifications. (And why does it matter if
> something is particularly highlighted? Is HTML supposed to be a
> presentational language? Why limit <cite> to the place of a
> presentational element?)
The only use case I'm aware of for <cite> is as a media-independent
presentation hook, yes.
> > > Existing tools that treat <cite> exclusively as "title of work" do
> > > so against every HTML specification out there (i.e., HTML 4.01 and
> > > earlier).
> > Existing tools generally have had very few problems in finding ways to
> > do things against every HTML specification out there. Over 90% of all
> > content on the Web is syntactically invalid in some way, and I'm sure
> > that more than 10% of content on the Web is generated by tools.
> Yes, and one of those tools is Wikipedia, which wraps entire citations
> in the <cite> element, not just titles. It correctly follows current
> HTML specifications in using <cite> to identify a citation.
Upgrading Wikipedia to HTML5's definitions will simplify Wikipedia. This
seems like a net win.
> > > > Indeed, there is a lot of misuse of the element -- as alternatives
> > > > for <q>, <i>, <em>, and HTML5's meaning of <cite>, in particular.
> > > >
> > > > Expanding it to cover the meanings of <q>, <i>, and <em> doesn't
> > > > seem as useful as expanding it just to cover works.
> > >
> > > I believe you mean "limiting it just to cover works" here.
> > I meant expanding it, since not all titles of works are citations.
> Any reference to a title of a work is by definition a citation.
> Therefore you are limiting <cite> to a subset of citation.
I disagree with your definition of "citation".
> > As a first approximation, titles are italics, and names are not. I
> > think that's a far closer approximation of typographical conventions
> > than lumping titles and names together into one default style.
> This doesn't seem to be an issue for you with the reuse of <legend> in
> another context, even though it is broken. So why is it an issue here?
> (And again, titles are not always in italics.)
<legend> is an example of the worst possible end result. It's not an
example of best practice. It's an issue with <legend> also, there are just
other factors at work there.
> > I haven't changed the spec. I continue to hold the position that
> > covering titles of works is more useful than covering titles of works
> > and names of people, and more useful than covering only names of
> > people or works that are explicitly cited.
> You are misconstruing my argument. This isn't about including names of
> people; that is just the most obvious non-title form of citation. This
> is about properly understanding what a citation can be and writing the
> specification for the <cite> element to account for those possibilities.
> Citations are references to works, people, etc. By limiting it to "title
> of work" you are actually limiting it to a subset of a subset, as many
> objects worth citing don't have conventional titles.
Unless you can demonstrate that there is a concrete benefit to doing what
you describe, I do not think it is a good idea. There are concrete
benefits to the definition currently in HTML5, namely it provides a good
first approximation of common typographic effects at a very low cost.
On Mon, 3 Aug 2009, Jeremy Keith wrote:
> Hixie asked:
> > What is the problem solved by allowing names to be marked up in the
> > same manner as titles?
> They are both entities being referenced (cited). It seems arbitrary to
> me to forbid referencing names with the <cite> element. HTML 4 already
> allows it, authors would have to change their existing behaviour
> (something to be avoided wherever possible) and when the meanings of
> other existing elements<i>, <b>, <small>are being *expanded*, I can't
> follow the logic in *restricting* the meaning of an element already
> being used broadly.
As noted above, I believe that this is an expansion as well (I don't think
HTML4's use of "source" was meant to include people). But in any case,
what you describe here isn't a problem.
What is the _problem_ solved by allowing names to be marked up in the same
manner as titles?
> > The problem solved by allowing titles specifically to be marked up is
> > that titles are usually typographically offset from the surrounding
> > text in a distinctive fashion. This doesn't apply to names.
> That's what CSS is for.
CSS is optional. We need the media-independent layer to make sure that we
get a reasonable rendering even without CSS. (Otherwise, why wouldn't we
just be using <span> for everything?)
> Okay, but it won't make any difference to authors like myself who will
> continue to use <cite> to mark up names.
> We can do this either by applying a Kenobian interpretation of the spec
> (e.g. a person is the work of their parents/peers/society and a person's
> name is therefore a "title of work")
The spec explicitly says people's names aren't titles of works.
> When it comes to language features, the browser makers don't have to do
> muchjust make sure the element shows up in the DOM. However, if authors
> refuse to implement a language feature as described in the spec, then
> the spec becomes fiction.
Agreed; that's why I base a lot of the spec on research about what authors
are doing. In practice, most authors aren't marking up names with <cite>.
> Authors use the <cite> element to mark up names.
Only a small minority do. Certainly not enough to make this a language
> It is often the most semantically appropriate element for marking up a
There is no need to mark up a name at all.
> (and then in itself is a good enough reason to use it
No, that's a cargo-cult approach to semantic markup.
> I don't think it makes sense to ignore the existing behaviour of
Existing behaviour of authors is not to mark up names with <cite>.
> Authors such as myself will continue to use the <cite> element to mark
> up names; our markup will still be conforming; validators won't flag up
> our choices as errors.
Your markup won't be conforming, though you are correct that the validator
won't catch this error.
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg