[whatwg] the cite element

Jim Jewett jimjjewett at gmail.com
Wed Oct 7 22:00:48 PDT 2009

On Mon, Oct 5, 2009 at 10:13 PM, Ian Hickson <ian at hixie.ch> wrote:
> On Tue, 22 Sep 2009, Jim Jewett wrote:
>> On Tue, Sep 22, 2009 at 8:46 PM, Ian Hickson <ian at hixie.ch> wrote:
>> > On Wed, 16 Sep 2009, Erik Vorhes wrote:
>> >> On Wed, Sep 16, 2009 at 4:16 AM, Ian Hickson <ian at hixie.ch> wrote:

>> <cite> points to a primary source of the statement,
>> as opposed to an someone merely named by the
>> statement.

> I hate to be so repetitive, but why is that beneficial?
> What is the semantic value of this?

You are welcome to say that argument by authority is so weak as to be
invalid, but it still happens.

Similarly, you are welcome to say that the academic habit of crediting
other authors (sometimes but not always for specific publications) is
silly, but it still happens.

> Is there as much semantic value in pointing to the primary source of a
> statement as there is in knowing that the word "earth" refers to the
> planet and not the dirt, for example? If so, what is that extra value?

I recently saw a .sig (where, by who?) with a quotation of one
character asking whether another character had said something.  I
could link to the archived email by title, but it has nothing to do
with .sig.  I could fake up a title, such as "Steven Bethard's .sig".
But that can get really awkward when referring to something informal.
"The Hiphopopotamus, in something that I couldn't identify even if I
saw it, but which I am titling as the original source of the .sig
quote".  The .sig itself (if the message weren't in plaintext) could
refer to an episode title, but ... that would be a little too pedantic
for a .sig quote.

"<cite>The Hiphopopotamus</cite>" seems a much more reasonable solution.

>> dialogues and transcripts and credits and theatrical scripts are all
>> arguably too fine-grained for a "citation", as opposed to a "label" or
>> "attribution", but they are certainly real use cases where the
>> attribution is important.

> Why? This is not a rhetorical question, I'm trying to get to the use case
> that means that there is an actual benefit to what you are asking for.

They are all cases where "who said it" or "who did it" is important --
sometimes far more important than what they actually said or did.
Reversing the characters in a dialogue can change the meaning.
Changing the attribution of an statement containing "I" in a criminal
trial can have important consequences.

> What does <cite> do that you want?

It says who to praise/blame/question for the original thought and/or
expression, as opposed to the decision to repeat (and possibly
ridicule) it.

That may not matter much in a technical discussion, but matters in
lawsuits and it matters (for different reasons) in academics.

>> These three are even cases where print sources will typically shift
>> font in some way between the attribution (<b>Mephistopheles</b>) and
>> the actual statement, though not always in the same manner.  Of the
>> three that I found first,


> I'm not sure what you're saying here.

I was pointing out that attribution (to a person by name, not to a
work by title) was important enough that print sources distinguished
the way they presented the name from the way they presented the

>> >> On October 31, 2006, Michael Fortin suggested the following pattern:
>> >> <p><cite>Me:</cite> <q>Can I say something?</q>

>> ...
>> >> Aside from the current definition of <cite>, I think this would be a
>> >> good use of the element, ...

>> > I don't understand why we need an element here at all, and I don't
>> > understand why we would want to reuse <cite>, of all elements, if we did
>> > in fact need one.

>> That "Me:" isn't pronounced; it is metadata so important that it gets
>> written (in an odd style) in printed form.

> I don't buy that at all. It's just one way that people write dialogs, but
> as far as I can tell this is perfectly adequate:

>   <p>Me: Can I say something?</p>

> ...and you need neither <q> nor <cite>.

You *never* need q -- you could just use quotation marks.  And you
*never* need <li> -- you could just use the entity for a bullet.  But
being explicit is often judged worthwhile.

>> The punctuation (followed by a new sentence, complete with initial
>> capitals) is the closest a typewriter can come to markup, and scripts
>> will typically make the difference more emphatic.

> If it's _important_, then use <strong>. If it's just a keyword, then <b>
> is fine. If you're saying that the name is something that is in a
> different voice, then either the name or the text could be in <i>.

Typically, the name would be entirely silent; in a proper audio
rendition, it would be inferred from the change in voice.  Alas, those
of us reading (as opposed to hearing) the dialogue need some hints.  A
cite (or a hypothetical <attrib>) element is the right semantic hook
from which to hang this styling.

> If you need even more fine-grained styling, <span> with class="" seems
> fine here.

A convention of <p class="attrib_to Michael"> would work, though
people would (correctly) tend to see it as a compound class rather
than two unrelated class values.

But if attribution requires hoops like that, then there is really no
justification for an element like <cite> that would really just mean
<i class="title">

> I don't really see the need for more than that though. It's not like there
> is a style so common that a new element would be useful.

It is very common for the distinction to be made obvious through
styling.  I agree that the precise format of that styling is not
standardized -- which is all the more reason to make a semantic
element and let authors use CSS to achieve their preferred styling.

>> I'll agree that it seems odd to have that many <cite> elements in such
>> close proximity, but it is the closest match I can find in the spec, and
>> it doesn't seem to be actually wrong.  Searching for lines by a
>> particular character is a fairly common use case.

> Doesn't "find in page" handle that fine?

Not in my opinion.

But as long as you're minimizing the markup, that suggestion does
bring up another question:

Should the character names be invisible, because they aren't spoken
aloud?  And does this mean they'll need an element (perhaps only span
with a specific class) anyhow?  And that this element-class
combination should trigger very different behavior depending on the
output device and the user's preference?

>> > That seems like a really strange and eclectic variety of uses.

>> All boil down to "says who?".  A title of a work indicates something
>> about when they said it, and how (formally enough to have a title), but
>> ... so does a hyperlink to the author.

> "title of work" doesn't boil down "says who":

Yes it does, in the citation context.  In academic articles, it really
means "somebody other than me said it, so I don't have to justify it".
 The full citation also indicates when and where they said it, which
may help a reader to judge how likely it is to be true.

In less formal environments, the title is what someone would use to
get a copy of the original (I want to watch the hiphopopotamus!).  In
some cases, you want to be more specific (*which* Bond movie), in
others, it isn't worth the effort.

>   <p>My favourite book is <cite>Pandora's Star</cite>.</p>

> ...so if that is bundled with the others, I stand by my statement that
> this is a really strange and eclectic variety of uses.

The only reason to mark it up at all (in this case) would be that the
page author is singling it out.  There may not be much detail on *why*
it is worthy of special attention, but it is worthy, and the author
went to some effort to say so.

>> > For example, it seems odd ...

>> The difference is that John Adams and Fred Fox were the ones saying
>> something -- the cite was attributing something to them.  They were
>> "actors" as opposed to "objects" in the linguistic sense.  Ian was
>> simply an "object" (a direct object, in this case) that happens to be
>> human.

> I've started asking people what they think the errors are in the following
> snippet:
>  <article>
>   <h1>Welcome to my home page</h1>
>   <p>My name is <cite>Bob Smith</cite>.</p>

Wrong, but probably not harmful in practice.  Sort of like messing up
a rev=made.

>   <p>I like the book <cite>Pandora's Star</cite>.</p>

I wouldn't personally use it, unless I were also using the cite
element to provide ISBN information or some such, but I consider it

If I were only using it for styling, I would write either <i>Pandora's
Star</i> or <i class="title">Pandora's Star</i>, depending on how
careful I was feeling.

>   <p>What do you think?</p>
>   <article>
>    <cite>James Smith</cite>


>    <p>I'm with you <cite>Bob</cite>!</p>

Invalid.  Since you haven't said what Bob suggested, Bob is just a
name, not an actual source.  If earlier text had explained what the
idea (with which James Smith agrees) actually was, then that *earlier*
text could reasonably be wrapped in a cite.

>   </article>
>   <article>
>    <cite>Fred</cite>

Fine.  Fred is the author/instigator of the next portion.

>    <p><cite>James</cite> wrote:</p>

Fine -- Fred is himself crediting (citing) James.

>    <blockquote><p>I'm with you <cite>Bob</cite>!</p></blockquote>

Still wrong, for the same reasons.

>    <p>But I disagree, I think <cite>Pat</cite>'s blog post is better.

I would change that to <cite>Pat's blog post</cite>, because the
citation really is to that specific work, and the detailed information
is available.

>   </article>
>  </article>

> ...but frankly I'm having trouble working out which you are proposing to
> have valid and not, which is not a good sign.

It doesn't matter than something is a proper noun; it matters that
something is the (linguistic) Agent responsible for whatever is being

>> Are you seriously saying that there is no need to attribute to "names
>> and other sources of quote attribution (including identifying speakers
>> in dialog)", or to markup the user name of "names of blog post
>> commenters and authors (in the context of their comments, posts, etc.)"

> As far as I can tell, there is no need, no. What is the need?

Because readers often do care who said it, or who said it first.

> I _really_ don't see why we'd want to use <cite> here, given that as you
> say, it doesn't even give the right styling.

Because we don't have an <attrib> or even a <credit> element, and so
<cite> is the closest match.  Defining it as a synonym for <i
class="title"> seems wrong in both directions -- both promoting
something that shouldn't be an element, *and* preventing sensible use
of an appropriately named element.

>> The original purpose of a citation was so that readers could, if they
>> wished, go back to the original.  That is much easier when the original
>> is only a click away, and so even more important.

> That's what <a> is for. No need for <cite> for that purpose.

If you want to say it should be <a class="cite"> then I'll mostly
agree -- except that the need for credits does sometimes appear even
when hyperlinks are not available.

>> >> (1)  Examples of citing a person, arguably the creator.

>> >> (1a)  http://www.hiddenmickeys.org/Movies/MaryPoppins.html

>> >> The cite element is used to give credit to the person who
>> >> found/verified each "Hidden Mickey":
>> >>     <CITE>REPORTED: <A HREF="mailto:...">Beverly O'Dell</A> 12 MAR 98</CITE>
>> >>     <CITE>UPDATE: Greg Bevier 29 JUL 98</CITE>

>> > I don't think that's a usage anyone is actually arguing for though, is
>> > it?

>> Yes, I do think so.  The person in the cite element is the source of the
>> information.  This is similar to using cite for the author of a comment
>> at a blog.

> But with the word "REPORTED:" inside it? With the date inside it? Surely
> that isn't what you are requesting. It doesn't match any of the
> definitions you gave earlier, as far as I can tell.

I see them as a "full citation" variant.  They not only say who, they
also say when and in what manner/with what certainty.

[regarding bylines] ...

>> I agree that they would be better off with a <credit> element.  I also
>> believe that <credit> would be better for some of the use cases that
>> seem to be contentious, like blog-comments-author.  (1a, 1c, and 1d
>> would also be better off with <credit>, in my opinion.)  An <attrib>
>> element might be better still, as that would also work sensibly in
>> dialogues.
>> But <cite> is clearly the best option unless/until the more specialized
>> <credit> (or attrib) is added.
> No, <p> is "clearly" the best option: it has the right styling, and
> doesn't require us to make the definition of an element more complicated.

> Why would <cite> be a better option than <p>?

For the same reason that <aside> or <footer> is better than <div>.  A
byline may technically be a paragraph (or a span), but it is a very
specialized and odd type of paragraph.

>> That almost sounds as though the real specification were:
>>    "Book Title, even if you aren't quoting or
>>     paraphrasing anything -- this isn't really about
>>     citations; we just call it cite for historical reasons."

> That's exactly what HTML5 says, yes.

If, for some bizarre reason, it was deemed appropriate for HTML5 to
continue saying this, then the element should be deprecated in favor
of <i>, the same way that <acronym> is deprecated in favor of <abbr>.

> On Wed, 23 Sep 2009, Jim Jewett wrote:
>> Smylers wrote:
>> > If authors are spending time on using an element which has no effect
>> > on users (and Hixie's pointed out that in many cases where <cite> is
>> > used other than for titles of works authors use CSS to remove the
>> > default italics, to ensure that users don't actually have the presence
>> > of the <cite> conveyed to them) then there's no reason for HTML5 to
>> > continue to support it.

>> If they are merely changing the styling to some other distinctive form,
>> there is still reason to support it.  If they are truly going to the
>> effort of adding it, then working to make it indistinguishable, that
>> tells me the element is *very* important (if perhaps only for
>> bureaucratic reasons), and the problem is with the default styling and
>> UI.

> You have an odd use of the word "important". To me, it seems like if
> authors are going out of their way to use an element which has zero effect
> on anything, then they are in fact wasting their time, not doing something
> important. Then again, I also think that "bureaucratic reasons" and "very
> important" are contradictory.

"Zero visual effect in most browsers" if very different from "zero
effect on anything."

<cite> -- particularly when restyled to not be visually apparent --
may be one of the few aspects of HTML which is more important to other
classes of products.


More information about the whatwg mailing list