[whatwg] the cite element

Smylers Smylers at stripey.com
Tue Sep 15 10:40:33 PDT 2009


Erik Vorhes writes:

> On Thu, Aug 27, 2009 at 7:08 PM, Ian Hickson <ian at hixie.ch> wrote:
> 
> > > Earlier, when justifying why you changed the definition of <cite>
> > > from HTML 4.01, you said:
> > >
> > > > I don't think it makes sense to use the <cite> element to refer
> > > > to people, because typographically people aren't generally
> > > > marked up anyway. I don't really see how you'd use it to refer
> > > > to untitled works.
> > >
> > > This usage is an example of when people are typographically marked
> > > up.
> > 
> > It's a minor case. The semantic here wouldn't be "name of person",
> > it would be "name of person when immediately following a quote in a
> > pullquote", which is far too specific to deserve a whole element.
> > 
> 
> I don't think anyone is arguing that there should be a new element
> exclusively for the above use or that <cite> should be limited only to
> that definition ("name of person when immediately following a quote in
> a pullquote" or the more forgiving "person to whom the quote is
> attributed"). Still, it would be nice to be able to use <cite> to mark
> up people being cited (along with other citations that don't
> explicitly involve a work's title).

But what do those situations have in common?  Titles of works are
rendered in some way which makes them stand out (typically italics), so
they aren't mistaken for words the authors is using with their normal
meaning.

That doesn't apply to blockquote attributions as a whole, where the
attribution is distinguished by dint of being the line just after the
quote.  And an attribution may include the author's name, the title of
the work, and a page number -- of which the title of the work needs
marking up in some way, ideally the same way as titles of works are
elsewhere in the document.

A publisher's house style may require that titles of works are
underlined instead of italicized, or in purple, or in roman text but
with single quotes around them ... but it's exceedingly unlikely they
would use exactly the same style for blockquote attributions.

Similarly a speaking browser would likely read titles of works
differently from the surrounding text, but doesn't need to give
blockquote attributions the same treatment.  The two are different.
They are conveyed differently to users.

I can appreciate that having to use:

  blockquote + div { text-align: right; }

(or whatever) isn't as nice as having a <blockquoteattribution> element.
But if you agree a <blockquoteattribution> is too niche to have its own
element, cramming it into an element which already has a specific
meaning and conveys something different to users.

Where a specific element for your needs doesn't exist the right course
of action is to use a generic one.

> > ... more importantly, the element's style is made non-italics, thus
> > completely defeating the entire point of marking up the element in
> > the first place.
> 
> I'm not sure this is a reasonable argument against the use of <cite> .
> Following this line of reasoning, it is not worthwhile to mark up
> titles of works if they are *not* to be italicized;

It's only worthwhile marking up _anything_ if there's to be some benefit
to readers.  If you do not wish your users to get clues as to which
words are titles of works then indeed you should not mark them up at
all.

Indeed going further, if you do not wish text to be conveyed to users as
being the title of a work then you must not mark it up with <cite>.
Even if you use CSS to remove the italics, only users of graphical
browsers with CSS enabled and good vision will be aware of that; many
other users will still have the words in question italicized or
otherwise conveyed in the manner appropriate for titles of works.  It is
unfair on such minority users to rely on CSS for removing meaning from
elements.

(Note this isn't about whether it's italics or some other styling you've
chosen to convey which are titles of works to readers.  If you wish to
remove the italics and use something else instead, <cite> is still the
right element.)

> moreover, it is even pointless to mark up headings using <h1> -<h6> if
> you intend to remove the bold styling.

Not so long as you leave the larger font sizes, or replace it with some
other styling which conveys they are headings -- then all users will
have conveyed to them that the headings are in fact headings, regardless
of their browsing environment.

> The counter to this approach is that <h1> -<h6> provide semantic value
> even when styled differently from the default.

Quite.

> But the same can be said for <cite> , whether it is defined as "title
> of work" or as a more general "citation."

Nope, because those two need conveying differently to users, and the
semantics browsers convey by default are those appropriate for titles of
works.

> Even if titles are by for the most common use case, it doesn't make
> sense to exclude other semantically justifiable uses of what appear to
> be valid uses of the <cite> element, at least according to the English
> language usages associated with the word "cite."

At this point in HTML's life many elements have non-ideal names, from
<a> upwards.  Let's have the most useful language we can and put up with
some names being unfortunate -- the alternative is accepting the tag
names we've been lumbered with and trying to come up with the best
dictionary definitions, regardless of whether they match what it's
actually useful to mark up.

> Put another way, if you had no prior knowledge of the current HTML5
> definition of <cite> (and perhaps any other specification's definition
> of the element), what would seem to be logical and appropriate uses of
> the element?

That one I can answer, because I've been in that situation.  Many years
ago, well before HTML5, (and before I knew HTML4 well) I was writing a
document, came to a mention of a title of a work, and wondered how in
HTML one indicates that so as to have it rendered in italics (or
whatever) -- it's something that's needed, so HTML4 presumably had an
element for it.

On perusing a list of HTML4 elements, <cite> was clearly the one
intended for that purpose.  Its name was a little obscure, and the HTML4
spec's description of it seemed a touch odd, but since an element to
mark up titles is useful (and one for other citationy-type-stuff isn't),
I concluded that was what it's for.  That <title> already has a meaning
in HTML explains the obscurity of its name, and the default rendering of
<cite> being italics confirmed my guess.

In other words, the HTML5 definition of <cite> is the one that made
logical sense to me years before I saw HTML5.  I even managed to divine
that meaning from the HTML4 spec.

> > > By changing the definition of <cite> in HTML5, you are saying that
> > > numerous users of the HTML4 definition of <cite> are no longer
> > > conforming, and not really giving any alternative that does the
> > > same job.
> > 
> > <span> does the job fine, in the rare cases where someone really
> > wants to mark up someone's name.
> 
> Unless there is some semantic value to the name being more than "just"
> a name, yes.

Such as?

> > > In the absence of that, having <cite> mean simply a source being
> > > cited, and allowing the author to determine whether they want to
> > > use it for titles of works, authors, or entire citations, seems to
> > > be both reasonable and compatible with existing content.
> > 
> > I think having it mean "title of work" only is more useful. Having
> > it mean all three will mislead authors into using it for all three,
> > and then cause them undue pain as they work around the default
> > styling.
> 
> I'm not sure I buy the "undue pain" argument, especially since there
> are plenty of times authors may wish to deviate from the default
> italic style of <cite> (using either "title of work" or "citation" as
> the definition):
> 
> - A normally italicized title that is in a block of text that is also
> italicized (in which case the general use would be to remove italics
> from <cite> )
> - A title of a work that according to a style guide should not be
> italicized (in which case a class value would probably be added to the
> <cite> element, such as "<cite class="essay"> The Freedom to
> Offend</cite> ").

That's OK, because you're just using different house style to convey the
same semantics.  And because the re-styling can apply throughout a
document to all instances of <cite>.

Indeed you can even apply such a house style on documents you haven't
had control of the authoring on, if <cite> has been used in accordance
with the HTML5 defintion.

Whereas if <cite> has been used with multiple meanings then you have to
use classes (or similar) to divine those -- and make agreements with all
parties involved as to which classes mean which.

> Moreover, what kinds of difficulties do you suppose?  Nested <cite>
> elements? I don't think this would be any more a challenge than nested
> lists, <strong> in bolded text, or <em> in italicized text, in terms
> of dealing with default styles.

In all those cases the element's underlying purpose is still being
conveyed to users (all users), just with non-default styling.  Whereas
with non-title uses of <cite> people are typically _removing all
styling_ to make it look like the surrounding text.  That confuses, and
disadvantages, those without visual CSS user-agents.

It also puts authors to a lot of unnecessary work, marking up something
they didn't need to and then undoing the effect of that marking up.

> > People are actively overriding the styles <cite> because they think
> > it's the right element, but it has the wrong effect. I don't know
> > what more harm we could be causing here. The element is failing at
> > its only purpose, because people think they're being Semantically
> > Right.
> 
> I'm not sure I understand your reasoning here. People who are using
> <cite> according to the HTML 4.01 specification are wrong for doing
> so?

Regardless of what HTML4 says (it's wooly; people's interpretations
differ; the point of HTML5 is to improve on HTML4, not republish it
verbatim), people who are using <cite> to mark things up that aren't
titles of works and which they don't want conveying as such to users
(and because of that are using CSS to remove the italics) are wrong.

It's unfortunate if HTML4 mislead them.  But that's no reason to make
the situation any worse by encouraging others to make the same mistake.

> Are you retroactively finding fault because you have redefined <cite>
> in the HTML5 specification?

Doing the above never made sense, notwithstanding and interpretations of
HTML4 which suggest otherwise.

> And as Jeremy Keith and others have pointed out, there's nothing wrong
> with overriding default presentational styles. I'm not sure why it
> should be such a cause for concern with <cite> .

Overriding is very differnt from trying to remove the effects of.

> I believe I understand why you have chosen to define <cite> as it
> appears in the current draft of the HTML5 specification; I just happen
> to believe that the current definition is not as useful as it could be
> and (more importantly) invalidates current reasonable uses of the
> element.

Why is that important?

Automated validators generally won't catch it, so it won't make previous
valid pages suddenly spew dozens of errors (a concern with other changes
from HTML4).

And if authors of such pages on discovering non-title uses of <cite>
aren't valid then remove them, that's a win for users of non-CSS
browsers.

Smylers



More information about the whatwg mailing list