[whatwg] the cite element

tjeddo tjeddo at gmail.com
Fri Oct 9 10:23:21 PDT 2009


On Tue, Oct 6, 2009 at 12:52 PM, Gordon P. Hemsley <gphemsley at gmail.com> wrote:
> I also propose allowing parenthetical citations and footnote markers
> (as is used in the various W3C/WHATWG specifications) to also be
> marked up with <cite>, though I'm not sure if TabAtkins agrees with me
> on that point.
>

I agree.  In fact I argue that this is should be the primary use case
for the cite element (i.e., acknowledging sources). The current HTML5
draft definition provided for the cite element is inconsistent with
the HTML4 specification, and furthermore now prohibits the cite
element from being used for actual in-text citations as it was
primarily intended for in HTML4 (see below) . I believe it would be
most beneficial to provide a new element to simply markup the title of
works, or just settle for using the <i> or <span> tags, and reserve
the cite element for true in-text citations.

To support my point here are some relevant quotes and characteristic
examples given from the HTML4.01 specification [5].

Here is the definition provided for the cite element in HTML4.01.
"CITE: Contains a citation or a reference to other sources [5, p. 91]."

The HTML4.01 specification provides the following examples
demonstrating the uses of the cite element:

"As <CITE>Harry S. Truman</CITE> said, <Q lang="en-us">The buck stops
here.</Q> [5, p. 91]"

and more importantly

"More information can be found in <CITE>[ISO-0000]</CITE> [5, p. 91]."

Both these examples are now illegal under the current HTML5 draft
definition for the cite element. While clarity certainly needs to be
provided on the usage of the cite element; it is the second example
that most closely matches the spirit and intention of the definition.
That is, in-text citations. HTML5 should focus on refining the
specification to handle this second in-text citation example. I've
taken a shot at formalizing the emerging concepts people have been
discussing on this mailing list to support valid in-text citations
using the cite element. For those looking for the value proposition in
all this, you can skim to the end of the email.
Constructive criticism and corrections are appreciated.

A Proposed Markup Scheme for the CITE element in HTML5

I've sampled a variety of passages containing real citations to markup
in the emerging citation scheme that is being discussed on this
mailing list. This way I don't have to overly contrive my examples. My
goal here is to illustrate how the cite element can be revised to
support first class citation support in HTML5.  Also, all these
examples are taken from sources about writing so there is a good
chance we will all agree they are valid examples.

Example 1A [1]:
    Human beings have been described as "symbol-using
    animals" (Burke 3).

Candidate HTML5 Markup:
    <span id="symbols">Humans have been described as
    <q>symbol-using animals</q></span>
    <cite for="symbols" href="#bib-burke">(Burke 3)</cite>.

    Note: The cite element is used here to make the citation
    relationship between the paraphrased/quoted content
    and the original source explicit. The 'for' attribute indicates
    the paraphrased/quoted content that, in this case, is the
    content of the span element with id="symbols". The href
    attribute provides a URI that resolves to a bibliography
    entry (in this case on the same page), or an actual online
    resource that contains the paraphrased/quoted content.
    HTML5 aware browsers would render the "(Burke 3)" text
    as a hyperlink that would move the browsers displayed area
    to the fragment "#bib-burke" on the same page
    (the bibliography entry). This could alternatively be an
    explicit URI with or without a fragment identifier appended
    that navigates to a separate page.

Example 1B [1]:
    Human beings have been described by Kenneth Burke as
    "symbol-using animals" (3).

    Note: An MLA-valid variant of Example 1A

Candidate HTML5 Markup:
    [Option 1]

    Human beings have been described by
    <cite for="symbols" href="#bib-burke">Kenneth Burke</cite>
    as <q id="symbols">symbol-using animals</q> (3).

    Note: It is the content of the whole sentence and not just
    the part between <q> tags that needs to be attributed to
    the author, therefore something like option 2 may be more
    appropriate.

    [Option 2]

    <span id="human-trait">
    Human beings have been described by
    <a href="#bib-burke">Kenneth Burke</a>
    as <q>symbol-using animals</p> (3).
    <cite for="human-trait" href="#bib-burke" />
    </span>

    Note: Here an empty cite element is provided with just
    attributes to make the citation relationship between
    the cited content and the original source explicit.
    <a> tags around the author's name can be optionally
    added to provide a hyperlink from the author's name to the
    bibliography entry.

Example 2 [4, p. 7]:
    For this reason, the American computer scientist Leslie Lamport
    has developed the LaTeX format (Lamport, 1985), which provides a
    set of higher-level commands for the production of complex
    documents.

Candidate HTML5 Markup:
    <span id="latex-fmt">For this reason, the American computer
    scientist Leslie Lamport has developed the Latex format
    <cite for="latex-fmt" href="#bib-lamp85">(Lamport, 1985)</cite>,
    which provides a set of higher-level commands for the production
    of complex documents.</span>

Example 3A [3, p. 95]:
    The rate of convergence is quadratic as, shown by Wilkinson [27].

Candidate HTML5 Markup:
    <span id="quad-conv">
    The rate of convergence is quadratic as, shown by Wilkinson
    <cite for="quad-conv">[27]</cite>.
    </span>

Example 3B [3, p. 95]:
    Several variations have been developed [2], [7], [13].

Proposed HTML5 Markup:
    <span id="variations">
    Several variations have been developed
    </span>
    <cite for="variations" href="#bib-2">[2]</cite>,
    <cite for="variations" href="#bib-7">[7]</cite>,
    <cite for="variations" href="#bib-13">[13]</cite>.

Example 3C [3, p. 108]:
    Knuth [164, p. 3] notes that "Many readers will skim over
    formulas on their first reading of your exposition. Therefore,
    your sentences should flow smoothly when all but the simplest
    formulas are replaced by 'blah' or some other grunting noise."

Candidate HTML5 Markup:
    Knuth <cite for="qt-knuth" href="#bib-164">[164, p. 3]</cite>
    notes that <q id="qt-knuth">Many readers will skim over
    formulas on their first reading of your exposition. Therefore,
    your sentences should flow smoothly when all but the simplest
    formulas are replaced by &slquo;blah&srquo; or some other
    grunting noise.</q>

Example 4 [2]:
    Students having a hard time finding databases isn't a new
    phenomenon. At the University of Washington, they have
    problems too.

        With the addition of so many new databases to the
        campus online system, many students were having
        difficulty locating the database they needed. At
        the same time, the role of Session manager had
        evolved. The increased importance of the Session
        Manager as a selection tool made it a part of the
        navigation process itself.
        (Eliasen, 1997, p. 510)

Candidate HTML5 Markup:
    <p>
    Students having a hard time finding databases isn't a new
    phenomenon. At the University of Washington, they have
    problems too.
    </p>
    <blockquote id="student-difficulty">
        <p>
        With the addition of so many new databases to the
        campus online system, many students were having
        difficulty locating the database they needed. At
        the same time, the role of Session manager had
        evolved. The increased importance of the Session
        Manager as a selection tool made it a part of the
        navigation process itself.
        <cite for="student-difficulty" href="#bib-Eli">
        (Eliasen, 1997, p. 510)
        </cite>
        </p>
    </blockquote>

Additional Thoughts on this Citation Markup Approach:

The 'for' attribute can be optionally dropped in which case the
content requiring the citation implicitly becomes the parent element
containing the cite element.

User agents can only assume these cite element semantics apply when
they have previously detected the HTML5 doctype.

As mentioned on this mailing list, the 'cite' attribute is
semantically closer fit than the 'href' element within the cite
element; however, it has been noted it looks redundant with the
element name. I am impartial on this one and just used 'href' in the
examples. Although, Hugh Guiney also mentioned that XHTML2 had planned
to use the cite element and attribute together (e.g., <cite
cite="...">)

In most of the proposed markup of the examples, the default italics
styling applied to the content of the cite element is not desirable.
But this is easily enough fixed by the css rule: cite {font-style:
normal}

I believe there is a significant value proposition for adding true
citation support to HTML5, for example:

* Search engines will have structured citation content to index.
Algorithms can be developed to better associate content with authors,
specific quotes with their speakers. This ultimately means more
relevant searches for the Internet community.
* If a standardized microdata vocabulary emerges for marking up
bibliography entries to complement this citation approach, crawlers
can be udpated to traverse these citation structures and extract out
specific information more readily.
* Professors might ask their students to write their papers in
wiki-like content management systems that encode the citation content
in this approach; thereby making it possible to use tools that check
for plagiarism.
* Using CSS, authors can readily highlight all their content that
contains citations to do a double check before publishing.
* Dialogs can be marked up to make explicit who a statement belongs
to. Once again this structure can be exploited by search engines to
provide more relevant searches.
* Overall we have a chance to standardize how authors encode citations
in HTML, which should further encourage Web authors to adopt the
encouraged practice of providing support for their claims.

Regards,
Tim Eddo

References for the Examples: (...not the sources cited in the examples)

[1] Purdue OWL. "MLA 2009 In-Text Citation: The Basics." The Purdue
OWL. Purdue U Writing Lab, 10 May 2008. Web. 08 Oct 2009.
<http://owl.english.purdue.edu/owl/resource/747/02/>

[2] "Mathematics Research Tutorial: In." University of North Carolina
at Chapel Hill Libraries, Web. 9 Oct. 2009.
<www.lib.unc.edu/instruct/math/citing/intext.htm/>

[3] Higham, Nicholas J.. Handbook of Writing for the Mathematical
Sciences. Philadelphia: SIAM: Society for Industrial and Applied
Mathematics, 1998.

[4] Daly, Patrick W., and Helmut Kopka. Guide to LaTeX (4th Edition)
(Tools and Techniques for Computer Typesetting). New York:
Addison-wesley Professional, 2003.

[5] "Paragraphs, Lines, and Phrases." World Wide Web Consortium - Web
Standards. 9 Oct. 2009.
<http://www.w3.org/TR/html401/struct/text.html#h-9.2.1>.



More information about the whatwg mailing list