[whatwg] Various threads with feedback on HTML elements

Tue Oct 15 13:00:08 PDT 2013

On Thu, 13 Dec 2012, Tantek Çelik wrote:
> 
> If anything I think I've grown more conservative regarding new elements 
> in this regard based on experience teaching authors. I used to support 
> <hgroup>, and though while I personally find it useful in content, I no 
> longer find its addition useful enough for authors in general to 
> overcome the confusion it adds.

Could you elaborate on what confusion it adds? Pointers to Web developer 
discussion fora where this confusion is evident would be particularly 
helpful, so I can see what "in the wild" discussions look like. (I've seen 
confusion, but only in the context of changes that our esteemed colleagues 
at the W3C have made to their version of the language.)

> Similarly with <section> (which appears to be turning into an alias for 
> <div>). IMO the outline algorithm is dead and we could simplify HTML by 
> dropping these two.

I'm not sure what it means for the outline algorithm to be dead. Documents 
do have headings of different levels; that's all the outline algorithm is, 
a way of defining what headings of different levels mean. Dropping that 
would mean having only one heading per page, which is probably not 
something that could really fly.

The difference between <section> and <div> seems pretty clear. I'm not 
sure how they can be aliases for each other. Again, any documentation on 
what you mean here would be most welcome.

> I've written up a wiki page documenting what I believe to be sufficient 
> arguments to add the <main> element, along with arguments against that 
> I've heard and rebuttals, as well as counter-proposals made along with 
> flaws in counter-proposals:
> 
> http://wiki.whatwg.org/wiki/Main_element

I normally at this point would say "well, it got implemented and that made 
it get added to the spec", but since you then added:

> However, I still think adding main is fully supported on principle 
> (rather than just on browser-implementation-Hixie-veto-override) and 
> thus I'm interested in capturing that on the wiki page so that hopefully 
> we can learn from this analysis about adding a new element and use those 
> lessons when considering new elements in the future.

...I guess you want more detail in my response. So:

> HTML5 has several new "block like" semantic elements (article, aside, 
> section), including header and footer elements. There is no element to 
> represent (and provide a hook for) for ''the'' "main content".

There doesn't need to be; all content is "the main content" unless 
otherwise indicated.

> The only landmark ARIA role that lacks an equivalent semantic HTML 
> element is also called "main", and it is being used on real world 
> websites.

There's lots of ARIA roles that don't have dedicated elements, and 
shouldn't have dedicated elements, e.g. role=application.

> One of the motivators was to get rid of skip links 
> (http://webaim.org/techniques/skipnav/).

This use case is already handled by <nav>.

> A <main> element paves an already researched cowpath.

That isn't clear.

> The introduction of <header>, <footer>, <aside>, <nav>, and <article> is 
> based on careful consideration based on data gathered (in 2005).

Actually, as much as I'd like to claim that was true, it's not. They were 
added based on intuition, and then I did the research, and found that my 
intuition had happened to be spot on.

> Even according to Hixie’s research, the class “content” ranked higher 
> than “nav” or “header”. It would be even more obvious if you took the 
> sum of “content”, “main” and “body” (as class names or ids).

That's what <article> is for.

> That Hixie left <main> out in the first place is not justified by the 
> research presented.

Actually it was. The justification was: based on a closer inspection of 
how the popular classes are actually used, people want a way to mark up 
their content, their headings, their footers, and sometimes the body part 
of their content for stylistic reasons. Thus we provide <article> to mark 
up the content, <header> and <footer> for the headings and footers, and 
that leaves <div> as an emminently suitable element to mark up the <body>. 
There's no need for a dedicated element for the body semantically, since 
the body is, by definition, everything that's left after everything else 
has been marked up accordingly.

> "Use of ARIA (a stop gap) should be a smell. Using it generally implies 
> a weakness in native semantic, or, an abuse of native semantics." -David 
> Bolter
> 
> Additionally, role=main is the most commonly used ARIA landmark per: * 
> http://www.paciellogroup.com/blog/2012/04/html5-accessibility-chops-real-world-aria-landmark-use/ 
> And it’s the only landmark that doesn’t have a one-to-one mapping to an 
> HTML element name.
> 
> A main element would avoid the need to use role="main".

role=main isn't necessary in the first place if you use the other HTML 
elements correctly. ATs can programmatically determine what is not main 
content, and what is left is, by definition, main content.

> On various development forums and sites, there are many questions of the 
> nature: what HTML5 element do I use for the main content of my page? 
> E.g.:
> * Sitepoint: 
> [http://www.sitepoint.com/forums/showthread.php?855981-Main-content-area-in-HTML5-what-is-correct 
> Main content area in HTML5 - what is correct ????]
> * Stack Overflow: 
> [http://stackoverflow.com/questions/12438300/html-5-element-for-content 
> HTML 5 element for content?]
> 
> Given that web developers are already asking these questions (and 
> getting wildly inconsistent answers), and that they're going to author 
> *something* for what they consider the main content, a main element 
> would naturally provide this functionality, and alleviate a common 
> source of confusion.

I think that with this remaining as the only use case for <main> should 
have led us to writing better documentation to answer the question. Adding 
an element was unnecessary for that purpose.

But in any case, <main> got added since it was implemented, and once 
implemented, could be made use of. (It's similar to why we have <i> and 
<cite> and <small> and <ins> and many other elements that I probably 
wouldn't think had sufficient justification if we were adding them now, 
but since they're implemented, we might as well make use of them.)

On Mon, 18 Feb 2013, Nils Dagsson Moskopp wrote:
> 
> When I tried to actually [make <head> visible] for a web page I found 
> that while Gecko (from conkeror) generates Hyperlinks for <link> 
> elements, WebKit (at least that from Chromium Version 22.0.1229.94) does 
> not. Therefore, I decided against using this styling pattern for my blog 
> sofware.
> 
> I suggest people test for themselves, as I suspect I may have done 
> something wrong with my user stylesheets in Chromium – it seems 
> counter-intuitive that a <link> Element does not create a hyperlink

I've made this clearer in the spec.

On Sat, 22 Dec 2012, Ian Yang wrote:
> 
> To sum up, the proposals are:
> 
> 1. Making <main> a sectioning element for a better and clearer document 
> outline. If unfortunately it were not accepted, personally I guess I 
> will continue to use <section role="main" /> at least it yields an ideal 
> document outline.

I don't think it makes much sense for <main> to be a sectioning element. 
It's used to mark the part of a sectioning element that's the "body", so 
pretty much by definition it can't be a section as well.

> 2. Making <main> being usable multiple times in a document, so we also 
> have a reasonable element to wrap the main content of a blog post.

The spec does not limit <main> to being used only once.

On Tue, 15 Jan 2013, Steve Faulkner wrote:
>
> Can anyone point me to or provide use cases for untitled article and 
> section elements?
> 
> as in who are the potential consumers of document outlines with untitled 
> sections?

An example of something that would be a legitimate untitled <article> 
would be the calculator widget in Apple's Dashboard product. You could 
easily imagine the widgets on UStart.org, Protopage.com, my msn, my 
Yahoo!, etc, not having captions -- actually I was somewhat surprised at 
how many of the widgets actually did have headings. A comic strip archive 
page showing many comics back to back might have each comic in a separate 
<article> without necessarily needing a heading for each one.

Basically, any time you have something syndicatable, it might make sense 
in an <article>. Not everything that's syndicated has an explicit heading.

An example of an untitled <section> would be a slide in a slide deck, or a 
chapter in a picture book.

On Tue, 15 Jan 2013, Jukka K. Korpela wrote:
> 
> The example that first comes into my mind is a discussion forum where 
> contributions (which would appear to match the <article> idea) can be 
> posted, and are usually posted, without a title of any kind. A 
> discussion has a title (subject), but individual contributions are 
> basically just text, though in advanced systems they may contain markup.

Yeah, that's a good example too. The comments on reddit could each be an 
<article>, for instance. Tweets on twitter, similarly. Cards in modern 
card UIs might be <article> or <section> elements without headings.

> > as in who are the potential consumers of document outlines with 
> > untitled sections?
> 
> Oh. That's a different issue. This whole "outline" thing does not look 
> very realistic. I have not seen much practical interest in it; the 
> "HTML5 Outliner" add-on in Firefox is one of the few signs of interest, 
> and it's fairly primitive.

The "outline" of a document is just the semantic relationship between 
sections, headings, and other content in a page.

On Tue, 15 Jan 2013, Jukka K. Korpela wrote:
> 
> A very different example is a novel. A novel is almost always divided 
> into sections, and sections may have subsections (visually separated 
> e.g. using extra empty space or maybe "***"). The sections may or may 
> not have title. Often they have just numbers, presented as titles like 
> "Chapter 1", so they are more or less pseudo-titles (and could be 
> replaced by CSS-generated content). Subsections almost never have 
> headings.

Indeed. (The extra empty space or "***" is usually better mapped to <hr> 
than to <section>, but it's certainly possible to use either in practice.)

> So what a browser could do, with a novel that uses <section>, is to 
> provide an outline of the structure, possibly so that along with 
> numbers, there are short excerpts from the start of each section or 
> subsection.

Yeah. Or the outline could never be shown but could still be used in 
deciding what "shift + page up" should do, for instance.

On Tue, 15 Jan 2013, Ian Yang wrote:
> 
> Imho, there is a reason for each sectioning element to have a heading. 
> If a content doesn't need a heading, then it should not be coded using 
> sectioning element.

Given the number of examples that are listed above that are legitimately 
sections or syndicatable articles, yet don't have headings, that seems 
unnecessarily strict.

> Because blog comments are coded using <article>s, at least their "author 
> name"s should be contained within <h1>s so that in the document outline 
> they are presented clearly. For example, <h1>Mike Smith says</h1>, or 
> <h1>Commenter: Mike Smith</h1>, or just <h1>Mike Smith</h1>.

It's not really clear to me that anyone is going to be navigating a 
comments section using a document outline. But in any case, it seems like 
it would be fine for a UA that does expose an outline to just replace the 
missing heading with a fragment of the content.

> Every section of a novel needs a heading, too. Otherwise in the document 
> outline we will see a bunch of "Untitled Section"s.

Or "Once upon a time...", "Meanwhile..." etc, if the UA takes the first 
few words to describe the section.

On Tue, 15 Jan 2013, Steve Faulkner wrote:
> > [blog comments]
>
> what does the use of article provide in this case over other markup?

The same as markup ever provides. It describes the meaning, provides 
styling hooks, and so forth.

> why not section or list?

<article> is a kind of section that is syndicatable and self-contained. 
That's what a comment is. <section> and <li> aren't as good a fit.

Having said that, if an author wanted to describe their blog comments 
using <section> and <ol>/<li>, it would still work.

> who are the consumers of the semantics?

Generally the authors themselves, for styling; maybe future ATs, search 
engines, and the like, could also make use of the semantics, if they are 
reliable used. But generally most semantics are actually only used by the 
authors themselves.

(I mean, you could ask the same questions of <ol>/<ul>, <p>, <em> vs 
<strong> vs <i> vs <b>, etc.)

On Tue, 15 Jan 2013, Jukka K. Korpela wrote:
> 
> When a contribution comments on another contribution, neither is 
> logically part of the other. They are related, not nested.

When article elements are nested, the inner article elements represent 
articles that are in principle related to the contents of the outer 
article. That's the meaning of the nesting, per the spec.

> It is difficult to see what the idea of the example is, but it says: 
> "The article element is used for each post, to mark up the threading." I 
> wonder if threads would deserve markup of their own, possible defined in 
> somewhat more abstract terms. But nested lists would be more natural 
> (and would create acceptable default rendering even in oldest browsers).

They do have markup of their own: nested <article>s.

On Sat, 26 Jan 2013, Steve Faulkner wrote:
> 
> Lists are appropriate for indicating nested tree structures. The use of 
> lists to markup comments is a common mark up pattern used in blogging 
> software such as wordpress. The code verbosity is not dissimilar to the 
> use of article, less so even option end </li> tags are omitted.

Indeed, that's fine too.

> Besides comments are generated code not hand authored so I don't see a 
> problem with code verbosity.

Generated code is hand-authored too, at some point.

> The useful information that is conveyed to users who actually consume 
> and benefit from it is provided by using lists , but not by using 
> article.

It's the exact same structure, the exact same information can be conveyed. 
That's just a matter for implementors. Since <ol> is decades old and 
<article> is a pretty recent addition, it makes sense that <ol> would be 
more reliably implemented and exposed.

On Sat, 26 Jan 2013, Bruce Lawson wrote:
> 
> In short, why should the spec suggest any specific method of marking up 
> comments?

In theory, to put an end to the very conversation we're having here. :-)

On Sun, 27 Jan 2013, Adrian Testa-Avila wrote:
> 
> So, maybe a better question is why should the spec suggest only one 
> specific method?

By and large it doesn't, for example the section on footnotes has a bunch 
of different techniques.

On Sun, 17 Feb 2013, Nils Dagsson Moskopp wrote:
>
> As someone consuming HTML and XML content, I find it extremely unhelpful 
> if equivalent semantics are expressed in mand different ways.

Yeah, that's a fair point.

On Sun, 17 Feb 2013, Nils Dagsson Moskopp wrote:
> 
> As someone who is interested in semantics and tired of scraping
> content and applying scrappy heuristics: If it is clear that an
> <article> within an <article> represents a comments one can easily:
>   * programmatically find article comments in HTML
>   * write interoperable stylesheets for comments, using the selector
>     “article > article”
>   * use HTML fragments in a document store for content management (I
>     wrote a blog software with a git backend yesterday and plan to add
>     this feature)
> 
> Without having one interoperable way all that becomes a lot harder.

Right.

On Mon, 18 Feb 2013, Silvia Pfeiffer wrote:
> 
> <article> in <article> could be a comment. Or it could be something else 
> entirely.

Not in a conforming document. A nested <article> "represent[s an] 
article[] that [is] in principle related to the contents of the outer 
article", i.e., a comment.

On Sat, 23 Feb 2013, Steve Faulkner wrote:
>
> Is there any rationale, uses cases or data available that supports the 
> current definition of the <main> element in the WHATWG spec? In 
> particular the author conformance requirements and advice.

It's based on how people mark up their documents. It's common for authors 
to mark up the body part of their articles or sections with a class=main 
or class=content or similar, for styling purposes. The <main> element is 
defined to be a suitable replacement for that.

On Fri, 1 Mar 2013, Mikko Rantalainen wrote:
> Ian Yang, 2013-02-14 03:21 (Europe/Helsinki):
> > <!DOCTYPE html>
> > <title>lorem ipsum</title>
> > <header>
> >   ...
> > </header>
> > <main id="main" role="main">
> >   ...
> > </main>
> > <footer>
> >   ...
> > </footer>
> 
> I find the logic to be that if you use <header> and/or <footer> you 
> should wrap the main content within <main>. Then use <section> and 
> <article> for the structure.

Right.

> One thing worth noting is that unlike id="main" or role="main", the 
> <main> is intended to be used (nested) multiple times on a page. So, 
> following markup does make sense:
> 
> <!DOCTYPE html>
> <html>
> ...
> <body>
> <header>...</header>
> <main>
>  <ul>
>   <li><article>
>    <header>...</header>
>    <main>...</main>
>    <footer>...</footer>
>   </article></li>
>   <li><article>
>    <header>...</header>
>    <main>...</main>
>    <footer>...</footer>
>   </article></li>
>  </ul>
> </main>
> <footer>...</footer>
> </body>
> </html>

Right.

(Actually, role=main is allowed to be repeated as well, according to the 
ARIA spec.)

> The first header (body > header) hopefully contains the page main header 
> (perhaps blog title and slogan, maybe site navigation in <nav>) and 
> within the first <main> (body > main) is a list of articles (perhaps 
> blog entries?) where each article has its own header (article > header), 
> main part (article > main) and the footer (article > footer). (Note that 
> the selectors that I used within the parentheses are generic and should 
> work equally well on any page that uses <main> element.)

Right.

> In the real world, the main part pretty much always requires some 
> container (usually for styling and scripting) anyway so better 
> standardize <main> for that, IMHO. I know Ian does/did not agree because 
> in theory that is not needed because the main part is everything minus 
> header minus footer. However, it turns out that neither CSS or JS can 
> handle that really well. In the end, the WHATWG was supposed to be about 
> real world usage vs. theoretical correctness and this is one example of 
> that.

Well, I don't think <main> really gets us anything <div class=main> didn't 
get us, since the semantics are redundant (the main content is everything 
not in an element defined to not be main content, like <header> or 
<aside>), but since browsers have implemented <main>, it makes sense to 
use it for the purpose you describe, right.

> If the content is authored this way, UA could provide a navigation aid 
> called "skip to the start of the next piece of content" instead of the 
> current "global skip to the content" implementation allowed by id="main" 
> or role="main" which would be usually the same as end of 
> html>body>header.

That's possible without <main> too, but yes.

On Thu, 20 Jun 2013, Steve Faulkner wrote:
>
> What are the use cases for a <figure> without a <figcaption> ?

On Thu, 20 Jun 2013, Xaxio Brandish wrote:
>
> An illustration of a font name, in its respective font?

Yeah, that's a good example. There's plenty of others, e.g. a headshot in 
a bio, or a table that has its own caption (or no caption). Basically any 
time that you have a figure that's part of the main content, but doesn't 
have an explicit caption. Typically, this would be anything that's 
referenced by the main content and yet floated to the side, but where it 
is uniquely identifiable, e.g. "the table", "the code listing", "the 
diagram", or whatnot, if there's just the one.

On Thu, 20 Jun 2013, Steve Faulkner wrote:
> 
> why is <figure> better in this case than <p> (for example) ?

On Thu, 20 Jun 2013, Xaxio Brandish wrote:
>
> The figures could be in a document talking about fonts, yet easily moved 
> to the side of the page and still maintain relevance if referenced 
> within the document.  I think something important about figures is 
> placement irrelevance as long as they can be referenced, whereas 
> paragraphs don't have the added semantic of "this will be referenced at 
> some point."

Right. Paragraphs are in-flow and come after their previous paragraph and 
before their next paragraph. Figures can be out of flow, and aren't 
necessarily read at the same time.

<figure> is more like <aside>, except that it's part of the main content.

On Thu, 20 Jun 2013, Steve Faulkner wrote:
>
> OK so how do you reference
> 
> <figure>
> arial
> </figure>
> 
> for example?

On Thu, 20 Jun 2013, Xaxio Brandish wrote:
>
> <p>Fonts come in many different varieties. The Arial font, for example,
> does not have serifs.</p> <figure>arial</figure>
> <p>However, font varieties go beyond simple serif and sans-serif
> distinctions. The Old English font is neither of these, instead being
> considered a "decorative" font.</p><figure>Old English</figure>
>
> The above example has meaning with or without the figures, and the
> placement of the figures doesn't matter. They could be in a font index at
> the end of the document, as long as the data consumer knows to look there
> if example are needed.  The fact that they are enclosed in the <figure>
> elements means that they are referenced somewhere, I believe.

Right.

Or:

   <p>Sometimes, the robot follows a complicated chain of reasoning (see 
   diagram).</p>
   <figure><img src="diagram.png" alt="Diagram: The robot begins by 
   [...]"></figure>
   <p>Other times, ...

> When referring to multiple figures containing graphs or tables with 
> really long names such as "Number of Children With Orange Dreadlocks 
> With Respect to Decade" and "Periods of Time During Which Dreadlocks Are 
> Popular, Where Orange Is Popular, and Where They Overlap", it's so much 
> easier just to give them a <figcaption> and refer to "Table 1" and 
> "Table 2" in the document.

Yeah.

On Thu, 20 Jun 2013, Steve Faulkner wrote:
> 
> <p>Fonts come in many different varieties. The Arial font, for example,
> does not have serifs.</p> <div>arial</div>
> <p>However, font varieties go beyond simple serif and sans-serif
> distinctions. The Old English font is neither of these, instead being
> considered a "decorative" font.</p><div>Old English</div>
> 
> The above example has meaning with or without the divs, and the placement
> of the divs doesn't matter.

The above is semantically equivalent the case of the <div> replaced by the 
<p> (since <div> has no meaning, and there's an implied paragraph between 
the other paragraphs, per the spec). So it would be more dubious that the 
"arial" text can be moved around at will.

However, yes, you're right, you could do it this way if you wanted.

You could also replace all the <p>s with <div>s, and all the <section>s 
with <div>s, and style everything using classes instead. It'd be less 
pretty, might be harder to maintain, probably wouldn't be quite as easy 
for software to make head or tails out of it (though sprinkling some ARIA 
around might help with that), but it'd still be fine HTML.

> so if not referenced somewhere, they should not be in a figure?

It doesn't have to be referenced explicitly. It's just a figure. But they 
typically are.

Arguably, the font figures in the example above aren't really explicitly 
referenced -- you wouldn't know they were missing if you removed them 
entirely -- but they're still figures.

On Fri, 21 Jun 2013, Martin Janecke wrote:
> 
> Probably they should not, as figures are "typically referenced as a 
> single unit from the main flow of the document"^[1]. I'd like to add 
> that the reference can be implicit, though. A short car magazine article 
> about a particular model might be a good example. Readers who are likely 
> to have seen some cars in their lives will identify a car's front 
> section on a photograph by themselves and make the connection to what 
> the articles writes about it.

Right.

> Here is such an article: 
> http://www.caranddriver.com/news/2014-bmw-4-series-photos-and-info-news
>
> Although the webpage does not actually use figure elements, it would be 
> appropriate for the photographs that are embedded in the main article. 
> The photographs illustrate and enhance the article's content by 
> providing more design details than the text, are self-contained, not 
> part of the main flow and implicitly referenced from it. (The photos 
> should have alt-texts though.)

Indeed, the photo galleries here are a fine candidate for <figure>.

On Fri, 21 Jun 2013, Xaxio Brandish wrote:
> 
> Consider a web page that is devoid of color or motion, and is thus less 
> interesting to people who *must* read it.  An example of this can be an 
> online driving education course.  Now imagine that the author of the 
> page wanted to seem less boring, and so adds a piece of barely related 
> clip art to the page, and said clip art is not referenced anywhere in 
> the main document material.  The author wants to add a humorous comment 
> to the image to lighten the mood of the page, and considers using 
> <figure> and <figcaption>.
> 
> Would it be appropriate to caption the aforementioned clip art using 
> <figcaption> if it is contained in an <aside> element, claiming that the 
> figure is self-referential yet only tangentially related to the 
> document? If not, is there an element better suited to this purpose, or 
> we can we redefine the <figcaption> element to encompass a purpose such 
> as this?

<aside>, <figure>, or indeed both, would be a fine way to do this.

I wouldn't try to overthink this all too much. There's an infinite number 
of possible documents that could be marked up, and only a finite number of 
elements and combinations thereof. We'll never describe every nuanced case 
perfectly, nor do we really need to.

On Fri, 21 Jun 2013, Steve Faulkner wrote:
> 
> the latest (single page) whatwg spec says:
> 
> "The figure element represents some flow content, optionally with a 
> caption, that is self-contained (like a complete sentence) and is 
> typically referenced as a single unit from the main flow of the 
> document."
> 
> There is no normative text that says it MUST be referenced, only a non 
> normative phrase "typically referenced"
> 
> so that suggests to me that it is OK to use figure/figcaption for the 
> use case i described and the one you described, but then the there is a 
> lot of other descriptive text about figure that serves to befuddle my 
> understanding.

Can you elaborate on what contradictions you see or what is confusing 
your understanding?

On Fri, 21 Jun 2013, Xaxio Brandish wrote:
> 
> One part of the ambiguity in the WHATWG spec comes from the examples given:
> 
> 1) The first example uses <figure> as referenced from a document.
> 2) The second example is not referenced from a document.
> 3) The third example shows an image that is not a figure, followed by two
> pieces of media content that are within <figure> tags.  The non-figure
> image could not be removed from its position in the document flow without
> changing the meaning of the document, so it is not used as a <figure>
> element.
> 4) The fourth example is not referenced from a document.
> 5) The final two examples are implied to be referenced from a document, and
> are semantically equivalent.
> 
> Since we cannot know the surrounding document for examples 2 and 4, it 
> seems that those examples take advantage of the open-ended adaptability 
> of the unreferenced version of the <figure> element.

Right. The idea with the examples here is to show a wide variety of valid 
ways to use the element, so as to not imply from the examples that any 
one specific technique is the only valid one.

At the end of the day, only the normative text (as defined by the word 
"must", and indirectly by "represents") matters.

> This leaves us with the question at hand: if we see a <figure> element, 
> can we expect to find a part of the document from which it is 
> referenced? Consider the following scenario:
> 
> One is reading an online newspaper article.  The article references 
> Figure 1, located at the end of the article (and near the bottom of the 
> page) due to readability constraints.  We look at the end of the 
> article, and see a figure with a caption "Figure 1".  The article then 
> references "Figure 2", so we look at the end of the article and see a 
> figure with a caption, "Figure 2".  We arrive at the end of the article 
> and see another figure with a caption, "Figure 3".
> 
> In the above scenario, Figure 3 is unreferenced.  The first instinct 
> when looking at an unreferenced figure (as used in the scenario) is to 
> examine the figure to attempt to establish a context for it.  Whether or 
> not context is established, the second instinct is almost invariably to 
> go back to the part of the article after Figure 2 was referenced in 
> order to find out where we missed the reference to Figure 3.  A third, 
> slightly lesser instinct may even prompt a review of the entire article 
> in an effort to find the missing reference.
> 
> It is possible that the author of the fabled online newspaper article 
> needed to use a visible caption, and could not find a better element for 
> the job than <figure> and <figcaption>.  It is not obvious whether the 
> article was edited incorrectly, whether there was a printing error, or 
> whether the unreferenced figure was intended to stand alone.

This problem can occur with or without <figure>, of course.

Nothing in HTML forces the author to give a caption.

> I propose that unreferenced figures set unreasonable expectation, as 
> just described, and that either
> 
> 1) more generic grouping content should be used to group unreferenced data
> with captions, or
> 2) a new element be created similar to <label> with an attribute similar to
> the "for" attribute that is not required to be located within a user
> interface such as form, or
> 3) a new set of elements similar to <figure> and <figcaption> be created to
> group unreferenced data.

I don't really see why <figure> doesn't already handle this case. Nothing 
is forcing the author to label the figure "Figure 3" or to not say "Figure 
3. This figure (not referenced in the article) shows...".

On Mon, 24 Jun 2013, Steve Faulkner wrote:
> 
> OK so 'typically' infers that <figure> is used in this way, from a 
> recent review of data (June 2013 data set from http://webdevdata.org) on 
> usage of <figure> it appears that it is typically not used in this way 
> by authors. There are typically no explicit references to figure 
> content.
> 
> Here are some examples of pages using figure/figcaption. (also appears 
> that figure is often used without figcaption: figcaption instances in 
> sample of 53000 pages = 4603 , figure usage = 14609, indicating approx 1 
> in 3 uses of figure includes a figcaption)
> 
>    - Mirror Online <http://www.mirror.co.uk/news/>
>    - Christian News on Christian Today <http://www.christiantoday.com/>
>    - Infonews <http://www.infonews.com/>
>    - Peru.com,  <http://peru.com/>
>    - Computer Arts magazine <http://www.computerarts.co.uk/>
>    - Elle <http://www.elle.it/>
>    - NASCAR.com <http://www.nascar.com/en_us/sprint-cup-series.html>
>    - Indiatimes: <http://www.indiatimes.com/>
>    - Bollywood Mantra <http://www.bollywoodmantra.com/>
>    - Teen Vogue <http://teenvogue.com/>
>    - Irish Independent <http://www.independent.ie/>
>    - bitbucket <https://bitbucket.org/>
>    - HELLO! Online <http://www.hellomagazine.com/>
>    - Mobile App Tracking <http://mobileapptracking.com/>
>    - Consumer Complaint Database <http://www.consumerfinance.gov/complaintdatabase/>
>    - AS.com <http://as.com/>

These are biased towards home pages, which are likely to use implicit 
references, not explicit references. You'd have to look at deep pages to 
see how <figure> was used on those, too.

In any case, the spec doesn't say it has to be an explicit reference. Take 
the last example above. Each <figure> is actually implicitly referenced by 
the text immediately following it. That's exactly the kind of thing the 
element is intended for.

(Also, note that the spec is encouraging best-practice behaviour, not 
describing actual behaviour. If we described actual authoring behaviour, 
the spec could do away with pretty much any content model or syntax 
restrictions, for example...)

On Mon, 24 Jun 2013, Xaxio Brandish wrote:
> 
> (had to snip the message and resend, it went over the mailing list size 
> limit)

I recommend that y'all always trim the e-mail you are replying to, so that 
it only includes what you are replying to. In particular, you should never 
leave any quotes after the last thing you add to your e-mail (please don't 
bottom-quote, thanks!).

> These pages seem to use <figure> inside of an <article> (or equivalent) 
> to place images related to the article, often linked to the extended 
> text of an article on another page, but none of the figures are 
> specifically referenced.

Right.

> I think the question at that point becomes, "*What value does the 
> <figure> element add to its content if not referenced?*", especially 
> since that seems to be the case a majority of the time. All of the 
> images in the <article> are by default related to that article, since 
> they are placed there.  Even if real world data does not dictate it, we 
> still need to maintain a level of reasonable expectation: one would 
> *not* put an image or figure inside of an article that is not related to 
> that article. Some of the pages you listed use <figure> and <figcaption> 
> as a way to caption an image, but several of the pages don't even have 
> captions (as you indicated, 1 in 3).
>
> The answer to the above question seems to be that the <figure> element 
> doesn't add meaning at that point.  One could encapsulate every <img> 
> element in an article inside of a <figure> element, but what would be 
> the point?  We already know they're images, and we already know they're 
> related to the article.

Sure. The main thing that it adds is an easy hook for styling. (Notice 
that the pages cited above tend to apply styles to the <figure>.)

> The WHATWG HTML specification [1] currently says
> 
> > If a figure element is referenced by its relative position, e.g. "in 
> > the photograph above" or "as the next figure shows", then moving the 
> > figure would disrupt the page's meaning. Authors are encouraged to 
> > consider using labels to refer to figures, rather than using such 
> > relative references, so that the page can easily be restyled without 
> > affecting the page's meaning.
> 
> It seems that <figure> elements are often simply not referenced at all, 
> not even relatively, which *seems* to be a misuse of the element as 
> currently defined.  <figure> elements are not required to be part of an 
> <article> element, though that seems to be the largest use.

Well we have to be careful about assuming that the uses above are 
representative, since they're so strongly biased towards home pages. But 
it's true that there was no example of an implicit <figure> in an 
<article>. I tweaked the examples accordingly.

> One idea is that rather than a figure being referenced explicitly, 
> perhaps it should be assumed that if it is in an grouping element, it is 
> referenced implicitly as being related to that grouping element's 
> content.

The spec doesn't actually say that there has to be an explicit reference. 
Only that there can be a reference (which could be implicit).

> Unfortunately, using <figure> as a container for an unreferenced image 
> thumbnail related to <article> content makes us lose the definition that 
> the <figure> can be removed from the visual flow and still retain 
> meaning in and of itself, since such usage then prevents the figure from 
> being referenced at all, even if desired.

I disagree with the premise here. The thumbnail still makes sense on its 
own, at least to the same extent as any diagram in a document. I'm not 
sure what you mean by "prevents the figure from being referenced at all".

> Another idea is that <figure> and <figcaption> should merely be used as 
> a way to caption images, but then we've lost an extremely convenient way 
> to express content relevance in the future.  I don't relish the idea of 
> this.
> 
> It seems absurd (perhaps just to me) that the <figure> element be 
> redefined as being "a grouping content element for any referenced or 
> unreferenced content that is not part of the main text of an article".  
> We already have <aside> for that.

There's a difference between a sidebar and a figure. If the spec's 
descriptions are failing to distinguish them, that's unfortunate. There's 
quite a bit of text in the spec that attempts to distinguish them. Can you 
elaborate on how that text is insufficient?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'