[whatwg] Should a <figure> element require a reference? (was: use cases for <figure> without <figcaption>?)

Mon Jun 24 10:40:33 PDT 2013

Good morning Steve,

(had to snip the message and resend, it went over the mailing list size
limit)

On Mon, Jun 24, 2013 at 2:17 AM, Steve Faulkner <faulkner.steve at gmail.com>wrote:

OK so 'typically' infers that <figure> is used in this way, from a recent
> review of data (June 2013 data set from http://webdevdata.org) on usage
> of <figure> it appears that it is typically not used in this way by
> authors. There are typically no explicit references to figure content.
>
>  Here are some examples of pages using figure/figcaption. (also appears
> that figure is often used without figcaption: figcaption instances in
> sample of 53000 pages = 4603 , figure usage = 14609, indicating approx 1 in
> 3 uses of figure includes a figcaption)
>
>
>    - Mirror Online <http://www.mirror.co.uk/news/>
>    - Christian News on Christian Today <http://www.christiantoday.com/>
>    - Infonews <http://www.infonews.com/>
>    - Peru.com,  <http://peru.com/>
>    - Computer Arts magazine <http://www.computerarts.co.uk/>
>    -  Elle <http://www.elle.it/>
>    - NASCAR.com <http://www.nascar.com/en_us/sprint-cup-series.html>
>    - Indiatimes: <http://www.indiatimes.com/>
>    - Bollywood Mantra <http://www.bollywoodmantra.com/>
>    - Teen Vogue <http://teenvogue.com/>
>    - Irish Independent <http://www.independent.ie/>
>    - bitbucket <https://bitbucket.org/>
>    - HELLO! Online <http://www.hellomagazine.com/>
>    - Mobile App Tracking <http://mobileapptracking.com/>
>    - Consumer Complaint Database<http://www.consumerfinance.gov/complaintdatabase/>
>    - AS.com <http://as.com/>
>
> I looked over the markup of several of the pages you listed here.  I'm
assuming that these represent a reasonable representation of widespread
usage (no offense, please -- I didn't check the webdavdata data myself).
These pages seem to use <figure> inside of an <article> (or equivalent) to
place images related to the article, often linked to the extended text of
an article on another page, but none of the figures are specifically
referenced.

On Mon, Jun 24, 2013 at 2:17 AM, Steve Faulkner <faulkner.steve at gmail.com>wrote:
>
Part of the semantics of HTML come from *author intent* and *reasonable
>> expectation*.  If we see a <table> element, we can expect tabular data.
>> If we see an <li> element, we can expect that it is one of multiple.
>>
> see stats above, author intent in real world use does not appear to match
> expectations. for the majority of users the use of figure/figcaption makes
> not difference they don't even know its there.
>

I think the question at that point becomes, "*What value does the <figure>
element add to its content if not referenced?*", especially since that
seems to be the case a majority of the time. All of the images in the
<article> are by default related to that article, since they are placed
there.  Even if real world data does not dictate it, we still need to
maintain a level of reasonable expectation: one would *not* put an image or
figure inside of an article that is not related to that article. Some of
the pages you listed use <figure> and <figcaption> as a way to caption an
image, but several of the pages don't even have captions (as you indicated,
1 in 3).

The answer to the above question seems to be that the <figure> element
doesn't add meaning at that point.  One could encapsulate every <img>
element in an article inside of a <figure> element, but what would be the
point?  We already know they're images, and we already know they're related
to the article.

The WHATWG HTML specification [1] currently says

If a figure element is referenced by its relative position, e.g. "in the
> photograph above" or "as the next figure shows", then moving the figure
> would disrupt the page's meaning. Authors are encouraged to consider using
> labels to refer to figures, rather than using such relative references, so
> that the page can easily be restyled without affecting the page's meaning.
>

It seems that <figure> elements are often simply not referenced at all, not
even relatively, which *seems* to be a misuse of the element as currently
defined.  <figure> elements are not required to be part of an <article>
element, though that seems to be the largest use.

On Mon, Jun 24, 2013 at 2:17 AM, Steve Faulkner <faulkner.steve at gmail.com>wrote:

For users of assistive technology in combination with browser that actually
> map the figure and figcaption elements to something useful they are aware
> that the the content of the figure is a distinct group and hat the caption
> for the group is (if provided), the theoretical capability of figure to be
> moved away from its current position is just that.
>

It is indeed theoretical, but part of the reason for the specification is
provisioning for future and practical usage.  Consider a search engine
similar to Wolfram Alpha that would be happy to pull <figure> elements as
being distinct groups.  Alternatively, consider a page of stock indexes
inside related articles that is visually organized in a nonintuitive way,
where the markup contains the value of the stock inside <figure>, and the
stock name inside <figcaption>; an enterprising web author could write a
style that transforms the data into a side-by-side list of stocks and their
related articles.

On Mon, Jun 24, 2013 at 2:17 AM, Steve Faulkner
<faulkner.steve at gmail.com>wrote:
>
...but do think that figure/figcaption definitions and explanations of use
> need to be modified to take into account cow paths already trodden.
>

Since the problem here is half technical (a way that exists to markup
page-essential reference data) and half semantic (what meaning does
<figure> have in relation to its content and surrounding markup?), we
should consider carefully whether <figure> loses its meaning in the
real-world use-cases already in production.  If so, it may make more sense
to persuade authors to use the element correctly.  If not, it may make more
sense to redefine <figure>.

One idea is that rather than a figure being referenced explicitly, perhaps
it should be assumed that if it is in an grouping element, it is referenced
implicitly as being related to that grouping element's content.
Unfortunately, using <figure> as a container for an unreferenced image
thumbnail related to <article> content makes us lose the definition that
the <figure> can be removed from the visual flow and still retain meaning
in and of itself, since such usage then prevents the figure from being
referenced at all, even if desired.

Another idea is that <figure> and <figcaption> should merely be used as a
way to caption images, but then we've lost an extremely convenient way to
express content relevance in the future.  I don't relish the idea of this.

It seems absurd (perhaps just to me) that the <figure> element be redefined
as being "a grouping content element for any referenced or unreferenced
content that is not part of the main text of an article".  We already have
<aside> for that.  How would you redefine figure/figcaption to take into
account current usage scenarios?  Keep in mind that even though they do not
seem to be the majority, there must already be a subset of <figure> usages
out there which are using the element as intended by current definition.
--Xaxio

References:
[1]
http://www.whatwg.org/specs/web-apps/current-work/multipage/grouping-content.html#the-figure-element

On Mon, Jun 24, 2013 at 2:17 AM, Steve Faulkner <faulkner.steve at gmail.com>wrote:

> Hi Xaxio
>
>
> On 21 June 2013 15:59, Xaxio Brandish <xaxiobrandish at gmail.com> wrote:
>>
> Steve,
>>
>> Please permit me to change the subject line since the topic no longer
>> answers the subject question?
>>
>
>>
>
>
> thanks!
>
>
>>
>> The next sentence in the WHATWG spec [1] states
>>
>> The element can thus be used to annotate illustrations, diagrams, photos,
>>> code listings, etc, that are *referred to from the main content of the
>>> document*
>>>
>> (italics mine)
>>
>> It's true that there is no text saying that the <figure> element MUST be
>> used a certain way, but there are two sentences saying how it "typically"
>> or "can" be used, both implying a reference from a document.
>>
>
>>
>
> OK so 'typically' infers that <figure> is used in this way, from a recent
> review of data (June 2013 data set from http://webdevdata.org) on usage
> of <figure> it appears that it is typically not used in this way by
> authors. There are typically no explicit references to figure content.
>
>  Here are some examples of pages using figure/figcaption. (also appears
> that figure is often used without figcaption: figcaption instances in
> sample of 53000 pages = 4603 , figure usage = 14609, indicating approx 1 in
> 3 uses of figure includes a figcaption)
>
>
>    - Mirror Online <http://www.mirror.co.uk/news/>
>    - Christian News on Christian Today <http://www.christiantoday.com/>
>    - Infonews <http://www.infonews.com/>
>    - Peru.com,  <http://peru.com/>
>    - Computer Arts magazine <http://www.computerarts.co.uk/>
>    -  Elle <http://www.elle.it/>
>    - NASCAR.com <http://www.nascar.com/en_us/sprint-cup-series.html>
>    - Indiatimes: <http://www.indiatimes.com/>
>    - Bollywood Mantra <http://www.bollywoodmantra.com/>
>    - Teen Vogue <http://teenvogue.com/>
>    - Irish Independent <http://www.independent.ie/>
>    - bitbucket <https://bitbucket.org/>
>    - HELLO! Online <http://www.hellomagazine.com/>
>    - Mobile App Tracking <http://mobileapptracking.com/>
>    - Consumer Complaint Database<http://www.consumerfinance.gov/complaintdatabase/>
>    - AS.com <http://as.com/>
>
>
>
>> One part of the ambiguity in the WHATWG spec comes from the examples
>> given:
>>
>> 1) The first example uses <figure> as referenced from a document.
>> 2) The second example is not referenced from a document.
>> 3) The third example shows an image that is not a figure, followed by two
>> pieces of media content that are within <figure> tags.  The non-figure
>> image could not be removed from its position in the document flow without
>> changing the meaning of the document, so it is not used as a <figure>
>> element.
>> 4) The fourth example is not referenced from a document.
>> 5) The final two examples are implied to be referenced from a document,
>> and are semantically equivalent.
>>
>> Since we cannot know the surrounding document for examples 2 and 4, it
>> seems that those examples take advantage of the open-ended adaptability of
>> the unreferenced version of the <figure> element.
>>
>
> agree that current examples are lacking and do not serve to illustrate
> intended use of figure/figcaption
>
>
>>
>> Part of the semantics of HTML come from *author intent* and *reasonable
>> expectation*.  If we see a <table> element, we can expect tabular data.
>> If we see an <li> element, we can expect that it is one of multiple.
>>
>
>>
>
> see stats above, author intent in real world use does not appear to match
> expectations. for the majority of users the use of figure/figcaption makes
> not difference they don't even know its there. For users of assistive
> technology in combination with browser that actually map the figure and
> figcaption elements to something useful they are aware that the the content
> of the figure is a distinct group and hat the caption for the group is (if
> provided), the theoretical capability of figure to be moved away from its
> current position is just that.
>
>
>> This leaves us with the question at hand: if we see a <figure> element,
>> can we expect to find a part of the document from which it is referenced?
>> Consider the following scenario:
>>
>> One is reading an online newspaper article.  The article references
>> Figure 1, located at the end of the article (and near the bottom of the
>> page) due to readability constraints.  We look at the end of the article,
>> and see a figure with a caption "Figure 1".  The article then references
>> "Figure 2", so we look at the end of the article and see a figure with a
>> caption, "Figure 2".  We arrive at the end of the article and see another
>> figure with a caption, "Figure 3".
>>
>> In the above scenario, Figure 3 is unreferenced.  The first instinct when
>> looking at an unreferenced figure (as used in the scenario) is to examine
>> the figure to attempt to establish a context for it.  Whether or not
>> context is established, the second instinct is almost invariably to go back
>> to the part of the article after Figure 2 was referenced in order to find
>> out where we missed the reference to Figure 3.  A third, slightly lesser
>> instinct may even prompt a review of the entire article in an effort to
>> find the missing reference.
>>
>> It is possible that the author of the fabled online newspaper article
>> needed to use a visible caption, and could not find a better element for
>> the job than <figure> and <figcaption>.  It is not obvious whether the
>> article was edited incorrectly, whether there was a printing error, or
>> whether the unreferenced figure was intended to stand alone.
>>
>> I propose that unreferenced figures set unreasonable expectation, as just
>> described, and that either
>>
>
>>
>
> from looking at real world data (see above) i couldn't actually find any
> uses of figure/figcaption as you describe.
>
>
>>
>> 1) more generic grouping content should be used to group unreferenced
>> data with captions, or
>> 2) a new element be created similar to <label> with an attribute similar
>> to the "for" attribute that is not required to be located within a user
>> interface such as form, or
>> 3) a new set of elements similar to <figure> and <figcaption> be created
>> to group unreferenced data.
>>
>>
> I don't think there is any need to add new features, but do think that
> figure/figcaption definitions and explanations of use need to be modified
> to take into account cow paths already trodden.
>
>
>> --Xaxio
>>
>>  References:
>> [1]
>> http://www.whatwg.org/specs/web-apps/current-work/multipage/grouping-content.html#the-figure-element
>>
>> On Fri, Jun 21, 2013 at 3:00 AM, Steve Faulkner <faulkner.steve at gmail.com
>> > wrote:
>>
>
-snip-