[whatwg] @aria-labelledby | Re: @generator-unable-to-provide-required-alt, figure with figcaption

Wed Jun 19 11:53:20 PDT 2013

On Wed, 19 Jun 2013, Martin Janecke wrote:
> Am 17.06.2013 um 22:58 schrieb Ian Hickson:
> > On Mon, 17 Jun 2013, Martin Janecke wrote:
> >> Am 17.06.2013 um 11:35 schrieb Steve Faulkner:
> >>> 
> >>> the restriction on figure/figcaption is only in the whawtg spec not 
> >>> the W3C HTML spec as it was not deemed a useful or practical 
> >>> restriction when reviewed by the HTML WG.
> >> 
> >> Sounds lovely, this would indeed solve my use case.
> > 
> > Could you elaborate on what your use case and why it's not handled?
> 
> Yes. The use case begins with a markup generator that does not have a 
> suitable alt-text for images. In my case it's actually a converter 
> converting some light-weight markup to HTML, but I don't think the 
> discussion should dive into the details too deep as it applies to other 
> markup generators as well -- I named WYSIWYG editors and automatic 
> digitizers as examples. It is an established fact that there are markup 
> generators that don't have a suitable image description for the required 
> alt attribute.^[1]

Agreed.

> Without the required alt-attribute the generator's output is 
> non-conforming or "invalid" markup.

More importantly, it's not accessible markup. That is why it's not 
conforming.

> It seems that (or at least markup generator creators seem to think that) 
> a notable amount of users prefer generators that produce output which 
> passes conformance checker tests over those which produce output that 
> gets big red error marks. This can pressure markup generator creators to 
> trick conformance checkers into thinking their output was conforming. 
> Methods to achieve this include using bogus alt-texts or empty 
> alt-texts, suggesting a purely presentational image when it's actually 
> not.

Indeed. That's why we added the non-conforming but validator-silencing 
attribute "generator-unable-to-provide-required-alt".

> These methods are in a way successful as conformance checkers today fall 
> for the tricks. However, these tricks are considered harmful for 
> accessibility.

Right.

> (a) "The img element has a title attribute with a value that is not the 
> empty string (also as described above)."^[3]
> 
> A title attribute text is usually not available to the light-weight 
> markup converter I maintain. This applies to other markup generators as 
> well: An OCR digitizer does not find something equivalent to the typical 
> implementation of the title attribute (i.e. a mouse-over text) in a book 
> scan.
> 
> While my markup generator usually has access to a caption for an image 
> that is immediately visible to anyone and which could theoretically be 
> included as title attribute, that would mean redundant text as in the 
> following example:
> 
> <div><img src="tree.jpg" title="Tree in Fantasia"> Tree in 
> Fantasia</div>
> 
> I've actually seen captions re-used for title or alt attributes like 
> this quite often in the wild. I do not consider this a desirable output. 
> Why should visually impaired persons have to read everything twice?

Indeed, this would not be good usage. That's what <figure>/<figcaption> is 
intended to avoid.

> (b) "The img element is in a figure element that …"^[3] has a figcaption
> 
> So now to the question:
> 
> > I don't understand why <figure> as defined in the WHATWG spec doesn't 
> > work for your case. What does the page look like?
> 
> The problem for markup generators is that they do not understand how the 
> pages look like exactly. My light-weight markup converter has to use 
> solutions that work in many cases, favorable and less favorable ones. 
> Again, this applies to other markup generators as well. They don't 
> understand the semantics implied in the text they handle. I'll provide 
> two examples in pseudo code -- the task for the markup generator is to 
> translate it into HTML. Example 1:
> 
> | The funny finch is a well known bird of Fantasia.
> |
> | [fig src="funny-finch.jpg"]Fig 1.: Funny finch on a fig twig[/fig]
> |
> | It frolics freely in Fantasias famous forests.
> |
> | [fig src="feeding.jpg"]Fig. 2.: Funny finch feeding a fledgling[/fig]
> |
> | The funny finch feeds on fruits and flies (fig. 2). Thanks to
> | reforestation, the funny finch population has flourished in the past
> | fourty years (fig. 3).
> |
> | [fig src="demographics.png"]Fig. 3: Finch population 1970--2010[/fig]
> |
> | The funny finch is closely related to the freaky finches of Florida.
> 
> The figure element with a figcaption is perfectly suitable for the 
> images in example 1.

Indeed, this, apart from the lacking alternative text, is a good structure 
to use for the Web.

> Let's look at example 2:
> 
> | The funny finch is a well known bird of Fantasia.
> |
> | [fig src="funny-finch.jpg"]Funny finch on a fig twig[/fig]
> |
> | It frolics freely in Fantasias famous forests.
> |
> | [fig src="feeding.jpg"]Funny finch feeding a fledgling[/fig]
> |
> | The funny finch feeds on fruits and flies, as shown in the photograph
> | above. Thanks to reforestation, the funny finch population has
> | flourished in the past fourty years, which the following diagram
> | illustrates.
> |
> | [fig src="demographics.png"]Finch population 1970--2010[/fig]
> |
> | The funny finch is closely related to the freaky finches of Florida.
> 
> Example 2 is conveying the same message as example 1. They almost look 
> the same. However, while moving all the figures to the bottom of the 
> page won't break example 1, example 2 will suffer badly from it. In 
> example 2 figures are referenced by their location, whereas they are 
> referenced by tokens in example 1.

Indeed.

> As I understand the WHATWG HTML spec, the figure element is not suitable 
> in example 2. And hence it is not usable for any markup generators that 
> do not understand human written texts well enough to differentiate 
> between the two cases. Even some people coding HTML by hand will 
> probably have difficulties to do it always right.

That's probably true, yes.

> The reason why I think example 2 does not conform with the spec is the 
> following paragraph from WHATWG HTML spec 4.5.11:
> 
> "The element can thus be used to annotate illustrations, diagrams, 
> photos, code listings, etc, that are referred to from the main content 
> of the document, but that could, without affecting the flow of the 
> document, be moved away from that primary content, e.g. to the side of 
> the page, to dedicated pages, or to an appendix."
> 
> The paragraph starts with a "can", indicating options. Then it continues 
> with a "but" clause, which negates the optional character making the 
> following words normative until "e.g." again switches to listing 
> options. The normative part is that figures must be movable away from 
> the primary content without affecting the flow of the document. They 
> don't have to be moved, hence the word "could", but it must be possible 
> without breaking anything. Did I understand that correctly?

Yes.

I've changed the spec to make <figure> applicable to your use case as 
well, and added more text to explain various use cases and whether they 
apply to <figure>. Let me know if the new text is still problematic for 
your use case. I agree that it would be overly restrictive to limit 
<figure> in the case you are presenting.

> (d) "The img element has a (non-conforming) 
> generator-unable-to-provide-required-alt attribute whose value is the 
> empty string."^[3]
> 
> Well, that is an option for any use case a markup generator runs into. 
> But it seems unattractive in all its verbosity to me.

It's supposed to be a little unattractive, to discourage authors from 
using it to silent validators complaining about their hand-written pages 
(where they should just provide the fricking replacement text).

> Unfortunately -- although its verbosity is there to prevent any 
> misunderstanding for its use -- it might leave the impression that a 
> generator writing
> 
> <img src="..." generator-unable-to-provide-required-alt="">
> 
> is not as good as a generator writing
> 
> <img src="..." alt="an image">

Indeed. I don't know of a way to fix that. It's always going to be the 
case that a generator doing the wrong thing in a way that is 
machine-readably indistinguishable from the right thing is more likely to 
look correct at a quick glance than a generator that is doing the wrong 
thing in a machine-detectable way. I don't know what we can do about that.

I'm open to suggestions.

> > If what you want is just an inline image followed by some text, 
> > though, you don't need <figure> or title="" -- you can just put in the 
> > image and the text, as in:
> > 
> >   <img src="the image"> <!-- FIXME: replacement text is missing -->
> >   <p>More text...
> 
> This causes big red error messages in conformance checkers which could 
> in theory motivate users of some markup generators to learn HTML and add 
> lots of alt-attributes to fix all these errors, but in practice leads to 
> markup generators using inappropriate attributes to silence conformance 
> checkers.

Yeah.

> In my case it is not applicable anyway: The converter generates markup 
> for instant display -- the output is not saved to be edited.

Doesn't mean that it's not still bad that it's inaccessible, of course. :-)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'