[whatwg] Conformance checking of missing alternative content for images

Ian Hickson ian at hixie.ch
Tue Aug 21 17:43:29 PDT 2012

On Fri, 27 Jul 2012, Steve Faulkner wrote:
> The spec currently allows img without alt if the title attribute is 
> present

That's a wild over-statement of the case.

To be precise, the specification requires that the alt attribute be 
present, with the exception of some very specific cases. The specific case 
where the presence of the title="" attribute is in any way relevant is the 
specific case where the image is obtained in some automated fashion 
without any associated alternative text (e.g. a Webcam), or because the 
page is being generated by a script using user-provided images where the 
user did not provide suitable or usable alternative text (e.g. photograph 
sharing sites), or because the author does not himself know what the 
images represent (e.g. a blind photographer sharing an image on his blog).

Only in those cases can the alt="" attribute be omitted if the title="" 
attribute is instead present with text. In all other cases, <img> elements 
in Web pages must have alt="" attributes.

> This is problematic for a number of reasons:
> 1. One of the functions of alt as implemented is that the text is 
> displayed when images are disabled or not available. I ran some tests a 
> while back[1] and found that while webkit based browsers display title 
> attribute content if images are disabled or not available, IE, Firefox 
> and Opera do not. I did a quick recheck and focund the implementations 
> have not changed in the 2.5 years since I ran those tests.
> 2. title attribute content is commonly displayed as a tooltip that 
> appears when a user moves their mouse over an element (in this case an 
> img) It is long running issue (14 years or so) that tooltips and thus 
> title attribute content is not displayed for keyboard only users. 
> Browsers vendors are fully aware of the issue, but as yet there have not 
> yet been moves to fix the issue*

These are violations of the UAAG, and affect far more than images. Any 
situation where there is content in a title="" attribute would be an 
accessibility problem, if title="" attributes aren't exposed. Either we 
should therefore drop the title="" attribute (unlikely to be a practical 
option), or we should fix the browsers to expose title="" attributes in 
cases where the user is not able to trigger the tooltip.

I've updated the spec to be clearer about this.

> It is suggested that due to the current and historical implementation of 
> title attribute display in browser, discouraging authors from using the 
> <img title="text"> markup pattern would result in more usable and 
> accessible content.

It is suggested by whom? I'm not sure I follow.

> We could address this problem by making changes along these lines:
> Remove the clause in the spec that makes the markup pattern conforming:
> "The title attribute is present and has a non-empty value"

This doesn't solve the problem in the general case, so it's not really a 
solution worth considering. It just papers over a minor case (where the 
page is in all likelihood not accessible anyway, since the author doesn't 
know what the image is) while ignoring the much larger issue of title="" 
not being exposed.

On Tue, 31 Jul 2012, Philip Jägenstedt wrote:
> AFAICT there's also no way to read the alt attribute on Opera Mobile. I 
> don't know what conclusions to draw, but if the situation is the same on 
> other mobile browsers and they are also unwilling to change, it seems 
> unwise to recommend using the title attribute to convey important 
> information. Of course, it would be equally unwise to use any other new 
> or existing attribute unless mobile browsers expose them in some way.

The spec encourages authors to use either title="" or <figcaption> in this 
situation. I've updated the spec to warn authors that contemporary user 
agents fail to expose the title="" attribute.

On Tue, 31 Jul 2012, Philip Jägenstedt wrote:
> I suppose that if mobile browsers fix Bug 3 *and* fall back to the title 
> attribute in the absence of an alt attribute then it would be OK to use 
> title instead of alt, but I'm confused -- is falling back to title a 
> Good Thing that people want browsers to implement, or is it just a quirk 
> that some legacy browser had?

When there's no alt attribute, if the image can't be rendered, the user 
agent should display some sort of indicator that there is an image that is 
not being rendered, and may, if requested by the user, or if so 
configured, or when required to provide contextual information in response 
to navigation, provide caption information for the image based on the 
title="" attribute or (if there's no title="") <figcaption>.

On Tue, 31 Jul 2012, Steve Faulkner wrote:
> *Note:* in terms of the accessible name calculation for an img element, 
> if the image does not have aria-label or an aria-labelledby or an alt 
> attribute, but does have a title attribute, then the title attribute is 
> used as the accessible name. From an accessibility API perspective, no 
> distinction is indicated as to the source of the accessible name (apart 
> from in the Mac AX API).

Note that the spec doesn't currently say to do this; it says what I said 
above, namely that if there's no alt="", the presence of the image should 
be indicated, and the caption information provided as caption information.

On Wed, 1 Aug 2012, Philip Jägenstedt wrote:
> To be very clear, you agree with the spec, think that WebKit is wrong 
> and would not offer any applause if Opera were to use the title 
> attribute to replace images when images are disabled and there is no alt 
> attribute?

I would applaud doing what the spec says, as I described above. :-)

The spec is pretty clear here, IMHO. (I've tried to make it even clearer.)

On Wed, 1 Aug 2012, Henri Sivonen wrote:
> In addition to image considerations, I think 
> http://www.whatwg.org/specs/web-apps/current-work/#footnotes is bad 
> advice.

On Wed, 1 Aug 2012, Philip Jägenstedt wrote:
> Yeah, that looks like a pretty bad idea, even for sighted users on 
> desktop browsers, unless you also add span[title] { border-bottom: 1px 
> dotted black; } or similar to your CSS to make it more discoverable. 
> Removing that advice seems like a good idea.

I've added a warning in that section and changed it from SHOULD to COULD 
for now. Since this is an accessibility problem, I remain hopeful that 
browser vendors will eventually fix their handling of title="".

I've also added a note about using CSS if title="" is used.

On Wed, 1 Aug 2012, Ian Hickson wrote:
> On Wed, 25 Jul 2012, Leif Halvard Silli wrote:
> > 
> > How about simply introducing a @generator attribute:
> > 
> >      <img generator='foo' alt='' src=bar.jpg />
> In the past I have argued against this, on the basis that it is highly 
> likely to be abused and actually make things worse. (The idea has been 
> brought up numerous times over the past few years, in many forms.)
> However, a few years of experience with the <meta> "generator" idea have 
> not led Henri and Mike (validator authors) to feel that the problem has 
> been suitably addressed, and Mike is now implementing alternatives, so 
> clearly it is time for me to revisit that. :-)
> We briefly brainstormed some ideas on #whatwg earlier tonight, and one 
> name in particular that I think could work is the absurdly long
>    <img src="..." generator-unable-to-provide-required-alt="">
> This has several key characteristics that I think are good:
>  - it's long, so people aren't going to want to type it out
>  - it's long, so it will stick out in copy-and-paste scenarios
>  - it's emminently searchable (long unique term) and so will likely lead 
>    to good documentation if it's adopted
>  - the "generator" part implies that it's for use by generators, and may 
>    discourage authors from using it
>  - the "unable" and "required" parts make it obvious that using this 
>    attribute is an act of last resort
> This attribute would be non-conforming except when provided in markup 
> generated by user agents that find themselves with an image and no 
> suitable alt="" text. It would be a third option in the "Images whose 
> contents are not known" section of the spec. It would be mentioned in 
> the "Guidance for markup generators" section, along with some text about 
> using one of the other two alternatives when the image in question is 
> the center of attention on the page (as in the Flickr case), rather than 
> using this new attribute. It would replace the "generator" exception in 
> the "Guidance for conformance checkers" section. The note in the 
> "generator" section would be removed.

I've now added this to the spec.

On Wed, 1 Aug 2012, Jukka K. Korpela wrote:
> 2012-08-01 10:56, Ian Hickson wrote:
> > 
> > Only generators are in a position where they might have to include 
> > images for which they lack the ability to provide alt texts.
> A simple counter-example to that: A human employee who has been told to 
> add some images to a web page, without having been told why and with no 
> instructions on alt texts.

Humans are incredibly adept at deducing context, and are quite able to 
reuse to comply to unethical requests. Such a human could, ofr instance, 
look at the image to determine what its meaning was, or could refuse to 
add the image without clear information on what the image was. Even if the 
human in this case was a mindless automaton, that is no reason to make the 
document conforming: the person asking the mindless automaton to add the 
image knows what the meaning of the image is, so there's no reason for the 
page to lack alternative text.

> So what about the poor human then?

The only "poor human" in this dicussion is the AT user who can't work out 
what the page is saying.

> > It's unfortunate to force such vendors into a position of having to 
> > defend their one validation error when there's nothing they can do 
> > about it,
> Silencing the error does not make the markup any better.

Actually, the argument here is very much that it does. Specifically, the 
argument is that WYSIWYG editor implementors will be pressured into making 
their tools output conforming content by people who don't understand the 
subtlties of this thread, based purely on validator output. Thus if we 
make the validator not complain when this error happens, they won't be 
pressured to do this.

> This is an example of the "validation as quality assurance" fallacy that 
> we should fight against, not support. When a document has been converted 
> to HTML format without due attention to alt texts for images, it has not 
> been converted properly. There is no reason to try please vendors of 
> converters by tweaking the rules and the checkers/validators to accept 
> automatic conversion results that just aren't good.

That only works when these implementors can in any way be expected to 
generate alternative texts. The point is they sometimes can't.

> And they *can* do a lot about it. They can initiate a user dialog, 
> prompting for a person to provide alt text.

A user converting 100,000 PDFs to HTML isn't going to be entering 
alternative texts for each image.

> Whether it is economically feasible is a different issue.

Not really...

> If you don't require generators to do that, why would you require the 
> poor human employee to write just something into the alt attribute? 
> (Making him type nonsense, mostly, of course.)

Because it's nowhere near as infeasible in this case.

> > > Even alt="unknown image" or alt="unknown image named foobar.jpg" is 
> > > better than lack of alt attribute (or alt="").
> > 
> > On the contrary; alt="unknown image" is equivalent to <span>unknown 
> > image</span> and would be fine alt="" text for an image of text that 
> > says "unknown image"
> That's a bit too theoretical, isn't it? On similar grounds, you might 
> argue that _any_ alt text is a fine text for an image containing that 
> text, and nothing else.

I'm not sure where the "and nothing else" comes from.

On Sat, 4 Aug 2012, Michael[tm] Smith wrote:
> Speaking as a validator contributor-implementor, I support the addition 
> of this attribute, with the "generator-unable-to-provide-required-alt" 
> name or at the very least with the characteristics of the name Hixie 
> outlines [earlier].

On Sun, 5 Aug 2012, Henri Sivonen wrote:
> On Sat, Aug 4, 2012 at 9:08 AM, Michael[tm] Smith <mike at w3.org> wrote:
> >
> > Agreed. I support making having some kind of "trial period" like what 
> > you describe, or a year or two or 18 months. If we do that I would 
> > prefer that the spec include some kind of note/warning making it clear 
> > that the attribute is experimental and may be dropped or changed 
> > significantly within the next two years based on analysis we get back 
> > during that time.
> There's a non-trivial set of validator users who get very upset if the 
> validator says that the document that previously produced no validation 
> errors now produces validation errors--even if the new errors result 
> from a bug fix. In my experience, handing out badges makes people more 
> upset if the criteria behind the badge changes, but even without badges, 
> it seems to me that the sentiment is there.
> Therefore, if you tell people that if they use a particular syntax their 
> document might become invalid in the future, chances are that they will 
> steer clear of the syntax when an easier alternative is available--just 
> writing alt="". So adding a warning that the syntax is experimental is 
> an almost certain way to affect the outcome of the experiment. On the 
> other hand, not warning people and then changing what's valid is likely 
> to make people unhappy.
> It seems to me that running an experiment like this will either result 
> in a failed experiment, unhappy people or both.

I think there's also the possibility that people will be happy, blissfully 
unaware that we ran an experiment that showed the idea was a success. :-)

> If an experiment on this topic was to be run, what would you measure how 
> would you interpret the measurements?

I would consider the experiment a success unless the following was found:

 - people "often" use the attribute in inappropriate ways.

 - all new WYSIWYG editors, when faced with an unknown image, give bogus 
   alt texts rather than no alt text.

If people don't abuse the attribute and some editors give no alt text 
rather than bogus alt text, then it's a success, IMHO.

Ideally, we'd have some way of comparing the situation now with the 
situation in the future, e.g. by coming up with a way to look at editors 
and determining whether they've overall moved to giving more or less bogus 
alt text.

(I described this the original e-mail:

> If we do this, I think we should commit to revisiting the issue in a 
> year or two, to examine what impact this is having on Web pages: is the 
> attribute used in inappropriate ways? Is it used badly more than 
> correctly? Are validator users more or less happy? Most importantly, are 
> alt="" texts overall better or worse? Have any generators started using 
> the attribute rather than outputting bogus alt="" values?


On Wed, 8 Aug 2012, Michael[tm] Smith wrote:
> So to avoid that, the alternative is to not state clearly and honestly 
> up front that we're adding it experimentally? And instead a year or two 
> from now when we look back at this and maybe find out the experiment has 
> not worked out the way we hoped, we drop the attribute anyway -- without 
> ever having been clear about that fact that is was experimental to begin 
> with?

That's more or less what I'd recommend. :-)

At the end of the day, pretty much everything we spec is an experiment. 
It's just that this one is something where we're even less confident than 
usual that it's going to work.

On Sun, 5 Aug 2012, Henri Sivonen wrote:
> >
> >    <img src="..." generator-unable-to-provide-required-alt="">
> While I agree with the sentiment the name of the attribute communicates, 
> its length is enough of a problem to probably make it fail:
> 1) Like a namespace URL, it's too long to memorize correctly, so it's 
> easier for the generator developer to type 'alt' than to copy and past 
> the long attribute name from somewhere.

Isn't that a good thing? How often are you expecting people to type the 
attribute? I would expect each editor implementor to type it once. They 
can definitely do that; even namespaces manage to be typed far more often 
than that.

> 2) It takes so many more bytes than alt="", so it's easy to shy away 
> from using it on imagined efficiency grounds.

I'm skeptical that this is an argument people will make.

On Sun, 5 Aug 2012, Maciej Stachowiak wrote:
> On Aug 1, 2012, at 12:56 AM, Ian Hickson <ian at hixie.ch> wrote:
> Here's a review of other proposed names and a few new ideas:
> noalt
>     Pro: brief
>     Con: not very explanatory, so perhaps more likely to be misused
> relaxed [suggested by Ted]
>     Pro: correctly conveys "relaxed validation"
>     Con: not clear what is relaxed or why
> incomplete [suggested by Laura]
>     Pro: correctly conveys that a non-decorative content image is incomplete without a textual equivalent
>     Con: not clear what is incomplete or why
> unknown
>     Pro: correctly conveys the reason for omitting alt, i.e. that the name is unknown to the generator
>     Con: might not be clear that it is not for human authors
> unknown-to-generator
>     Pro: correctly conveys intended generator use
>     Con: not totally clear what it is that is unknown
> I don't have a strong opinion, but I think 
> generator-unable-to-provide-required-alt might be long to the point of 
> silliness.

I think clarity of the attribute's purpose is really important here, 
because the attribute does nothing at all other than exist. Its only 
purpose is to convey information, it doesn't actually _do_ anything. So 
IMHO having it be precise is critical.

For most attributes, I would say the name is an opaque detail, and we 
could just go with pretty much any name without really harming the feature 
that much. But here, the name _is_ the feature.

On Mon, 6 Aug 2012, Silvia Pfeiffer wrote:
> I'd think it should at least mention "alt". Shorter would e.g. be 
> "unable-to-provide-alt".

On Mon, 6 Aug 2012, Nils Dagsson Moskopp wrote:
> What about “alt-unknown” or “unknown-alt” ?

On Sun, 5 Aug 2012, Glenn Maynard wrote:
> validator-ignore=alt

All of these are similarly less clear, IMHO.

On Mon, 6 Aug 2012, Odin Hørthe Omdal wrote:
> IMHO generator-unable-to-provide-required-alt in all its ugliness is a 
> really nice feature, because how would anyone in their sane mind write 
> that. It's really made for a corner case, and if you really really want 
> that, you should be prepared to deal with the ugliness, because what you 
> are doing is ugly in the first place...
> It clearly describes what's going on, in clear text that even those 
> whose mother language is not English can easily understand. It 
> discourages usage by being ugly and long. The negative reaction you had 
> here is more or less what I believe the name is designed to provoke.


On Mon, 6 Aug 2012, Glenn Maynard wrote:
> Making things ugly on purpose is always a bad idea.  Either it has valid 
> use cases, and it should be a clean, well-designed feature, or it 
> doesn't, and it shouldn't be there at all.  Please don't go down this 
> path; we have more than enough ugliness by accident without doing it on 
> purpose.

I think this misses the point -- it's not _ugly_, just verbose.

On Sat, 4 Aug 2012, Benjamin Hawkes-Lewis wrote:
> Would it be possible to combine this with the linter complaining about 
> all controls (links, buttons, form fields) have markup that yield a 
> non-empty "accessible name" without invoking repair techniques such as 
> reading filenames without img @src attributes?
> http://www.w3.org/WAI.new/PF/aria/roles#namecalculation
> I realise the author requirements in the HTML spec seem to have 
> gradually become very forgiving here, not really sure why.

I'm not sure I understand the suggestion here. Can you elaborate?

> It would help catch the not uncommon antipattern where the "content" of 
> a link or button is provided only by a background image.
>    <a href="somewhere"></a>
>    <a href="somewhere-else"></a>
>    <button class="delete"></button>

This is should-level non-conforming and has no reason to be conforming, as 
far as I can tell ("elements whose content model allows any flow content 
or phrasing content SHOULD have at least one child node that is palpable 
content and that does not have the hidden attribute specified").

The only reason it's not entirely non-conforming ("must" rather than 
"should") is that there are some edge cases where it makes sense, e.g. 
when you have an empty paragraph that you're going to fill in later.

But maybe we should tighten this up again, e.g. for interactive content?

On Sun, 5 Aug 2012, Henri Sivonen wrote:
> Whether that should be a validity constraint or an optional additional 
> check is a bit tricky, for the same reason why we allow empty paragraphs 
> and empty lists: to let markup editors simultaneously guarantee the 
> validity of their output and to allow the user to save the document at 
> any stage of editing.

That's not really the reason behind allowing incomplete pages. Incomplete 
pages can be saved without having to be conforming.

> (Again, there's tension between different uses of validity: the sort of 
> validity constraints you want to hold before and after each discrete 
> editing operation and constraints you want to hold when the document is 
> "done".)

The spec only concerns itself with the latter.

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list