[whatwg] alt text...

Charles McCathieNevile chaals at opera.com
Sun Apr 26 15:22:39 PDT 2009

Reading through the current spec:

The flowhart example in [1] seems to me to be good alt text. It  
does violate the principle of visible metadata (if you change your  
flowchart you don't see the metadata that it illustrates, and the  
associated accessibility use case of people who have some functional  
vision, but not a lot.

As described on this list somewhere in the past, such people typically  
look at the pictures, and read the text - sometimes the images help,  
sometimes they don't and the text description is necessary. Putting it in  
the alt, rather than somewhere in the page, or putting it in the page  
without a clear way to know that it refers to a given image, both increse  
the required work to be sure that the user has understood what is there  
(understanding alone is insufficient. A user needs to be clear that they  
have in fact understoof all of what is there - a subtle distinction which  
only sometimes has practical application, but which is important when it  

A better approach, IMHO, would be to have the image illustrating the text,  
such as:

<p>In the common case, the data handled by the tokenization stage
comes from the network, but it can also come from script.</p>

<p><figure><img src="images/parsing-model-overview.png"  
alt=""><legend>Figure 41</legend><p>

<p>As shown in figure 41, the network
passes data to the Tokenizer stage, which passes data to the Tree
Construction stage. From there, data goes to both the DOM and to
Script Execution. Script Execution is linked to the DOM, and, using
document.write(), passes data to the Tokenizer.</p>

You may not want to do this, or for reasons of flow to have the detailed  
data elsewhere. This is where longdesc/describedBy can be helpful.

The issue that arises is copy/paste coding - if you do that in a text  
editor as opposed to an authoring tool (because the latter are already  
sometimes smart enough to carry the associated code, and we could clarify  
which ones are helpful and which ones help authors break things) then a  
relative description link to an internal reference needs to be made (more)  
absolute. On the other hand, if I use wiki commons images, or things that  
Ian has described somewhere on the WHATWG site, and the references point  
to these external pages, I simply copy/paste and have automatically  
inherited a good description too.

Such an example may look like

<p>In the common case, the data handled by the tokenization stage
comes from the network, but it can also come from script.</p>
<p><img src="images/parsing-model-overview.png" alt="document.write() can  
provide input too."

longdesc is effectively the same as aria-describedBy, although  
historically limited to a very few elements - it is here for better  
backwards compatibility, not because I care much either way about the name.

In any event, what happens to an AT user (whether a screenreader or  
someone running a simple script extension such as [2] or [3]) is that you  
see the image, and you can get a further description if you want. (In this  
example it is an edge case, but this is still a relatively simple diagram.  
For something more complex this becomes more relevant.

Such an example might be the common "problem-solving" flowchart [4] I have  
seen this in most developer contexts I have been in - it is the one that  
starts "Is it working? If so, leave it alone. If not, did *you* mess with  
it? if you didn't, will things be OK if you leave it alone, or are you  
unfortunate enough that you will get the blame? And if you did, you are a  
fool, but the question is whether anyone will find out..." (That's  
probably more description than necessary here. If someone had described  
the thing accurately once, and such descriptions were more readily  
available, I would simply look for the description and point to it :) )

Everything then looks OK until the third example of [4] where  
the figure element is considered seperate to the paragraph it follows, and  
it is therefore suggested that text is required. This leads to a horrible  
repetition. Again, having a way to clearly associate the paragraph with  
the image would be helpful, but in any case the repetition of text because  
it matches some theoretical pattern should be removed, since in real use  
it will simply annoy users who rely on reading the alt text. (The  
preceeding sections and the general advice that follows rightly point out  
that repetitively redundant multiple versions of the same thing are not  

The final example in that section is good, but it is impossible to know  
that there is in fact a clear description of the image - you have to guess  
it by context. It would be nice to have something like

<p id="results">According to a study covering several billion pages,
about 62% of documents on the Web in 2007 triggered the Quirks
rendering mode of Web browsers, about 30% triggered the Almost
Standards mode, and about 9% triggered the Standards mode.</p>
<p><img src="rendering-mode-pie-chart.png" alt=""  

Which would enable copying authors to point to the description in the  
original context, or to note that it existed and copy it - this can be  
done automagically by authoring tools (some do and some don't right now,  
but it isn't rocket science to maintain links when you move them from page  
to page).

The example in [5] where the image is a key part of the content  
shows how the current model is limited and why I raised ISSUE-30. A  
non-text user trying to skim through a set of images (the scenario  
described) quite possibly doesn't want to hear the complex descriptions of  
each photo in turn.

If you can determine the user's intent in looking through the list of  
images, you can decide dynamically whether the caption is enough. This  
would apply where the user is just trying to find out what is in the list,  
or is looking to see if their OS is there and don't understand the  
find-in-page function, at least. If the user is actually trying to  
determine what each OS listed looks like, then you do need the big blocks  
of text available. Unfortunately, software that determines user intent is  
in its infancy (and often cursed by people for its inaccuracy), and a  
better solution for the next few years seems to be a method of encoding  
content that readily supports several use cases at once.

Having the ability to use a linked description (and that being clarified  
in this section of the spec as a good thing for certain use cases) would  
make a number of the examples in this section nicer. Being able to reuse a  
high-quality description of known images, instead of being asked to  
squeeze something into alt text, shows authors a path where for less work  
they can often achieve better results. Even in something as complex as the  
Rorschach inkblots, looking for a reference or two is easier than figuring  
out how to describe it (it took me five minutes to find something worth  
referring to for all ten images, and I would have been hard-pressed to  
describe a single one in that time).

Given that the WAI folks are finialising input on what to do in the case  
where there is no known input I will refrain from comment for now.

I wrote ages ago:
>> Section "An image in an e-mail or document intended for a  
>> specific person who is known to be able to view images" seems to me
>> mildly harmful.
>> Specifically, it appears to legitimate claims of the kind "all our  
>> customers can see, so we don't need to worry", something that causes
>> a huge umber of problems in ensuring accessibility where it is in
>> fact necessary.

and Ian replied

> I've changed that section to make it clear it is only referring to  
> private documents and not anything on the Web.

That section [6] is  not entirely clear on first reading (the double  
negative isn't helpful). More to the point this is a more-or-less  
untestable statement of intent, and I am not sure why any tool would  
bother having a setting for "I am writing to somebody I know, and I don't  
intend to publish this later on the web, no for either of us to use a text  
editor" but would not have a setting for "I don't care about validation,  
right now I want to do this". (I don't know of any validation service that  
makes such an exception possible, and I don't see the use case for  
building one - I certainly haven't seen it described anywhere).

It also seems to me that guidance for markup generators [7] (section should also take a lead from ATAG[8] which is produced by a  
group that has worked on that topic for a decade or so, and who have  
maintained a consistent approach that markup generators "must not add alt  
attributes with dummy values". A reference to either the ATAG 1  
recommendation on that [8a], or the equivalent section of ATAG 2 [9] would  
probably be in order along with aligning closely with what is said in that  
spec (or convincing them to change it in future ATAG 2 drafts if they are  
wrong, which amounts to the same thing).

[2] http://userjs.org/scripts/browser/enhancements/frameset-links
[3] http://www.splintered.co.uk/experiments/55/
[5] http://dev.w3.org/html5/spec/Overview.html#a-key-part-of-the-content
[8] http://www.w3.org/TR/ATAG10
[8a] http://www.w3.org/TR/ATAG10/#check-no-default-alt
[9] http://www.w3.org/TR/2008/WD-ATAG20-20080310/#gl-manage-alts



Charles McCathieNevile  Opera Software, Standards Group
     je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals       Try Opera: http://www.opera.com

Charles McCathieNevile  Opera Software, Standards Group
     je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals       Try Opera: http://www.opera.com

More information about the whatwg mailing list