[whatwg] The IMG element, proposing a CAPTION attribute

Mon Nov 27 07:15:17 PST 2006

Dave writes:

"To an indexing service, the caption is the single most important thing
about 
an image.  By separating the caption from the IMG element, you force the

search engine to apply a heuristic of some variety to infer the
connection...."

"... The indexing service user agent has to make sense of all of this,
in order 
to figure out what caption goes with what image, and it is just going to
be 
extremely difficult to get that with no actual structural relationship 
between the caption and the image."

Well said, Dave! I think this is an understanding that has been (mostly)
missing in this discussion. We need a structural and semantic link
between caption and image, first and foremost. There will inevitably be
all kinds of implementations of this to suit an equal number of
purposes, but this is a core need to be met in any future iterations of
HTML/XHTML.

Without a consistent structural and semantic expression of whatever we
call this "caption," the cognitive link between image and caption will
be haphazard, at best -- in user agents and in the end user (the
audience). Thank you also, Dave, for reminding us that web browsers
aren't our only targets. As most of us are probably focused on GUI
browsers most of the time, the needs of other media can be swept aside
too easily. But they must be considered in the final specifications, so
we need to bear them in mind now.

Where to go from here? "Title" as currently specified won't do the job
for captions because, as Matthew Raymond points out, "a caption is not
necessarily 'advisory information'[1], which is what the |title|
attribute is defined as containing." But would there be support for
broadening the definition of "title" and encouraging its adoption for
image captioning? It seems to me that there would be advantages to
piggybacking our purpose on an element or attribute already specified,
implementing an evolutionary change as opposed to a revolutionary one.

So please consider this: What do we lose, semantically or cognitively,
even if we entirely discard the "advisory information" capabilities in
the "title" attribute? It seems to me we'd lose far less than we may
gain in having a proper structure for captions. Existing "titles" may
not inform as well as a proper caption, but would probably not be
rendered meaningless as such.

The ultimate solution should be as simple and direct as possible, I
think. I'm just tossing this out for further consideration.

I'm also mindful of the previous arguments in favor of allowing markup
within captions, which suggests that the caption ought to be an element
rather than an attribute. I guess that would be nice, but I'm not sure I
agree with the necessity of it. Form follows function, and all that.

Jeff Seager