[whatwg] The IMG element, proposing a CAPTION attribute
davebacher at hotmail.com
davebacher at hotmail.com
Sun Nov 26 13:57:39 PST 2006
HTML is made up of 5 atoms:
The web browser converts the img element into:
Conversely, if you add a caption, it has to generate this:
Needless to say, its really just setting some attributes or properties,
maybe a bitmask or two to indicate the atom, or if it is a more modern
implementation, it might be attaching some function pointers (delegates) to
handle the behavior of the atom.
To the web browser the caption is, indeed, presentational, as a result. If
the HTML standard only had to deal with web browsers, if it only had to
worry about what Gecko and Opera want, the story would end there.
However, the web browser is not the only user-agent.
Web browsers don't care about structural versus presentational markup. They
care about structure precisely to the point that it triggers CSS rules, and
no farther than that. Some might infer things based on errors in the
tagsoup. (and most people in the list probably don't remember what the web
was like with gopher and the first generation of HTTP browsers, where a
markup error would actually crash the browser -- and sometimes the host OS
along with it)
Indexing services, on the other hand, care only about the relationships
between data. They want to form cross reference tables that they can use to
implement features such as search engines.
To an indexing service, the caption is the single most important thing about
an image. By separating the caption from the IMG element, you force the
search engine to apply a heuristic of some variety to infer the connection.
Consider a page of thumbnails with captions, for example, being indexed by
Google. Google needs to know what caption belongs to what thumbnail. This
is trivial if caption is an attribute, child element or has an IDREF
association with the image. In any other scenario, the markup that has to
be handled is diverse.
I mean, the images could be floated divs with the caption in the div. They
could be td elements, with a separate td element in the next row for the
caption, they could be position:absolute with another position:absolute
element somewhere else in the document positioned where some GUI tool put
The indexing service user agent has to make sense of all of this, in order
to figure out what caption goes with what image, and it is just going to be
extremely difficult to get that with no actual structural relationship
between the caption and the image.
I don't think it matters if it is an attribute, a child element, or a
separate element associated via an IDREF, but one of those things must
happen in order to maintain the structural relationship, so that an indexing
service can leverage that to provide better cross references, and ultimately
better search engine results.
----- Original Message -----
From: "Michel Fortin" <michel.fortin at michelf.com>
To: "Alexey Feldgendler" <alexey at feldgendler.ru>
Cc: "WHATWG List" <whatwg at whatwg.org>
Sent: Thursday, November 23, 2006 7:43 AM
Subject: Re: [whatwg] The IMG element, proposing a CAPTION attribute
Le 23 nov. 2006 à 3:32, Alexey Feldgendler a écrit :
> Anyway, "caption" is presentational.
Oh, please. If "caption" is presentational, then "paragraph" and
"table" are as much, if not more. According to my dictionary:
a distinct section of a piece of writing, usually dealing
with a single theme and indicated by a new line,
indentation, or numbering.
a set of facts or figures systematically displayed, esp.
a title or brief explanation appended to an article,
illustration, cartoon, or poster.
If there is a definition in this list which doesn't suggest some kind
of visual presentation, it's the caption. Surely you have a different
definition than me.
The semantic relation between a caption and its image, or figure,
should be exactly what is defined above: "a title or a brief
(Definitions from the New Oxford American Dictionary, 2nd edition)
michel.fortin at michelf.com
More information about the whatwg