[whatwg] Markup for external content

Alexey Feldgendler alexey at feldgendler.ru
Sun Mar 18 10:53:39 PDT 2007

Hope it's not too late to add my opinion to the discussion about <video>.  
This posting expresses my view on the various kinds of markup for external  

Under external content I understand informative content presented  
out-of-line of the HTML document and referenced from the latter by means  
of a URI. Also, the equivalent of referencing external content with an URI  
can be embedding some content expressed in an XML-based language other  
than XHTML into an XHMTL document (SVG, MathML) -- this case should be  
treated the same as the case of true out-of-line content.

There is a set of types which the user agent is capable of handling  
(either internally or with help of external software, as long as it's done  
automatically). In the most vulgar example of a traditional GUI browser,  
if you enter an URL of a resource having one of these types into the  
address bar, the browser will somehow present the content to the user  
(render the hypertext document, display the image, play the video or audio  
clip, etc). Generally, external content can be of any of these types.

External content can have or not have certain properties, which are not  
mutually exclusive:

* Replaced: under replaced content I understand something that is  
presented in a rectangular area of a visual medium. Images and videos are  
examples of replaced content, though it's not an exhaustive list.

* Timed: such content that is presented as time goes. Examples are video  
and audio clips. Typically, such operations as "play", "pause", "seek"  
apply to them, though not necessarily (e.g. a live TV broadcast usually  
cannot be sought in).

* Interactive: implements its own handling of user input (clicks,  
keypresses) not expressed by HTML markup. Flash controls are examples.

* Structured: the content is a document whose structure is represented in  
a DOM. Examples are HTML documents, SVG and MathML resources.

Basing on these properties, it's possible to define some generic content  
types (not to be confused with data formats):

* Image: replaced.

* Audio: timed, NOT replaced.

* Video: replaced, timed.

* Subdocument: replaced, structured.

* Control: replaced, interactive.

Note that "video", "subdocument", and "control" are a subset of "image"  
(and they all effectively degrade to "image" when rendered on a  
non-interactive medium such as paper).

[X]HTML has, or is going to have, several elements to support external  

* <object>: any external content. This is the most basic and meaningless  
markup for external content, similar to <div> being the most basic and  
meaningless markup for hypertext. <object> places no restrictions on the  
properties of the content. In addition to being semantically meaningless,  
<object> also has historical interoperability problems (as in "always was  
hopelessly broken").

* <img>: image. Note that because "image" is a superset of "video". And  
indeed, <img> has been supporting one video format for decades: animated  
GIF. I can see no reason why <img> shouldn't support e.g. Ogg Theora in  
user agents which support this format; however, for authors it would be  
more desirable to use <video> for this purpose (and, probably, for  
animated GIF, too). Likewise, I see no reason why <img> shouldn't support  
SVG or Flash (alright, there are security issues with Flash...). Even  
though <object> can also display images, <img> is preferable because it's  
specifically designed for images and provides image-specific features such  
as scaling to a fixed width and height while maintaining the aspect ratio,  
and possibly some visual controls such as IE's image toolbar. <img> is  
also more semantically precise.

* <video>: video. Should support any video format which can be decoded  
either by the UA internally or by plugins, external programs etc as long  
as it's automatic. Should also support animated GIF and SVG because these  
are essentially video. Even though <img> and even <object> can also play  
video, the <video> element is designed specifically for video and provides  
video-specific features: scaling logic most useful for video clips, a DOM  
API, and possibly some visual playback controls. <video> is also more  
semantically precise.

* <audio> and <bgsound>: audio. Should support any audio format which can  
be decoded either by the UA internally or by plugins, external programs  
etc as long as it's automatic. Similar benefits over <object> as above.  
I'm not sure what the relation between <audio> and <bgsound> is going to  
be. Probably <audio> should act like replaced content by rendering some  
playback controls?

* <embed>: the old way of activating plugins; references any external  
content for which a plugin is used. I believe that HTML should be  
independent of the implementations architecture, so knowledge of whether  
or not a plugin is needed to render a specific format should not be  
encoded in HTML documents. <embed> must die.

* <applet>: the (old) way of activating Java. Probably must also die,  
though I'm unsure about this one.

* <iframe>: subdocument. Though <object> can do the same, <iframe> is  
specifically designed for structured content and provides a DOM API for  
access to the contained DOM. Also more semantically precise.

* Foregin namespaces in XHTML (<svg xmlns="http://www.w3.org/2000/svg">  
and such): subdocument. This method allows access to the contained DOM but  
doesn't expose any image-, audio-, or video-specific API. Probably it  

About plugins: in early browser development, it used to be the case that  
every new browser feature got itself a new element. That's how come we now  
have <object>, <embed>, and <applet>. So, <embed> was the way to mark up  
certain types of external content just because the first browser to  
implement it did so with a plugin. I believe such implementation details  
should not affect the design of HTML anymore. There shouldn't be any  
specific markup for "plugin content" because we never know whether support  
for a particular format is going to be implemented with a plugin. There  
are plugins which add PNG support to old browsers. On the other hand, it's  
possible to implement Flash support natively using an open-source player  
library. Especially on a mobile device it's unlikely that the browser is  
going to have any plugin system.

That's why I strongly disagree with the idea of having <video> only  
support open formats and leaving proprietary formats to <object>. Any  
video format which the browser can play, no matter natively or through a  
plugin, should be supported with <video>, as long as it's technically  
possible to expose a woring <video> DOM API to control the underlying  
implementation. Even Flash supports the notion of pausing and seeking to  
some extent, and is therefore a candidate for <video>.

Alexey Feldgendler <alexey at feldgendler.ru>
[ICQ: 115226275] http://feldgendler.livejournal.com

More information about the whatwg mailing list