[whatwg] Ressurecting <video> a11y thread [was Re: Video, Closed Captions, and Audio Description Tracks]

Aaron Leventhal aaronlev at moonset.net
Fri Aug 22 03:59:10 PDT 2008

Has anyone put any further thought on what to do about captions for Ogg?

We've started to throw some thoughts together here:

We could use some help from individuals who understand the area of video 
and captions. The problem of deciding what to do for captions in <video> 
or specifically for Ogg does not appear to be simple. I'd appreciate it 
if someone could prove us wrong.

- Aaron

Silvia Pfeiffer wrote:
> Sorry to be getting back to this thread this late, but I am trying to
> catch up on email.
> I'd like to contribute some thoughts on Ogg, CMML and Captions and
> will cite selectively from emails in this thread.
> On Oct 9, 2007 5:22 PM, Henri Sivonen<hsivonen at iki.fi>  wrote:
>> On Oct 8, 2007, at 22:12, Dave Singer wrote:
>>> I don't think we can or should 'climb inside' the content formats,
>>> merely have a standard way to ask them to do things (e.g. turn on
>>> captions).
>> I agree. However, in order for the HTML 5 spec to be able to
>> reasonably and pragmatically tell browsers to ask the video subsystem
>> to perform tasks like "turn on captions", we need to check that the
>> obviously foreseeable format families (Ogg in the case of Mozilla
>> and, apparently, Opera and MPEG-4 in the case of Apple) are able to
>> cater for such tasks. Moreover, browsers and content providers need
>> to have a shared understanding of how to do this concretely.
>>> This should all be out of scope, IMHO;  this is about the design of
>>> a captioning system, which I don't think we should try to do.
>> I think the captioning format should be specified by the video format
>> family. However, in this case it has become apparent that there
>> currently isn't One True Way of doing captioning in the Ogg family.
>> In principle, this is a problem that the specifiers of the Ogg family
>> should solve. In practice, though, this thread arises directly from
>> an issue hit by the Mozilla implementation effort. Since the WHATWG
>> is about interoperable implementations, it becomes a WHATWG problem
>> to make sure that browsers that implement Ogg for<video>  and content
>> providers have the same understanding of what the One True Way of
>> doing captioning in Ogg is if the HTML 5 spec tosses the captioning
>> problem to the video format (which, I agree, is the right place to
>> toss it to). Hopefully, the HTML 5 spec text can be a one-sentence
>> informative reference to a spec by another group. But which spec?
> Ogg indeed currently has no preferred means of specifying captions.
> Usually it happens through a separate srt or ssa or similar file and
> the player makes sure to display the captions correctly.
> I just had a look at the W3C DFXP format
> (http://www.w3.org/TR/2006/CR-ttaf1-dfxp-20061116/). It looks rather
> similar to CMML, lacks the hyperlinking functionality, but has
> stylesheet and formatting support in it and more
> subtitle/karaoke-specific functionality. I believe it would be
> straightforward to define a media mapping for DFXP into Ogg should we
> decide that DFXP is the way forward. Similarly, it would be rather
> simple to define a media mapping for any of the anime subtitle formats
> mentioned above.
> Somewhat orthogonal to the discussion about subtitles is the use of
> CMML. Yes, it is possible to use CMML in its current specification as
> a superset of a srt-type sutitle format. However, the "description"
> element would then need be interpreted as the caption, which is
> somewhat of a misuse. I actually see captions and CMML as orthogonal
> concepts - CMML provides hyperlinks and machine-readable textual
> annotations in a timed manner, while captions provide formatted text
> for users to read.
> =================
> On Oct 10, 2007 3:03 AM, Maik Merten<maikmerten at gmx.net>  wrote:
>> Benjamin Hawkes-Lewis schrieb:
>> Actually I wonder if it wouldn't make sense to have an attribute for
>> media elements specifying a URI for a file containing Timed Text. These
>> externally stored (not embedded in a media file) captions would be
>> codec-agnostic and could be used to reuse the very same set of captions
>> for e.g. differently encoded media (Ogg, MPEG,
>> Generic-Codec-Of-The-Season, ...).
> In the above described cases of DFXP, srt, ssa or CMML, each one of
> these are text documents that can potentially live independent of the
> video file n a server ("externally stored"). In fact, apart from CMML,
> there is no defined mapping for Ogg of the others as yet.
>> As a side note I like the idea of captions which are more than just the
>> usual stream text. Imagine a newsreel with timed "Would you like to know
>> more?" links. Given that HTML5 is usually viewed in browsers that
>> implement at least a non-empty subset of HTML I imagine it should be
>> possible for the browser to layer something div-equivalent over the
>> media elements supporting captioning and pipe the HTML captions into it
>> (with caution, imagine a caption itself recursively embedding a video).
> That is exactly what CMML provides to Ogg: timed textual annotations,
> hyperlinks out of a video, and hyperlink into a video (URI addressable
> offsets and sections in the file).
> I am wondering whether it might be a good idea to include some of the
> DFXP specifications into CMML to enable it better for captiosn and
> thus not have to deal with multiple timed text formats. I haven't
> thought this through yet.
> ====
> On Oct 10, 2007 3:42 AM, Anne van Kesteren<annevk at opera.com>  wrote:
>> On Tue, 09 Oct 2007 18:03:41 +0200, Maik Merten<maikmerten at gmx.net>  wrote:
>>>> http://www.w3.org/TR/2006/CR-ttaf1-dfxp-20061116/
>>> Actually I wonder if it wouldn't make sense to have an attribute for
>>> media elements specifying a URI for a file containing Timed Text. These
>>> externally stored (not embedded in a media file) captions would be
>>> codec-agnostic and could be used to reuse the very same set of captions
>>> for e.g. differently encoded media (Ogg, MPEG,
>>> Generic-Codec-Of-The-Season, ...).
>> This would be problematic when downloading the video for offline use or
>> further distribution. This is also different from how this currently works
>> for DVDs, iPod, and the like as far as I can tell. It also makes authoring
>> more complicated in the cases where someone hands a video to you as you'd
>> have to separate the closed caption stream from it first and point to it
>> as a separate resource.
> Think it through: when you currently download a video from bittorrent,
> you download the subtitle file with it - mostly inside a zip file for
> simplicity even. Downloading a separate caption file  is similar to
> how you currently have to download the images separately for a Web
> page. It's no big deal really as long as there is a connection that
> can be automatically identified (e.g. through a link to the other
> inside the one, or through a zip-file, or through a description file).
> Actually for the authoring, I completely disagree. Authoring a
> captioning file inside a text editor is much simpler than needing a
> special application to author the captions directly inside a video
> file.
> In any case: I don't think it's a matter of one or the other. I
> believe firmly that it should be both, no matter what caption format
> and video format is being used.
> =====
> On Oct 10, 2007 3:46 AM, Henri Sivonen<hsivonen at iki.fi>  wrote:
>> On Oct 9, 2007, at 19:24, Dave Singer wrote:
>>> How the Ogg community designs intrinsic caption support is up to
>>> them, isn't it?
>> In theory ideally yes.
>> However, when HTML 5 says "User agents should support Ogg Theora
>> video and Ogg Vorbis audio, as well as the Ogg container format." and
>> "User agents should provide controls to enable or disable the display
>> of closed captions associated with the video stream, though such
>> features should, again, not interfere with the page's normal
>> rendering." it becomes a WHATWG issue to elicit a way to satisfy both
>> "should" requirements at the same time if implementors don't
>> otherwise have sufficient guidance on how to implement closed
>> captioning support for Ogg interoperably.
> Yes and no. Even if WHATWG decides that you should use Ogg with DFXP
> inside it for captioning - as long as the Ogg community does not
> provide a media mapping (i.e. a prescrption on how to do the embedding
> into the Ogg container), there is no standard means for doing so.
> Thus, if there is a need for such a mapping, the Ogg community would
> indeed need to create such a specification, unless there is no need
> for encapsulating the caption files directly inside the Ogg container.
> I believe howere, that such a specification is necessary to enable
> ubiquitous usabilty and uptake.
> Regards,
> Silvia.
> ---
> Dr Silvia Pfeiffer
> Annodex Association
> Xiph Foundation Member

More information about the whatwg mailing list