[whatwg] Video, Closed Captions, and Audio Description Tracks

Henri Sivonen hsivonen at iki.fi
Mon Oct 8 23:22:20 PDT 2007

On Oct 8, 2007, at 22:12, Dave Singer wrote:

> At 12:22  +0300 8/10/07, Henri Sivonen wrote:

>> Could someone who knows more about the production of audio  
>> descriptions, please, comment if audio description can in practice  
>> be implemented as a supplementary sound track that plays  
>> concurrently with the main sound track (in that case Speex would  
>> be appropriate) or whether the main sound must be manually mixed  
>> differently when description is present?
> Sometimes;  but sometimes, for example:
> * background music needs to be reduced
> * other audio material needs to be 'moved' to make room for audio  
> description

In that case, an entire alternative soundtrack encoded using a  
general-purpose codec would be called for. Is it reasonable to expect  
content providers to take the bandwidth hit? Or should we expect  
content providers to provide an entire alternative video file?

>> When the problem is frame this way, the language of the text track  
>> doesn't need to be specified at all. In case #1 it is "same as  
>> audio". In case #2 it is "same as context site". This makes the  
>> text track selection mechanism super-simple.
> Yes, it can often fall through to the "what content did you select  
> based on language" and then the question of either selecting or  
> styling content for accessibility can follow the language.

I don't understand that comment. My point was that the two most  
obvious cases don't require a language preference-based selection  
mechanism at all.

>> Personally, I'd be fine with a format with these features:
>>  * Metadata flag that tells if the text track is captioning for  
>> the deaf or translation subtitles.
> I don't think we can or should 'climb inside' the content formats,  
> merely have a standard way to ask them to do things (e.g. turn on  
> captions).

I agree. However, in order for the HTML 5 spec to be able to  
reasonably and pragmatically tell browsers to ask the video subsystem  
to perform tasks like "turn on captions", we need to check that the  
obviously foreseeable format families (Ogg in the case of Mozilla  
and, apparently, Opera and MPEG-4 in the case of Apple) are able to  
cater for such tasks. Moreover, browsers and content providers need  
to have a shared understanding of how to do this concretely.

> This should all be out of scope, IMHO;  this is about the design of  
> a captioning system, which I don't think we should try to do.

I think the captioning format should be specified by the video format  
family. However, in this case it has become apparent that there  
currently isn't One True Way of doing captioning in the Ogg family.  
In principle, this is a problem that the specifiers of the Ogg family  
should solve. In practice, though, this thread arises directly from  
an issue hit by the Mozilla implementation effort. Since the WHATWG  
is about interoperable implementations, it becomes a WHATWG problem  
to make sure that browsers that implement Ogg for <video> and content  
providers have the same understanding of what the One True Way of  
doing captioning in Ogg is if the HTML 5 spec tosses the captioning  
problem to the video format (which, I agree, is the right place to  
toss it to). Hopefully, the HTML 5 spec text can be a one-sentence  
informative reference to a spec by another group. But which spec?

Henri Sivonen
hsivonen at iki.fi

More information about the whatwg mailing list