[whatwg] Introduction of media accessibility features

Tue Apr 13 00:50:27 PDT 2010

On Tue, 13 Apr 2010 15:28:54 +0800, Jonas Sicking <jonas at sicking.cc> wrote:

> On Sun, Apr 11, 2010 at 11:39 PM, Silvia Pfeiffer
> <silviapfeiffer1 at gmail.com> wrote:
>> On Mon, Apr 12, 2010 at 4:00 PM, Philip Jägenstedt <philipj at opera.com>  
>> wrote:
>>> On Mon, 12 Apr 2010 08:47:33 +0800, Silvia Pfeiffer
>>> <silviapfeiffer1 at gmail.com> wrote:
>>>
>>>> On Mon, Apr 12, 2010 at 7:59 AM, Jonas Sicking <jonas at sicking.cc>  
>>>> wrote:
>>>>>
>>>>> On Sun, Apr 11, 2010 at 5:30 AM, Silvia Pfeiffer
>>>>> <silviapfeiffer1 at gmail.com> wrote:
>>>>> f>> Is it expected that all of TTML will be required? The proposal
>>>>> suggests
>>>>>>>
>>>>>>> 'starting with the simplest profile', being the transformation  
>>>>>>> profile.
>>>>>>> Does
>>>>>>> this mean only the transformation profile is needed to provide  
>>>>>>> subtitle
>>>>>>> features equivalent to SRT?
>>>>>>
>>>>>> That is also something that still has to be discussed further.  
>>>>>> Initial
>>>>>> feedback from browser vendors was that the full TTML spec is too
>>>>>> complicated and too much to support from the start. Thus, the
>>>>>> implementation path with the TTML profiles is being suggested.
>>>>>>
>>>>>> However, it is as yet unclear if there should be a native parsing
>>>>>> implementation of TTML implemented in browsers or simply a mapping  
>>>>>> of
>>>>>> TTML markup to HTML/CSS/JavaScript. My gut feeling is that the  
>>>>>> latter
>>>>>> would be easier, in particular since such a mapping has been started
>>>>>> already with Philippe's implementation, see
>>>>>> http://www.w3.org/2009/02/ThisIsCoffee.html . The mapping would need
>>>>>> to be documented.
>>>>>
>>>>> Personally I'm concerned that if we start heading down the TTML path,
>>>>> browsers are ultimately going to end up forced to implement the whole
>>>>> thing. Useful parts as well as parts less so. We see this time and
>>>>> again where if we implement part of a spec we end up forced to
>>>>> implement the whole thing.
>>>>>
>>>>> Things like test suites, blogging advocates, authoring tools, etc
>>>>> often means that for marketing reasons we're forced to implement much
>>>>> more than we'd like. And much more than is useful. This is why spec
>>>>> writing is a big responseibility, every feature has a large cost and
>>>>> means that implementors will be working on implementing that feature
>>>>> instead of something else.
>>>>
>>>> Understood. But what is actually the cost of implementing all of TTML?
>>>> The features in TTML all map onto existing Web technology, so all it
>>>> takes is a bit more parsing code over time. And if we chose not to
>>>> implement TTML, we will have to eventually support some other format
>>>> that provides formatting and positioning capabilities, seeing how the
>>>> legal landscape has evolved for traditional media (e.g. TV, set-top
>>>> box technology). Since TTML was originally developed to be the
>>>> exchange format for all such formats, it should have a sensible set of
>>>> features for this space. So, I personally think it's not a bad choice
>>>> for the purpose. Which other format did you have in mind to replace
>>>> it?
>>>>
>>>
>>> For the record, I am also not enthusiastic about TTML, specifically the
>>> styling mechanism which even makes creative use of XML namespaces. An
>>> example [1] for those that haven't seen it before:
>>>
>>> <region xml:id="r1">
>>>  <style tts:extent="306px 114px"/>
>>>  <style tts:backgroundColor="red"/>
>>>  <style tts:color="white"/>
>>>  <style tts:displayAlign="after"/>
>>>  <style tts:padding="3px 40px"/>
>>> </region>
>>> ...
>>> <p region="r1" tts:backgroundColor="purple" tts:textAlign="center">
>>>  Twinkle, twinkle, little bat!<br/>
>>>  How <span tts:backgroundColor="green">I wonder</span> where you're at!
>>> </p>
>>>
>>> While I don't have any suggestions about what to use instead, I'd much
>>> prefer something which just uses CSS with the same syntax we're all  
>>> used to.
>>>
>>> [1]
>>> http://www.w3.org/TR/ttaf1-dfxp/#style-attribute-backgroundColor-example-1
>>
>>
>> I have looked at alternative formats the provide styling and
>> positioning functionality. There is, for example, ASS/SAA
>> http://www.matroska.org/technical/specs/subtitles/ssa.html . We could
>> decide to support something like that instead, but it would be
>> essentially the same work as for TTML: define a mapping to Web
>> technologies (HTML, CSS, JavaScript) and implement that mapping, as I
>> don't think anyone would implement SSA natively either.
>>
>> I am myself not excited by the way that TTML turned out and would have
>> wished for a more Web-friendly format to have come out of the W3C, but
>> that's what it is now. Also, I am really missing hyperlinking
>> functionality in it, but believe it is possible to extend it in the
>> future.
>>
>> We could, of course, decide to develop a totally new specification,
>> but then we are left on our own to push that into the world. At least
>> TTML already has support by several existing vendors in the caption
>> space (see e.g.
>> http://www.adobe.com/accessibility/products/flash/captioning_tools.html,
>> http://www.ooyala.com/www3/support_closedcaptions,
>> http://broadcastengineering.com/automation/ninsight-unveils-dfxp-subtitling-mxf-ayoto-0901/).
>>
>> I just think of all the options available TTML is the least
>> problematic way to go. But if somebody has a better idea, I'm more
>> than open to it! (This is me personally saying it - it's not a
>> representative opinion of the W3C HTML5 accessibility task force).
>
> Henri Sivonen brought up an interesting point on the HTML WG list.
> Bringing it up here as I don't know if people are following both
> lists.
>
> Will implementations want to do the rendering of the subtitles off the
> main thread? I believe many browsers are, or are planning to, render
> the actual video graphics using a separate thread. If that is correct,
> do we want to support rendering of the subtitles on a separate thread
> too?
>
> Or is it enough to do the rendering on the main thread, but composit
> using a separate thread?
>
> If rendering is expected to happen on a separate thread, then CSS is
> possibly not the right solution as most CSS engines are
> main-thread-only today.

At least for us, the video (and audio) is decoded on another thread, but  
composited on the main thread. Even if this were to change in some way, I  
don't see that it makes any difference to captions. As long as it possible  
to overlay arbitrary HTML on top of video without a performance loss  
(which will be the case since scripted controls depend on it) then  
rendering captions should be no more of a problem.

The remaining issue is sync, but I don't think compositing captions with  
the video frames in the decoding pipeline (if that is what is being  
implied) is the solution, because that would require deep(er) integration  
with the media framework and a layout engine usable from a separate  
thread. Neither are impossible, but don't seem to be justified just to  
solve the sync problem.

-- 
Philip Jägenstedt
Core Developer
Opera Software