[whatwg] Implementation difficulties for MediaController

Tue Mar 29 21:26:31 PDT 2011

On Tue, Mar 29, 2011 at 9:05 PM, Ian Hickson <ian at hixie.ch> wrote:
> On Tue, 29 Mar 2011, Jer Noble wrote:
>>
>> Contained is Eric and my feedback as to the difficulty of implementing
>> this proposal in Apple's port of WebKit:
>
> Thank you very much for your feedback. I'll look into it more tomorrow
> when I update the spec, but in the meantime I had some additional
> questions:
>
>
>> > * playing tracks synchronised at different offsets
>>
>> However, if the in-band tracks will be played at a different time
>> offsets, or at different rates, playback becomes just as inefficient as
>> playing independent files.  To implement this we will have to open two
>> instances of a movie, enable different tracks on each, and then play the
>> two instances in sync.
>
> Is that acceptable? That is, are you ok with implementing multiple file
> (or two instances of the same file at different offsets) synchronisation?
>
>
>> > * playing tracks at different rates
>>
>> In addition to the limitation listed above, efficient playback of tracks
>> at different rates will require all tracks to be played in the same
>> direction.
>
> Ah, interesting.
>
> Is it acceptable to implement multiple playback at different rates if
> they're all in the same direction, or would you (at least for now) be
> significantly helped by forcing the playback rates to be the same for all
> slaved media tracks?
>
>
>> > * changing any of the above while media is playing vs when it is
>> > stopped
>>
>> Modifying the media groups while the media is playing is probably
>> impossible to do without stalling.  The media engine may have thrown out
>> unneeded data from disabled tracks and may have to rebuffer that data,
>> even in the case of in-band tracks.
>
> That makes sense. There's several ways to handle this; the simplest is
> probably to say that when the list of synchronised tracks is changed,
> or when the individual offsets of each track or the individual playback
> rates of each track are changed, the playback of the entire group should
> be automatically stopped. Is that sufficient?
>
> (In the future, if media frameworks optimise these cases, or if hardware
> advances sufficiently that even inefficient implementations of this are
> adequate, we could add a separate flag that controls whether or not this
> automatic pausing happens.)
>
>
>> From a user's point of view, your proposal seems more complicated than
>> the basic use cases merit.  For example, attempting to fix the
>> synchronization of improperly authored media with micro-adjustments of
>> the playback rate isn't likely to be very successful or accurate.  The
>> metronome case, while an interesting experiment, would be better served
>> through something like the proposed Audio API.
>
> Indeed. The use cases you mention here aren't the driving factor in this
> design, they're just enabled mostly as a side-effect. The driving factor
> is to avoid the symmetry problem described below:
>
>> Slaving multiple media elements' playback rate and current time to a
>> single master media element, Silvia and Eric's proposal, seems to
>> achieve the needs of the broadest use cases.
>
> The problem with this design is that it is highly asymetric. The
> implementation of a media element needs to have basically two modes: slave
> and master, where the logic for both can be quite different. (Actually,
> three modes if you count the lone media element case as a separate mode.)
> This then also spills into the API, where the master is exposing both the
> network state of its own media, as well as the overall state of playback.
> We end up having to handle all kinds of special cases, such as what
> happens when the master track is shorter than a slaved track, or what
> happens when the master track is paused vs when a slaved track is paused.
> It's not impossible to do, but it is significantly more messy than simply
> having a distinct "master" object and having all the media elements only
> deal with one "mode" (two if a lone media element counts as separate),
> namely the "slave" mode. Any asymetry is reflected as differences between
> the controller and the media element. Each media element only has to deal
> with its own networking state, etc.
>
> For an example of why this matters, consider the use case of a movie site
> with the option of playing movies with a director's commentary track. Some
> director's commentaries are shorted than the movie (most, in fact, since
> many directors stop commenting on the movie when the credits start!). Some
> are longer (e.g. some Futurama commentaries, where the commentary is
> basically a bunch of the cast and crew chatting away for a while and
> sometimes they don't really care that the show is over, they still have
> stuff to talk about). If we have to make a media element into the master,
> then how do we handle this case without the site having to determine ahead
> of time which is the longer track to decide which to use as the master?
>
> With just a controller, it doesn't matter.
>
> (Admittedly, as currently specified the controller object lacks a defined
> "current time" and min and max times, and an Web app would have to
> determine what the seek bar should display by looking at all the tracks.

Independent of the solution that we choose, we have to define what the
common timeline is for the combined resource. I think we should
probably go with the mental model of what it would be when it was
really all encapsulated in a single resource. Thus, if a slave
resource is longer than the main resource, it actually changes the
duration of the combined resource. Thus, we really should have a model
for that duration. Shorter is easier to deal with since you can just
pretend it is a transparent video or silent audio where it lacks
duration.

Also, independent of the model, we have to have a common understanding
if the currentTime and thus a combined transport bar. By default it
makes sense to display that combined transport bar so the user has a
means to interact with the multitrack resource.

> But we can fix that in a later version. It's much harder to fix in the
> case of one media element being promoted to a different state than the
> others, since we already have defined what the media element API does.)

One thing that I would really like to see is a common menu for turning
on and off tracks. This is particularly important if you have audio
description tracks, so a blind user can immediately find out if such a
track is available and activate it.

Cheers,
Silvia.