[whatwg] Proposal for a MediaSource API that allows sending media data to a HTMLMediaElement

Sun Aug 14 13:35:48 PDT 2011

Hi All,

comments in line...

On Fri, Aug 12, 2011 at 1:01 PM, Aaron Colwell <acolwell at google.com> wrote:

> Hi Mark,
>
> comments inline...
>
> On Thu, Aug 11, 2011 at 9:46 AM, Mark Watson <watsonm at netflix.com> wrote:
>
> > I think it would be good if the API recognized the fact that the media
> data
> > may becoming from several different original files/streams (e.g.
> different
> > bitrates) as the player adapts to network or other conditions.
> >
>
> I agree. I intend to document this when I spec out the format of the byte
> stream that is passed into this API. Initially I'm focusing on WebM which
> requires this type of functionality if the Vorbis initialization data ever
> needs to change during playback. My intuition says that Ogg & MP4 will
> require similar solutions.
>
>
> >
> > The different files may have different initialization information (Info
> and
> > Tracks in WebM, Movie Box in mp4 etc.), which could be provided either in
> > the first append call for each stream or with a separate API call. But
> > subsequently you need to know which initialization information is
> relevant
> > for each appended block. An integer streamId in the append call would be
> > sufficient - the absolute value has no meaning - it would just associate
> > data from the same stream across calls.
> >
>
> Since I'm using WebM for the byte stream I don't need to add explicit
> streamIds to the API or data. StreamIDs are already in the byte stream. Ogg
> bitstream serial numbers, and MP4 track numbers should serve the same
> purpose.
>
> A little background. I have taken what Aaron has written for the MediaChunk
API and I am currently trying to create an adaptive player that will switch
WebM video streams seamlessly. There is only one audio stream. All streams
are in separate files.

Even in the simple case of one video stream and one audio stream, the
problem I'm running into with the current API is that there is no way to
send the header info for the separate streams without re-muxing the separate
headers into a combined header. I can do this in Javascript for WebM files
(provided the track numbers are different or I would need to change all the
track numbers on the blocks in Javascript) but I think it would be easier on
the person writing a player if they didn't have to worry about that.
The easiest solution would be to add a stream id. That way the media engine
doesn't need to force the player or encoder to deal with track id's that are
the same in different streams.

I think the next best solution is probably (b) from below. That way you
could send the header info for a video stream and the header info for
an audio stream to initialize the MediaEngine. Not that it is a big deal
but, you would still have the restriction that different stream types cannot
have the same track number.

>
> >
> > The alternatives are:
> > (a) to require that all streams have the same or compatible
> initialization
> > information or
> > (b) to pass the initialization information every time you change streams
> >
> > (a) has the disadvantage of constraining encoding, and making adding new
> > streams more dependent on the details of how the existing streams were
> > encoded/packaged
> > (b) is ok, except that it is nice for the player to know "this data is
> from
> > the same stream you were playing a while ago" - it can re-use some
> > previously established state - rather than every stream change being 'out
> of
> > the blue'.
> >
>
> I'm leaning toward (b) right now. Any time a change in stream parameters is
> needed new INFO & TRACKS elements will be appended before the media data
> from the new source. This is similar to how Ogg chaining works. I don't
> think we need unique IDs for marking this state. The media engine can look
> at the new codec config data and see if it matches anything it has seen
> before. If so then it can simply reuse whatever resources it see fit.
> Another thing to note is that just because we append this data every time a
> stream switch occurs, it doesn't mean we have to transfer that data across
> the network each time. JavaScript can cache this data and simply append it
> when necessary.
>
>
> >
> > A separate comment is that practically we have found it very useful for
> the
> > media player to know the maximum resolution, frame rate and codec
> > level/profile that will be used, which may be different from the
> resolution
> > and codec/level/profile of the first stream.
> >
> >
> I agree that this info is useful, but it isn't clear to me that this API
> needs to support that. Existing APIs like
> canPlayType()<
> http://www.w3.org/TR/html5/video.html#dom-navigator-canplaytype>
> could
> be used to determine whether specific codec parameters are supported. Other
> DOM APIs could be used to determine max screen size. This could all be used
> to prune the candidate streams sent to the MediaSource API.
>
>
> Aaron
>