[whatwg] A standard for adaptive HTTP streaming for media resources
ian at hixie.ch
Thu Aug 19 18:08:25 PDT 2010
On Tue, 25 May 2010, Silvia Pfeiffer wrote:
> We've in the past talked about how there is a need to adapt the bitrate
> version of a audio or video resource that is being delivered to a user
> agent based on the available bandwidth on the network, the available CPU
> cycles, and possibly other conditions.
> It has been discussed to do this using @media queries and providing
> links to alternative versions of a media resources through the <source>
> element inside it. But this is a very inflexible solution, since the
> side conditions for choosing a bitrate version may change over time and
> what is good at the beginning of video playback may not be good 2
> minutes later (in particular if you're on a mobile device driving
> through town).
> Further, we have discussed the need for supporting a live streaming
> approach such as RTP/RTSP - but RTP/RTSP has its own "non-Web" issues
> that will make it difficult to make it part of a Web application
> framework - in particular it request a custom server and won't just work
> with a HTTP server.
> In recent times, vendors have indeed started moving away from custom
> protocols and custom servers and have moved towards more intelligence in
> the UA and special approaches to streaming over HTTP.
> Microsoft developed "Smooth Streaming", Apple developed "HTTP Live
> Streaming" and Adobe recently launched "HTTP Dynamic Streaming". (Also
> see a comparison at). As these vendors are working on it for MPEG files,
> so are some people for Ogg. I'm not aware anyone is looking at it for
> WebM yet.
> Standards bodies haven't held back either. The 3GPP organisation have
> defined 3GPP adaptive HTTP Streaming (AHS) in their March 2010 release 9
> of 3GPP. Now, MPEG has started consolidating approaches for adaptive
> bitrate streaming over HTTP for MPEG file formats.
> Adaptive bitrate streaming over HTTP is the correct approach towards
> solving the double issues of adapting to dynamic bandwidth availability,
> and of providing a live streaming approach that is reliable.
> Right now, no standard exists that has been proven to work in a
> format-independent way. This is particularly an issue for HTML5, where
> we want at least support for MPEG4, Ogg Theora/Vorbis, and WebM.
> I know that it is not difficult to solve this issue in a
> format-independent way, which is why solutions are jumping up
> everywhere. They are, however, not compatible and create a messy
> environment where people have to install solutions for multiple
> different approaches to make sure they are covered for different
> platforms, different devices, and different formats. It's a clear
> situation where a new standard is necessary.
> The standard basically needs to provide three different things:
> * authoring of content in a specific way
> * description of the alternative files on the server and their
> features for the UA to download and use for switching
> * a means to easily switch mid-way between these alternative files
On Mon, 24 May 2010, Chris Holland wrote:
> I don't have something decent to offer for the first and last bullets
> but I'd like to throw-in something for the middle bullet:
> The http protocol is vastly under-utilized today when it comes to URIs
> and the various Accept* headers.
> Today developers might embed an image in a document as chris.png. Web
> daemons know to find that resource and serve it, in this sense,
> chris.png is a resource locator.
> Technically one might reference the image as a resource identifier named
> "chris". The user's browser may send "image/gif" as the only value of an
> accept header, signaling the following to the server: "I'm supposed to
> download an image of chris here, but I only support gif, so don't bother
> sending me a .png". In a perhaps more useful scenario the user agent may
> tell the server "don't bother sending me an image, I'm a screen reader,
> do you have anything my user could listen to?". In this sense, the
> document's author didn't have to code against or account for every
> possible "context" out there, the author merely puts a reference to a
> higher-level representation that should remain forward-compatible with
> evolving servers and user-agents.
> By passing a list of accepted mimetypes, the accept http header provides
> this ability to serve context-aware resources, which starts to feel like
> a contender for catering to your middle bullet.
> To that end, new mime-types could be defined to encapsulate media
> type/bit rate combinations.
> Or the accept header might remain confined to media types and acceptable
> bit rate information might get encapsulated into a new header, such as:
> X-Accept-Bitrate .
> If you combined the above approach with existing standards for http byte
> range requests, there may be a mechanism there to cater to your 3rd
> bullet as well: when network conditions deteriorate, the client could
> interrupt the current stream and issue a new request "where it left off"
> to the server. Although this likel wouldn't work because a byte range
> request would mean nothing on files of two different sizes. For
> playbacked media, time codes would be needed to define range.
On Tue, 25 May 2010, Silvia Pfeiffer wrote:
> That's not quite sufficient, actually. You need to know which byte range
> to retrieve or which file segment. Apple solved it by introducing a m3u8
> file format, Microsoft by introducing a SMIL-based server manifest file,
> Adobe by introducing a XML-based Flash Media Manifest file F4M. That
> kind of complexity canot easily be transferred through HTTP headers.
> The idea of the manifest file is to provide matching transition points
> between the different files of different bitrate to segments or byte
> ranges. This information has to somehow come to the UA (amongst other
> information as available in typical manifest files). I don't think that
> can be achieved without a manifest file.
On Fri, 28 May 2010, Jeroen Wijering wrote:
> Indeed, one such key condition is the current dimensions of the video
> window. Tracking this condition allows user-agents to:
> *) Not waste bandwidth, e.g. by pushing a 720p video in a 320x180 video
> *) Respond to changes in the video display, e.g. when the video is
> switched to fullscreen playback.
> Providing the different media options using <source> elements might
> still work out fine, if there's a clearly defined API that covers all
> scenarios. A rough example:
> <source bitrate="100" height="120" src="video_100.mp4" type="video/mp4; codecs='avc1.42E01E, mp4a.40.2'; keyframe-interval='00:02'" width="160">
> <source bitrate="500" height="240" src="video_500.mp4" type="video/mp4; codecs='avc1.42E01E, mp4a.40.2'; keyframe-interval ='00:02'" width="320">
> <source bitrate="900" height="540" src="video_900.mp4" type="video/mp4; codecs='avc1.42E01E, mp4a.40.2'; keyframe-interval ='00:02'" width="720">
> This example would tell the user-agent that the three MP4 files have a
> keyframe-interval of 2 seconds - which of course raises the issue that
> fixed keyframe-intervals would be required.
> The user-agent can subsequently use e.g. the Media Fragments API to
> request chunks, switching between sources as the conditions change.
It seems to me that we are not lacking in solutions in this space -- it
would behoove us to try to leverage the existing solutions rather than
making up new ones. Have the above solutions been tried in browsers?
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg