[whatwg] What is not possible with HTML5
marques at displague.com
Thu Jun 24 07:03:09 PDT 2010
I would like to see a standardization of HTTP streaming, not
necessarily "adaptive streaming" - but that could also be useful.
The HTTP spec is vague or incomplete on behaviors related to partial requests.
To implement a streaming audio / video site where the user can only
receive data if their account permits it (based on credits, or some
sort of pay-as-you-play service) it is necessary to control the
throughput of the media content.
Some streaming systems use non-HTTP protocols to handle this but the
ones that use HTTP end up splitting the original content into various
segments and then use scripts or redirects to send the user to the
next piece of the media.
I attempted to implement a streaming video service by using the uncut
video files that the browser could natively play. I used only HTTP
206 (partial data) responses to limit the user. Here are some of the
problems I ran into:
1. There is no way for a server to tell a client that Ranges must be
specified if the sender did not initiate their request with Ranges.
a. A server can ignore this and give the client a 206 response
anyway. Chrome handles this. It will play the content it was given
and then continue to make Range requests to pull additional media.
Other browsers may already handle this the way I would like but if so
they fail to play beyond the content delivered in the initial response
due to '2'.
b. A server can send a 307 redirect that provides an Accept-Ranges
header, and hope that the client will use a range on their next
request (This was recently added to Firefox, seems to work in Chrome,
Opera, Safari - but my test fails overall due to '2'). Text based
browsers (curl, wget, links) do not handle this.
c. There are headers that sound like they could be used for this, but
they can't because there is no definition saying how browsers should
deal with them.
Status 416: range not satisfiable- by spec, can only be sent if the
browser requested a range, so the server can't tell the browser to use
ranges with this
Status 411: Length required - the description sounds right, but
this is used to force the client to send Content-Length for requests
Status 402: payment required - this fits the bill, but there is no
definition of what the follow-up behavior should be for browsers
d. Perhaps an addition 4xx Status should be defined for "Range
required". With this the browser behavior to follow up with a partial
get should be spelled out.
2. There is no spec assertion that browsers should accept a Range from
a server that is smaller than the range the browser requested.
a. When browsers begin making Range requests they tend to reach for
the stars: "Range: bytes=0-". Since I want to control the content I
limit the browser to X many bytes in my response: "Content-Range:
bytes 0-100000/9999999". Some browsers (Firefox, Opera) will play
this data but will not progress to the next block unless the user
intervenes by seeking. Opera's Ogg handler (but not webm or mp4) and
Chrome will continue to fetch data as it is needed to play the
3. Some browsers (Opera (at least, but certainly not Firefox or
Chrome)) ignores pragma no-cache on 206 responses in the video content
which allows a user to seek to an already played section of a video
without re-requesting the content from the server. The server is
trying to protect this content (and to restrict how much a viewer can
watch) and so (as the HTTP spec regarding 206 already seems to
dictate) this data should not be cached.
I think these problems can be addressed by:
1. Having all browser requests for <video> <audio> include a Range: in
the request - since this is the only invitation the server has to
respond with a 206. (or otherwise make it permissible for a server to
respond with a 206 to a request that had no ranges)
2. Defining a spec behavior that partial responses should be accepted
when they are smaller than what was requested and should be followed
up with additional partial gets until the media has been acquired
based on the users intention (to continuously play a video in this case).
3. Opera should address partial video caching
See https://bugzilla.mozilla.org/show_bug.cgi?id=570755 for additional
> I think what you call "multi-format video" is being implemented as
> HTTP adaptive streaming, where you have multiple different
> bandwidth-versions of the same media resource on the server and they
> have synchronisation points (usually the keyframes of the video) at
> which the user agent can switch over from one resource to another
> using byte range requests depending on the user agent's situation.
> I still wonder whether we should standardise HTTP adaptive streaming
> across video formats here or somewhere else. At this moment, several
> different solutions exist for MPEG video and none for Ogg or WebM.
> On Tue, Jun 8, 2010 at 1:18 AM, Bruce de Graaf <WeBMartians at verizon.net> wrote:
> > 3D Video and Real-time Multi-format Video! Ooops... My "bad." It doesn't
> > seem feasible even outside of HTML!
> > Apologies for being facetious and trying to be a "wag," but that latter
> > point (multi-format video) is a good candidate for attention ... and that is
> > what the query (What is not possible with HTML5) is about, right? "To what
> > have we failed to pay due consideration?"
> > A Statement of the Problem (apologies aforehand for offensive terminology):
> > Real-time, multi-format video (there must be a better term for this) is the
> > acquisition of audio and video in one form (say, 1920 x 1080 with 7.1 sound)
> > and its propagation in native as well as other formats (say, 160 x 112 with
> > monaural sound). This is the "Fat Man's Trousers" problem - How do you stuff
> > that belly into such small pants? [and THAT is the most polite of the
> > various titles] Right now, with either H.264 or with Ogg, no matter which,
> > this task is Sysiphean. Simply put, it is the difficulty in cramming 8+ Mb/s
> > through a 300 kb/s pipe (to a display that cannot present, anyway, the high
> > resolution image) and then randomly "adjusting" the bandwidth. Think of
> > being in a car, watching an iPad (as a passenger, we hope and pray) and
> > trying to watch a video as the feed wobbles from several megabits to mere
> > kilobits per second and back again. ...and then being foolish enough to
> > recommend the video to an iPhone user...
> > Just recode ... on-the-fly? Oh, you have not begun to address the problems!
> > You see, when your feed collapses, your clients lose compression
> > synchronization: compressed audio and video that depend on a particular
> > client state must be halted until some form of re-atunment occurs ...
> > possibly several seconds (hopefully not minutes). ...not good in a
> > commercial environment (somebody's not getting that for which he/she is
> > paying...).
> > There have been related discussions (focused on reporting actual bandwidth)
> > on this forum about the challenges. Some have said, with good reason, that
> > it is outside of the HTML5 envelope. Yet, the current need is so great and
> > that need's growth is of such magnitude (it could reach, quickly, the major
> > portion of all traffic) as to warrant some cycles in its discussion.
> > ...and, not to complicate and confuse matters, deserving of attention is an
> > analogous problem with mere layout of HTML pages: view a commercial page
> > (say, Bloomberg or Der Spiegel) on a decently sized screen and then try to
> > view such pages on an iPhone or Droid ... not a pleasant experience. Whilst
> > not dynamic as the described media feed problem is, the difficulties are
> > analogous (again, even if "spatial rather than temporal"). If the
> > documentation could offer some guidance, it would be well received and yield
> > great returns.
> > What say ye?
> > ===
> > On 2010-06-07 10:01, Nikita Eelen wrote:
> > I believe video telephony is not possible due to security limitations in
> > browser, but it was in spec but never acted upon as of yet (IE device access
> > using something like a microphone/camera etc.), however i believe this is
> > possible to work around (with individual browser extensions, like firefox
> > extensions, or
> > a native plugin (flash comes to mind but sucks for mobile devices in terms
> > of battery life), so I think for all those as of now need to be written as
> > either
> > A.) An application that acts as a plugin (ex: Flash)
> > B.) An extension to a browser (Firefox extension/active x control etc.)
> > C.) A native application that reads the camera/microphone for a given device
> > when the above is not possible (Ex: iPhone/iPad),
> > if this is possible any other ways I would be very interested if anyone
> > could comment,
> > Thanks,
> > Nikita
> > On Mon, Jun 7, 2010 at 5:34 AM, narendra sisodiya
> > <narendra at narendrasisodiya.com> wrote:
> >> May someone explore what is not possible with HTML5 in spec and in
> >> Implementations
> >> For example video telephony in browser is possible ? we draft DAP spec but
> >> no implementations. So not possible at this moment.
More information about the whatwg