[whatwg] <video>/<audio> feedback

David Singer singer at apple.com
Fri May 8 09:25:43 PDT 2009

At 23:46  +1000 8/05/09, Silvia Pfeiffer wrote:
>On Fri, May 8, 2009 at 9:43 AM, David Singer <singer at apple.com> wrote:
>>  At 8:45  +1000 8/05/09, Silvia Pfeiffer wrote:
>>>  On Fri, May 8, 2009 at 5:04 AM, David Singer <singer at apple.com> wrote:
>>>>   At 8:39  +0200 5/05/09, KÞi"tof Îelechovski wrote:
>>>>>   If the author wants to show only a sample of a resource and not the
>>>>>  full
>>>>>   resource, I think she does it on purpose.  It is not clear why it is
>>>>>  vital
>>>>>   for the viewer to have an _obvious_ way to view the whole resource
>>>>>   instead;
>>>>>   if it were the case, the author would provide for this.
>>>>>   IMHO,
>>>>>   Chris
>>>>   It depends critically on what you think the semantics of the fragment
>>>>  are.
>>>>   In HTML (the best analogy I can think of), the web page is not trimmed
>>>>  or
>>>>   edited in any way -- you are merely directed to one section of it.
>>>  There are critical differences between HTML and video, such that this
>>>  analogy has never worked well.
>>  could you elaborate?
>At the risk of repeating myself ...
>HTML is text and therefore whether you download a snippet only or the
>full page and then do an offset does not make much of a difference.
>Even for a long page.

you might try loading, say, the one-page version 
of the HTML5 spec. from the WhatWG site...it 
takes quite a while.  Happily Ian also provides a 
multi-page, but this is not always the case.

>In contrast, downloading a snippet of video compared to the full video
>will make a huge difference, in particular for long-form video.

there are short and long pages and videos.

But we're talking about a point of principal 
here, which should be informed by practical, for 
sure, but not dominated by it.

The reason I want clarity is that this has 
ramifications.  For example, if a UA is asked to 
play a video with a fragment indication 
#time="10s-20s", and then a script seeks to 5s, 
does the user see the video at the 5s point of 
the total resource, or 15s?  I think it has to be 

>So, the difference is that in HTML the user agent will always have the
>context available within its download buffer, while for video this may
>not be the case.

I'm sorry, I am lost.  We could quite easily 
extend HTTP to allow for anchor-based retrieval 
of HTML (i.e. convert a 'please start at anchor 
X' into a pair of byte-range responses, for the 
global material, and then the document from that 
anchor onwards).

>This admittedly technical difference also has an influence on the user
>If you have all the context available in the user agent, it is easy to
>just grab a scroll-bar and jump around in the full content manually to
>look for things. This is not possible in the video case without many
>further download actions, which will each incur a network delay. This
>difference opens the door to enable user agents with a choice in
>display to either provide the full context, or just the fragment

But we can optimize for the fragment without disallowing the seeking.

David Singer
Multimedia Standards, Apple Inc.

More information about the whatwg mailing list