[whatwg] <video>/<audio> feedback

Fri May 8 09:25:43 PDT 2009

At 23:46  +1000 8/05/09, Silvia Pfeiffer wrote:
>On Fri, May 8, 2009 at 9:43 AM, David Singer <singer at apple.com> wrote:
>>  At 8:45  +1000 8/05/09, Silvia Pfeiffer wrote:
>>>
>>>  On Fri, May 8, 2009 at 5:04 AM, David Singer <singer at apple.com> wrote:
>>>>
>>>>   At 8:39  +0200 5/05/09, KÞi"tof Îelechovski wrote:
>>>>>
>>>>>   If the author wants to show only a sample of a resource and not the
>>>>>  full
>>>>>   resource, I think she does it on purpose.  It is not clear why it is
>>>>>  vital
>>>>>   for the viewer to have an _obvious_ way to view the whole resource
>>>>>   instead;
>>>>>   if it were the case, the author would provide for this.
>>>>>   IMHO,
>>>>>   Chris
>>>>
>>>>   It depends critically on what you think the semantics of the fragment
>>>>  are.
>>>>   In HTML (the best analogy I can think of), the web page is not trimmed
>>>>  or
>>>>   edited in any way -- you are merely directed to one section of it.
>>>
>>>  There are critical differences between HTML and video, such that this
>>>  analogy has never worked well.
>>
>>  could you elaborate?
>
>At the risk of repeating myself ...
>
>HTML is text and therefore whether you download a snippet only or the
>full page and then do an offset does not make much of a difference.
>Even for a long page.

you might try loading, say, the one-page version 
of the HTML5 spec. from the WhatWG site...it 
takes quite a while.  Happily Ian also provides a 
multi-page, but this is not always the case.

>
>In contrast, downloading a snippet of video compared to the full video
>will make a huge difference, in particular for long-form video.

there are short and long pages and videos.

But we're talking about a point of principal 
here, which should be informed by practical, for 
sure, but not dominated by it.

The reason I want clarity is that this has 
ramifications.  For example, if a UA is asked to 
play a video with a fragment indication 
#time="10s-20s", and then a script seeks to 5s, 
does the user see the video at the 5s point of 
the total resource, or 15s?  I think it has to be 
5s.

>
>So, the difference is that in HTML the user agent will always have the
>context available within its download buffer, while for video this may
>not be the case.

I'm sorry, I am lost.  We could quite easily 
extend HTTP to allow for anchor-based retrieval 
of HTML (i.e. convert a 'please start at anchor 
X' into a pair of byte-range responses, for the 
global material, and then the document from that 
anchor onwards).

>
>This admittedly technical difference also has an influence on the user
>interface.
>
>If you have all the context available in the user agent, it is easy to
>just grab a scroll-bar and jump around in the full content manually to
>look for things. This is not possible in the video case without many
>further download actions, which will each incur a network delay. This
>difference opens the door to enable user agents with a choice in
>display to either provide the full context, or just the fragment
>focus.

But we can optimize for the fragment without disallowing the seeking.

-- 
David Singer
Multimedia Standards, Apple Inc.