[whatwg] media elements: Relative seeking

Ian Hickson ian at hixie.ch
Mon Dec 1 02:28:57 PST 2008

On Sun, 23 Nov 2008, Maik Merten wrote:
> currently seeking in the media elements is done by manipulating the 
> currentTime attribute. This expects an absolute time offset in seconds. 
> This works fine as long as the duration (in absolute time) of the media 
> file is known and doesn't work at all in other cases.

For non-infinite streams, the duration is defined to be known. That is, 
the user agent is required to find the duration and set the "duration" 
attribute accordingly.

> Some media formats don't store the duration of the media file anywhere. 
> A client can only determine the duration of the media file by 
> byte-seeking near the end of the file and finding a timestamp near/at 
> the end. This isn't a problem whatsoever on local files, but in remote 
> setups this puts additional load on the server and the connection. If 
> one would like to avoid this, meaning no duration is known, seeking in 
> absolute time cannot work.

One is not allowed to avoid this per spec today, short of using a codec 
that doesn't have this problem.

> While getting the absolute duration is often a problem retrieving the 
> length of the media file is is no problem. I propose seeking with 
> relative positions, e.g. values between zero and one. This way the 
> client can determine if to seek in absolute time (if the duration is 
> known) or to just jump into to a position of the bytestream (if the 
> length in bytes is known).

You can jump to a position that's a fraction of the whole clip by setting 
'currentTime' to a fractional multiple of 'duration'.

>  - make currentTime readonly, still have it report playback position in 
> absolute time. This information should be available in all media formats 
> due to timestamps in the stream.
>  - introduce a seek() method, taking a relative value ranging from zero 
> to one. This allows both accurate seeking if the duration is known and 
> less precise seeking otherwise if only the length of the file is known 
> in storage units. This is still way better than not being able to seek 
> at all.

Why should currentTime be readonly?

>  - make duration report either the duration in absolute time (if known) 
> or the length of the file in storage units. This enables computation of 
> a relative playback position even when no duration is known, if the byte 
> position of the stream is known (low precision fallback - still better 
> than nothing at all).

The duration in absolute time is required to be determined before the 

>  - introduce a readonly storagePosition attribute. Meant to compute a 
> relative position if the duration is only known in storage units.

We just got rid of bufferingBytes and totalBytes because browser vendors 
want to return times only, not data based on bytes.

On Sun, 23 Nov 2008, Eric Carlson wrote:
> Reporting the absolute time of the current sample won't help when the 
> first sample of the file doesn't have a timestamp of zero. It will be 
> even more confusing for files with portions removed or added without 
> fixing time stamps - for example a movie created by concatenating 
> different files.

The spec is written with the assumption that all the offsets of the 
resource have unique timestamps, and that the timestamps are in monotonic 
increasing order without discontinuous jumps, but it doesn't assume that 
the initial position is zero.

Should I put a requirement in the spec to the effect that discontinuous or 
duplicate timelines must be normalised in some way?

> As I noted when this subject came up a few weeks ago, the right way to 
> deal with media formats that don't store duration is to estimate the 
> duration of the whole file by extrapolating from the known, exact, 
> duration of the portion(s) that have been processed. While the initial 
> estimate won't always be correct for variable bit-rate formats, the 
> estimate will become more and more accurate as it is iteratively refined 
> by processing more media data. The spec defines the "durationchange" for 
> just exactly this scenario.


On Mon, 24 Nov 2008, Silvia Pfeiffer wrote:
> I don't see addition of a duration attribute as much of a problem. We 
> have width and height for images, and sizes for fonts, too, and web 
> developers have learnt how to deal with these in various entities (px, 
> em, pt). I would not have a problem giving web developers the 
> opportunity to report the real duration of a video in an attribute in 
> either bytes or seconds (might be better called: length), which would 
> allow a renderer to display an accurate timeline. It is help for a 
> display mechanism just as width and height are.
> In case of contradiction between the attribute and the actual decoded 
> length, a renderer can still override the length attribute at the time 
> the real length is known. In case of contradiction between the attribute 
> and the estimated length of a video, the renderer should make a call 
> based on the probability of the estimate being correct.
> In may in fact be rather confusing to users if a video player keeps 
> changing the duration of a video file as it plays it back.
> I think such an attribute can help get this right earlier.

If we had kept pixelratio="", duration="" might make some sense (maybe, 
though I am still a little skeptical). But with the removal of 
pixelratio="", there's really no parallel for duration="". It seems to me 
that the UAs can do a fine job estimating the duration in the cases where 
it's necessary.

Frankly, I'd rather just that the video data included the duration. It 
seems a bit weird for a video resource to not know its own length. It 
would be a bit like an image file not self-describing its dimensions, and 
just having the decoder work it out based on some markers at the end of 
each line of pixels.

On Sun, 23 Nov 2008, Eric Carlson wrote:
> In the case of a file with video or VBR audio the true duration 
> literally isn't actually known until *every* frame has been examined.
> When would you have the UA decide to switch from the attribute to the to 
> the real duration? What would you have the UA do if the user seeks to 
> time 90 seconds when attribute says a file is 100 seconds long, but the 
> file actually has a duration of 80?

What would the UA do to know it has a duration of 80s? Would it have to 
read the whole file?

If so, why can't the UA just seek to the end (and read every frame), get 
the duration, and just be done with it?

On Tue, 25 Nov 2008, Silvia Pfeiffer wrote:
> The browser would need to have it's own estimate of reliability of the 
> length attribute. If there is a high probability that the length 
> attribute is wrong, e.g. we have already played beyond the given length, 
> or we have obviously too much data to decode for the given length, it 
> would ignore the length attribute. I would however hope that web 
> developers generally test their pages and the length attribute to make 
> sure the display is correct. If that data is provided through a CMS, 
> then the information should be correct anyway.

My experience with content on the Web suggests you are optimistic. :-)

On Mon, 24 Nov 2008, Dave Singer wrote:
> I don't think you mean 'relative' here, which I would take to be "go 
> forward 10 seconds", but 'proportional', "please go to 60% of the way 
> through".
> IF we are to do this, I would have thought it would be by adding units 
> to the "where to seek to" argument:
> * go to this time in NPT (normal play time, which runs from 0 to media
> duration)
> * go to this SMPTE time code
> * go by this relative distance in NPT times
> * go to this proportional time
> * go to this proportional byte distance
> * go by this relative byte distance

Woah. Not in v1.

> Note that proportional distances are not well defined for some streams 
> (e.g. indefinite ones).
> We'd have to define what bytes are counted and what aren't, especially 
> if a URL offers a set of sub-streams only some of which a client would 
> normally choose to have sent to it for playing.

These are other good reasons not to have a byte-based seeking mechanism.

On Tue, 25 Nov 2008, Silvia Pfeiffer wrote:
> Live streams are somewhat bad to deal with anyway, because a timeline is 
> badly defined on such. All you could really do is show the past and have 
> a "continues" pointer at the end. Most live streams (e.g. the recent 
> YouTube live concert) simply don't show a timeline and disallow people 
> to jump around in the presentation.

The API defined in the HTML5 spec seems to support such streams fine. It 
even allows the UA to start dropping data at the start of the buffer. 
Effectively, it works like a TiVo.

On Tue, 25 Nov 2008, Maik Merten wrote:
> This applet does not seek to the end of the stream to retrieve a 
> timestamp there.

It should. :-)

On Wed, 26 Nov 2008, Chris Double wrote:
> I won't be estimating the duration - the user experience of a 
> fluctuating duration is terrible. For now for Ogg files, I'm seeking to 
> the end and getting the duration. I may check for X-Content-Duration 
> which I believe mod_annodex and soon oggz-chop support.


> For the few servers that don't support seeking, duration is not 
> available.

Note that that is non-conforming at the moment. You have to have a 
duration available (though it can be +Infinity if you think that the 
resource is a stream, and can be an estimate, so long as you keep updating 
it as your information gets better.)

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list