[whatwg] Start position of media resources

Ian Hickson ian at hixie.ch
Wed Apr 29 18:04:40 PDT 2009

On Mon, 6 Apr 2009, Chris Double wrote:
> Ogg based media resources can start from a time position that is not 
> zero. Examples of files that do this are those generated by the program 
> oggz-chop. For example:
> http://ia331342.us.archive.org/2/items/night_of_the_living_dead/night_of_the_living_dead.ogv?t=0:20:00/0:20:50
> If this is played in VLC the start time of the video is 0:20:00. When 
> seeking the time requested for the seek must be between 0:20:00 and 
> 0:20:50. Does the HTML5 spec allow media resources that don't start from 
> 0?

Yes. A typical example of a media resource that doesn't start from 0 is a 
live TV stream being buffered DVR-style, where the start position keeps 
moving. Other examples are as you give above, where the timeline of the 
clip doesn't start at zero.

In all these cases, the "earliest possible position" is non-zero.

> I see in the spec mention:
> "Media elements have a current playback position, which must initially 
> be zero. The current position is a time."

There are two other requirements of relevance here:

For the case of a non-zero origin, the following is relevant:

# Once enough of the media data has been fetched to determine the duration 
# of the media resource, its dimensions, and other metadata
# [...]
# 1. Set the current playback position to the earliest possible position

For the DVR case, this is relevant also:

# When the earliest possible position changes, if the current playback 
# position is before the earliest possible position, the user agent must 
# seek to the earliest possible position.

> Is this valid per the spec?  If so, would we need an attribute on the 
> media object so the web page author can retrieve the start time of the 
> video (in the same way they can get the duration). They would need this 
> to be able to display progress bars/scrubbers to position the thumb 
> correctly based on the currentTime. Detecting the first frame or 
> metadata loaded events and getting the position of the that won't work 
> as some of the video may have been played by the time that event is 
> handled by user code.

Good point. I've added .startTime to expose this, and made timeupdate fire 
if the value changes.

On Mon, 6 Apr 2009, Eric Carlson wrote:
>   A media file with a non-zero initial time stamp is not new to 
> oggz-chopped files (eg. an MPEG stream initial PTS can have any value, 
> SMPTE time-codes do not necessarily start at zero, etc), but I disagree 
> that we need a new attribute to handle it.
>   Media time values are expressed in normal play time (NPT), the 
> absolute position relative to the beginning of the presentation. It is 
> the responsibility of the UA to map time zero of the element to the 
> starting time of the media resource, whatever it may be.

On Tue, 7 Apr 2009, Robert O'Callahan wrote:
> What if a script wants to present UI that contains the time from the 
> media resource?

On Tue, 7 Apr 2009, Philip Jägenstedt wrote:
> Indeed clarification is needed. In my opinion time 0 should correspond 
> to the beginning of the media resource, no matter what the timeline of 
> the container format says. This means that currentTime doesn't jump when 
> playback begins and never goes beyond duration.

> Taking Ogg as an example, there's no requirement that the granulepos start at
> zero nor does a non-zero granulepos imply any special semantics such as "the
> beginning of the file has been chopped off". A tool like oggz-chop might
> retain the original granulepos of each packet or it could just as well adjust
> the stream to start at granulepos 0. Neither is more correct than the other,
> so I'd prefer we not try to infer anything from it, especially since such
> low-level timing information might be hidden deep inside the platform media
> framework (if it normalizes the time like XiphQT presumably does).
> Perhaps we would like to have some way to express where a resource is
> positioned on a timeline external to the resource itself, but could SMIL do
> this perhaps?
> I suppose that an UA could parse the media fragment in the URL and 
> adjust the timeline accordingly, but I'm not sure if it's a great 
> idea...

On Tue, 7 Apr 2009, Conrad Parker wrote:
> For Ogg, the start time of the original file (prior to chopping) is 
> recorded in the skeleton headers by oggz-chop, so this info is 
> intrinsically in the media format itself.

On Tue, 7 Apr 2009, Silvia Pfeiffer wrote:
> I humbly disagree. If a media file explicitly knows at what time offset 
> it starts, the timeline needs to represent that, otherwise there are 
> contradictory user experiences.
> For example, take a video that is a subpart of a larger video and has 
> been delivered through a media fragment URI 
> (http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-reqs/). 
> When a user watches both, the fragment and the full resource, and both 
> start at 0, he/she will assume they are different resources, when in 
> fact one is just a fragment of the other. Worse still: if he/she tries 
> to send a URI with a link to a specific time offset in the video to a 
> friend, he/she will come up with diverging URIs based on whether he/she 
> watched the fragment or the full resource. Representing the wrong 
> timeline for a media fragment will only cause confusion and 
> inconsistencies.

On Tue, 7 Apr 2009, Philip Jägenstedt wrote:
> If the media resource really does explicitly define an offset then I 
> might agree that it makes sense to adjust the timeline.
> However, for plain Ogg or any other container format that just happens 
> to have a non-zero timestamp at the beginning of the file I think we 
> should definitely align them to zero. You can get such files easily by 
> cutting streams and it would be confusing if the timeline was relative 
> to the original file. As an example, in MPEG the PTS (Presentation Time 
> Stamp) can start at non-zero, be discontinuos and wrap around 0 so 
> normalization is necessary. I'm not sure if anyone disagrees, but it 
> would be a very bad idea to infer any semantics from the container time 
> stamps in the absense of some explicit mapping like Ogg Skeleton.
> Not generally requiring low-level inspection of the container format 
> time stamps is important to us as that would virtually require that the 
> UA itself to demux and inspect the time stamps of different container 
> formats. If a platform media framework is used, time is normalized in 
> some way, likely differently depending on framework and plugins used.

On Tue, 7 Apr 2009, Ralph Giles wrote:
> There is a ui problem here, in that the 'seek bar' control typically 
> displayed by web video players has finite resolution. It works great for 
> short-form clips a la YouTube, but a 30 second segment of a two hour 
> movie amounts to a few pixels. Displaying such a fragment in the context 
> of the complete work makes a linear progress bar useless for seeking 
> within the fragment itself, everything having been traded for showing 
> that it's part of a much larger resource. Never mind that a temporal url 
> can equally well reference a five minute section of a 20000 hour webcam 
> archive.
> Showing a fragment in context is helpful, not just for the time offset, 
> but cue points, related resources, and so on. The default controls the 
> browser provides can't encompass all possible context interfaces, so 
> perhaps the focus here should be on what is necessary to enable scripts 
> (or browser vendors) to build more complicated interfaces when they're 
> appropriate.

On Wed, 8 Apr 2009, Silvia Pfeiffer wrote:
> I think there is a misunderstanding here. The suggestion is not to 
> display the full timeline of the original resource and specially mark 
> the fragment, but to start and end the timeline with the in and out 
> times of the fragment. I mentioned "context" only in that the in and out 
> times imply that the media resource is a fragment of a larger resource. 
> Not that the full context is actually displayed in the timeline.

On Tue, 7 Apr 2009, David Singer wrote:
> I think that there is a very real difference between the 
> zero-to-duration 'seek bar' that the UI presents, and which users 
> understand, from the 'represented time' of the content.  That might be a 
> section of a movie, or indeed might be a section of real time-of-day 
> (think of one of the millions of british surveillance cameras..., or not 
> if you'd prefer not to).  Getting "what time is this media resource 
> talking about" is a metadata question...

I have left the spec as is (except for adding startTime), which means that 
currentTime can be greater than duration if startTime is not zero.

The UI is up to the UA, but the information is present now for all of the 
UIs described above to be rendered, I believe.

On Tue, 7 Apr 2009, David Singer wrote:
> If a URL asks for
> http://www.example.com/t.mov?time="10s-20s"
> it's clear that all I have at the UA is a 10s clip, so that's what I 
> present; the ? means the trimming is done at the server.
> However, if I am given
> http://www.example.com/t.mov#time="10s-20s"
> which means the UA does the selecting;  should the UA present a timeline
> representing all of t.mov, but start at 10s into it and stop at 20s, allowing
> the user (if they wish) to seek outside that range, or should the UA present
> (as in the query case) only a 10s clip?

The UA should allow seeking to all times (at least from the API; the UI 
could be limited I guess).

On Wed, 8 Apr 2009, Silvia Pfeiffer wrote:
> Note that in the Media Fragment working group even the specification of 
> http://www.example.com/t.mov#time="10s-20s" may mean that only the 
> requested 10s clip is delivered, especially if all the involved 
> instances in the exchange understand media fragment URIs.

That doesn't seem possible since fragments aren't sent to the server.

On Thu, 9 Apr 2009, Jonas Sicking wrote:
> If we look at how fragment identifiers work in web pages today, a link 
> such as
> http://example.com/page.html#target
> this displays the 'target' part of the page, but lets the user scroll to 
> anywhere in the resource. This feels to me like it maps fairly well to
> http://example.com/video.ogg#t=5s
> displaying the selected frame, but displaying a timeline for the full 
> video and allowing the user to directly go to any position.

Agreed. This is how the spec works now.

# Once enough of the media data has been fetched to determine the duration 
# of the media resource, its dimensions, and other metadata
# [...]
# 6. If either the media resource or the address of the current media 
# resource indicate a particular start time, then seek to that time. 
# [...]
# || For example, a fragment identifier could be used to indicate a start 
# || position.

> But I also agree that there is a use case for directing the user to a 
> specific range of the video, such as your 30 second clip out of 5 hour 
> video example. Maybe this could be done with syntax like
> http://example.com/video.ogg#r=3600s-3630s

Currently the spec has no way to indicate a stop time from the fragment 
identifier or other out-of-band information, but I agree that we might 
need to add something like that (e.g. implying a default cue range with 
autopause-on-exit enabled) at some point.

On Wed, 8 Apr 2009, Silvia Pfeiffer wrote:
> Most videos nowadays are sent using HTTP - in particular YouTube videos. 
> But we (in the media fragments WG) do indeed include RTP/RTSP as another 
> protocol that already includes mechanisms for requesting time fragments.

It certainly would be nice for videos to be sent using a more appropriate 
protocol than HTTP.

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list