[whatwg] Start position of media resources

Philip Jägenstedt philipj at opera.com
Tue Apr 7 04:26:13 PDT 2009

On Tue, 07 Apr 2009 10:26:15 +0200, Silvia Pfeiffer
<silviapfeiffer1 at gmail.com> wrote:

> On Tue, Apr 7, 2009 at 5:12 PM, Philip Jägenstedt <philipj at opera.com>  
> wrote:
>> On Tue, 07 Apr 2009 06:11:51 +0200, Chris Double  
>> <chris.double at double.co.nz>
>> wrote:
>>> On Tue, Apr 7, 2009 at 3:40 AM, Eric Carlson <eric.carlson at apple.com>
>>> wrote:
>>>>  Media time values are expressed in normal play time (NPT), the  
>>>> absolute
>>>> position relative to the beginning of the presentation.
>>> I don't see mention of this in the spec which is one of the reasons I
>>> raised the question. Have I missed it? If not I'd like to see the spec
>>> clarified here.
>>> Chris.
>> Indeed clarification is needed. In my opinion time 0 should correspond  
>> to
>> the beginning of the media resource, no matter what the timeline of the
>> container format says. This means that currentTime doesn't jump when
>> playback begins and never goes beyond duration.
> I humbly disagree. If a media file explicitly knows at what time
> offset it starts, the timeline needs to represent that, otherwise
> there are contradictory user experiences.

If the media resource really does explicitly define an offset then I might
agree that it makes sense to adjust the timeline.

However, for plain Ogg or any other container format that just happens to
have a non-zero timestamp at the beginning of the file I think we should
definitely align them to zero. You can get such files easily by cutting
streams and it would be confusing if the timeline was relative to the
original file. As an example, in MPEG the PTS (Presentation Time Stamp)
can start at non-zero, be discontinuos and wrap around 0 so normalization
is necessary. I'm not sure if anyone disagrees, but it would be a very bad
idea to infer any semantics from the container time stamps in the absense
of some explicit mapping like Ogg Skeleton.

Not generally requiring low-level inspection of the container format time
stamps is important to us as that would virtually require that the UA
itself to demux and inspect the time stamps of different container
formats. If a platform media framework is used, time is normalized in some
way, likely differently depending on framework and plugins used.

> For example, take a video that is a subpart of a larger video and has
> been delivered through a media fragment URI
> (http://www.w3.org/2008/WebVideo/Fragments/WD-media-fragments-reqs/).
> When a user watches both, the fragment and the full resource, and both
> start at 0, he/she will assume they are different resources, when in
> fact one is just a fragment of the other. Worse still: if he/she tries
> to send a URI with a link to a specific time offset in the video to a
> friend, he/she will come up with diverging URIs based on whether
> he/she watched the fragment or the full resource. Representing the
> wrong timeline for a media fragment will only cause confusion and
> inconsistencies.

OK, I agree with this.

>> Taking Ogg as an example, there's no requirement that the granulepos  
>> start
>> at zero nor does a non-zero granulepos imply any special semantics such  
>> as
>> "the beginning of the file has been chopped off". A tool like oggz-chop
>> might retain the original granulepos of each packet or it could just as  
>> well
>> adjust the stream to start at granulepos 0. Neither is more correct  
>> than the
>> other, so I'd prefer we not try to infer anything from it, especially  
>> since
>> such low-level timing information might be hidden deep inside the  
>> platform
>> media framework (if it normalizes the time like XiphQT presumably does).
> For Ogg and the definition of Skeleton
> (http://wiki.xiph.org/index.php/Ogg_Skeleton), both the original
> basetime of the beginning of the file and the presentation time of the
> chopped off part are recorded, so it actually does imply special
> semantics.

The most consistent behavior in my opinion would be to report duration as  
the duration of the whole resource and for buffered/played/seekable to  
only return ranges within the range indicated by the media fragment.  
Exposing the start and stop position of the fragment via the DOM seems  
like overkill to me at this point.

Philip Jägenstedt
Opera Software

More information about the whatwg mailing list