[whatwg] HTML5 video: frame accuracy / SMPTE

Gregory Maxwell gmaxwell at gmail.com
Fri Jan 21 14:05:18 PST 2011

On Fri, Jan 21, 2011 at 4:42 PM, Roger Hågensen <rescator at emsai.net> wrote:
> Accurate seeking also assumes things about the codec/container/encoding.
> If a format does not have keyframes then it "does" have something
> equivalent.
> Formats without keyframes can probably (I might be wrong there) seek more
> accurate than those with keyframes.

You can _always_ seek accurately if you can seek at all, just not
necessarily efficiently: if all else fails you decode the video from
the start.

Keyframes are orthogonal to this: I can construct Theora streams (and
presumably VP8, and other interframe formats) where you can begin
decoding at any point and after decoding no more than N frames (where
N is some value of my choosing, perhaps 24) the decoder is completely
synchronized and bit-accurate.  A stream with keyframes is no more
seekable than such a stream, they are just less computationally
expensive to seek if and only if you don't mind only seeking to the
keyframes, for seeking to arbitrary locations a rolling intra scheme
and an exact recovery scheme are the same. (Which is why firefox
correctly decodes Theora files constructed in this manner, even though
that was never a consideration).

Exact should be exact.  Consider a video editor application created
using the video tag and canvas. A failure to operate exactly may cause
data corruption.  A stream which isn't incrementally seekable should
be decoded from the front in the case of an exact request.

The potentially high cost of an exact seek is the primary reason why I
wouldn't want to make the default behavior mandate exact, but exact
still needs to be available.

On Fri, Jan 21, 2011 at 4:57 PM, Silvia Pfeiffer
<silviapfeiffer1 at gmail.com> wrote:
> * the default is best effort

I fear that the "best effort" language is misleading.  You can always
do exact on a stream that you can seek to the beginning. So the "best"
would be exact.

The language I'd prefer is "fast".  Fast may be exact, or it might
just be to the nearest keyframe, or something in between. It might
just start you over at the beginning of the stream.

One question about inexact seeking is what should the client do when
the current playtime is closer to the requested time than what the
inexact seek would provide?

> * KEYFRAME is keyframe-accurate seeking, so to the previous keyframe

What does this mean when a seekable stream doesn't have interior
keyframes? Should the client always seek to the beginning? Why is this
valuable over a "fast" option?

More information about the whatwg mailing list