[whatwg] Video with MIME type application/octet-stream

Aryeh Gregor Simetrical+w3c at gmail.com
Mon Sep 6 12:19:08 PDT 2010

On Mon, Sep 6, 2010 at 4:14 AM, Philip Jägenstedt <philipj at opera.com> wrote:
> The Ogg page begins with the 4 bytes "OggS", which is what Opera (GStreamer)
> checks for. For additional safety, one could also check for the trailing
> version indicator, which ought to be a NULL byte for current Ogg. [1] [2]

"OggS\0" as the first five bytes seems safe to check for.  It's rather
short, I guess because it's repeated on every page, but five bytes is
long enough that it should occur by random only negligibly often, in
either text or binary files.

> For WebM, the first 4 bytes are the EBML header: the bytes 0x1A, 0x45, 0xDF,
> 0xA3. [3] The EBML DocType in the header must be "webm". Since parsing the
> EBML header is a little bit complicated, Opera (GStreamer) simply checks for
> the string "webm" somewhere in the header. I've heard rumors that WebM files
> are allowed to contain arbitrary garbage before the EBML header, but this is
> something we happily ignore, i.e., such files would fail to play in Opera,
> regardless of MIME type. I haven't encountered any such files yet, and think
> that browsers should not support this "feature".
> [1] http://www.xiph.org/ogg/doc/framing.html#page_header
> [2] http://www.xiph.org/ogg/doc/rfc3533.txt
> [3] http://ebml.sourceforge.net/specs/

It looks like you could check for 0x1a 0x45 0xdf 0xa3 as the first
four bytes, followed by 0x42 0x82 0x84 "webm" somewhere in the first
255 bytes or whatever.  (0x42 0x82 is the DocType marker, and 0x84 is
the length, encoded UTF-8 style: 1 for a one-byte length, 0000010 for
the actual length.)  That seems very safe.  If WebM allows degenerate
stuff that makes sniffing hard, we can just prohibit it in the WebM
spec, I assume.

More information about the whatwg mailing list