[whatwg] Video with MIME type application/octet-stream

Roger Hågensen rescator at emsai.net
Fri Sep 10 15:51:01 PDT 2010

  On 2010-09-09 09:24, Philip Jägenstedt wrote:
> On Thu, 09 Sep 2010 02:15:27 +0200, David Singer <singer at apple.com> 
> wrote:
>>> On Wed, Sep 8, 2010 at 3:13 PM, And Clover <and-py at doxdesk.com> wrote:
>>>> Perhaps I *meant* to serve a non-video
>>>> file with something that looks a fingerprint from a video format at 
>>>> the top.
>>> Anything's possible, but it's vastly more likely that you just made 
>>> a mistake.
>> It may be possible to make one file that is valid under two formats.  
>> Kinda like those old competitions "write a single file that when 
>> compiled and run through as many languages as possible prints "hello, 
>> world!" :-).
> For at least WAVE, Ogg and WebM it's not possible as they begin with 
> different magic bytes.

Then why not define a new "magic" that is universal, so that if a proper 
content type is not stated then a sniffing for a standardized universal 
magic is done?

Yep, I'm referring to my BINID proposal.
If a content type is missing, sniff the first 265 bytes and see if it is 
a BINID, if it is a BINID check if it's a supported/expected one, and it 
is then play away, all is good.
If a content type is given, then just in case sniff the first 265 bytes 
and see if it is a BINID, if it is a BINID check if it's a 
supported/expected one, and it is then play away, all is good.
If a content type is missing, and the sniffing of the first 265 bytes 
shows it is not a BINID or not a supported one, then it can only be 
treated as unknown binary and would fail (though in the case of a 
unsupported BINID the user would be shown what the BINID is so they 
won't be fully stuck if they miss a particular codec or the browser 
doesn't support it).
If a content type is given, and sniffing the first 265 bytes shows it's 
not a BINID or not a supported one, then treat it as per the context 
(video or audio) and hope the video or audio codec layer is able to find 
out what it is (what "should" happen currently right?).

It would be very easy to add support for something like BINID as it can 
be output at the start of a file or stream as the server sends it, a 
script could even output it or it could be at the start of the actual 
file itself,
and in the case of live streaming a server could easily add it to the 
start of the stream even if it's mid-stream. Even a wrongly configured 
webserver wouldn't be able to mess up the handling of this.
The benefit is that the browser would see that, Oh, this is a BINID and 
it's Webm, I'll pass this on to the video codec then.
Or if <audio> and the browser sees it is a BINID and it's MP3 it would 
pass it to the mp3 audio codec.
In time something like BINID might even propagate elsewhere beyond just 
<video> and <audio>.

I'm not saying that BINID must be used, but at least something very 
close to it (as unknown formats can be shown to a human user and make 
sense and be searchable), and maybe the first 8 bytes should be 
constructed slightly differently?.
Oh and although I haven't tested this, I suspect that most current 
codecs would ignore the first 265 bytes when they sniff for the start of 
the data anyway so a BINID would be partially backwards compatible,
and in any case certainly easy to patch in support for quite easily.
And the best part is that the browser could easily strip or skip past 
the BINID when passing the data to the OS or codecs (if such do not 
support BINID at all), or if saving the audio or video locally per user 

Something like BINID (short for Binary Identification actually) is 
needed, and there is nothing wrong with HTML5 and <video> <audio> 
standard defining it,
it wouldn't be the first time a web standard has been adopted elsewhere 
later, it would surely see adoption outside of this, I certainly would 
use it elsewhere.

I invented BINID for a reason, because .*** file extensions just isn't 
good enough, and sniffing binary files is a real pain, the same pain as 
the <video> and <audio> discussion here is pointing out right now.

So if sniffing is bad, but sniffing can't be avoided, then why not 
simply standardize the sniffing by defining a universal, simple and end 
user friendly (the BINID can be displayed to the user, even if 
and the sniffing would be limited to the first 265 bytes (in the case of 
the BINID proposal), and this limited sniffing can't determine what 
something is and the context and extra info (like content type) does not 
clarify what it is or what to do with it then simply fail and inform the 
user, it doesn't have to be more complicated than that.

As simple as possible, but no simpler. Isn't that the ideal mantra of 
all coders here?

Remember, I'm not saying you must use BINID (but hey it's there and 
fleshed out already), if you must change the name, do so, if you must 
change the 8 byte sequence, do so, just make sure it has a max length, 
and the "ID" is humanly disaplayable if the format is unsupported. Just 
make it into an RFC or something, and spec it in the HTML standard that 
it must be supported, and spec how to behave if it's not present (like I 
pointed further above) and it's solved as best as is possible. (unless 
somebody have an even better idea here that is?)

And yeah, this kinda stretched beyond the scope of HTML5 specs, but 
you'd be swatting two flies at once, solving the sniffing issue with 
<video> and <audio>, but also the sniffing issue that every OS has had 
for the last couple of um... decades?! (poke your OS/Filesystem 
colleagues and ask them what they think of something like this.)
Then again, HTML5 is kinda a OS in it's own right, being a app platform 
(not to mention supporting local storage of databases and files even), 
so maybe it's not that far outside the scope anyway to define something 
like this?

Roger "Rescator" Hågensen.
Freelancer - http://EmSai.net/

More information about the whatwg mailing list