[whatwg] Content type sniffing
bzbarsky at MIT.EDU
Mon Jan 12 07:54:15 PST 2009
Adam Barth wrote:
> Extensions are bad news for content sniffing because they can often be
> chosen by the attacker. For example, suppose user-uploaded content is
> can be downloaded at:
> In most PHP configurations, the attacker can choose whatever file
> extension he likes by directing the user's browser to:
> And the PHP script will happily run.
Right, I understand that.
> Yes. We do have lots of data from opt-in user metrics from Chrome.
> Here is a somewhat recent summary:
I'm not quite sure what to make of this, actually. Specifically, where
is the "22.19%" number for "HTML Tags" coming from? 22.19% of what?
The magic numbers stuff actually adds up to 100%, but of what?
> To address your particular concern, <body occurs 6899 times less often
> than <script on Web content that lacks a Content-Type (or has an bogus
> Content-Type like */*), assuming I did my arithmetic correctly.
OK, that's good to know.
> I'm sympathetic to adding more HTML tags to the list, but I'm not sure
> how far down the tail we should go. In Chrome, we went for 99.999%
> compatibility, which might be a bit far down the tail.
Doesn't seem that way to me, given the number of web pages out there.
Ah, ok. The relevant Gecko code is
I'd probably be fine with trimming that list down a bit, but I'm not
quite sure what the downsides of having more tags in it are here.
More information about the whatwg