[whatwg] updateWithSanitizedHTML (was Re: innerStaticHTML)

Mon Nov 30 17:43:47 PST 2009

On Nov 30, 2009, at 3:55 PM, Adam Barth wrote:

> On Fri, Jun 5, 2009 at 5:09 PM, Ian Hickson <ian at hixie.ch> wrote:
>> Defining a spec-blessed whitelist of element, attributes, and  
>> attribute
>> values is and filtering at the parser level is a significant new  
>> feature.
>> While I see that it has value, I think on the short term it would be
>> better to wait for a future version of HTML before introducing this
>> feature; ideally once we have more implementation experience with
>> experimental versions of this idea.
>>
>> I would encourage browser vendors to introduce APIs similar to that
>> discussed below, clearly marked as vendor-specific (e.g. for Firefox,
>> something like .mozStaticInnerHTML).
>
> The WebKit community is considering taking up such an experimental
> implementation.  Here's my current proposal for how this might work:
>
> http://docs.google.com/Doc?docid=0AZpchfQ5mBrEZGQ0cDh3YzRfMTJzbTY1cWJrNA&hl=en
>
> I would appreciate any feedback on the design.

I neglected to give feedback on webkit-dev but here's my comments:

1) It seems like this API is harder to use than a sandboxed iframe. To  
use it correctly, you need to determine a whitelist of safe elements  
and attributes; providing an explicit whitelist at least of tags is  
mandatory. With a sandboxed iframe, as a Web developer you can just  
ask the browser to turn off unsafe things and not worry about  
designing a security policy. Besides ease of use, there is also the  
concern that a server-side filtering whitelist may be buggy, and if  
you apply the same whitelist on the client side as backup instead of  
doing something high level like "disable scripting" then you are less  
likely to benefit from defense in depth, since you may just replicate  
the bug.

2) It seems like this API loses one of the big benefits of sanitizing  
HTML in the browser implementation. Specifically, in theory it's safe  
to say "allow everything except any construct that would result in  
script/code running". You can't do that on the server side -  
blacklisting is not sound because you can't predict the capabilities  
of all browsers. But the browser can predict its own capabilities.  
Sandboxed iframes do allow for this.

I think the benefits of filtering by tag/attribute/scheme for advanced  
experts are outweighed by these two disadvantages for basic use,  
compared to something simple like the original staticInnerHTML idea.  
Another possible alternative is to express how to sanitize at a higher  
level, using something similar to sandboxed iframe feature strings.

Here's a problem that exists with both this API and also  
innerStaticHTML:

3) There is no secure and efficient way to append sanitized contents  
to an element that already has children. This may result in authors  
appending with innerHTML +=  (inefficient and insecure!) or  
insertAdjecentHTML() (efficient but still insecure!). I'm willing to  
concede that use cases other than "replace existing contents" and  
"append to existing contents" are fairly exotic.

Regards,
Maciej