[whatwg] some thoughts on sandboxed IFRAMEs
slightlyoff at google.com
Mon Jan 25 15:45:56 PST 2010
On Sun, Jan 24, 2010 at 2:52 AM, Ian Hickson <ian at hixie.ch> wrote:
> On Fri, 11 Dec 2009, Michal Zalewski wrote:
>> 1) IFRAME semantics make it exceedingly cumbersome to sandbox short
>> snippets of text, and this task is perhaps the most common and pressing
>> XSS-related challenge. Unless the document is constructed on client side
>> a lot of additional HTTP roundtrips, to utilize sandboxed IFRAMEs for
>> this purpose. [ There is also the problem of formatting and positioning
>> IFRAME content, although the seamless attribute would fix this. ]
> I've introduced srcdoc="" to largely handle this. There is an example in
> the spec showing how it can be used.
>> The ability to sandbox SPANs or DIVs using a token-guarded approach
>> (<span sandbox="random_token"></span sandbox="same_token">) is, on the
>> other hand, considerably easier on the developer, and probably has a
>> very similar implementation complexity.
> This has been proposed before. The concern is that many authors would be
> likely to make mistakes in their selection of "random" tokens that would
> lead to significant flaws in the deployment of the feature.
> srcdoc="" is less prone to errors. Only " and & characters need to be
> escaped. If the " character is not escaped, then a single " character in
> the input will cause the comment to break. This is likely to be caught
> early. If the & character is not escaped, correctness and fidelity will
> suffer, but it will not lead to security errors.
Sorry I'm late to this discussion. Would like to add my objection to
using attribute string escaping as a security "feature" in any way. I
strongly prefer required nonces attached to opening and closing of
>> 2) Renderers suck dealing with IFRAMEs, and will probably continue to
>> do so for time being. This means that a typical, moderately complex
>> application (say, as a discussion forum or a social site), where
>> hundreds of user-controlled strings may need to be present to display
>> user content - the mechanism would have an unacceptable load time and
>> memory footprint. In fact, people are already coming up with
>> lightweight alternatives with a significant functionality overlap (and
>> different security controls). Microsoft has toStaticHTML(), while a
>> standardized implementation is being discussed here right now in a
>> separate thread.
> I agree that we should investigate other options too (<iframe> boxes
> aren't suitable for everything), but I don't think that current
> implementation problems with <iframe> should necessarily prevent us from
> investigating sandboxed iframes too.
> In certain contexts, e.g. reddit comments, it may be the case that instead
> of one sandboxed <iframe> per comment, the best way to do things is
> instead one sandboxed iframe for all the comments, with scripts disabled
> and allow-same-origin enabled, so that scripts can poke into the page and
> set event handlers on all the relevant links.
>> Isn't the benefit of keeping the design slightly simpler (and
>> realistically, limited to relatively few usage scenarios) negated by the
>> fact that alternative solutions to other narrow problems would need to
>> emerge elsewhere? The browser coming with several different script
>> sanitizers with completely different APIs and security controls does not
>> strike me as a desirable outcome (all the flavors of SOP are a testament
>> to this). If the anser is not a strong "no", maybe the token-guarded DIV
>> / SPAN approach is a better alternative?
> I agree in principle that fewer features are better than more features,
> but we have to take into account that many of the people deploying these
> features know nothing about security. We have to ensure that the security
> aspects of features like this (like what to escape, what security tokens
> need to be generated) are aligned with the practical aspects of features
> like this (like what results in the page appearing to work, regardless of
> the state of security).
>> Now, that aside - on a more pragmatic level, I have two extra comments:
>> 1) The utility of the SOP sandboxing behavior outlined in the spec is
>> diminished if we have no way to actually *enforce* that the IFRAMEd
>> resource would only be rendered in such a context. If I am serving
>> user-supplied, unsanitized HTML, it is obviously safe to do <iframe
>> sandbox src="show.cgi?id=1234"></iframe> - but where do we prevent the
>> attacker from calling http://my_site/show.cgi?id=1234 directly, and
>> bypassing the filter?
> I've introduced text/html-sandboxed for this purpose.
>> 2.1) The ability to disable loading of external resources (images,
>> scripts, etc) in the sandboxed document. The common usage scenario is
>> when you do not want the displayed document to "phone home" for privacy
>> reasons, for example in a web mail system.
> Good point. Should we make sandbox="" disable off-origin network requests?
>> 2.2) The ability to disable HTML parsing. On IFRAMEs, this can actually
>> be approximated with the excommunicated <plaintext> tag, or with
>> Content-Type: text/plain / data:text/plain,. On token-guarded SPANs or
>> DIVs, however, it would be pretty damn useful for displaying text
>> content without the need to escape &, <, >, etc. "Pure" security benefit
>> is limited, but as a phishing prevention and display correctness
>> measure, it makes sense.
> I don't really understand the use case here; could you elaborate?
> On Sun, 13 Dec 2009, Michal Zalewski replied to Tab:
>> > I believe that the @doc attribute, discussed in the original threads
>> > about @sandbox, will be introduced to deal with that. It'll take
>> > plain html as a string, avoiding the opaqueness and larger escaping
>> > requirements of a data:// url, as the only thing you'll have to escape
>> > is whichever quote you're using to surround the value.
>> That doesn't strike me as a robust way to prevent XSS - the primary
>> reason why we need sandboxing to begin with is that people have a
>> difficulty properly parsing, serializing, or escaping HTML; so replacing
>> this with a mechanism that still requires escaping is perhaps
> There's a world of difference between "properly parsing, serializing, or
> escaping HTML" and "escaping quotes and ampersands".
>> > More importantly, though, it puts a significant burden on authors to
>> > generate unpredictable tokens. Is this difficult? No, of course not.
>> > But people *will* do it badly, copypasting a single token in all their
>> > <iframe>s or similar.
>> People already need to do this well for XSRF defenses to work, and I'd
>> wager it's a much simpler and better-defined problem than real-world
>> HTML parsing and escaping could realistically be. It is also very easy
>> to delegate this task to existing functions in common web frameworks.
> Do people get CSRF right more often than simply escaping characters? It
> seems implausible that authors get complex cryptographic properties right
> more often than a simple set of substitutions, but I suppose stranger
> things are true on the Web.
>> Also, a single token on a returned page, as long as it's unpredictable
>> across user sessions, should not be a significant issue.
> I'm just worried that some people would just a constant string.
> On Sun, 13 Dec 2009, Adam Barth wrote:
>> I agree that we need something to help with content received by
>> cross-site XMLHttpRequest and postMessage. For those use cases, we're
>> already running script, so a design like toStaticHTML seems better than
> If the data is to be rendered into a block-level box, it seems that
> srcdoc="" might actually handle that case too.
> On Sun, 13 Dec 2009, Michal Zalewski replied to Adam:
>> > The @sandbox seems like a better fit for the advertising use case.
>> I am not contesting this, to be clear - I am aware of many cases where
>> it would be very useful - but gadgets are a fairly small part of the
>> Internet, and seems like a unified solution would be more desirable than
>> several very different APIs with different granularity.
>> The toStaticHTML-alike will address another specific uses, but will
>> leave applications that can't rely on JS exclusively for their rendering
>> needs (which I'd wager is still a majority) out in the cold; which would
>> probably lead to a yet another XSS prevention / HTML sandboxing approach
>> emerging later on.
>> I haven't really seen a compelling argument why all these can't be
>> unified without a significant increase in code or spec complexity -
>> maybe one exists.
> What would they be unified under? I don't think anyone has proposed
> anything that solves all the problems that CSP, sandbox="", srcdoc="",
> toStaticHTML(), httpOnly, text/html-sandboxed, and the various other
> "security" mechanisms introduced to the platform over the past few years
> would solve without introducing more complexity overall.
> There are many problems to solve. It seems logical that we'd end up with
> many solutions.
> On Sun, 13 Dec 2009, Michal Zalewski replied to Adam:
>> > That seems like a backwards way of proceeding. Do you have a proposal
>> > for unification besides the <jail> tag?
>> The only fundamental objection I have heard against it is the trouble
>> with XML representation.
> Well, it also doesn't really solve all the problems. For example, it
> doesn't solve the "embedding external content safely" problem.
>> The other option is to simply require a traditional CDATA-esque behavior
>> or a tag parameter - which would place the burden on the author to
>> filter out / escape a single exact string or a quote, but would be
>> similar otherwise.
> That's similar to what srcdoc="" does when used with sandbox="".
>> It's obviously less secure - because while the token-based approach
>> actually requires the user to explicitly come up with a token, however
>> poor it might be; whereas here, there is no way to enforce escaping.
> The token-based approach could lead an author to just coming up with a
> constant token, which is just as useless as not enforcing escaping, except
> that the author had to wonder how to get security to use it, and thus the
> author will have a false sense of security whose only likely failure mode
> is an actual attack. Compare this to srcdoc="", where the failure mode is
> the use of a quote mark, and is thus likely to happen much earlier than an
> attack. It's also easier to understand the failure mode. "The token has to
> be unguessable" is harder to explain than "quotes have to be escaped".
>> From Tab's response, looks like it's being considered, too - @doc +
>> @seamless. What's strikes me as a bit ironic is that this way, we're
>> overloading IFRAME to become something else entirely, and after
>> rejecting token-guards, settling for an option that is definitely not
>> perfect, and in practice, I think, is bound to be less secure.
> I don't really follow the "something else entirely" bit. Also, why would
> it be less secure? What is the attack scenario?
> On Sun, 13 Dec 2009, Michal Zalewski wrote:
>> Huh? But that's not the point I am making... I am not arguing that
>> iframe sandbox should be abandoned as a bad idea - quite the opposite.
>> I was merely suggesting that we *expand* the same logic, and the same
>> excellent security control granularity, to span and div; this seems like
>> it would not increase the implementation complexity in any significant
> I don't understand the proposal then. What is the problem it is solving,
> and how does it solve it?
>> We could then allow these to be populated with secure contents in three
>> 1) Guarded closing tag - this is simple and bullet-proof; but may
>> conflict with XML serializations, and hence require some hacks,
> I strongly disagree with the characterisation of this idea as "simple and
> bullet-proof", at least for anyone who doesn't understand cryptography.
>> 2) CDATA or @doc-like approaches. Less secure because it does not
>> enforce a security control, but less contentious, and already being
>> considered for IFRAMEs.
> I don't understand what you mean by "does not enforce a security control",
> or how a guarded closing tag does "enforce a security control".
>> 3) .innerHTML, which would be then safe by default, without the need for
>> .innerSafeHTML (and the associated ambiguities) or explicit
>> .toStaticHTML calls.
> To run scripts in a safe environment, we need to have a separate global
> object, which is why we're using <iframe> for it. This supports the
> equivalent of ".innerHTML" as you describe (.srcdoc).
> If you just want something that blocks scripts, plugins, forms, targeted
> links, etc, without a separate document, then it's not clear to me that
> that is something that is sanely achievable. It would require complex
> changes all over the place.
> What is the use case this is targetted at?
> On Sun, 13 Dec 2009, Adam Barth wrote:
>> I'm very interested in a solution that works for the following use
>> 1) A web page wants to display untrusted (i.e., restricted) HTML
>> received via cross-site XMLHttpRequest or postMessage.
> Do you have a concrete use case for which <iframe> doesn't work?
>> 2) A blog wishes to display many comments containing untrusted (i.e.,
>> restricted) HTML.
> It seems <iframe srcdoc> works well for this case. You can even safely
> enable scripts in the comments, so that people can upload little
> calculator-like things or games, not that I would recommend that!
> On Sun, 13 Dec 2009, Michal Zalewski wrote:
>> [...] this really strikes me as throwing random ideas at the wall, and
>> seeing which ones stick.
> Welcome to Web standards development. :-)
>> Furthermore, in this particular case, I am really concerned that the
>> spec is at odds with itself - you mention certain specific use cases,
>> but the spec seems to be after a broader goal: sandboxing user-supplied
>> content in general. In doing so, it gives some bad advice (again, the
>> user content example is exploitable, at least until the arrival of some
>> out-of-scope security mechanism to prevent it).
> I've added a warning to the spec pointing out that the text/html-sandboxed
> MIME type has to be used in that case.
> On Sun, 13 Dec 2009, Aryeh Gregor wrote:
>> So instead, why not just use the standard escaping mechanisms we already
>> have? Allow a sandbox attribute on all elements that can contain
>> phrasing or flow content. Any such element with a sandbox attribute
>> will be required to contain no literal <>'" before the closing tag. If
>> any of those four characters is encountered, the element is treated as
>> having no contents. Otherwise, the browser unescapes all characters
>> with special meanings ("<" -> "<", ">" -> ">", "&" -> "&",
>> etc.) and then treats the resulting string as the inner HTML of the
>> element, parsing it like regular HTML, but the contents are sandboxed.
>> <span sandbox>This span will work normally, except for being
>> <span sandbox>This span will be <em>empty</em> in the DOM, even though
>> it contains no evil content, because otherwise authors will forget to
>> escape the contents of the sandbox.</span>
>> <span sandbox><span>But this span will have another span as its
>> child, sandboxed. The regular parser sees no entities here, only a
>> nested span!</span></span>
>> <span sandbox>It would be safe to allow this to work, since it only
>> contains an apostrophe, but let's not, so that lack of escaping is
>> easier to catch. This span is therefore also empty.</span>
> What would the "sandbox" do, other than require one level of escaping?
> i.e. what is it protecting against?
> Ian Hickson U+1047E )\._.,--....,'``. fL
> http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
> Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg