[whatwg] The <iframe> element and sandboxing ideas

Wed May 21 15:30:48 PDT 2008

Summary:

 * I've added a sandbox="" attribute to <iframe>, which by default
   disables a number of features and takes a space-separated list of
   features to re-enable:

     - by default, content in sandboxed browsing contexts, and any
       browsing contexts nested in them, have a unique origin
       (independent of the origin of their URI); this can be overriden
       using the "allow-same-origin" keyword

     - by default, all form controls in those browsing contexts are
       disabled; this can be overriden using the "allow-forms"
       keyword

     - by default, script in those browsing contexts cannot run; this can
       be overriden using the "allow-scripts" keyword

     - content in those browsing contexts cannot navigate other
       browsing contexts outside of the sandbox (seamless="" below
       overrides this)

     - content in those browsing contexts cannot create new browsing
       contexts or open modal dialogs or alerts

     - all plugins in those browsing contexts are disabled

 * I've added a seamless="" boolean attribute to <iframe>, which, if
   the content's active document's URI has the same origin as the
   container, causes the iframe to size vertically to the bounding box
   of the contents, and horizontally to the width of the container,
   and which causes the initial containing block of the contents to be
   treated as zero height. In addition, styles on the root element of
   the content must inherit from the <iframe> instead of being the
   initial values, and the style sheets that apply to the <iframe>
   must also apply to the contents. In addition, any time the browsing
   context navigates itself, the parent browsing context gets
   navigated instead.

This is all HIGHLY EXPERIMENTAL. I am looking for feedback on the general 
approaches taken.

There are various things that this doesn't address yet; e.g. there's no 
way to force (or even allow) a non-seamless iframe to open links in the 
parent window.

On Thu, 9 Mar 2006, Alexey Feldgendler wrote:
> 
> Let's imagine a blogging website that allows anybody to create a blog 
> which is available as http://www.example.com/blogs/username/. Many such 
> sites allow various user customization, so imagine this site lets the 
> blog owner to supply custom HTML to display on top of the blog page. 
> This is primarily used by blog authors to design stylish navigation. To 
> make such navigation menus more attractive, the authors wish to use 
> JavaScript and Flash, but unrestricted JavaScript would make it possible 
> for the blog owner to steal visitors' session cookies.
> 
> The blog author logs in and opens some kind of customization screen:
> 
> HTML to display on top of your blog: [TEXTAREA]
> [SUBMIT]
> 
> So, imagine the blog author enters into the textarea:
> 
> Welcome to my blog!</sandbox><a href="#"
> onclick="alert(document.cookie)">Click here</a>
> 
> After submission, this code is fed to the HTML cleaner. At present, HTML 
> cleaners are usually complicated scripts which try to catch known quirks 
> of the user agents, and still they usually have security holes found one 
> after another. See for example 
> http://cvs.livejournal.org/browse.cgi/livejournal/cgi-bin/cleanhtml.pl. 
> With HTML 5 parsing spec, there will be one single algorithm for parsing 
> HTML code with well-defined error recovery. So, the HTML cleaner at the 
> server side runs the HTML 5 parser on the user-supplied text, which 
> produces the following DOM:
> 
> * Welcome to my blog!
> * A
>     href="#"
>     onclick="alert(document.cookie)"
>   * Click here
> 
> The </sandbox> tag is ignored as an easy parse error because there is no 
> matching <sandbox> tag in the user-supplied text. After parsing, the 
> HTML cleaner iterates through the tree, renaming potentially unsafe 
> elements and attributes, producing the following:
> 
> * Welcome to my blog!
> * A
>     href="#"
>     safe-onclick="alert(document.cookie)"
>   * Click here
> 
> At the final stage, the HTML cleaner re-serializes the DOM into the 
> following code, which is saved into the database:
> 
> Welcome to my blog!<a href="#" 
> safe-onclick="alert(document.cookie)">Click here</a>
> 
> When the site renders the blog page, it puts the "HTML for page top" 
> inside a sandbox:
> 
> <body>
> <sandbox>
> Welcome to my blog!<a href="#" safe-onclick="alert(document.cookie)">Click
> here</a>
> </sandbox>
> ...
> </body>
> 
> Each blog entry is probably also contained in its own sandbox. This is 
> even more important on the so-called friends pages, where entries by 
> different authors are displayed on the same page.
> 
> When the page is rendered in a modern user agent which supports 
> sandboxing, the safe-onclick attribute is interpreted exactly the same 
> as onclick. When the user clicks the link, the event handler is 
> executed. Because the code is inside the sandbox, it operates on a fake 
> document object, so it doesn't retrieve the cookies (I think 
> document.cookie should just return an empty string). The visitor's 
> session cookies are safe.
> 
> When the page is rendered in an older user agent which doesn't support 
> sandboxing, the safe-onclick attribute is ignored because it is unknown. 
> When the user clicks the link, no event handler is executed, and the 
> cookies are safe again.

You can do this now (though it's far uglier) by taking the author's markup 
and converting it to base64, and then stuffing it into an iframe something 
like this:

   <iframe seamless sandbox="allow-scripts allow-forms"
           src="data:text/html;base64,PCFET0NUWVBFIEhUTUw%2BPHRpdGxlPjwvdGl0bGU%2BV2VsY29tZSB0byBteSBibG9nITwvc2FuZGJveD48YSBocmVmPSIjIiBvbmNsaWNrPSJhbGVydChkb2N1bWVudC5jb29raWUpIj5DbGljayBoZXJlPC9hPg0K">
   </iframe>

This isn't very readable, I'll grant you. I'm thinking of introducing a 
new attribute. I haven't worked out what to call it yet, but definitely 
not "src", "source", "src2", "content", "value", or "data" -- maybe 
"html" or "doc", though neither of those are great. This attribute would 
take a string which would then be interpreted as the source document 
markup of an HTML document, much like the above; it would override src="" 
if it was present, allowing src="" to be used for legacy UAs:

   <iframe seamless sandbox="allow-scripts allow-forms" doc="
     <!DOCTYPE HTML>
     <title></title>
     Welcome to my blog!
     </sandbox>
     <a href='#' onclick='alert(document.cookie)'>Click here</a>
   "></iframe>

(There are things we can do to make this better, e.g. make the <!DOCTYPE 
HMTL> and <title></title> bits implicit, maybe introducing type="" to say 
whether it's HTML or XML instead of only supporting HTML, maybe saying 
that if src="" and doc="" are both specified they must have identical 
data, etc.)

Comments and suggestions on this are welcome. I haven't added it to the 
spec yet. I do agree that without this or something equivalent that we 
don't have a solution for sandboxing embedded blog comments yet.

On Mon, 23 Apr 2007, Jonas Sicking wrote:
>
> The idea is basically an element like <iframe> but that renders the 
> linked page, instead of inside a square area, in flow with the main 
> page. This idea is really rough still, but I hope to try to implement it 
> in a not too distant future to solidify it a bit. One thing very much up 
> in the air is what the element would be called. Suggestions welcome, but 
> I'm using the name <include> below.

I've basically added this to <iframe> using the seamless="" attribute.

> Should the stylesheets of the outer or the inner document be used?

I went with "yes".

> When a fragment identifier is specified, should we render that element, 
> or its children?

I went with making that work the same as with normal <iframe>s (so likely 
no effect if the default shrink-wrapping-to-boundary-box behaviour is in 
effect).

> Should style be inherited from the parent of the <include>, or from the 
> DOM parent in the inner document?

I've made inheritance happen from <iframe> to root element.

> Should the inner DOM be rendered inside of, or in place of the <include>?

I've made this happen as with <iframe>.

On Mon, 23 Apr 2007, Gervase Markham wrote:
>
> https://bugzilla.mozilla.org/show_bug.cgi?id=80713

I've taken the notes there into account.

On Mon, 23 Apr 2007, Jonas Sicking wrote:
> 
> There's a big difference to that and to what I'm proposing. With what's 
> in bug 80713 you're still limited to a box that basically doesn't take 
> part of the outer page at all. For example in the table example in my 
> original post the headers of the table would not resize to fit the 
> column sizes in the <include>ed table.

Woah. That's far more radical. I have no idea how to do that. How would 
you make the parser not generate the implied elements and switch straight 
to the "in table" mode? How would you make the CSS model work with this? 
How would you define conformance for the document fragments?

On Thu, 26 Apr 2007, Martin Atkins wrote:
> 
> Would documents included via <include> run in the security context of 
> the including page, as with the script technique, or would they run in 
> the context of the included document, as with iframes?

The sandbox="" attribute can be specified to change it from the former to 
the latter (and in fact, from the former to an isolated origin regardless 
of the true origin of the document).

On Fri, 27 Apr 2007, Jonas Sicking wrote:
> 
> They would run in the context of the included page, just like an iframe. 
> The processing of <include> is exactly that of <iframe> the only 
> difference is in the rendering.

It may be worth bringing this up with the CSSWG if it really is just a 
rendering issue.

On Tue, 8 May 2007, Dean Edwards wrote:
> 
> XBL has an attribute to cover inherited styles, so you're right. 
> Realistically, I can't see Microsoft ever implementing XBL (I hope I'm 
> wrong). So adding it to HTML might be the only way to achieve this 
> functionality.

Inventing a new technology that does the same as another on the basis that 
the UAs will implement one but not the other seems dubious at best.

> Kind of like an <iframe> but without an external source.

Would the doc="" proposal above be enough?

On Tue, 8 May 2007, Henri Sivonen wrote:
> 
> I wonder if this issue could be solved on the layout/CSS level by 
> providing a way to make the height of an iframe depend on the actual 
> height of the root element of the document loaded in the iframe. That 
> is, would it be feasible to make the iframe contents have the layout/UI 
> feel of a part of the parent page while keeping the DOMs and script 
> security contexts separate?

That's pretty much what seamless="" does, yes.

On Tue, 8 May 2007, Jon Barnett wrote:
> 
> http://www.w3.org/TR/css3-box/#intrinsic0 (and also CSS2 10.6)
> Since CSS doesn't attempt to specify the intrinsic width of a document in an
> iframe, maybe HTML5 should specify that the intrinsic width of a document
> is:
> - if the CSS width property is specified on the html element, the margin-box
> of the page at that width (which may have overflow)
> - else, if the CSS min-width property is specified on the html element, the
> margin-box of the page at that width (which may have overflow)
> - else, the smallest width the page can have without horizontal scrolling
> and the intrinsic height of the document is:
> - if the CSS height or min-height property are set, similar to above,
> - else, the smallest height the page can have at the intrinsic width of the
> document without vertical scrolling

That seems overly complicated, but the spec says something similar in 
fewer words.

On Thu, 10 May 2007, Magnus Gasslander wrote:
>
> I see you have done some work to prevent reflow loops with percentage 
> root heights > 100%, but how does your patch handle an iframe document 
> that looks like this? (I can think of nastier testcases also, where 
> "bottom"  is embedded further down in the document)
> 
> <html>
> <head>
> </head>
> <body>
> <div style="position:absolute;bottom:-5px;">This will force a scrollbar on the
> document</div>
> </body>
> </html>

As far as I can tell, the spec handles this fine.

On Mon, 14 May 2007, Michel Fortin wrote:
> 
> What about encoding the content of each comment iframe in a "data:" URI?

That unfortunately isn't compatible with IE, and has rather unfortunate 
non-trivial escaping requirements.

On Mon, 14 May 2007, Jon Barnett wrote:
> 
> The contents of an iframe with a data: URI source should be trusted, 
> unlike an iframe with an http: URI source from another domain.  A script 
> in an iframe with a data: URI source should, by default, be able to 
> communicate with the parent window.  So, that alone doesn't solve the 
> problem.

Adding sandbox="" solves this (at least for new UAs).

On Mon, 14 May 2007, Alexey Feldgendler wrote:
> 
> Not to mention that data: URIs are ugly, wasteful (because of the BASE64 
> encoding), cannot be read and written by humans directly, and have 
> maximum length problems in some implementations.

Right.

On Mon, 14 May 2007, Alexey Feldgendler wrote:
> 
> Yes, I want the sandbox to degrade securely, as does any webmaster who 
> might be going to allow some user-supplied scripting while relying on 
> sandboxing for security. To cover its use cases, this feature must 
> degrade securely.

Degrade securely _and usefully_, or just securely (and maybe to nothing)?

The latter is handled by the doc="" proposal. The former may be impossible 
without server-side filtering.

> This does degrade securely, doesn't require separate HTTP requests, and
> maintains human readability.
> http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2005-December/005301.html

This still requires server-side filtering, though.

On Mon, 14 May 2007, Michel Fortin wrote:
> 
> People are already struggling to remove all scripts from HTML snippets. 
> I don't think finding all these occurrence and replacing them is going 
> to be much better. Also, you'd need safe-style="" and <safe-style> too, 
> since IE can embed javascript expressions into style rules. (And now 
> lets hope IE does not allow expression elsewhere.)

Indeed.

> This principle could be transposed to <sandbox>, where it could be 
> defined as taking the unsafe HTML content from an attribute. And the 
> best part: you don't need anything else like the safe-* substitutions as 
> suggested earlier for <sandbox>:
> 
>     <sandbox type="text/html" content="
>       <p>"Unsafe" content here:</p>
>       <script>
>         document.write(window.parent.location)
>       </script>
>     ">
>        Alternative, possibly degraded but safe content for older browsers.
>     </sandbox>

I think we'd want to use <iframe> for this, but otherwise, yes.

On Tue, 15 May 2007, Gervase Markham wrote:
> 
> Would you really want separate security contexts for each comment?

On Tue, 15 May 2007, Alexey Feldgendler wrote:
> 
> I wouldn't want to allow people screw up others' comments, making it 
> look that other users wrote what they didn't write. So, yes, it's 
> important that any code within a comment cannot change anything but 
> itself. This also means that the comment should be unable to change the 
> header/footer around it to pretend that someone else wrote it.

Documents per comment are expensive, but they do seem to be what we need 
(or maybe want) here.

On Tue, 15 May 2007, Kristof Zelechovski wrote:
>
> The OP probably meant that maintaining so many contexts would cause a 
> comparable deterioration in performance.  All user comments should be 
> put in one security context.
>
> With all comments grouped together in such a manner, you could even use 
> an inline frame.

While simple, this wouldn't let you do things like have trusted content 
interleaved with comments (e.g. "edit" and "reply" links), which is 
common.

On Tue, 15 May 2007, Jon Barnett wrote:
> 
> I really think comments are a bad use case.  Why would someone allow 
> scripts in comments in any context, much less a sandboxed one?

You wouldn't, but you would want to prevent scripts from running 
altogether.

> The best use case I have thought of so far is MySpace et. al., a site 
> where users have their own page with limited permission in the context 
> of the overall site.  MySpace solves this by not allowing scripts at 
> all, as most such web sites do.  If possible, such sites might allow a 
> user to insert widget scripts with limited permissions.  For this use 
> case, iframe isn't ideal, either, but limited scripting and styling are 
> desired.

Would the spec's current proposals work?

On Wed, 9 May 2007, Alexey Feldgendler wrote:
> On Tue, 08 May 2007 05:50:38 +0200, Ian Hickson <ian at hixie.ch> wrote:
> > 
> > This probably depends on the use cases in question. For some use 
> > cases, the status quo is in fact the script running with full 
> > privileges, so while not being ideal, it is indeed acceptable; in 
> > other cases, you wouldn't want scripts to run at all if they weren't 
> > limited in some way.
> 
> A security feature, by definition, protects the users from a certain 
> class of attacks. An attack needs to be only successful in one browser 
> to do harm. For example, a malicious advertising script which actually 
> steals passwords entered by users on the host page is dangerous enough 
> even if the attacker only succeeds in stealing passwords of just a 
> fraction of the users.
> 
> I can't really imagine a scenario in which sandbox restrictions could be 
> somehow considered optional. Wherever there is need for such 
> restrictions, it's unacceptable to run the script without them 
> implemented.

In some cases the sandbox would be "defence in depth" -- for example, in 
all cases where user-generated content is embedded today.

> The key differences from <iframe> are:
> 
> 1. Doesn't require loading of a separate document via a separate HTTP 
> request, and without the ugliness of data: URIs. If there was some 
> "inline" version of <iframe>, such as <iframe>content</iframe>, that 
> would be just fine.

doc="" would handle this, then...

> 2. Implements the security barrier even though the inner content doesn't 
> come from a different domain. <iframe> would require a separate domain 
> for that.

sandbox="" does this now.

> 3. The security barrier is asymmetric, i.e. the outer scripts have 
> access to the inner content, but not the other way round.

What's the use case for this?

> All attempts to treat user-submitted HTML as a string are doomed to 
> having such vulnerabilities. <sandbox> alone doesn't add much to this 
> problem. Just look at how complex is the HTML sanitizer in LiveJournal 
> which allows some user-submitted markup but not all.

That's one advantage of the doc="" idea; it makes sanitising mostly 
trivial compared to all other ideas for this.

On Thu, 10 May 2007, Gervase Markham wrote:
> 
> If attributes on closing tags were allowed, you could do:
> 
> <sandbox secret="09f9...">Hello World</sandbox secret="09F9...">
> 
> In other words, make them match. So any inserted </sandbox> tags 
> wouldn't close the sandbox unless they knew the secret - which they 
> couldn't do, because they have the chicken-and-egg problem of having to 
> be able to read the page first.

This relies on the author being able to reliably produce unpredictable 
content, which is a very dubious responsibility to put on many authors.

Also, it would make the XML guys have a fit. Then again, maybe that goes 
in the "pro" column and not the "con" column...

> http://www.gerv.net/security/content-restrictions/ , specifically the 
> "hierarchy" restriction, allows the <iframe> content to be isolated from 
> the parent.

It's not enirely clear what the proposal here is; as far as I can tell 
it's an HTTP header. Is that right? Self-describing the security 
restrictions on content works for same-site serving, but not really for 
third-party content.

> IE has the proprietary "security" attribute on <iframe> which restricts 
> script in various ways: 
> http://msdn2.microsoft.com/en-us/library/ms534622.aspx

I tried using this, but it was tied too closely to IE's own security 
concepts to really make use of it, sadly.

On Thu, 15 May 2008, Henri Sivonen wrote:
> > 
> > Documents don't have intrinsic dimensions, and the user's default font 
> > size is likely to vary from user to usr. How would you know what 
> > height and width to give?
> 
> You give it the dimensions of an industry-standard ad banner size.

On Fri, 16 May 2008, James Justin Harrell wrote:
> 
> The same way you would know what height and width to give to a 
> non-replaced element. Why should an embedded document not be able to 
> render as if the contents of the document were present inline in the 
> parent document? Backwards compatibility should probably trump better 
> behavior here, but why is it not possible to specify this through CSS?
> 
> I've heard of this problem multiple times. For example, 
> http://weblogs.mozillazine.org/gerv/archives/2005/02/autosizing_ifra.html

I've added height/width back.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'