[whatwg] Content Restrictions
Alexey Feldgendler
alexey at feldgendler.ru
Thu Mar 2 06:30:23 PST 2006
On Tue, 21 Feb 2006 10:31:51 +0600, Hallvord Reiar Michaelsen Steen
<hallvord at hallvord.com> wrote:
>> What is or what isn't technically simple to implement in existing
>> implementations should perhaps not be what decides how specifications
>> are
>> written. It is clear that it is possible to implement per-function
>> security tracking (though slightly unclear how such security tracking
>> should work; which of all currently executing functions determine the
>> security context?)
Only the innermost one does. I've posted the exact rules a couple of weeks
ago.
>> It is also clear that it hasn't been exactly been required by
>> implementations
>> yet, so it is likely that an implementation doesn't have it already.
>> And since
>> it involves storing more information, implementing it is likely to cost
>> some
>> in terms of memory use.
In Gecko, as far as I can see from its source code, it doesn't add memory
overhead. It already has origin tracking of some kind (used to implement
the today's usual security restrictions).
>> why doesn't the author simply make
>> sure to serve the untrusted content from another server (with another
>> host name or port number, that is, not necessarily another machine)?
This is what LiveJournal does now. However:
1. For many small sites it's not an option,
2. It doesn't solve the problem of untrusted JS included by a page.
>> Seems that brings another (although simpler) set of problems though:
>> what if the untrusted content contains a "</SANDBOX>" tag, or if it
>> ends with "<!--", or possibly other syntax anomalies?
I never said that the website won't have to do HTML cleaning for
user-supplied content. But with HTML 5 reference parsing algorithm, such
cleaning is going to be much easier and straightforward: parse the text
into DOM (as if it was inside BODY, for example), remove or modify
forbidden elements, then serialize it. That way, </SANDBOX> will be
ignored as an easy parse error because it doesn't match an opening tag
within the user-supplied text. An unclosed comment will be ignored, too.
>> What if it doesn't contain exactly that, but something else that
>> triggers equivalent behaviour in the HTML parser in some implementation?
>> HTML parser are traditionally quite complex, and quite "fuzzy". The
>> fuzziness hasn't been a security problem before, now all of a sudden it
>> might be.
HTML 5 will make HTML parsing in standards mode well-defined, with
predictable error recovery.
> Did we discuss how the UA should handle a closing </sandbox> tag?
> Would it need to scan forward in the markup to find other closing
> tags and determine if the current one is a part of the enclosed
> markup or the end of the SANDBOX in that page? Perhaps only the first
> and the last SANDBOX open/close tags can be taken into account and
> others discarded?
No need to do that. SANDBOX elements can be nested like many others.
Nevertheless, a </SANDBOX> tag without a matching opening tag inside the
user-supplied content will be ignored during the HTML cleanup process
described above.
There is one more such case: when </SANDBOX> is injected using
document.write("</SANDBOX>"), but that can be easily circumvented.
--
Opera M2 8.5 on Debian Linux 2.6.12-1-k7
* Origin: X-Man's Station [ICQ: 115226275] <alexey at feldgendler.ru>
More information about the whatwg
mailing list