[whatwg] base64 entities
Ryosuke Niwa
ryosuke.niwa at gmail.com
Wed Aug 25 14:32:05 PDT 2010
Does ECMAScript currently have a built-in function for encoding & decoding
base-64? We might want a built-in base-64 encoder / decoder if we are
implementing this base64-encoded entities.
- Ryosuke
On Wed, Aug 25, 2010 at 1:50 PM, Adam Barth <w3c at adambarth.com> wrote:
> == Summary ==
>
> HTML should support Base64-encoded entities to make it easier for
> authors to include untrusted content in their documents without
> risking XSS. For example,
>
> &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==;
>
> would decode to "HTML5's <canvas> element is awesome." Notice that
> the < and > characters get emitted by the parser as character tokens.
> That means they can't be used by an attacker for XSS. These entities
> can be used safely both in intertag content as well as in attribute
> values.
>
> == Use Case ==
>
> Authors often combine trusted and untrusted text into HTML documents.
> If done naively, an attacker can supply HTML markup, including script,
> in the untrusted script, resulting in a cross-site script attack.
> Authors want a way to include untrusted content safely in HTML
> documents without risking XSS.
>
> == Workarounds ==
>
> Currently, authors must carefully escape all untrusted content to
> prevent an attacker from injecting HTML. Unfortunately, authors often
> apply the incorrect escaping or forget to escape entirely, resulting
> in security vulnerabilities. Escaping content in HTML is tricky
> because authors need to use different escaping rules for different
> contexts. For example, PHP's htmlspecialchars isn't sufficient in the
> following contexts:
>
> <img alt=<?php echo htmlspecialchars($name) ?> src="...">
>
> <script>
> elmt.innerHTML = 'Hi there <?php echo htmlspecialchars($name) ?>.';
> </script>
>
> Some framework convert untrusted content to a series of hex entities,
> but that greatly increases the length of the content.
>
> == Proposal ==
>
> We should add a new kind of HTML entity that authors can use to
> include untrusted content. In particular, authors should be able to
> supply untrusted content in base64, which nicely avoids any scary
> characters. We can avoid clashes with existing or future entities by
> using a new character after the & escape character. In particular, we
> could use the % character:
>
> &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==;
>
> Authors could then supply untrusted content as follows:
>
> <img alt=<?php echo htmlescape($name) ?> src="...">
>
> where htmlescape is defined as follows:
>
> function htmlescape($text) {
> return "&%".base64_encode($text).";";
> }
>
> Adam
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20100825/1e56f8d1/attachment-0002.htm>
More information about the whatwg
mailing list