[whatwg] base64 entities
Adam Barth
w3c at adambarth.com
Wed Aug 25 14:32:53 PDT 2010
btoa and atob should do the trick.
Adam
On Wed, Aug 25, 2010 at 2:32 PM, Ryosuke Niwa <ryosuke.niwa at gmail.com> wrote:
> Does ECMAScript currently have a built-in function for encoding & decoding
> base-64? We might want a built-in base-64 encoder / decoder if we are
> implementing this base64-encoded entities.
> - Ryosuke
> On Wed, Aug 25, 2010 at 1:50 PM, Adam Barth <w3c at adambarth.com> wrote:
>>
>> == Summary ==
>>
>> HTML should support Base64-encoded entities to make it easier for
>> authors to include untrusted content in their documents without
>> risking XSS. For example,
>>
>> &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==;
>>
>> would decode to "HTML5's <canvas> element is awesome." Notice that
>> the < and > characters get emitted by the parser as character tokens.
>> That means they can't be used by an attacker for XSS. These entities
>> can be used safely both in intertag content as well as in attribute
>> values.
>>
>> == Use Case ==
>>
>> Authors often combine trusted and untrusted text into HTML documents.
>> If done naively, an attacker can supply HTML markup, including script,
>> in the untrusted script, resulting in a cross-site script attack.
>> Authors want a way to include untrusted content safely in HTML
>> documents without risking XSS.
>>
>> == Workarounds ==
>>
>> Currently, authors must carefully escape all untrusted content to
>> prevent an attacker from injecting HTML. Unfortunately, authors often
>> apply the incorrect escaping or forget to escape entirely, resulting
>> in security vulnerabilities. Escaping content in HTML is tricky
>> because authors need to use different escaping rules for different
>> contexts. For example, PHP's htmlspecialchars isn't sufficient in the
>> following contexts:
>>
>> <img alt=<?php echo htmlspecialchars($name) ?> src="...">
>>
>> <script>
>> elmt.innerHTML = 'Hi there <?php echo htmlspecialchars($name) ?>.';
>> </script>
>>
>> Some framework convert untrusted content to a series of hex entities,
>> but that greatly increases the length of the content.
>>
>> == Proposal ==
>>
>> We should add a new kind of HTML entity that authors can use to
>> include untrusted content. In particular, authors should be able to
>> supply untrusted content in base64, which nicely avoids any scary
>> characters. We can avoid clashes with existing or future entities by
>> using a new character after the & escape character. In particular, we
>> could use the % character:
>>
>> &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==;
>>
>> Authors could then supply untrusted content as follows:
>>
>> <img alt=<?php echo htmlescape($name) ?> src="...">
>>
>> where htmlescape is defined as follows:
>>
>> function htmlescape($text) {
>> return "&%".base64_encode($text).";";
>> }
>>
>> Adam
>
>
More information about the whatwg
mailing list