[whatwg] base64 entities

Adam Barth w3c at adambarth.com
Wed Aug 25 16:41:18 PDT 2010


On Wed, Aug 25, 2010 at 1:55 PM, Ian Hickson <ian at hixie.ch> wrote:
> On Wed, 25 Aug 2010, Adam Barth wrote:
>> HTML should support Base64-encoded entities to make it easier for
>> authors to include untrusted content in their documents without
>> risking XSS.
>
> Seems like a fine idea. Get browsers to implement it and I'll spec it.

I've posted a patch for WebKit:

https://bugs.webkit.org/show_bug.cgi?id=44641

Some subtleties:

1) Some base64 decoders tolerate newlines.  We don't want to decode
entities with newlines.
2) Decoding base64 results in binary data.  We'll need to convert that
data to characters in order to deal with it in the DOM.  We use always
use UTF8 for that transformation, regardless of the document's
encoding.
3) Null characters are replaced with U+FFFD.
4) The empty base64 entity &%; is consumed and is replaced with the
empty string.
5) Invalid base64 is rejected and the entity is not decoded.

Adam



More information about the whatwg mailing list