[whatwg] data: URI origin
Adam Barth
w3c at adambarth.com
Mon Mar 14 14:20:42 PDT 2011
On Mon, Mar 14, 2011 at 1:27 PM, Luis Marsnao <l0mars01 at yahoo.com> wrote:
> Can data: URIs be used insecurely?
Yes, but everything can be used insecurely, even a butter knife.
> I'm attempting to write a client-side script that processes a user selected file through an input element. Since the input element interface conceals the file: URI, the best solution I can think of is to access the file through the input element's interface, get its data: URI through readAsDataURL in FileAPI's FileReader interface, and process the data: URI. However, I get not-same-origin errors when I try to use this URI. Specifically, this happens when I try to use XMLHttpRequest to retrieve an XML resource with the data URI.
>
> Is this correct?
> http://www.w3.org/TR/html5/origin-0.html#origin-0 appears to suggest it: "If url does not use a server-based naming authority, or if parsing url failed, or if url is not an absolute URL, then return a new globally unique identifier.", data URIs do not use server-based authorities, and opaque identifiers only have same origin with themselves.
Are you using WebKit? There are long-standing bugs in WebKit where
WebKit is more conservative about the security context for data URLs
than what's in the spec. I'd like to fix them, but I've got a bunch
of other things to do first.
> Is there a better way to process files in a client-side script? I considered using blob: URIs, but the support is not yet there.
Blob is a much better way to interact with files. With Blob, you can
interact with much larger files and you don't need to access the disk
synchronously (which can be arbitrarily slow).
> Can data: URIs be abused with the other same-origin policies in effect? I'm trying to imagine a situation where the data: URI origin policy is necessary for security. But I'm under the impression data: URIs literally are the resources they denote, and current policies allow input only from same-origin resources or the user, so scripts get input only from those sources. If that input literally is a resource, then that resource /should/ be treated as same-origin or from the user. Am I wrong?
The security context of data URLs is a subtle issue. Life is more
complex than you state above.
Adam
More information about the whatwg
mailing list