[whatwg] Proposal for improved handling of '#' inside of data URIs

Boris Zbarsky bzbarsky at MIT.EDU
Sat Sep 10 19:31:14 PDT 2011


On 9/10/11 9:04 PM, Nils Dagsson Moskopp wrote:
> Oops, partial misunderstanding. While I did not think of SVG (thanks),
> I wanted to know how often authors have erred here by not properly
> encoding their data, expecting it to work.

Good question.

Given that it used to work in Gecko, WebKit, and Presto (unlike SVG from 
data:, which did not really work in Gecko), it might have been 
reasonably common....

On the other hand, this would presumably mostly be a problem for people 
hand-writing data: URIs.  Any sort of data: URI generator would get this 
right, as you point out.

I suspect that data: URI usage on the web is rare enough so far that 
there are no serious backwards-compat issues.

> Btw: Are there possible security implications of data URI parse changes?

Not so much implications of the "changes", since it's not like UAs 
actually parse them per spec... but yes, a URI like this:

   data:text/html,#<script>doStuff()</script>

is very difficult to sanitize if your URI parser just treats the part 
before '#' as the data while a browser treats everything after the ',' 
as the data.  So there are definitely security implications to the fact 
that the browser behavior is not consistent, either across browsers, 
within a given browser, or with the specs.

-Boris



More information about the whatwg mailing list