[whatwg] Proposal for improved handling of '#' inside of data URIs

Daniel Holbert dholbert at mozilla.com
Sat Sep 10 18:30:20 PDT 2011

On 09/10/2011 04:53 PM, Nils Dagsson Moskopp wrote:
 >> Browsers handle the "#" character in data URIs very differently, and
 >> the arguably "correct" behavior is probably not what authors actually
 >> want in many cases.
 > Do you have any evidence for that assertion, e.g. author surveys,
 > occurance in sites, number of duplicates in mozilla bugzilla (relative
 > to other common bugs)?

No large-scale data like that, just a few anecdotal reports in IRC of 
Firefox purportedly being "broken" on particular content (that contained 
a "#"), whereas Chromium was "working".  (one instance about a week ago, 
which prompted this proposal)

Plus, a concern that people can *almost* just stick pure HTML/SVG into a 
data URI (see examples below) except for "#" characters which break things.

 > This change would probably have to be communicated to other software
 > working with data URIs (Python's urlparse module comes to mind).

Sure, ultimately. One step at a time.

 > Do you
 > intend to update the RFC on the point or leave that usage
 > non-conforming?

I'm not sure. Right now this is just a proposal for better 
interoperability, but ultimately, yeah, it'd be great to have this 

 >> Note that in cases where an author *accidentally* includes "#" inside
 >> their data URI (e.g.<body background="#f00">),
 > What's with the unencoded bracket (should be %3C) and space (should be
 > %20) beforehand? Why wouldn't parsing stop at those points?

Those are fine, actually -- but I should have included an actual URI 
that loads in browsers, like the following (this is what I meant):
   data:text/html,<html><body style='background: #f00'>

So to answer your question -- that does render just fine (giant red 
page) in Chromium, without any need to encode the space or the brackets. 
  It also renders fine in Opera if you point an <iframe> at it.  (but 
not if you type it directly into the URLbar -- that's the inconsistency 
on their part that I mentioned in my post)

And in Firefox, it renders fine if you just encode the # character:
   data:text/html,<html><body style="background:%23f00">
(that makes it load fine from Opera's URLbar, too.)

So no -- practically at least, there's no need to encode the >/< or the 
space character.


More information about the whatwg mailing list