[whatwg] Proposal for improved handling of '#' inside of data URIs

Glenn Maynard glenn at zewt.org
Sun Sep 11 09:14:08 PDT 2011

On Sat, Sep 10, 2011 at 5:15 PM, Daniel Holbert <dholbert at mozilla.com>wrote:

> This could be more intuitive/do-what-I-mean if we restricted the cases
> under which "#" is treated as a fragment-ID delimiter inside of data URIs.
>  In particular: when a "#" character is followed by ">" or "<" in a data
> URI, I propose that we *don't* treat the "#" as a delimiter, and instead
> just treat it as part of the encoded document.

An HTML document in a data: URI containing a # is probably followed by a >
or <; but that's an "if", not "iff".  It doesn't imply that a # followed by
a > or < is *always* intended as part of the data and not an actual

data:text/html,foo<div style=height:3000px></div><span

I don't think adding black magic to URI parsing will make things less

Firefox parses fragment-identifiers strictly, potentially giving authors
> headaches and truncating content that renders fine in Opera/Webkit.

I'd say the opposite: WebKit breaks this author's expectations and
encourages headaches, by not parsing the above URIs in the ordinary way,
where Firefox matches my expectations.  I was certainly surprised to find
that Chrome fails the above.

On Sun, Sep 11, 2011 at 10:21 AM, Michael A. Puls II
<shadow2531 at gmail.com>wrote:

> Not only must "#" be "%23" if you don't want it as a frag id, but ">" and
> "<" should be "%3E" and "%3C".

I'm not sure about the spec on this, but Firefox actively unencodes %3E and
%3C.  Pasting this into the address bar and copying it back out turns them
back into literal < and > characters:

data:text/html,foo<div style=height:3000px></div><span

which suggests that escaping these characters isn't necessary or encouraged.

Glenn Maynard

More information about the whatwg mailing list