[whatwg] Proposal for improved handling of '#' inside of data URIs
Michael A. Puls II
shadow2531 at gmail.com
Sun Sep 11 11:44:13 PDT 2011
On Sun, 11 Sep 2011 12:14:08 -0400, Glenn Maynard <glenn at zewt.org> wrote:
> On Sun, Sep 11, 2011 at 10:21 AM, Michael A. Puls II
> <shadow2531 at gmail.com>wrote:
>
>> Not only must "#" be "%23" if you don't want it as a frag id, but ">"
>> and
>> "<" should be "%3E" and "%3C".
>>
>
> I'm not sure about the spec on this, but Firefox actively unencodes %3E
> and
> %3C. Pasting this into the address bar and copying it back out turns
> them
> back into literal < and > characters:
>
> data:text/html,foo<div style=height:3000px></div><span
> id='vector<int>'>bar</span>#vector%3Cint%3E
>
> which suggests that escaping these characters isn't necessary or
> encouraged.
Firefox aggressively decodes %HH in the address field to make the URI
human-readable (which I hate btw, but that's another discussion). It
usually copies the correct/original value to the clipboard though. In this
case though, Firefox copies < and > to the clipboard decoded just like you
say. I can't say I think that's a good idea and wonder if it's intentional
as you suspect. Chrome does it too though. Opera doesn't and leaves them
encoded, which I think is better.
I love what Safari does and I think what it does is the right thing to do.
It will resolve the data URI and convert raw spaces to %20 and convert <
and > to %3C and %3E (and anything else that should be encoded in a URI to
%HH) if they're not encoded. It could be argued that Safari shouldn't do
that visually. But, as far as copying to the clipboard, it should and
does, which is awesome.
I don't think < and > are in the list of safe URI characters. All
URI-based functions seem to percent-encode them too. Keeping them encoded
is definitely good for data URIs in text/plain documents so the don't
interfere with the < and > that encase the URI.
--
Michael
More information about the whatwg
mailing list