[whatwg] URL query component
Anne van Kesteren
annevk at opera.com
Fri Apr 20 02:15:16 PDT 2012
The URL query component for URLs found in HTML (exact set still be to be
defined I think) uses the page encoding when the page encoding is not
utf-8/utf-16 (then it uses utf-8).
E.g. "?€" maps to "?%80" in a windows-1252 encoded page.
Currently browsers differ for what happens when the code point cannot be
encoded. E.g. "?€"
Opera uses "?". Internet Explorer uses "?" (but when the URL hits the
network layer, not when you inspect it via script). WebKit uses "&#...;".
Gecko encodes it using utf-8.
What Gecko does makes the resulting data impossible to interpret.
What WebKit does is consistent with form submission. I like it.
Also, given that encoding behavior is not exposed besides form submission
and URLs, consistently using "&#...;" for code points not represented in
legacy encodings makes sense to me. Am I missing something?
--
Anne van Kesteren
http://annevankesteren.nl/
More information about the whatwg
mailing list