[whatwg] URL: query encoding

Boris Zbarsky bzbarsky at MIT.EDU
Tue Oct 30 09:51:14 PDT 2012


On 10/30/12 11:43 AM, Simon Pieters wrote:
> The above applies to what gets sent over the wire when using the
> WebSocket(...) constructor. For <a href>, the results are different:
>
> http://simon.html5.org/test/url/url-encoding.html
>
> I don't have an opinion at this point about what to do here.

In Gecko, at least , when a URL object is constructed from a string the 
caller can specify an encoding to use for the URL.  The URL code then 
does things that depend on what that encoding was.  Apart from the 
hierarchical vs not distinction, I believe the handling of the encoding 
does not depend on scheme in Gecko.  If no encoding is specified, UTF-8 
is assumed.

<a href> passes in the document encoding as the encoding to use when 
constructing the URL object.

The WebSocket constructor does not pass in an ecoding when constructing 
the URL object, so UTF-8 is used.

I would not be opposed to us explicitly specifying things this way. 
That would incidentally require specs to say exactly when some non-UTF8 
encoding is supposed to be used for their URIs and what that encoding 
should be, which seems like a good thing to me.

-Boris




More information about the whatwg mailing list