[whatwg] Question about document.referrer (and document.URL, document.location.href) when IDN domains are in use

Ian Hickson ian at hixie.ch
Fri Jul 12 11:15:06 PDT 2013


On Wed, 20 Mar 2013, Boris Zbarsky wrote:
>
> The spec for document.referrer says:
> 
>   The referrer attribute must return the document's referrer.
> 
> The "document's referrer" is not really defined anywhere in a useful way 
> that I can find.

What's not useful about the way it's defined? It's set to a specific 
string.


> This then follows with a non-normative note:
> 
>    Note: In the case of HTTP, the referrer IDL attribute will match the
>    Referer (sic) header that was sent when fetching the current page.
> 
> In cases when the hostname is non-ASCII, the Referer header will have it 
> encoded in punycode.

Is that defined anywhere?


> The question is what should happen for document.referrer.

The spec says (normatively, not just in the note) that it's the exact 
string that the HTTP spec says must be generated for the Referer: header. 
See the "Creating a new Document object" algorithm:

   http://whatwg.org/html/#create-a-document-object

...which refers to the "fetch" algorithm, which refers to HTTP.


> Right now, I see the following behavior:
> 
> 1)  Gecko shows exactly the string we put on the wire in 
> document.referrer (punycode and all).  document.URL and 
> document.location.href show the non-ASCII chars in some cases.

That's correct per spec (assuming the punycoding is required anywhere). 
The latter two are set separately than document.referrer:

   http://whatwg.org/html/#set-the-document's-address

If other browsers don't match this, file bugs on them. :-)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


More information about the whatwg mailing list