[whatwg] URL decomposition on HTMLAnchorElement interface
Boris Zbarsky
bzbarsky at MIT.EDU
Fri Mar 27 11:14:35 PDT 2009
Kartikaya Gupta wrote:
> I was trying different things to see what happens and came across some particularly weird behavior in Gecko/2009021910 Firefox/3.0.7:
>
> var a = document.createElement('a');
> a.setAttribute('href', 'http://example.org:123/foo?bar#baz');
> a.hostname = null;
> alert(a.hostname); // displays "foo"
> alert(a.href); // displays "http://foo/?bar#baz"
Indeed. The behavior you're seeing is due setting the hostname to the
empty string, basically... That said, this code should probably bail
out when that happens instead of pressing on. I've filed
https://bugzilla.mozilla.org/show_bug.cgi?id=485562 on this.
Interestingly, it looks like Opera doesn't support the hostname setter
at all. Safari ignores the call in this case. I don't have IE to test
offhand.
> a.setAttribute('href', 'scheme://host/path');
> a.host = null;
> alert(a.host); // displays ""
> alert(a.pathname); // displays ""
> alert(a.href); // displays "scheme:////host/path"
This case is more fun. It's an unknown scheme, so it's assumed to be a
no-authority non-hierarchical scheme and the URI is parsed that way.
This does cause issues, since RFC 3986 says that i there is no authority
then the path cannot begin with two slashes (so if "scheme" is a
non-authority protocol then the URI is invalid, in fact). But deciding
whether this is an invalid URI or not involves knowing something about
the "scheme" protocol, which is rather hard in this case, since you just
made it up. ;)
In general, parsing a URI for a scheme you know nothing about is a huge
pain, especially if your URL parser is expected to do fixup on invalid
URIs (which the parser for the "href" attribute of <a> is certainly
expected to do).
-Boris
More information about the whatwg
mailing list