[whatwg] New URL Standard

Anne van Kesteren annevk at annevk.nl
Fri Sep 21 08:16:02 PDT 2012

I took a crack at defining URLs: http://url.spec.whatwg.org/

At the moment it defines parsing (minus domain names / IP addresses)
and the JavaScript API (minus the query manipulation methods proposed
by Adam Barth). It defines things like setting .pathname to "hello
world" (notice the space), it defines what happens if you resolve
"http:test" against a data URL (you get "http://test/") or
http://teehee (you get "http://teehee/test"). It is based on the
various URL code paths found in WebKit and Gecko and supports the \ as
/ in various places because it seemed better for compatibility.

I'm looking for some feedback/ideas on how to handle various aspects, e.g.:

* data URLs; in Gecko these appear to be parsed as part of the URL
layer, because they can turn a URL invalid. Other browsers do not do
this. Opinions? Should data URLs support .search?
* In the current text only a select few URLs support host/port/query.
The rest is solely path/fragment. But maybe we want mailto to support
query? Should it support host? (mailto supporting e.g. host would also
mean normalising host via IDNA toASCII and friends. Not sure I'm fond
of that.)
* Advice on file URLs would be nice.
* IDNA: what are your plans? IDNA2003 / IDNA2008 / UTS #46 / something
else? It would be nice to get agreement on this.
* Terminology: should we align the terminology with the API or would
that just be too confusing?


PS: It also does the query encoding thing correctly for the first time
ever in the history of URL standards although the wording can probably
be improved.


More information about the whatwg mailing list