[whatwg] form charset

Olav Junker Kjær olav at olav.dk
Tue Apr 19 15:02:08 PDT 2005


I understand that the _charset_ field is needed in url encoded requests,
since any encoding can be chosen through accept-charset and
there is no other way to know the encoding.

However, is it really the right thing to allow arbitrary encodings of
GET queries in the first place? The official Right Way to encode URLs is 
to use Utf8, and it seems strange to allow a different encoding after 
the question mark.

Also, URLs are supposed to be context independent, e.g. you should be 
able to bookmark a query, send it in a mail and so on. This might be 
problematic if the correct interpretation of the URL is dependent on the 
encoding or the accept-charset attribute on the form in the originating 
page.

Of course we cannot just mandate utf8 always, since there is the issue 
of backwards compatibility. If I'm not mistaken, browsers usually 
urlencode forms using the same charset as the page. I we want to
avoid breakage of server scripts, this should remain the default. 
However, the only legal value in accept-charset should be utf8 when the 
method is GET.

regards
Olav Junker Kjær





More information about the whatwg mailing list