[whatwg] form charset
Olav Junker Kjær
olav at olav.dk
Tue Apr 19 15:02:08 PDT 2005
I understand that the _charset_ field is needed in url encoded requests,
since any encoding can be chosen through accept-charset and
there is no other way to know the encoding.
However, is it really the right thing to allow arbitrary encodings of
GET queries in the first place? The official Right Way to encode URLs is
to use Utf8, and it seems strange to allow a different encoding after
the question mark.
Also, URLs are supposed to be context independent, e.g. you should be
able to bookmark a query, send it in a mail and so on. This might be
problematic if the correct interpretation of the URL is dependent on the
encoding or the accept-charset attribute on the form in the originating
page.
Of course we cannot just mandate utf8 always, since there is the issue
of backwards compatibility. If I'm not mistaken, browsers usually
urlencode forms using the same charset as the page. I we want to
avoid breakage of server scripts, this should remain the default.
However, the only legal value in accept-charset should be utf8 when the
method is GET.
regards
Olav Junker Kjær
More information about the whatwg
mailing list