[whatwg] Internal character encoding declaration
Henri Sivonen
hsivonen at iki.fi
Thu Mar 16 03:09:42 PST 2006
On Mar 14, 2006, at 15:07, Peter Karlsson wrote:
> Henri Sivonen on 2006-03-14:
>>> Transcoding is very popular, especially in Russia.
>> In *proxies* *today*? What's the point considering that browsers
>> have supported the Cyrillic encoding soup *and* UTF-8 for years?
>
> The mod_charset is not proxying, it's on the server level.
Right. So, as a data point, it neither proves nor disproves the
legends about transcoding *proxies* around Russia and Japan.
>> How could proxies properly transcode form submissions coming back
>> without messing everything up spectacularly?
>
> That's why the "hidden-string" technique was invented. Introduce a
> hidden <input> with a character string that will get encoded
> differently depending on the encoding used. When data comes in, use
> this character string to determine what encoding was used.
I thought that method was for detecting broken browsers and users
meddling with the encoding menu, and I though using that method was
relatively rare.
In order for deploying a transcoding proxy to be safe for a Russian
ISP, virtually every form handler in Russia would have take
countermeasures against the adverse effects of transcoding proxies.
Are the countermeasures ubiquitous?
>> Easy parse errors are not fatal in browsers. Surely it is OK for a
>> conformance checker to complain that much at server operators
>> whose HTTP layer and meta do not match.
>
> I just reacted at the notion of calling such documents invalid. It
> is the transport layer that defines the encoding, whatever the
> document says or how it looks like is irrelevant, and is just
> something that you can look at if the transport layer neglects to
> say anything.
If two layers disagree, it suggests there is a problem and, in my
opinion, it should be flagged as an error. (Especially considering
Ruby's Postulate[1].) Operators of transcoding origin servers (or
reverse proxies which viewed from the Web count as origin servers)
are free not to send a disagreeing charset meta.
[1] http://intertwingly.net/slides/2004/devcon/69.html
--
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/
More information about the whatwg
mailing list