[whatwg] Internal character encoding declaration
hsivonen at iki.fi
Thu Mar 16 03:09:42 PST 2006
On Mar 14, 2006, at 15:07, Peter Karlsson wrote:
> Henri Sivonen on 2006-03-14:
>>> Transcoding is very popular, especially in Russia.
>> In *proxies* *today*? What's the point considering that browsers
>> have supported the Cyrillic encoding soup *and* UTF-8 for years?
> The mod_charset is not proxying, it's on the server level.
Right. So, as a data point, it neither proves nor disproves the
legends about transcoding *proxies* around Russia and Japan.
>> How could proxies properly transcode form submissions coming back
>> without messing everything up spectacularly?
> That's why the "hidden-string" technique was invented. Introduce a
> hidden <input> with a character string that will get encoded
> differently depending on the encoding used. When data comes in, use
> this character string to determine what encoding was used.
I thought that method was for detecting broken browsers and users
meddling with the encoding menu, and I though using that method was
In order for deploying a transcoding proxy to be safe for a Russian
ISP, virtually every form handler in Russia would have take
countermeasures against the adverse effects of transcoding proxies.
Are the countermeasures ubiquitous?
>> Easy parse errors are not fatal in browsers. Surely it is OK for a
>> conformance checker to complain that much at server operators
>> whose HTTP layer and meta do not match.
> I just reacted at the notion of calling such documents invalid. It
> is the transport layer that defines the encoding, whatever the
> document says or how it looks like is irrelevant, and is just
> something that you can look at if the transport layer neglects to
> say anything.
If two layers disagree, it suggests there is a problem and, in my
opinion, it should be flagged as an error. (Especially considering
Ruby's Postulate.) Operators of transcoding origin servers (or
reverse proxies which viewed from the Web count as origin servers)
are free not to send a disagreeing charset meta.
hsivonen at iki.fi
More information about the whatwg