[whatwg] Guessing the fallback encoding from the top-level domain name before trying to guess from the browser localization

Ian Hickson ian at hixie.ch
Fri Feb 7 14:37:34 PST 2014


On Thu, 19 Dec 2013, Henri Sivonen wrote:
> 
> Considering that the encoding of the content browsed is not really a 
> function of the UI localization of the browser, though the two are often 
> correlated, I have developed a patch for Firefox to make the guess based 
> on the top-level domain name of the URL of the document when possible.
> 
> Before deciding whether to land that patch, I'd like to get feedback 
> from the broader Web standards community.
> 
> Does this seem like a good idea? Good idea if the mapping details are 
> tweaked? Bad idea? (Why?)

Seems like a reasonable idea to me. The correlation should be at least as 
high, as far as I can tell. But that's just a guess. Data would be good, 
for example instrumenting an existing locale-based browser to see how 
often the guess from the locale disagrees with the guess from the TLD, and 
checking how often the guess from the locale is wrong (via looking at 
people overriding the encoding manually). Or maybe a 50%/50% experiment 
with that as the first 50% and the default coming from the TLD instead of 
the UI locale in the second 50%, with the corresponding instrumentation, 
to see how the results compare.

Have you tried deploying this? What have you learnt so far?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


More information about the whatwg mailing list