[whatwg] Default encoding to UTF-8?
Leif Halvard Silli
xn--mlform-iua at xn--mlform-iua.no
Sun Dec 11 03:44:37 PST 2011
Leif Halvard Silli Sun Dec 11 03:21:40 PST 2011
> W.r.t. iframe, then the "big in Norway" newspaper Dagbladet.no is
> declared ISO-8859-1 encoded and it includes a least one ads-iframe that
...
> * Let's say that I *kept* ISO-8859-1 as default encoding, but instead
> enabled the Universal detector. The frame then works.
> * But if I make the frame page very short, 10 * the letter "ø" as
> content, then the Universal detector fails - on a test on my own
> computer, it guess the page to be Cyrillic rather than Norwegian.
> * What's the problem? The Universal detector is too greedy - it tries
> to fix more problems than I have. I only want it to guess on "UTF-8".
> And if it doesn't detect UTF-8, then it should fall back to the locale
> default (including fall back to the encoding of the parent frame).
The above illustrates that the current charset-detection solutions are
starting to get old: They are not geared and optimized towards UTF-8 as
the firmly recommended and - in principle - anticipated default.
The above may also catch a real problem with switching to UTF-8: that
one may need to embed pages which do not use UTF-8: If one could trust
UAs to attempt UTF-8 detection (but not "Univeral detection) before
defaulting, then it became virtually risk free to switch a page to
UTF-8, even if it contains iframe pages. Not?
Leif H Silli
More information about the whatwg
mailing list