[whatwg] Web Encodings
Ian Hickson
ian at hixie.ch
Sat Aug 29 18:47:34 PDT 2009
On Wed, 19 Aug 2009, Anne van Kesteren wrote:
>
> Today every browser implements their own encoding label matching
> algorithm, supports their own list of encodings, their own list of
> encoding label aliases, and everything sort of works, but not really.
>
> HTML5 solves part of this problem by defining exactly how to identify an
> encoding label alias in a text/html stream. It also defines which
> encoding label matching algorithm to use, UTS22, but we found out that
> this is incompatible with (existing) sites that specify EUC_JP at the
> HTTP level and actually want to be decoded per UTF-8 according to a
> <meta> in the text/html stream. This works fine if you have a strict
> encoding label matching algorithm, but with UTS22, EUC_JP and EUC-JP
> become the same thing, while only the latter is the actual encoding
> label.
I've backed off UTS22. I think we need the IANA list updated, though, to
include the aliases browsers support. I understand you are working on
this? I would like to remove the table in the HTML5 spec that defines such
mappings, once that is done.
> Another problem HTML5 does not solve is giving a definitive list of
> encodings clients have to implement to be compatible with a large body
> of Web content. This means new clients will have to reverse engineer
> that list from existing clients which I think is bad.
If you can get browser vendors to agree on a comprehensive and accurate
list, I'm happy to add it to the spec. But unless a plurality of browser
vendors actually decide to standardise on a single set of encodings, I
don't know that it makes sense to spec something here.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list