[whatwg] A comment to character encoding declaration

Jjgod Jiang gzjjgod at gmail.com
Mon Mar 3 07:11:02 PST 2008


Hi,

It's a comment to the "character encoding declaration"
section of HTML 5 spec:

http://www.w3.org/html/wg/html5/#character1

During the development of CJK information processing, many
text encodings is just a strict subset of another one, for
example, GB2312 is a subset of GBK, GBK is a subset of
GB18030. For compatibility purpose, a lot of web pages used
character encoding declaration like this:

<meta http-equiv="Content-Type" content="text/html; charset=gb2312">

in their header, yet they might use characters in GBK but
not in GB2312. So, I think we can suggest clients to simply
treat encodings like these as their biggest superset, for
instance, treat GB2312 as GB18030.

BTW, browsers like Firefox seems already handles such cases
well, but Safari/WebKit seems not.

Regards,
Jiang




More information about the whatwg mailing list