[whatwg] Internal character encoding declaration
lachlan.hunt at lachy.id.au
Mon Mar 13 06:12:21 PST 2006
Henri Sivonen wrote:
> If a meta element whose http-equiv attribute has the value
> "Content-Type" (compare case-insensitively) and whose content attribute
> has a value that begins with "text/html; charset=", the string in the
> content attribute following the start "text/html; charset=" is taken,
> white space removed from the sides and considered the tentative encoding
This will need to handle common mistakes such as the following:
<meta ... content="application/xhtml+xml;charset=X">
<meta ... content="foo/bar;charset=X">
<meta ... content="foo/bar;charset='X'">
<meta ... content="charset=X">
<meta ... charset="X">
I'm not sure which browsers support each one, they'll all need to be tested.
> Authors are adviced not to use the UTF-32 encoding or legacy encodings.
> (Note: I think UTF-32 on the Web is harmful and utterly pointless,
I agree about it being pointless, but why is it considered harmful?
> I'd like to have some text in the spec that justifies whining
> about legacy encodings.
What are your reasons for whining about legacy encodings and what would
you like the spec to say?
> Also, the spec should probably give guidance on what encodings need to
> be supported. That set should include at least UTF-8, US-ASCII,
> ISO-8859-1 and Windows-1252.
And probably UTF-16 as well.
More information about the whatwg