[whatwg] Character-encoding-related threads
simonp at opera.com
Mon Feb 13 22:54:18 PST 2012
On Mon, 13 Feb 2012 18:22:13 +0100, Ian Hickson <ian at hixie.ch> wrote:
>> I think this is like saying that requiring <!DOCTYPE HTML> is an undue
>> burden on authors...
> It is. You may recall we tried really hard to make it shorter. At the end
> of the day, however, "<!DOCTYPE HTML>" is the best we could do.
It is a burden, but it's not significantly difficult or anything.
>> In practice, authors who don't declare their encoding can silence the
>> validator by using entities for their non-ASCII characters, but they
>> will still get bitten by encoding problems as soon as they want to
>> submit forms or resolve URLs with %-escaped stuff in the query
>> component, and so forth, so it seems to me authors would be better off
>> if we said that the encoding cruft is required cruft just like the
>> doctype cruft.
> Hm, that's an interesting point. Can we make a list of features that rely
> on the character encoding and have the spec require an encoding if any of
> those are used?
> If the list is long or includes anything that it's unreasonable to expect
> will not be used in most Web pages, then we should remove this particular
> "hole" in the conformance criteria.
The list may well be longer, I haven't checked, but I don't think that
matters. The resolving URL problem is a bad problem because it means links
will stop working for users that have a different default encoding, so
those users leave and go to a competitor site. The form problem is a bad
problem because it means that the database will be filled with content
using various different encodings with no knowledge of what is what, so
when the author realizes this and "fixes" it by declaring the encoding,
it's already too late, the data is broken and is very hard to repair.
Letting authors get themselves in a situation where they have broken data
even though it could have been easily prevented seems more like an undue
burden to me.
Note that both of these features can be hidden in scripts where validators
currently don't even look, so I think it's not a good idea to make the
requirement conditional on these features.
More information about the whatwg