[whatwg] Default encoding to UTF-8?

Jukka K. Korpela jkorpela at cs.tut.fi
Tue Dec 6 13:27:11 PST 2011

2011-12-06 22:58, Leif Halvard Silli write:

> There is now a bug, and the editor says the outcome depends on "a
> browser vendor to ship it":
> https://www.w3.org/Bugs/Public/show_bug.cgi?id=15076
> Jukka K. Korpela Tue Dec 6 00:39:45 PST 2011
>> what is this proposed change to defaults supposed to achieve. […]
> I'd say the same as in XML: UTF-8 as a reliable, common default.

The "bug" was created so that the argument given was:
"It would be nice to minimize number of declarations a page needs to 

That is, author convenience - so that authors could work sloppily and 
produce documents that could fail on user agents that haven't 
implemented this change.

This sounds more absurd than I can describe.

XML was created as a new data format; it was an entirely different issue.

>> If there's something that should be added to or modified in the
>> algorithm for determining character encoding, the I'd say it's error
>> processing. I mean user agent behavior when it detects, [...]
> There is already an (optional) detection step in the algorithm - but UA
> treat that step differently, it seems.

I'm afraid I can't find it - I mean the treatment of a document for 
which some encoding has been deduced (say, directly from HTTP headers) 
and which then turns out to violate the rules of the encoding.


