[whatwg] Charset sniffing from XML prolog
lists.whatwg at stakface.com
Wed Oct 7 18:29:17 PDT 2009
On Wed, 07 Oct 2009 20:23:35 -0400, Boris Zbarsky <bzbarsky at MIT.EDU> wrote:
> On 10/7/09 7:51 PM, Kartikaya Gupta wrote:
> > I tried it again in Chrome and if I paste the above in the address bar I get US-ASCII. But if I save it to a file and then load it I get UTF-8. I checked the headers being sent from Apache and they don't include any sneaky encoding hints, just Content-Type: text/html.
> Can you attach the exact file you saved? Does it have a BOM, perchance?
No BOM (I created the files using vim, and checked them with xxd).
Using a degree symbol in UTF-8:
In both cases the _iso version has a tweaked prolog such that it goes back to ISO-8859-1 in Firefox. Chrome still detects fakexml_iso.html as UTF-8. I've now also tested in Firefox on Mac (Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:18.104.22.168) Gecko/20090824 Firefox/3.5.3) which also has a default encoding of ISO-8859-1 as per the preferences.
More information about the whatwg