[html5] r7958 - [giow] (3) New encoding defaults based on more data. Fixing https://www.w3.org/B [...]
whatwg at whatwg.org
whatwg at whatwg.org
Tue Jun 11 21:49:03 PDT 2013
Author: ianh
Date: 2013-06-11 21:49:02 -0700 (Tue, 11 Jun 2013)
New Revision: 7958
Modified:
complete.html
index
source
Log:
[giow] (3) New encoding defaults based on more data.
Fixing https://www.w3.org/Bugs/Public/show_bug.cgi?id=21087
Affected topics: HTML Syntax and Parsing
Modified: complete.html
===================================================================
--- complete.html 2013-06-11 22:23:54 UTC (rev 7957)
+++ complete.html 2013-06-12 04:49:02 UTC (rev 7958)
@@ -256,7 +256,7 @@
<header class=head id=head><p><a class=logo href=http://www.whatwg.org/><img alt=WHATWG height=101 src=/images/logo width=101></a></p>
<hgroup><h1 class=allcaps>HTML</h1>
- <h2 class="no-num no-toc">Living Standard — Last Updated 11 June 2013</h2>
+ <h2 class="no-num no-toc">Living Standard — Last Updated 12 June 2013</h2>
</hgroup><dl><dt><strong>Web developer edition:</strong></dt>
<dd><strong><a href=http://developers.whatwg.org/>http://developers.whatwg.org/</a></strong></dd>
<dt>Multiple-page version:</dt>
@@ -84717,103 +84717,334 @@
to frequent). The following table gives suggested defaults based on the user's locale, for
compatibility with legacy content. Locales are identified by BCP 47 language tags. <a href=#refsBCP47>[BCP47]</a></p>
- <!-- based on mozilla 1.9.1 localizations:
- http://mxr.mozilla.org/l10n-mozilla1.9.1/find?string=global%2Fintl.properties&tree=l10n-mozilla1.9.1&hint= -->
+ <!-- based on three sources:
+ 1. mozilla 1.9.1 localizations: http://mxr.mozilla.org/l10n-mozilla1.9.1/find?string=global%2Fintl.properties&tree=l10n-mozilla1.9.1&hint=
+ 2. windows vista encodings: http://msdn.microsoft.com/en-us/goglobal/bb896001
+ 3. chrome encodings: https://code.google.com/p/chromium/codesearch#search/&q=IDS_DEFAULT_ENCODING
+ several assumptions were made in this process; amongst them:
+ - ISO-8859-1 and Windows-1252 are the same (supported by encoding.spec.whatwg.org)
+ - ISO-8859-9 and Windows-1254 are the same (supported by encoding.spec.whatwg.org)
+ - Windows-31J and Shift_JIS are the same (supported by encoding.spec.whatwg.org)
+ - Windows-932 is close enough to Shift_JIS to be treated as equivalent (supported by wikipedia)
+ - Windows-936 is a basically a subset of GBK which is basically a subset of GB18030 (supported by wikipedia)
+ - Windows-950 is basically the same as Big5 (supported by wikipedia)
+ - Firefox's UTF-8 defaults are all bogus
+ -->
- <table><thead><tr><th>Locale language
+ <table><thead><tr><th colspan=2>Locale language
<th>Suggested default encoding
- <tbody><tr><td>ar
- <td>UTF-8
+ <tbody><!-- af, Afrikaans, uses windows-1252: Windows Vista and Firefox agreed --><!-- am, Amharic, uses windows-1252: Firefox and Chrome agreed --><tr><td>ar
+ <td>Arabic
+ <td>windows-1256 <!-- Windows Vista and Chrome agreed -->
- <tr><td>be
- <td>ISO-8859-5
+ <!-- arn-CL, Mapudungun (Chile), uses windows-1252: Windows Vista and Firefox agreed -->
+ <!-- az, Azeri, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1254 -->
+
+ <!-- az-Cyrl-AZ, Azeri (Cyrillic, Azerbaijan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- ba-RU, Bashkir (Russia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- be, Belarusian, is not listed here because Windows Vista wanted windows-1251, Chrome wanted <none>, and Firefox wanted ISO-8859-5 -->
+
+ <!-- be-BY, Belarusian (Belarus), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
<tr><td>bg
- <td>windows-1251
+ <td>Bulgarian
+ <td>windows-1251 <!-- Windows Vista, Chrome, and Firefox agreed -->
- <tr><td>cs<!-- -CZ -->
- <td>ISO-8859-2
+ <!-- bn, Bengali, uses windows-1252: Firefox and Chrome agreed -->
- <tr><td>cy
- <td>UTF-8
+ <!-- br-FR, Breton (France), uses windows-1252: Windows Vista and Firefox agreed -->
- <tr><td>fa<!-- -IR -->
- <td>UTF-8
+ <!-- bs-Cyrl-BA, Bosnian (Cyrillic, Bosnia and Herzegovina), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
- <tr><td>he<!-- -IL -->
- <td>windows-1255
+ <!-- bs-Latn-BA, Bosnian (Latin, Bosnia and Herzegovina), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+ <!-- ca, Catalan, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- co-FR, Corsican (France), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>cs
+ <td>Czech
+ <td>windows-1250 <!-- Windows Vista and Chrome agreed (but disagreed with Firefox, which thought the encoding should be ISO-8859-2) -->
+
+ <!-- cy-GB, Welsh (United Kingdom), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- da, Danish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- de, German, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- el, Greek, is not listed here because Windows Vista wanted windows-1253, Chrome wanted ISO-8859-7, and Firefox wanted windows-1252 -->
+
+ <!-- el-GR, Greek (Greece), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1253 -->
+
+ <!-- en, English, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- es, Spanish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <tr><td>et
+ <td>Estonian
+ <td>windows-1257 <!-- Windows Vista and Chrome agreed -->
+
+ <!-- eu, Basque, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>fa
+ <td>Persian
+ <td>windows-1256 <!-- Windows Vista and Chrome agreed -->
+
+ <!-- fi, Finnish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- fil, Filipino, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- fo, Faroese, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- fr, French, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- fy-NL, Frisian (Netherlands), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- ga-IE, Irish (Ireland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- gl, Galician, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- gsw-FR, Alsatian (France), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- gu, Gujarati, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- ha-Latn-NG, Hausa (Latin, Nigeria), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>he
+ <td>Hebrew
+ <td>windows-1255 <!-- Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- hi, Hindi, uses windows-1252: Firefox and Chrome agreed -->
+
<tr><td>hr
- <td>UTF-8
+ <td>Croatian
+ <td>windows-1250 <!-- Windows Vista and Chrome agreed -->
- <tr><td>hu<!-- -HU -->
- <td>ISO-8859-2
+ <tr><td>hu
+ <td>Hungarian
+ <td>ISO-8859-2 <!-- Chrome and Firefox agreed (but disagreed with Windows Vista, which thought the encoding should be windows-1250) -->
- <tr><td>ja <!-- and ja-JP-mac -->
- <td>Windows-31J <!-- Shift_JIS -->
+ <!-- hu-HU, Hungarian (Hungary), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
- <tr><td>kk
- <td>UTF-8
+ <!-- id, Indonesian, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
- <tr><td>ko<!-- -KR -->
- <td>windows-949 <!-- EUC-KR -->
+ <!-- ig-NG, Igbo (Nigeria), uses windows-1252: Windows Vista and Firefox agreed -->
+ <!-- is, Icelandic, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- it, Italian, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- iu-Latn-CA, Inuktitut (Latin, Canada), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>ja
+ <td>Japanese
+ <td>Shift_JIS <!-- Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- kk, Kazakh, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- kl-GL, Greenlandic (Greenland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- kn, Kannada, uses windows-1252: Firefox and Chrome agreed -->
+
+ <tr><td>ko
+ <td>Korean
+ <td>windows-949 <!-- Windows Vista, Chrome, and Firefox agreed -->
+
<tr><td>ku
- <td>windows-1254 <!-- ISO-8859-9 -->
+ <td>Kurdish
+ <td>windows-1254 <!-- Best guess -->
+ <!-- ky, Kyrgyz, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- lb-LU, Luxembourgish (Luxembourg), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr><td>lt
- <td>windows-1257
+ <td>Lithuanian
+ <td>windows-1257 <!-- Windows Vista, Chrome, and Firefox agreed -->
- <tr><td>lv<!-- -LV -->
- <td>ISO-8859-13
+ <tr><td>lv
+ <td>Latvian
+ <td>windows-1257 <!-- Windows Vista and Chrome agreed (but disagreed with Firefox, which thought the encoding should be ISO-8859-13) -->
- <tr><td>mk<!-- -MK -->
- <td>UTF-8
+ <!-- mk, Macedonian, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
- <tr><td>or
- <td>UTF-8
+ <!-- ml, Malayalam, uses windows-1252: Firefox and Chrome agreed -->
- <tr><td>pl<!-- -PL -->
- <td>ISO-8859-2
+ <!-- mn, Mongolian, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
- <tr><td>ro
- <td>UTF-8
+ <!-- moh-CA, Mohawk (Mohawk), uses windows-1252: Windows Vista and Firefox agreed -->
+ <!-- mr, Marathi, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- ms, Malay, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- nb, Norwegian Bokmål, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- nl, Dutch, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- nn-NO, Norwegian, Nynorsk (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- no, Norwegian, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- nso-ZA, Sesotho sa Leboa (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- oc-FR, Occitan (France), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>pl
+ <td>Polish
+ <td>ISO-8859-2 <!-- Chrome and Firefox agreed (but disagreed with Windows Vista, which thought the encoding should be windows-1250) -->
+
+ <!-- pl-PL, Polish (Poland), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- prs-AF, Dari (Afghanistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1256 -->
+
+ <!-- pt, Portuguese, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- qut-GT, K'iche (Guatemala), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- quz-BO, Quechua (Bolivia), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- quz-EC, Quechua (Ecuador), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- quz-PE, Quechua (Peru), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- rm-CH, Romansh (Switzerland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- ro, Romanian, is not listed here because Windows Vista wanted windows-1250, Chrome wanted ISO-8859-2, and Firefox wanted <none> -->
+
+ <!-- ro-RO, Romanian (Romania), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
<tr><td>ru
- <td>windows-1251
+ <td>Russian
+ <td>windows-1251 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- rw-RW, Kinyarwanda (Rwanda), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sah-RU, Yakut (Russia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- se-FI, Sami, Northern (Finland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- se-NO, Sami, Northern (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- se-SE, Sami, Northern (Sweden), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr><td>sk
- <td>windows-1250
+ <td>Slovak
+ <td>windows-1250 <!-- Windows Vista, Chrome, and Firefox agreed -->
<tr><td>sl
- <td>ISO-8859-2
+ <td>Slovenian
+ <td>ISO-8859-2 <!-- Chrome and Firefox agreed (but disagreed with Windows Vista, which thought the encoding should be windows-1250) -->
+ <!-- sl-SI, Slovenian (Slovenia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- sma-NO, Sami, Southern (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sma-SE, Sami, Southern (Sweden), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- smj-NO, Sami, Lule (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- smj-SE, Sami, Lule (Sweden), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- smn-FI, Sami, Inari (Finland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sms-FI, Sami, Skolt (Finland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sq, Albanian, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
<tr><td>sr
- <td>UTF-8
+ <td>Serbian
+ <td>windows-1251 <!-- Windows Vista and Chrome agreed -->
+ <!-- sr-Latn-BA, Serbian (Latin, Bosnia and Herzegovina), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- sr-Latn-SP, Serbian (Latin, Serbia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- sv, Swedish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- sw, Kiswahili, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- ta, Tamil, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- te, Telugu, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- tg-Cyrl-TJ, Tajik (Cyrillic, Tajikistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
<tr><td>th
- <td>windows-874 <!-- TIS-620 -->
+ <td>Thai
+ <td>windows-874 <!-- Windows Vista, Chrome, and Firefox agreed -->
- <tr><td>tr<!-- -TR -->
- <td>windows-1254 <!-- ISO-8859-9 -->
+ <!-- tk-TM, Turkmen (Turkmenistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+ <!-- tn-ZA, Setswana (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>tr
+ <td>Turkish
+ <td>windows-1254 <!-- Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- tt, Tatar, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- tzm-Latn-DZ, Tamazight (Latin, Algeria), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- ug-CN, Uighur (PRC), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1256 -->
+
<tr><td>uk
- <td>windows-1251
+ <td>Ukrainian
+ <td>windows-1251 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- ur, Urdu, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1256 -->
+
+ <!-- uz, Uzbek, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1254 -->
+
+ <!-- uz-Cyrl-UZ, Uzbek (Cyrillic, Uzbekistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
<tr><td>vi
- <td>UTF-8
+ <td>Vietnamese
+ <td>windows-1258 <!-- Windows Vista and Chrome agreed -->
+ <!-- wee-DE, Lower Sorbian (Germany), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- wen-DE, Upper Sorbian (Germany), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- wo-SN, Wolof (Senegal), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- xh-ZA, isiXhosa (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- yo-NG, Yoruba (Nigeria), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr><td>zh-CN
- <td>GB18030
+ <td>Chinese (People's Republic of China)
+ <td>GB18030 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- zh-HK, Chinese (Hong Kong S.A.R.), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted Big5 -->
+
+ <!-- zh-Hans, Chinese (Simplified), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted GB18030 -->
+
+ <!-- zh-Hant, Chinese (Traditional), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted Big5 -->
+
+ <!-- zh-MO, Chinese (Macao S.A.R.), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted Big5 -->
+
+ <!-- zh-SG, Chinese (Singapore), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted GB18030 -->
+
<tr><td>zh-TW
- <td>Big5
+ <td>Chinese (Taiwan)
+ <td>Big5 <!-- Windows Vista, Chrome, and Firefox agreed -->
- <tr><td>All other locales
+ <!-- zu-ZA, isiZulu (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td colspan=2>All other locales
<td>windows-1252
- </table></li>
+ </table><p class=tablenote><small>The contents of this table are derived from the intersection of
+ Windows, Chrome, and Firefox defaults. For locales where these disagreed, user agents are
+ encouraged to try using UTF-8, and to report if another encoding is more successful.</small></p>
+
+ </li>
+
</ol><p>The <a href="#document's-character-encoding">document's character encoding</a> must immediately be set to the value returned
from this algorithm, at the same time as the user agent uses the returned value to select the
decoder to use for the input byte stream.</p>
Modified: index
===================================================================
--- index 2013-06-11 22:23:54 UTC (rev 7957)
+++ index 2013-06-12 04:49:02 UTC (rev 7958)
@@ -256,7 +256,7 @@
<header class=head id=head><p><a class=logo href=http://www.whatwg.org/><img alt=WHATWG height=101 src=/images/logo width=101></a></p>
<hgroup><h1 class=allcaps>HTML</h1>
- <h2 class="no-num no-toc">Living Standard — Last Updated 11 June 2013</h2>
+ <h2 class="no-num no-toc">Living Standard — Last Updated 12 June 2013</h2>
</hgroup><dl><dt><strong>Web developer edition:</strong></dt>
<dd><strong><a href=http://developers.whatwg.org/>http://developers.whatwg.org/</a></strong></dd>
<dt>Multiple-page version:</dt>
@@ -84717,103 +84717,334 @@
to frequent). The following table gives suggested defaults based on the user's locale, for
compatibility with legacy content. Locales are identified by BCP 47 language tags. <a href=#refsBCP47>[BCP47]</a></p>
- <!-- based on mozilla 1.9.1 localizations:
- http://mxr.mozilla.org/l10n-mozilla1.9.1/find?string=global%2Fintl.properties&tree=l10n-mozilla1.9.1&hint= -->
+ <!-- based on three sources:
+ 1. mozilla 1.9.1 localizations: http://mxr.mozilla.org/l10n-mozilla1.9.1/find?string=global%2Fintl.properties&tree=l10n-mozilla1.9.1&hint=
+ 2. windows vista encodings: http://msdn.microsoft.com/en-us/goglobal/bb896001
+ 3. chrome encodings: https://code.google.com/p/chromium/codesearch#search/&q=IDS_DEFAULT_ENCODING
+ several assumptions were made in this process; amongst them:
+ - ISO-8859-1 and Windows-1252 are the same (supported by encoding.spec.whatwg.org)
+ - ISO-8859-9 and Windows-1254 are the same (supported by encoding.spec.whatwg.org)
+ - Windows-31J and Shift_JIS are the same (supported by encoding.spec.whatwg.org)
+ - Windows-932 is close enough to Shift_JIS to be treated as equivalent (supported by wikipedia)
+ - Windows-936 is a basically a subset of GBK which is basically a subset of GB18030 (supported by wikipedia)
+ - Windows-950 is basically the same as Big5 (supported by wikipedia)
+ - Firefox's UTF-8 defaults are all bogus
+ -->
- <table><thead><tr><th>Locale language
+ <table><thead><tr><th colspan=2>Locale language
<th>Suggested default encoding
- <tbody><tr><td>ar
- <td>UTF-8
+ <tbody><!-- af, Afrikaans, uses windows-1252: Windows Vista and Firefox agreed --><!-- am, Amharic, uses windows-1252: Firefox and Chrome agreed --><tr><td>ar
+ <td>Arabic
+ <td>windows-1256 <!-- Windows Vista and Chrome agreed -->
- <tr><td>be
- <td>ISO-8859-5
+ <!-- arn-CL, Mapudungun (Chile), uses windows-1252: Windows Vista and Firefox agreed -->
+ <!-- az, Azeri, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1254 -->
+
+ <!-- az-Cyrl-AZ, Azeri (Cyrillic, Azerbaijan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- ba-RU, Bashkir (Russia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- be, Belarusian, is not listed here because Windows Vista wanted windows-1251, Chrome wanted <none>, and Firefox wanted ISO-8859-5 -->
+
+ <!-- be-BY, Belarusian (Belarus), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
<tr><td>bg
- <td>windows-1251
+ <td>Bulgarian
+ <td>windows-1251 <!-- Windows Vista, Chrome, and Firefox agreed -->
- <tr><td>cs<!-- -CZ -->
- <td>ISO-8859-2
+ <!-- bn, Bengali, uses windows-1252: Firefox and Chrome agreed -->
- <tr><td>cy
- <td>UTF-8
+ <!-- br-FR, Breton (France), uses windows-1252: Windows Vista and Firefox agreed -->
- <tr><td>fa<!-- -IR -->
- <td>UTF-8
+ <!-- bs-Cyrl-BA, Bosnian (Cyrillic, Bosnia and Herzegovina), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
- <tr><td>he<!-- -IL -->
- <td>windows-1255
+ <!-- bs-Latn-BA, Bosnian (Latin, Bosnia and Herzegovina), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+ <!-- ca, Catalan, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- co-FR, Corsican (France), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>cs
+ <td>Czech
+ <td>windows-1250 <!-- Windows Vista and Chrome agreed (but disagreed with Firefox, which thought the encoding should be ISO-8859-2) -->
+
+ <!-- cy-GB, Welsh (United Kingdom), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- da, Danish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- de, German, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- el, Greek, is not listed here because Windows Vista wanted windows-1253, Chrome wanted ISO-8859-7, and Firefox wanted windows-1252 -->
+
+ <!-- el-GR, Greek (Greece), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1253 -->
+
+ <!-- en, English, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- es, Spanish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <tr><td>et
+ <td>Estonian
+ <td>windows-1257 <!-- Windows Vista and Chrome agreed -->
+
+ <!-- eu, Basque, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>fa
+ <td>Persian
+ <td>windows-1256 <!-- Windows Vista and Chrome agreed -->
+
+ <!-- fi, Finnish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- fil, Filipino, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- fo, Faroese, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- fr, French, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- fy-NL, Frisian (Netherlands), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- ga-IE, Irish (Ireland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- gl, Galician, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- gsw-FR, Alsatian (France), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- gu, Gujarati, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- ha-Latn-NG, Hausa (Latin, Nigeria), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>he
+ <td>Hebrew
+ <td>windows-1255 <!-- Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- hi, Hindi, uses windows-1252: Firefox and Chrome agreed -->
+
<tr><td>hr
- <td>UTF-8
+ <td>Croatian
+ <td>windows-1250 <!-- Windows Vista and Chrome agreed -->
- <tr><td>hu<!-- -HU -->
- <td>ISO-8859-2
+ <tr><td>hu
+ <td>Hungarian
+ <td>ISO-8859-2 <!-- Chrome and Firefox agreed (but disagreed with Windows Vista, which thought the encoding should be windows-1250) -->
- <tr><td>ja <!-- and ja-JP-mac -->
- <td>Windows-31J <!-- Shift_JIS -->
+ <!-- hu-HU, Hungarian (Hungary), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
- <tr><td>kk
- <td>UTF-8
+ <!-- id, Indonesian, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
- <tr><td>ko<!-- -KR -->
- <td>windows-949 <!-- EUC-KR -->
+ <!-- ig-NG, Igbo (Nigeria), uses windows-1252: Windows Vista and Firefox agreed -->
+ <!-- is, Icelandic, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- it, Italian, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- iu-Latn-CA, Inuktitut (Latin, Canada), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>ja
+ <td>Japanese
+ <td>Shift_JIS <!-- Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- kk, Kazakh, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- kl-GL, Greenlandic (Greenland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- kn, Kannada, uses windows-1252: Firefox and Chrome agreed -->
+
+ <tr><td>ko
+ <td>Korean
+ <td>windows-949 <!-- Windows Vista, Chrome, and Firefox agreed -->
+
<tr><td>ku
- <td>windows-1254 <!-- ISO-8859-9 -->
+ <td>Kurdish
+ <td>windows-1254 <!-- Best guess -->
+ <!-- ky, Kyrgyz, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- lb-LU, Luxembourgish (Luxembourg), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr><td>lt
- <td>windows-1257
+ <td>Lithuanian
+ <td>windows-1257 <!-- Windows Vista, Chrome, and Firefox agreed -->
- <tr><td>lv<!-- -LV -->
- <td>ISO-8859-13
+ <tr><td>lv
+ <td>Latvian
+ <td>windows-1257 <!-- Windows Vista and Chrome agreed (but disagreed with Firefox, which thought the encoding should be ISO-8859-13) -->
- <tr><td>mk<!-- -MK -->
- <td>UTF-8
+ <!-- mk, Macedonian, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
- <tr><td>or
- <td>UTF-8
+ <!-- ml, Malayalam, uses windows-1252: Firefox and Chrome agreed -->
- <tr><td>pl<!-- -PL -->
- <td>ISO-8859-2
+ <!-- mn, Mongolian, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
- <tr><td>ro
- <td>UTF-8
+ <!-- moh-CA, Mohawk (Mohawk), uses windows-1252: Windows Vista and Firefox agreed -->
+ <!-- mr, Marathi, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- ms, Malay, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- nb, Norwegian Bokmål, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- nl, Dutch, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- nn-NO, Norwegian, Nynorsk (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- no, Norwegian, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- nso-ZA, Sesotho sa Leboa (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- oc-FR, Occitan (France), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>pl
+ <td>Polish
+ <td>ISO-8859-2 <!-- Chrome and Firefox agreed (but disagreed with Windows Vista, which thought the encoding should be windows-1250) -->
+
+ <!-- pl-PL, Polish (Poland), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- prs-AF, Dari (Afghanistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1256 -->
+
+ <!-- pt, Portuguese, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- qut-GT, K'iche (Guatemala), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- quz-BO, Quechua (Bolivia), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- quz-EC, Quechua (Ecuador), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- quz-PE, Quechua (Peru), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- rm-CH, Romansh (Switzerland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- ro, Romanian, is not listed here because Windows Vista wanted windows-1250, Chrome wanted ISO-8859-2, and Firefox wanted <none> -->
+
+ <!-- ro-RO, Romanian (Romania), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
<tr><td>ru
- <td>windows-1251
+ <td>Russian
+ <td>windows-1251 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- rw-RW, Kinyarwanda (Rwanda), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sah-RU, Yakut (Russia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- se-FI, Sami, Northern (Finland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- se-NO, Sami, Northern (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- se-SE, Sami, Northern (Sweden), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr><td>sk
- <td>windows-1250
+ <td>Slovak
+ <td>windows-1250 <!-- Windows Vista, Chrome, and Firefox agreed -->
<tr><td>sl
- <td>ISO-8859-2
+ <td>Slovenian
+ <td>ISO-8859-2 <!-- Chrome and Firefox agreed (but disagreed with Windows Vista, which thought the encoding should be windows-1250) -->
+ <!-- sl-SI, Slovenian (Slovenia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- sma-NO, Sami, Southern (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sma-SE, Sami, Southern (Sweden), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- smj-NO, Sami, Lule (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- smj-SE, Sami, Lule (Sweden), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- smn-FI, Sami, Inari (Finland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sms-FI, Sami, Skolt (Finland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sq, Albanian, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
<tr><td>sr
- <td>UTF-8
+ <td>Serbian
+ <td>windows-1251 <!-- Windows Vista and Chrome agreed -->
+ <!-- sr-Latn-BA, Serbian (Latin, Bosnia and Herzegovina), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- sr-Latn-SP, Serbian (Latin, Serbia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- sv, Swedish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- sw, Kiswahili, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- ta, Tamil, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- te, Telugu, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- tg-Cyrl-TJ, Tajik (Cyrillic, Tajikistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
<tr><td>th
- <td>windows-874 <!-- TIS-620 -->
+ <td>Thai
+ <td>windows-874 <!-- Windows Vista, Chrome, and Firefox agreed -->
- <tr><td>tr<!-- -TR -->
- <td>windows-1254 <!-- ISO-8859-9 -->
+ <!-- tk-TM, Turkmen (Turkmenistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+ <!-- tn-ZA, Setswana (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td>tr
+ <td>Turkish
+ <td>windows-1254 <!-- Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- tt, Tatar, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- tzm-Latn-DZ, Tamazight (Latin, Algeria), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- ug-CN, Uighur (PRC), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1256 -->
+
<tr><td>uk
- <td>windows-1251
+ <td>Ukrainian
+ <td>windows-1251 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- ur, Urdu, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1256 -->
+
+ <!-- uz, Uzbek, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1254 -->
+
+ <!-- uz-Cyrl-UZ, Uzbek (Cyrillic, Uzbekistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
<tr><td>vi
- <td>UTF-8
+ <td>Vietnamese
+ <td>windows-1258 <!-- Windows Vista and Chrome agreed -->
+ <!-- wee-DE, Lower Sorbian (Germany), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- wen-DE, Upper Sorbian (Germany), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- wo-SN, Wolof (Senegal), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- xh-ZA, isiXhosa (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- yo-NG, Yoruba (Nigeria), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr><td>zh-CN
- <td>GB18030
+ <td>Chinese (People's Republic of China)
+ <td>GB18030 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- zh-HK, Chinese (Hong Kong S.A.R.), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted Big5 -->
+
+ <!-- zh-Hans, Chinese (Simplified), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted GB18030 -->
+
+ <!-- zh-Hant, Chinese (Traditional), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted Big5 -->
+
+ <!-- zh-MO, Chinese (Macao S.A.R.), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted Big5 -->
+
+ <!-- zh-SG, Chinese (Singapore), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted GB18030 -->
+
<tr><td>zh-TW
- <td>Big5
+ <td>Chinese (Taiwan)
+ <td>Big5 <!-- Windows Vista, Chrome, and Firefox agreed -->
- <tr><td>All other locales
+ <!-- zu-ZA, isiZulu (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <tr><td colspan=2>All other locales
<td>windows-1252
- </table></li>
+ </table><p class=tablenote><small>The contents of this table are derived from the intersection of
+ Windows, Chrome, and Firefox defaults. For locales where these disagreed, user agents are
+ encouraged to try using UTF-8, and to report if another encoding is more successful.</small></p>
+
+ </li>
+
</ol><p>The <a href="#document's-character-encoding">document's character encoding</a> must immediately be set to the value returned
from this algorithm, at the same time as the user agent uses the returned value to select the
decoder to use for the input byte stream.</p>
Modified: source
===================================================================
--- source 2013-06-11 22:23:54 UTC (rev 7957)
+++ source 2013-06-12 04:49:02 UTC (rev 7958)
@@ -94508,138 +94508,368 @@
compatibility with legacy content. Locales are identified by BCP 47 language tags. <a
href="#refsBCP47">[BCP47]</a></p>
- <!-- based on mozilla 1.9.1 localizations:
- http://mxr.mozilla.org/l10n-mozilla1.9.1/find?string=global%2Fintl.properties&tree=l10n-mozilla1.9.1&hint= -->
+ <!-- based on three sources:
+ 1. mozilla 1.9.1 localizations: http://mxr.mozilla.org/l10n-mozilla1.9.1/find?string=global%2Fintl.properties&tree=l10n-mozilla1.9.1&hint=
+ 2. windows vista encodings: http://msdn.microsoft.com/en-us/goglobal/bb896001
+ 3. chrome encodings: https://code.google.com/p/chromium/codesearch#search/&q=IDS_DEFAULT_ENCODING
+ several assumptions were made in this process; amongst them:
+ - ISO-8859-1 and Windows-1252 are the same (supported by encoding.spec.whatwg.org)
+ - ISO-8859-9 and Windows-1254 are the same (supported by encoding.spec.whatwg.org)
+ - Windows-31J and Shift_JIS are the same (supported by encoding.spec.whatwg.org)
+ - Windows-932 is close enough to Shift_JIS to be treated as equivalent (supported by wikipedia)
+ - Windows-936 is a basically a subset of GBK which is basically a subset of GB18030 (supported by wikipedia)
+ - Windows-950 is basically the same as Big5 (supported by wikipedia)
+ - Firefox's UTF-8 defaults are all bogus
+ -->
<table>
<thead>
<tr>
- <th>Locale language
+ <th colspan=2>Locale language
<th>Suggested default encoding
<tbody>
+ <!-- af, Afrikaans, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- am, Amharic, uses windows-1252: Firefox and Chrome agreed -->
+
<tr>
<td>ar
- <td>UTF-8
+ <td>Arabic
+ <td>windows-1256 <!-- Windows Vista and Chrome agreed -->
- <tr>
- <td>be
- <td>ISO-8859-5
+ <!-- arn-CL, Mapudungun (Chile), uses windows-1252: Windows Vista and Firefox agreed -->
+ <!-- az, Azeri, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1254 -->
+
+ <!-- az-Cyrl-AZ, Azeri (Cyrillic, Azerbaijan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- ba-RU, Bashkir (Russia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- be, Belarusian, is not listed here because Windows Vista wanted windows-1251, Chrome wanted <none>, and Firefox wanted ISO-8859-5 -->
+
+ <!-- be-BY, Belarusian (Belarus), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
<tr>
<td>bg
- <td>windows-1251
+ <td>Bulgarian
+ <td>windows-1251 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- bn, Bengali, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- br-FR, Breton (France), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- bs-Cyrl-BA, Bosnian (Cyrillic, Bosnia and Herzegovina), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- bs-Latn-BA, Bosnian (Latin, Bosnia and Herzegovina), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- ca, Catalan, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- co-FR, Corsican (France), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr>
- <td>cs<!-- -CZ -->
- <td>ISO-8859-2
+ <td>cs
+ <td>Czech
+ <td>windows-1250 <!-- Windows Vista and Chrome agreed (but disagreed with Firefox, which thought the encoding should be ISO-8859-2) -->
+ <!-- cy-GB, Welsh (United Kingdom), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- da, Danish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- de, German, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- el, Greek, is not listed here because Windows Vista wanted windows-1253, Chrome wanted ISO-8859-7, and Firefox wanted windows-1252 -->
+
+ <!-- el-GR, Greek (Greece), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1253 -->
+
+ <!-- en, English, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- es, Spanish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
<tr>
- <td>cy
- <td>UTF-8
+ <td>et
+ <td>Estonian
+ <td>windows-1257 <!-- Windows Vista and Chrome agreed -->
+ <!-- eu, Basque, uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr>
- <td>fa<!-- -IR -->
- <td>UTF-8
+ <td>fa
+ <td>Persian
+ <td>windows-1256 <!-- Windows Vista and Chrome agreed -->
+ <!-- fi, Finnish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- fil, Filipino, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- fo, Faroese, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- fr, French, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- fy-NL, Frisian (Netherlands), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- ga-IE, Irish (Ireland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- gl, Galician, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- gsw-FR, Alsatian (France), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- gu, Gujarati, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- ha-Latn-NG, Hausa (Latin, Nigeria), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr>
- <td>he<!-- -IL -->
- <td>windows-1255
+ <td>he
+ <td>Hebrew
+ <td>windows-1255 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- hi, Hindi, uses windows-1252: Firefox and Chrome agreed -->
+
<tr>
<td>hr
- <td>UTF-8
+ <td>Croatian
+ <td>windows-1250 <!-- Windows Vista and Chrome agreed -->
<tr>
- <td>hu<!-- -HU -->
- <td>ISO-8859-2
+ <td>hu
+ <td>Hungarian
+ <td>ISO-8859-2 <!-- Chrome and Firefox agreed (but disagreed with Windows Vista, which thought the encoding should be windows-1250) -->
- <tr>
- <td>ja <!-- and ja-JP-mac -->
- <td>Windows-31J <!-- Shift_JIS -->
+ <!-- hu-HU, Hungarian (Hungary), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+ <!-- id, Indonesian, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- ig-NG, Igbo (Nigeria), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- is, Icelandic, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- it, Italian, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- iu-Latn-CA, Inuktitut (Latin, Canada), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr>
- <td>kk
- <td>UTF-8
+ <td>ja
+ <td>Japanese
+ <td>Shift_JIS <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- kk, Kazakh, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- kl-GL, Greenlandic (Greenland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- kn, Kannada, uses windows-1252: Firefox and Chrome agreed -->
+
<tr>
- <td>ko<!-- -KR -->
- <td>windows-949 <!-- EUC-KR -->
+ <td>ko
+ <td>Korean
+ <td>windows-949 <!-- Windows Vista, Chrome, and Firefox agreed -->
<tr>
<td>ku
- <td>windows-1254 <!-- ISO-8859-9 -->
+ <td>Kurdish
+ <td>windows-1254 <!-- Best guess -->
+ <!-- ky, Kyrgyz, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- lb-LU, Luxembourgish (Luxembourg), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr>
<td>lt
- <td>windows-1257
+ <td>Lithuanian
+ <td>windows-1257 <!-- Windows Vista, Chrome, and Firefox agreed -->
<tr>
- <td>lv<!-- -LV -->
- <td>ISO-8859-13
+ <td>lv
+ <td>Latvian
+ <td>windows-1257 <!-- Windows Vista and Chrome agreed (but disagreed with Firefox, which thought the encoding should be ISO-8859-13) -->
- <tr>
- <td>mk<!-- -MK -->
- <td>UTF-8
+ <!-- mk, Macedonian, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
- <tr>
- <td>or
- <td>UTF-8
+ <!-- ml, Malayalam, uses windows-1252: Firefox and Chrome agreed -->
- <tr>
- <td>pl<!-- -PL -->
- <td>ISO-8859-2
+ <!-- mn, Mongolian, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+ <!-- moh-CA, Mohawk (Mohawk), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- mr, Marathi, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- ms, Malay, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- nb, Norwegian Bokmål, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- nl, Dutch, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- nn-NO, Norwegian, Nynorsk (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- no, Norwegian, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- nso-ZA, Sesotho sa Leboa (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- oc-FR, Occitan (France), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr>
- <td>ro
- <td>UTF-8
+ <td>pl
+ <td>Polish
+ <td>ISO-8859-2 <!-- Chrome and Firefox agreed (but disagreed with Windows Vista, which thought the encoding should be windows-1250) -->
+ <!-- pl-PL, Polish (Poland), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- prs-AF, Dari (Afghanistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1256 -->
+
+ <!-- pt, Portuguese, uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- qut-GT, K'iche (Guatemala), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- quz-BO, Quechua (Bolivia), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- quz-EC, Quechua (Ecuador), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- quz-PE, Quechua (Peru), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- rm-CH, Romansh (Switzerland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- ro, Romanian, is not listed here because Windows Vista wanted windows-1250, Chrome wanted ISO-8859-2, and Firefox wanted <none> -->
+
+ <!-- ro-RO, Romanian (Romania), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
<tr>
<td>ru
- <td>windows-1251
+ <td>Russian
+ <td>windows-1251 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- rw-RW, Kinyarwanda (Rwanda), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sah-RU, Yakut (Russia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- se-FI, Sami, Northern (Finland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- se-NO, Sami, Northern (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- se-SE, Sami, Northern (Sweden), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr>
<td>sk
- <td>windows-1250
+ <td>Slovak
+ <td>windows-1250 <!-- Windows Vista, Chrome, and Firefox agreed -->
<tr>
<td>sl
- <td>ISO-8859-2
+ <td>Slovenian
+ <td>ISO-8859-2 <!-- Chrome and Firefox agreed (but disagreed with Windows Vista, which thought the encoding should be windows-1250) -->
+ <!-- sl-SI, Slovenian (Slovenia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- sma-NO, Sami, Southern (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sma-SE, Sami, Southern (Sweden), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- smj-NO, Sami, Lule (Norway), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- smj-SE, Sami, Lule (Sweden), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- smn-FI, Sami, Inari (Finland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sms-FI, Sami, Skolt (Finland), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- sq, Albanian, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
<tr>
<td>sr
- <td>UTF-8
+ <td>Serbian
+ <td>windows-1251 <!-- Windows Vista and Chrome agreed -->
+ <!-- sr-Latn-BA, Serbian (Latin, Bosnia and Herzegovina), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- sr-Latn-SP, Serbian (Latin, Serbia), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- sv, Swedish, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- sw, Kiswahili, uses windows-1252: Windows Vista, Chrome, and Firefox agreed -->
+
+ <!-- ta, Tamil, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- te, Telugu, uses windows-1252: Firefox and Chrome agreed -->
+
+ <!-- tg-Cyrl-TJ, Tajik (Cyrillic, Tajikistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
<tr>
<td>th
- <td>windows-874 <!-- TIS-620 -->
+ <td>Thai
+ <td>windows-874 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- tk-TM, Turkmen (Turkmenistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1250 -->
+
+ <!-- tn-ZA, Setswana (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr>
- <td>tr<!-- -TR -->
- <td>windows-1254 <!-- ISO-8859-9 -->
+ <td>tr
+ <td>Turkish
+ <td>windows-1254 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- tt, Tatar, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
+ <!-- tzm-Latn-DZ, Tamazight (Latin, Algeria), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- ug-CN, Uighur (PRC), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1256 -->
+
<tr>
<td>uk
- <td>windows-1251
+ <td>Ukrainian
+ <td>windows-1251 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- ur, Urdu, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1256 -->
+
+ <!-- uz, Uzbek, is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1254 -->
+
+ <!-- uz-Cyrl-UZ, Uzbek (Cyrillic, Uzbekistan), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted windows-1251 -->
+
<tr>
<td>vi
- <td>UTF-8
+ <td>Vietnamese
+ <td>windows-1258 <!-- Windows Vista and Chrome agreed -->
+ <!-- wee-DE, Lower Sorbian (Germany), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- wen-DE, Upper Sorbian (Germany), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- wo-SN, Wolof (Senegal), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- xh-ZA, isiXhosa (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
+ <!-- yo-NG, Yoruba (Nigeria), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr>
<td>zh-CN
- <td>GB18030
+ <td>Chinese (People's Republic of China)
+ <td>GB18030 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- zh-HK, Chinese (Hong Kong S.A.R.), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted Big5 -->
+
+ <!-- zh-Hans, Chinese (Simplified), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted GB18030 -->
+
+ <!-- zh-Hant, Chinese (Traditional), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted Big5 -->
+
+ <!-- zh-MO, Chinese (Macao S.A.R.), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted Big5 -->
+
+ <!-- zh-SG, Chinese (Singapore), is not listed here because neither Chrome nor Firefox knew about it. For what it's worth, Windows Vista wanted GB18030 -->
+
<tr>
<td>zh-TW
- <td>Big5
+ <td>Chinese (Taiwan)
+ <td>Big5 <!-- Windows Vista, Chrome, and Firefox agreed -->
+ <!-- zu-ZA, isiZulu (South Africa), uses windows-1252: Windows Vista and Firefox agreed -->
+
<tr>
- <td>All other locales
+ <td colspan=2>All other locales
<td>windows-1252
</table>
+ <p class="tablenote"><small>The contents of this table are derived from the intersection of
+ Windows, Chrome, and Firefox defaults. For locales where these disagreed, user agents are
+ encouraged to try using UTF-8, and to report if another encoding is more successful.</small></p>
+
+
</li>
</ol>
More information about the Commit-Watchers
mailing list