[html5] r4126 - [giow] (2) List the default encodings by locale.

whatwg at whatwg.org whatwg at whatwg.org
Tue Oct 13 03:45:03 PDT 2009


Author: ianh
Date: 2009-10-13 03:45:01 -0700 (Tue, 13 Oct 2009)
New Revision: 4126

Modified:
   complete.html
   index
   source
Log:
[giow] (2) List the default encodings by locale.

Modified: complete.html
===================================================================
--- complete.html	2009-10-13 09:44:17 UTC (rev 4125)
+++ complete.html	2009-10-13 10:45:01 UTC (rev 4126)
@@ -69445,15 +69445,121 @@
 
    </li>
 
-   <li><p>Otherwise, return an implementation-defined or
-   user-specified default character encoding, with the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a>
-   <i>tentative</i>. In controlled environments or in environments
-   where the encoding of documents can be prescribed (for example, for
-   user agents intended for dedicated use in new networks), the more
-   comprehensive <code title="">UTF-8</code> encoding is
-   suggested. Due to its use in legacy content, <code title="">windows-1252</code> is suggested as a default in
-   predominantly Western locales instead.</li>
+   <li>
 
+    <p>Otherwise, return an implementation-defined or user-specified
+    default character encoding, with the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a>
+    <i>tentative</i>.</p>
+
+    <p>In controlled environments or in environments where the
+    encoding of documents can be prescribed (for example, for user
+    agents intended for dedicated use in new networks), the
+    comprehensive <code title="">UTF-8</code> encoding is
+    suggested.</p>
+
+    <p>In other environments, the default encoding is typically
+    dependent on the user's locale (an approximation of the languages,
+    and thus typically encodings, of the pages that the user is likely
+    to frequent). The following table gives suggested defaults based
+    on the user's locale, for compatibility with legacy content:</p>
+
+    <!-- based on mozilla 1.9.1 localizations: 
+         http://mxr.mozilla.org/l10n-mozilla1.9.1/find?string=global%2Fintl.properties&tree=l10n-mozilla1.9.1&hint= -->
+
+    <table><thead><tr><th>Locale
+       <th>Suggested default encoding
+     <tbody><tr><td>ar
+       <td>UTF-8
+
+      <tr><td>be
+       <td>ISO-8859-5
+
+      <tr><td>bg
+       <td>windows-1251
+
+      <tr><td>cs<!-- -CZ -->
+       <td>ISO-8859-2
+
+      <tr><td>cy
+       <td>UTF-8
+
+      <tr><td>fa<!-- -IR -->
+       <td>UTF-8
+
+      <tr><td>he<!-- -IL -->
+       <td>windows-1255
+
+      <tr><td>hr
+       <td>UTF-8
+
+      <tr><td>hu<!-- -HU -->
+       <td>ISO-8859-2
+
+      <tr><td>ja <!-- and ja-JP-mac -->
+       <td>windows-31J <!-- Shift_JIS -->
+
+      <tr><td>kk
+       <td>UTF-8
+
+      <tr><td>ko<!-- -KR -->
+       <td>windows-949 <!-- EUC-KR -->
+
+      <tr><td>ku
+       <td>windows-1254 <!-- ISO-8859-9 -->
+
+      <tr><td>lt
+       <td>windows-1257
+
+      <tr><td>lv<!-- -LV -->
+       <td>ISO-8859-13
+
+      <tr><td>mk<!-- -MK -->
+       <td>UTF-8
+
+      <tr><td>or
+       <td>UTF-8
+
+      <tr><td>pl<!-- -PL -->
+       <td>ISO-8859-2
+
+      <tr><td>ro
+       <td>UTF-8
+
+      <tr><td>ru
+       <td>windows-1251
+
+      <tr><td>sk
+       <td>windows-1250
+
+      <tr><td>sl
+       <td>ISO-8859-2
+
+      <tr><td>sr
+       <td>UTF-8
+
+      <tr><td>th
+       <td>windows-874 <!-- TIS-620 -->
+
+      <tr><td>tr<!-- -TR -->
+       <td>windows-1254 <!-- ISO-8859-9 -->
+
+      <tr><td>uk
+       <td>windows-1251
+
+      <tr><td>vi
+       <td>UTF-8
+
+      <tr><td>zh-CN
+       <td>GB18030
+
+      <tr><td>zh-TW
+       <td>Big5
+
+      <tr><td>All other locales
+       <td>windows-1252
+
+    </table></li>
+
   </ol><p>The <a href="#document's-character-encoding">document's character encoding</a> must immediately
   be set to the value returned from this algorithm, at the same time
   as the user agent uses the returned value to select the decoder to

Modified: index
===================================================================
--- index	2009-10-13 09:44:17 UTC (rev 4125)
+++ index	2009-10-13 10:45:01 UTC (rev 4126)
@@ -60466,15 +60466,121 @@
 
    </li>
 
-   <li><p>Otherwise, return an implementation-defined or
-   user-specified default character encoding, with the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a>
-   <i>tentative</i>. In controlled environments or in environments
-   where the encoding of documents can be prescribed (for example, for
-   user agents intended for dedicated use in new networks), the more
-   comprehensive <code title="">UTF-8</code> encoding is
-   suggested. Due to its use in legacy content, <code title="">windows-1252</code> is suggested as a default in
-   predominantly Western locales instead.</li>
+   <li>
 
+    <p>Otherwise, return an implementation-defined or user-specified
+    default character encoding, with the <a href=#concept-encoding-confidence title=concept-encoding-confidence>confidence</a>
+    <i>tentative</i>.</p>
+
+    <p>In controlled environments or in environments where the
+    encoding of documents can be prescribed (for example, for user
+    agents intended for dedicated use in new networks), the
+    comprehensive <code title="">UTF-8</code> encoding is
+    suggested.</p>
+
+    <p>In other environments, the default encoding is typically
+    dependent on the user's locale (an approximation of the languages,
+    and thus typically encodings, of the pages that the user is likely
+    to frequent). The following table gives suggested defaults based
+    on the user's locale, for compatibility with legacy content:</p>
+
+    <!-- based on mozilla 1.9.1 localizations: 
+         http://mxr.mozilla.org/l10n-mozilla1.9.1/find?string=global%2Fintl.properties&tree=l10n-mozilla1.9.1&hint= -->
+
+    <table><thead><tr><th>Locale
+       <th>Suggested default encoding
+     <tbody><tr><td>ar
+       <td>UTF-8
+
+      <tr><td>be
+       <td>ISO-8859-5
+
+      <tr><td>bg
+       <td>windows-1251
+
+      <tr><td>cs<!-- -CZ -->
+       <td>ISO-8859-2
+
+      <tr><td>cy
+       <td>UTF-8
+
+      <tr><td>fa<!-- -IR -->
+       <td>UTF-8
+
+      <tr><td>he<!-- -IL -->
+       <td>windows-1255
+
+      <tr><td>hr
+       <td>UTF-8
+
+      <tr><td>hu<!-- -HU -->
+       <td>ISO-8859-2
+
+      <tr><td>ja <!-- and ja-JP-mac -->
+       <td>windows-31J <!-- Shift_JIS -->
+
+      <tr><td>kk
+       <td>UTF-8
+
+      <tr><td>ko<!-- -KR -->
+       <td>windows-949 <!-- EUC-KR -->
+
+      <tr><td>ku
+       <td>windows-1254 <!-- ISO-8859-9 -->
+
+      <tr><td>lt
+       <td>windows-1257
+
+      <tr><td>lv<!-- -LV -->
+       <td>ISO-8859-13
+
+      <tr><td>mk<!-- -MK -->
+       <td>UTF-8
+
+      <tr><td>or
+       <td>UTF-8
+
+      <tr><td>pl<!-- -PL -->
+       <td>ISO-8859-2
+
+      <tr><td>ro
+       <td>UTF-8
+
+      <tr><td>ru
+       <td>windows-1251
+
+      <tr><td>sk
+       <td>windows-1250
+
+      <tr><td>sl
+       <td>ISO-8859-2
+
+      <tr><td>sr
+       <td>UTF-8
+
+      <tr><td>th
+       <td>windows-874 <!-- TIS-620 -->
+
+      <tr><td>tr<!-- -TR -->
+       <td>windows-1254 <!-- ISO-8859-9 -->
+
+      <tr><td>uk
+       <td>windows-1251
+
+      <tr><td>vi
+       <td>UTF-8
+
+      <tr><td>zh-CN
+       <td>GB18030
+
+      <tr><td>zh-TW
+       <td>Big5
+
+      <tr><td>All other locales
+       <td>windows-1252
+
+    </table></li>
+
   </ol><p>The <a href="#document's-character-encoding">document's character encoding</a> must immediately
   be set to the value returned from this algorithm, at the same time
   as the user agent uses the returned value to select the decoder to

Modified: source
===================================================================
--- source	2009-10-13 09:44:17 UTC (rev 4125)
+++ source	2009-10-13 10:45:01 UTC (rev 4126)
@@ -78202,17 +78202,159 @@
 
    </li>
 
-   <li><p>Otherwise, return an implementation-defined or
-   user-specified default character encoding, with the <span
-   title="concept-encoding-confidence">confidence</span>
-   <i>tentative</i>. In controlled environments or in environments
-   where the encoding of documents can be prescribed (for example, for
-   user agents intended for dedicated use in new networks), the more
-   comprehensive <code title="">UTF-8</code> encoding is
-   suggested. Due to its use in legacy content, <code
-   title="">windows-1252</code> is suggested as a default in
-   predominantly Western locales instead.</p></li>
+   <li>
 
+    <p>Otherwise, return an implementation-defined or user-specified
+    default character encoding, with the <span
+    title="concept-encoding-confidence">confidence</span>
+    <i>tentative</i>.</p>
+
+    <p>In controlled environments or in environments where the
+    encoding of documents can be prescribed (for example, for user
+    agents intended for dedicated use in new networks), the
+    comprehensive <code title="">UTF-8</code> encoding is
+    suggested.</p>
+
+    <p>In other environments, the default encoding is typically
+    dependent on the user's locale (an approximation of the languages,
+    and thus typically encodings, of the pages that the user is likely
+    to frequent). The following table gives suggested defaults based
+    on the user's locale, for compatibility with legacy content:</p>
+
+    <!-- based on mozilla 1.9.1 localizations: 
+         http://mxr.mozilla.org/l10n-mozilla1.9.1/find?string=global%2Fintl.properties&tree=l10n-mozilla1.9.1&hint= -->
+
+    <table>
+     <thead>
+      <tr>
+       <th>Locale
+       <th>Suggested default encoding
+     <tbody>
+
+      <tr>
+       <td>ar
+       <td>UTF-8
+
+      <tr>
+       <td>be
+       <td>ISO-8859-5
+
+      <tr>
+       <td>bg
+       <td>windows-1251
+
+      <tr>
+       <td>cs<!-- -CZ -->
+       <td>ISO-8859-2
+
+      <tr>
+       <td>cy
+       <td>UTF-8
+
+      <tr>
+       <td>fa<!-- -IR -->
+       <td>UTF-8
+
+      <tr>
+       <td>he<!-- -IL -->
+       <td>windows-1255
+
+      <tr>
+       <td>hr
+       <td>UTF-8
+
+      <tr>
+       <td>hu<!-- -HU -->
+       <td>ISO-8859-2
+
+      <tr>
+       <td>ja <!-- and ja-JP-mac -->
+       <td>windows-31J <!-- Shift_JIS -->
+
+      <tr>
+       <td>kk
+       <td>UTF-8
+
+      <tr>
+       <td>ko<!-- -KR -->
+       <td>windows-949 <!-- EUC-KR -->
+
+      <tr>
+       <td>ku
+       <td>windows-1254 <!-- ISO-8859-9 -->
+
+      <tr>
+       <td>lt
+       <td>windows-1257
+
+      <tr>
+       <td>lv<!-- -LV -->
+       <td>ISO-8859-13
+
+      <tr>
+       <td>mk<!-- -MK -->
+       <td>UTF-8
+
+      <tr>
+       <td>or
+       <td>UTF-8
+
+      <tr>
+       <td>pl<!-- -PL -->
+       <td>ISO-8859-2
+
+      <tr>
+       <td>ro
+       <td>UTF-8
+
+      <tr>
+       <td>ru
+       <td>windows-1251
+
+      <tr>
+       <td>sk
+       <td>windows-1250
+
+      <tr>
+       <td>sl
+       <td>ISO-8859-2
+
+      <tr>
+       <td>sr
+       <td>UTF-8
+
+      <tr>
+       <td>th
+       <td>windows-874 <!-- TIS-620 -->
+
+      <tr>
+       <td>tr<!-- -TR -->
+       <td>windows-1254 <!-- ISO-8859-9 -->
+
+      <tr>
+       <td>uk
+       <td>windows-1251
+
+      <tr>
+       <td>vi
+       <td>UTF-8
+
+      <tr>
+       <td>zh-CN
+       <td>GB18030
+
+      <tr>
+       <td>zh-TW
+       <td>Big5
+
+      <tr>
+       <td>All other locales
+       <td>windows-1252
+
+    </table>
+
+   </li>
+
   </ol>
 
   <p>The <span>document's character encoding</span> must immediately




More information about the Commit-Watchers mailing list