[html5] r1661 - /

whatwg at whatwg.org whatwg at whatwg.org
Thu May 22 02:21:59 PDT 2008


Author: ianh
Date: 2008-05-22 02:21:58 -0700 (Thu, 22 May 2008)
New Revision: 1661

Modified:
   index
   source
Log:
[c] (0) Encoding aliases for CJK environments.

Modified: index
===================================================================
--- index	2008-05-21 23:40:01 UTC (rev 1660)
+++ index	2008-05-22 09:21:58 UTC (rev 1661)
@@ -25,7 +25,7 @@
 
    <h1 id=html-5>HTML 5</h1>
 
-   <h2 class="no-num no-toc" id=draft>Draft Recommendation — 21 May
+   <h2 class="no-num no-toc" id=draft>Draft Recommendation — 22 May
     2008</h2>
 
    <p>You can take part in this work. <a
@@ -42050,9 +42050,73 @@
    control character, be considered <a href="#parse1" title="parse
    error">parse errors</a>.
 
+  <p>In addition, when a user agent would otherwise use an encoding given in
+   the first column of the following table, it must instead use the encoding
+   given in the cell in the second column of the same row. Any bytes that are
+   treated differently due to this encoding aliasing must be considered <a
+   href="#parse1" title="parse error">parse errors</a>.
+
+  <table>
+   <caption>Encoding aliases</caption>
+
+   <thead>
+    <tr>
+     <th> Input encoding
+
+     <th> Replacement encoding
+
+     <th> References
+
+   <tbody>
+    <tr>
+     <td> GB2312
+
+     <td> GBK
+
+     <td> <a href="#refsGB2312">[GB2312]</a><!-- XXX ? --> <a
+      href="#refsGBK">[GBK]</a><!-- http://www.iana.org/assignments/charset-reg/GBK -->
+      
+
+    <tr>
+     <td> GB_2312-80
+
+     <td> GBK
+
+     <td> <a href="#refsRFC1345">[RFC1345]</a> <a
+      href="#refsGBK">[GBK]</a><!-- http://www.iana.org/assignments/charset-reg/GBK -->
+      
+
+    <tr>
+     <td> EUC-KR
+
+     <td> Windows-949
+
+     <td> <a href="#refsEUCKR">[EUCKR]</a>
+      <!-- see reference for [EUC-KR] in RFC1557 --> <a
+      href="#refsWin949">[WIN949]</a><!-- http://www.microsoft.com/globaldev/reference/dbcs/949.mspx -->
+      
+
+    <tr>
+     <td> KS_C_5601-1987
+
+     <td> Windows-949
+
+     <td> <a href="#refsRFC1345">[RFC1345]</a> <a
+      href="#refsWin949">[WIN949]</a><!-- http://www.microsoft.com/globaldev/reference/dbcs/949.mspx -->
+      
+
+    <tr>
+     <td> x-x-big5
+
+     <td> Big5
+
+     <td> <a href="#BIG5">[BIG5]</a> <!-- XXX ? -->
+  </table>
+
   <p class=note>The requirement to treat certain ISO-8859 encodings as
-   Windows encodings is a willful violation of the W3C Character Model
-   specification. <a href="#refsCHARMOD">[CHARMOD]</a>
+   Windows encodings, and the requirement to alias certain encodings
+   according to the table above, are willful violations of the W3C Character
+   Model specification. <a href="#refsCHARMOD">[CHARMOD]</a>
 
   <p>User agents must not support the CESU-8, UTF-7, BOCU-1 and SCSU
    encodings. <a href="#refsCESU8">[CESU8]</a> <a href="#refsUTF7">[UTF7]</a>

Modified: source
===================================================================
--- source	2008-05-21 23:40:01 UTC (rev 1660)
+++ source	2008-05-22 09:21:58 UTC (rev 1661)
@@ -39716,9 +39716,40 @@
   encoding instead of as a control character, be considered <span
   title="parse error">parse errors</span>.</p>
 
+  <p>In addition, when a user agent would otherwise use an encoding
+  given in the first column of the following table, it must instead
+  use the encoding given in the cell in the second column of the same
+  row. Any bytes that are treated differently due to this encoding
+  aliasing must be considered <span title="parse error">parse
+  errors</span>.</p>
+
+  <table>
+   <caption>Encoding aliases</caption>
+   <thead>
+    <tr> <th> Input encoding <th> Replacement encoding <th> References
+   <tbody>
+    <tr> <td> GB2312 <td> GBK <td>
+         <a href="#refsGB2312">[GB2312]</a><!-- XXX ? -->
+         <a href="#refsGBK">[GBK]</a><!-- http://www.iana.org/assignments/charset-reg/GBK -->
+    <tr> <td> GB_2312-80 <td> GBK <td>
+         <a href="#refsRFC1345">[RFC1345]</a>
+         <a href="#refsGBK">[GBK]</a><!-- http://www.iana.org/assignments/charset-reg/GBK -->
+    <tr> <td> EUC-KR <td> Windows-949 <td>
+         <a href="#refsEUCKR">[EUCKR]</a> <!-- see reference for [EUC-KR] in RFC1557 -->
+         <a href="#refsWin949">[WIN949]</a><!-- http://www.microsoft.com/globaldev/reference/dbcs/949.mspx -->
+    <tr> <td> KS_C_5601-1987 <td> Windows-949 <td>
+         <a href="#refsRFC1345">[RFC1345]</a>
+         <a href="#refsWin949">[WIN949]</a><!-- http://www.microsoft.com/globaldev/reference/dbcs/949.mspx -->
+    <tr> <td> x-x-big5 <td> Big5 <td>
+         <a href="#BIG5">[BIG5]</a> <!-- XXX ? -->
+   </tbody>
+  </table>
+
   <p class="note">The requirement to treat certain ISO-8859 encodings
-  as Windows encodings is a willful violation of the W3C Character
-  Model specification. <a href="#refsCHARMOD">[CHARMOD]</a></p>
+  as Windows encodings, and the requirement to alias certain encodings
+  according to the table above, are willful violations of the W3C
+  Character Model specification. <a
+  href="#refsCHARMOD">[CHARMOD]</a></p>
 
   <p>User agents must not support the CESU-8, UTF-7, BOCU-1 and SCSU
   encodings. <a href="#refsCESU8">[CESU8]</a> <a




More information about the Commit-Watchers mailing list