[html5] r1661 - /
whatwg at whatwg.org
whatwg at whatwg.org
Thu May 22 02:21:59 PDT 2008
Author: ianh
Date: 2008-05-22 02:21:58 -0700 (Thu, 22 May 2008)
New Revision: 1661
Modified:
index
source
Log:
[c] (0) Encoding aliases for CJK environments.
Modified: index
===================================================================
--- index 2008-05-21 23:40:01 UTC (rev 1660)
+++ index 2008-05-22 09:21:58 UTC (rev 1661)
@@ -25,7 +25,7 @@
<h1 id=html-5>HTML 5</h1>
- <h2 class="no-num no-toc" id=draft>Draft Recommendation — 21 May
+ <h2 class="no-num no-toc" id=draft>Draft Recommendation — 22 May
2008</h2>
<p>You can take part in this work. <a
@@ -42050,9 +42050,73 @@
control character, be considered <a href="#parse1" title="parse
error">parse errors</a>.
+ <p>In addition, when a user agent would otherwise use an encoding given in
+ the first column of the following table, it must instead use the encoding
+ given in the cell in the second column of the same row. Any bytes that are
+ treated differently due to this encoding aliasing must be considered <a
+ href="#parse1" title="parse error">parse errors</a>.
+
+ <table>
+ <caption>Encoding aliases</caption>
+
+ <thead>
+ <tr>
+ <th> Input encoding
+
+ <th> Replacement encoding
+
+ <th> References
+
+ <tbody>
+ <tr>
+ <td> GB2312
+
+ <td> GBK
+
+ <td> <a href="#refsGB2312">[GB2312]</a><!-- XXX ? --> <a
+ href="#refsGBK">[GBK]</a><!-- http://www.iana.org/assignments/charset-reg/GBK -->
+
+
+ <tr>
+ <td> GB_2312-80
+
+ <td> GBK
+
+ <td> <a href="#refsRFC1345">[RFC1345]</a> <a
+ href="#refsGBK">[GBK]</a><!-- http://www.iana.org/assignments/charset-reg/GBK -->
+
+
+ <tr>
+ <td> EUC-KR
+
+ <td> Windows-949
+
+ <td> <a href="#refsEUCKR">[EUCKR]</a>
+ <!-- see reference for [EUC-KR] in RFC1557 --> <a
+ href="#refsWin949">[WIN949]</a><!-- http://www.microsoft.com/globaldev/reference/dbcs/949.mspx -->
+
+
+ <tr>
+ <td> KS_C_5601-1987
+
+ <td> Windows-949
+
+ <td> <a href="#refsRFC1345">[RFC1345]</a> <a
+ href="#refsWin949">[WIN949]</a><!-- http://www.microsoft.com/globaldev/reference/dbcs/949.mspx -->
+
+
+ <tr>
+ <td> x-x-big5
+
+ <td> Big5
+
+ <td> <a href="#BIG5">[BIG5]</a> <!-- XXX ? -->
+ </table>
+
<p class=note>The requirement to treat certain ISO-8859 encodings as
- Windows encodings is a willful violation of the W3C Character Model
- specification. <a href="#refsCHARMOD">[CHARMOD]</a>
+ Windows encodings, and the requirement to alias certain encodings
+ according to the table above, are willful violations of the W3C Character
+ Model specification. <a href="#refsCHARMOD">[CHARMOD]</a>
<p>User agents must not support the CESU-8, UTF-7, BOCU-1 and SCSU
encodings. <a href="#refsCESU8">[CESU8]</a> <a href="#refsUTF7">[UTF7]</a>
Modified: source
===================================================================
--- source 2008-05-21 23:40:01 UTC (rev 1660)
+++ source 2008-05-22 09:21:58 UTC (rev 1661)
@@ -39716,9 +39716,40 @@
encoding instead of as a control character, be considered <span
title="parse error">parse errors</span>.</p>
+ <p>In addition, when a user agent would otherwise use an encoding
+ given in the first column of the following table, it must instead
+ use the encoding given in the cell in the second column of the same
+ row. Any bytes that are treated differently due to this encoding
+ aliasing must be considered <span title="parse error">parse
+ errors</span>.</p>
+
+ <table>
+ <caption>Encoding aliases</caption>
+ <thead>
+ <tr> <th> Input encoding <th> Replacement encoding <th> References
+ <tbody>
+ <tr> <td> GB2312 <td> GBK <td>
+ <a href="#refsGB2312">[GB2312]</a><!-- XXX ? -->
+ <a href="#refsGBK">[GBK]</a><!-- http://www.iana.org/assignments/charset-reg/GBK -->
+ <tr> <td> GB_2312-80 <td> GBK <td>
+ <a href="#refsRFC1345">[RFC1345]</a>
+ <a href="#refsGBK">[GBK]</a><!-- http://www.iana.org/assignments/charset-reg/GBK -->
+ <tr> <td> EUC-KR <td> Windows-949 <td>
+ <a href="#refsEUCKR">[EUCKR]</a> <!-- see reference for [EUC-KR] in RFC1557 -->
+ <a href="#refsWin949">[WIN949]</a><!-- http://www.microsoft.com/globaldev/reference/dbcs/949.mspx -->
+ <tr> <td> KS_C_5601-1987 <td> Windows-949 <td>
+ <a href="#refsRFC1345">[RFC1345]</a>
+ <a href="#refsWin949">[WIN949]</a><!-- http://www.microsoft.com/globaldev/reference/dbcs/949.mspx -->
+ <tr> <td> x-x-big5 <td> Big5 <td>
+ <a href="#BIG5">[BIG5]</a> <!-- XXX ? -->
+ </tbody>
+ </table>
+
<p class="note">The requirement to treat certain ISO-8859 encodings
- as Windows encodings is a willful violation of the W3C Character
- Model specification. <a href="#refsCHARMOD">[CHARMOD]</a></p>
+ as Windows encodings, and the requirement to alias certain encodings
+ according to the table above, are willful violations of the W3C
+ Character Model specification. <a
+ href="#refsCHARMOD">[CHARMOD]</a></p>
<p>User agents must not support the CESU-8, UTF-7, BOCU-1 and SCSU
encodings. <a href="#refsCESU8">[CESU8]</a> <a
More information about the Commit-Watchers
mailing list