[html5] r4282 - [a] (0) discourage use of HZ-GB-2312; explain why.
whatwg at whatwg.org
whatwg at whatwg.org
Thu Oct 22 20:13:39 PDT 2009
Author: ianh
Date: 2009-10-22 20:13:34 -0700 (Thu, 22 Oct 2009)
New Revision: 4282
Modified:
complete.html
index
source
Log:
[a] (0) discourage use of HZ-GB-2312; explain why.
Modified: complete.html
===================================================================
--- complete.html 2009-10-23 02:34:24 UTC (rev 4281)
+++ complete.html 2009-10-23 03:13:34 UTC (rev 4282)
@@ -11888,12 +11888,13 @@
<a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>.</p>
<p>Authors should not use JIS-X-0208 <!-- x-JIS0208 -->
- (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), encodings based on
- ISO-2022<!-- http://krijnhoetmer.nl/irc-logs/whatwg/20090628#l-422
- -->, and encodings based on EBCDIC. Authors should not use
- UTF-32. Authors must not use the CESU-8, UTF-7, BOCU-1 and SCSU
- encodings.
+ (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), HZ-GB-2312<!-- has
+ crazy handling of ASCII "~" -->, encodings based on ISO-2022<!--
+ http://krijnhoetmer.nl/irc-logs/whatwg/20090628#l-422 -->, and
+ encodings based on EBCDIC. Authors should not use UTF-32.
+ Authors must not use the CESU-8, UTF-7, BOCU-1 and SCSU encodings.
<a href=#refsRFC1345>[RFC1345]</a><!-- for the JIS types -->
+ <a href=#refsRFC1842>[RFC1842]</a><!-- HZ-GB-2312 -->
<a href=#refsRFC1468>[RFC1468]</a><!-- ISO-2022-JP -->
<a href=#refsRFC2237>[RFC2237]</a><!-- ISO-2022-JP-1 -->
<a href=#refsRFC1554>[RFC1554]</a><!-- ISO-2022-JP-2 -->
@@ -11907,8 +11908,18 @@
<!-- no idea what to reference for EBCDIC, so... -->
</p>
+ <p class=note>Most of these encodings are discouraged because of
+ security concerns. If a hostile user can contribute text to a site
+ using these encodings, bugs in the site's whitelisting filter or in
+ a user agent can easily lead to the filter interpreting the
+ contribution as "safe" while the user agent interprets the same
+ contribution as containing a <code><a href=#script>script</a></code> element. This would
+ enable cross-site scripting attacks. By avoiding these encodings,
+ and always providing a <a href=#character-encoding-declaration>character encoding declaration</a>,
+ an author is less likely to run into this kind of problem.</p>
+
<p>Authors are encouraged to use UTF-8. Conformance checkers may
- advise against authors using legacy encodings.</p>
+ advise authors against using legacy encodings.</p>
<div class=impl>
@@ -86522,6 +86533,13 @@
Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF,
December 1993.</dd>
+ <dt id=refsRFC1842>[RFC1842]</dt>
+
+ <dd><cite><a href=http://www.ietf.org/rfc/rfc1842.txt>ASCII
+ Printable Characters-Based Chinese Character Encoding for Internet
+ Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang.
+ IETF, August 1995.</dd>
+
<dt id=refsRFC1922>[RFC1922]</dt>
<dd><cite><a href=http://www.ietf.org/rfc/rfc1922.txt>Chinese Character
Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao,
Modified: index
===================================================================
--- index 2009-10-23 02:34:24 UTC (rev 4281)
+++ index 2009-10-23 03:13:34 UTC (rev 4282)
@@ -11718,12 +11718,13 @@
<a href=#ascii-compatible-character-encoding>ASCII-compatible character encoding</a>.</p>
<p>Authors should not use JIS-X-0208 <!-- x-JIS0208 -->
- (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), encodings based on
- ISO-2022<!-- http://krijnhoetmer.nl/irc-logs/whatwg/20090628#l-422
- -->, and encodings based on EBCDIC. Authors should not use
- UTF-32. Authors must not use the CESU-8, UTF-7, BOCU-1 and SCSU
- encodings.
+ (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), HZ-GB-2312<!-- has
+ crazy handling of ASCII "~" -->, encodings based on ISO-2022<!--
+ http://krijnhoetmer.nl/irc-logs/whatwg/20090628#l-422 -->, and
+ encodings based on EBCDIC. Authors should not use UTF-32.
+ Authors must not use the CESU-8, UTF-7, BOCU-1 and SCSU encodings.
<a href=#refsRFC1345>[RFC1345]</a><!-- for the JIS types -->
+ <a href=#refsRFC1842>[RFC1842]</a><!-- HZ-GB-2312 -->
<a href=#refsRFC1468>[RFC1468]</a><!-- ISO-2022-JP -->
<a href=#refsRFC2237>[RFC2237]</a><!-- ISO-2022-JP-1 -->
<a href=#refsRFC1554>[RFC1554]</a><!-- ISO-2022-JP-2 -->
@@ -11737,8 +11738,18 @@
<!-- no idea what to reference for EBCDIC, so... -->
</p>
+ <p class=note>Most of these encodings are discouraged because of
+ security concerns. If a hostile user can contribute text to a site
+ using these encodings, bugs in the site's whitelisting filter or in
+ a user agent can easily lead to the filter interpreting the
+ contribution as "safe" while the user agent interprets the same
+ contribution as containing a <code><a href=#script>script</a></code> element. This would
+ enable cross-site scripting attacks. By avoiding these encodings,
+ and always providing a <a href=#character-encoding-declaration>character encoding declaration</a>,
+ an author is less likely to run into this kind of problem.</p>
+
<p>Authors are encouraged to use UTF-8. Conformance checkers may
- advise against authors using legacy encodings.</p>
+ advise authors against using legacy encodings.</p>
<div class=impl>
@@ -77700,6 +77711,13 @@
Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF,
December 1993.</dd>
+ <dt id=refsRFC1842>[RFC1842]</dt>
+
+ <dd><cite><a href=http://www.ietf.org/rfc/rfc1842.txt>ASCII
+ Printable Characters-Based Chinese Character Encoding for Internet
+ Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang.
+ IETF, August 1995.</dd>
+
<dt id=refsRFC1922>[RFC1922]</dt>
<dd><cite><a href=http://www.ietf.org/rfc/rfc1922.txt>Chinese Character
Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao,
Modified: source
===================================================================
--- source 2009-10-23 02:34:24 UTC (rev 4281)
+++ source 2009-10-23 03:13:34 UTC (rev 4282)
@@ -12379,12 +12379,13 @@
<span>ASCII-compatible character encoding</span>.</p>
<p>Authors should not use JIS-X-0208 <!-- x-JIS0208 -->
- (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), encodings based on
- ISO-2022<!-- http://krijnhoetmer.nl/irc-logs/whatwg/20090628#l-422
- -->, and encodings based on EBCDIC. Authors should not use
- UTF-32. Authors must not use the CESU-8, UTF-7, BOCU-1 and SCSU
- encodings.
+ (JIS_C6226-1983), JIS-X-0212 (JIS_X0212-1990), HZ-GB-2312<!-- has
+ crazy handling of ASCII "~" -->, encodings based on ISO-2022<!--
+ http://krijnhoetmer.nl/irc-logs/whatwg/20090628#l-422 -->, and
+ encodings based on EBCDIC. Authors should not use UTF-32.
+ Authors must not use the CESU-8, UTF-7, BOCU-1 and SCSU encodings.
<a href="#refsRFC1345">[RFC1345]</a><!-- for the JIS types -->
+ <a href="#refsRFC1842">[RFC1842]</a><!-- HZ-GB-2312 -->
<a href="#refsRFC1468">[RFC1468]</a><!-- ISO-2022-JP -->
<a href="#refsRFC2237">[RFC2237]</a><!-- ISO-2022-JP-1 -->
<a href="#refsRFC1554">[RFC1554]</a><!-- ISO-2022-JP-2 -->
@@ -12398,8 +12399,18 @@
<!-- no idea what to reference for EBCDIC, so... -->
</p>
+ <p class="note">Most of these encodings are discouraged because of
+ security concerns. If a hostile user can contribute text to a site
+ using these encodings, bugs in the site's whitelisting filter or in
+ a user agent can easily lead to the filter interpreting the
+ contribution as "safe" while the user agent interprets the same
+ contribution as containing a <code>script</code> element. This would
+ enable cross-site scripting attacks. By avoiding these encodings,
+ and always providing a <span>character encoding declaration</span>,
+ an author is less likely to run into this kind of problem.</p>
+
<p>Authors are encouraged to use UTF-8. Conformance checkers may
- advise against authors using legacy encodings.</p>
+ advise authors against using legacy encodings.</p>
<div class="impl">
@@ -95692,6 +95703,13 @@
Encoding for Internet Messages</a></cite>, U. Choi, K. Chon, H. Park. IETF,
December 1993.</dd>
+ <dt id="refsRFC1842">[RFC1842]</dt>
+
+ <dd><cite><a href="http://www.ietf.org/rfc/rfc1842.txt">ASCII
+ Printable Characters-Based Chinese Character Encoding for Internet
+ Messages</a></cite>, Y. Wei, Y. Zhang, J. Li, J. Ding, Y. Jiang.
+ IETF, August 1995.</dd>
+
<dt id="refsRFC1922">[RFC1922]</dt>
<dd><cite><a href="http://www.ietf.org/rfc/rfc1922.txt">Chinese Character
Encoding for Internet Messages</a></cite>, HF. Zhu, DY. Hu, ZG. Wang, TC. Kao,
More information about the Commit-Watchers
mailing list