[html5] r1263 - /
whatwg at whatwg.org
whatwg at whatwg.org
Wed Feb 27 12:43:42 PST 2008
Author: ianh
Date: 2008-02-27 12:43:37 -0800 (Wed, 27 Feb 2008)
New Revision: 1263
Modified:
index
source
Log:
[ac] (1) Make control characters and non-Unicode characters be parse errors, for compatibility with XML.
Modified: index
===================================================================
--- index 2008-02-27 19:35:50 UTC (rev 1262)
+++ index 2008-02-27 20:43:37 UTC (rev 1263)
@@ -37154,7 +37154,9 @@
href="#charset">character encoding declarations</a> are to be serialised,
as discussed in the section on that topic.
- <p>The U+0000 NULL character must not appear anywhere in a document.
+ <p>The U+0000 NULL character, control characters other than the <a
+ href="#space" title="space character">space characters</a>, and characters
+ that are not defined by Unicode, must not appear anywhere in a document.
<p class=note>Space characters before the root <code><a
href="#html">html</a></code> element will be dropped when the document is
@@ -38428,6 +38430,21 @@
REPLACEMENT CHARACTERs. Any occurrences of such characters is a <a
href="#parse0">parse error</a>.
+ <p>Any occurances of any characters in the ranges U+0001 to U+0008,
+ <!-- space characters allowed --> U+000E to U+001F, <!-- ASCII
+ allowed -->
+ U+007F <!--to U+0084, (U+0085 NEL not allowed),
+ U+0086--> to U+009F,
+ U+D800 to U+DFFF <!-- surrogates not allowed
+ -->, U+FDD0 to U+FDDF, and
+ characters U+FFFE, U+FFFF, U+1FFFE, U+1FFFF, U+2FFFE, U+2FFFF, U+3FFFE,
+ U+3FFFF, U+4FFFE, U+4FFFF, U+5FFFE, U+5FFFF, U+6FFFE, U+6FFFF, U+7FFFE,
+ U+7FFFF, U+8FFFE, U+8FFFF, U+9FFFE, U+9FFFF, U+AFFFE, U+AFFFF, U+BFFFE,
+ U+BFFFF, U+CFFFE, U+CFFFF, U+DFFFE, U+DFFFF, U+EFFFE, U+EFFFF, U+FFFFE,
+ U+FFFFF, U+10FFFE, and U+10FFFF are <a href="#parse0" title="parse
+ error">parse errors</a>. (These are all control characters or permanently
+ undefined Unicode characters.)
+
<p>U+000D CARRIAGE RETURN (CR) characters, and U+000A LINE FEED (LF)
characters, are treated specially. Any CR characters that are followed by
LF characters must be removed, and any CR characters not followed by LF
Modified: source
===================================================================
--- source 2008-02-27 19:35:50 UTC (rev 1262)
+++ source 2008-02-27 20:43:37 UTC (rev 1263)
@@ -34677,8 +34677,10 @@
href="#charset">character encoding declarations</a> are to be
serialised, as discussed in the section on that topic.</p>
- <p>The U+0000 NULL character must not appear anywhere in a
- document.</p>
+ <p>The U+0000 NULL character, control characters other than the
+ <span title="space character">space characters</span>, and
+ characters that are not defined by Unicode, must not appear anywhere
+ in a document.</p>
<p class="note">Space characters before the root <code>html</code>
element will be dropped when the document is parsed; space
@@ -35997,6 +35999,19 @@
U+FFFD REPLACEMENT CHARACTERs. Any occurrences of such characters is
a <span>parse error</span>.</p>
+ <p>Any occurances of any characters in the ranges U+0001 to U+0008,
+ <!-- space characters allowed --> U+000E to U+001F, <!-- ASCII
+ allowed --> U+007F <!--to U+0084, (U+0085 NEL not allowed),
+ U+0086--> to U+009F, U+D800 to U+DFFF <!-- surrogates not allowed
+ -->, U+FDD0 to U+FDDF, and characters U+FFFE, U+FFFF, U+1FFFE,
+ U+1FFFF, U+2FFFE, U+2FFFF, U+3FFFE, U+3FFFF, U+4FFFE, U+4FFFF,
+ U+5FFFE, U+5FFFF, U+6FFFE, U+6FFFF, U+7FFFE, U+7FFFF, U+8FFFE,
+ U+8FFFF, U+9FFFE, U+9FFFF, U+AFFFE, U+AFFFF, U+BFFFE, U+BFFFF,
+ U+CFFFE, U+CFFFF, U+DFFFE, U+DFFFF, U+EFFFE, U+EFFFF, U+FFFFE,
+ U+FFFFF, U+10FFFE, and U+10FFFF are <span title="parse error">parse
+ errors</span>. (These are all control characters or permanently
+ undefined Unicode characters.)</p>
+
<p>U+000D CARRIAGE RETURN (CR) characters, and U+000A LINE FEED (LF)
characters, are treated specially. Any CR characters that are
followed by LF characters must be removed, and any CR characters not
More information about the Commit-Watchers
mailing list