[html5] r8182 - [c] (2) Disallow surrogates in the input stream; make the syntax section match t [...]
whatwg at whatwg.org
whatwg at whatwg.org
Fri Sep 13 14:27:14 PDT 2013
Author: ianh
Date: 2013-09-13 14:27:11 -0700 (Fri, 13 Sep 2013)
New Revision: 8182
Modified:
complete.html
index
source
Log:
[c] (2) Disallow surrogates in the input stream; make the syntax section match the parser for character references to surrogates; add a redundant paragraph regarding namespaces
Affected topics: HTML Syntax and Parsing
Modified: complete.html
===================================================================
--- complete.html 2013-09-12 20:36:49 UTC (rev 8181)
+++ complete.html 2013-09-13 21:27:11 UTC (rev 8182)
@@ -256,7 +256,7 @@
<header class=head id=head><p><a href=http://www.whatwg.org/ class=logo><img width=101 src=/images/logo alt=WHATWG height=101></a></p>
<hgroup><h1 class=allcaps>HTML</h1>
- <h2 class="no-num no-toc">Living Standard — Last Updated 12 September 2013</h2>
+ <h2 class="no-num no-toc">Living Standard — Last Updated 13 September 2013</h2>
</hgroup><dl><dt><strong>Web developer edition:</strong></dt>
<dd><strong><a href=http://developers.whatwg.org/>http://developers.whatwg.org/</a></strong></dd>
<dt>Multiple-page version:</dt>
@@ -85313,7 +85313,7 @@
character (;).</dd>
</dl><p>The numeric character reference forms described above are allowed to reference any Unicode code
- point other than U+0000, U+000D, permanently undefined Unicode characters (noncharacters), and
+ point other than U+0000, U+000D, permanently undefined Unicode characters (noncharacters), surrogates (U+D800–U+DFFF), and
<a href=#control-characters>control characters</a> other than <a href=#space-character title="space character">space characters</a>.</p>
<p>An <dfn id=syntax-ambiguous-ampersand title=syntax-ambiguous-ampersand>ambiguous ampersand</dfn> is a U+0026 AMPERSAND
@@ -85416,6 +85416,13 @@
<p>For the purposes of conformance checkers, if a resource is determined to be in <a href=#syntax>the HTML
syntax</a>, then it is an <a href=#html-documents title="HTML documents">HTML document</a>.</p>
+ <p class=note>As stated <a href=#html-elements class=no-backref title="HTML elements">in the terminology
+ section</a>, references to <a href=#element-type title="element type">element types</a> that do not
+ explicitly specify a namespace always refer to elements in the <a href=#html-namespace-0>HTML namespace</a>. For
+ example, if the spec talks about "a <code><a href=#the-menuitem-element>menuitem</a></code> element", then that is an element with
+ the local name "<code title="">menuitem</code>", the namespace "<code title="">http://www.w3.org/1999/xhtml</code>", and the interface <code><a href=#htmlmenuitemelement>HTMLMenuItemElement</a></code>.
+ Where possible, references to such elements are hyperlinked to their definition.</p>
+
</div>
@@ -86410,6 +86417,10 @@
errors</a>. These are all <a href=#control-characters>control characters</a> or permanently
undefined Unicode characters (noncharacters).</p>
+ <p>Any <a href=#character>character</a> that is a not a <a href=#unicode-character>Unicode character</a>, i.e. any isolated
+ surrogates, is a <a href=#parse-error>parse error</a>. (These can only find their way into the input stream
+ via script APIs such as <code title=dom-document-write><a href=#dom-document-write>document.write()</a></code>.)</p>
+
<p>U+000D CARRIAGE RETURN (CR) characters and U+000A LINE FEED (LF)
characters are treated specially. All CR characters must be
converted to LF characters, and any LF characters that immediately
Modified: index
===================================================================
--- index 2013-09-12 20:36:49 UTC (rev 8181)
+++ index 2013-09-13 21:27:11 UTC (rev 8182)
@@ -256,7 +256,7 @@
<header class=head id=head><p><a href=http://www.whatwg.org/ class=logo><img width=101 src=/images/logo alt=WHATWG height=101></a></p>
<hgroup><h1 class=allcaps>HTML</h1>
- <h2 class="no-num no-toc">Living Standard — Last Updated 12 September 2013</h2>
+ <h2 class="no-num no-toc">Living Standard — Last Updated 13 September 2013</h2>
</hgroup><dl><dt><strong>Web developer edition:</strong></dt>
<dd><strong><a href=http://developers.whatwg.org/>http://developers.whatwg.org/</a></strong></dd>
<dt>Multiple-page version:</dt>
@@ -85313,7 +85313,7 @@
character (;).</dd>
</dl><p>The numeric character reference forms described above are allowed to reference any Unicode code
- point other than U+0000, U+000D, permanently undefined Unicode characters (noncharacters), and
+ point other than U+0000, U+000D, permanently undefined Unicode characters (noncharacters), surrogates (U+D800–U+DFFF), and
<a href=#control-characters>control characters</a> other than <a href=#space-character title="space character">space characters</a>.</p>
<p>An <dfn id=syntax-ambiguous-ampersand title=syntax-ambiguous-ampersand>ambiguous ampersand</dfn> is a U+0026 AMPERSAND
@@ -85416,6 +85416,13 @@
<p>For the purposes of conformance checkers, if a resource is determined to be in <a href=#syntax>the HTML
syntax</a>, then it is an <a href=#html-documents title="HTML documents">HTML document</a>.</p>
+ <p class=note>As stated <a href=#html-elements class=no-backref title="HTML elements">in the terminology
+ section</a>, references to <a href=#element-type title="element type">element types</a> that do not
+ explicitly specify a namespace always refer to elements in the <a href=#html-namespace-0>HTML namespace</a>. For
+ example, if the spec talks about "a <code><a href=#the-menuitem-element>menuitem</a></code> element", then that is an element with
+ the local name "<code title="">menuitem</code>", the namespace "<code title="">http://www.w3.org/1999/xhtml</code>", and the interface <code><a href=#htmlmenuitemelement>HTMLMenuItemElement</a></code>.
+ Where possible, references to such elements are hyperlinked to their definition.</p>
+
</div>
@@ -86410,6 +86417,10 @@
errors</a>. These are all <a href=#control-characters>control characters</a> or permanently
undefined Unicode characters (noncharacters).</p>
+ <p>Any <a href=#character>character</a> that is a not a <a href=#unicode-character>Unicode character</a>, i.e. any isolated
+ surrogates, is a <a href=#parse-error>parse error</a>. (These can only find their way into the input stream
+ via script APIs such as <code title=dom-document-write><a href=#dom-document-write>document.write()</a></code>.)</p>
+
<p>U+000D CARRIAGE RETURN (CR) characters and U+000A LINE FEED (LF)
characters are treated specially. All CR characters must be
converted to LF characters, and any LF characters that immediately
Modified: source
===================================================================
--- source 2013-09-12 20:36:49 UTC (rev 8181)
+++ source 2013-09-13 21:27:11 UTC (rev 8182)
@@ -95187,7 +95187,7 @@
</dl>
<p>The numeric character reference forms described above are allowed to reference any Unicode code
- point other than U+0000, U+000D, permanently undefined Unicode characters (noncharacters), and
+ point other than U+0000, U+000D, permanently undefined Unicode characters (noncharacters), surrogates (U+D800–U+DFFF), and
<span>control characters</span> other than <span title="space character">space characters</span>.</p>
<p>An <dfn title="syntax-ambiguous-ampersand">ambiguous ampersand</dfn> is a U+0026 AMPERSAND
@@ -95296,6 +95296,14 @@
<p>For the purposes of conformance checkers, if a resource is determined to be in <span>the HTML
syntax</span>, then it is an <span title="HTML documents">HTML document</span>.</p>
+ <p class="note">As stated <span class="no-backref" title="HTML elements">in the terminology
+ section</span>, references to <span title="element type">element types</span> that do not
+ explicitly specify a namespace always refer to elements in the <span>HTML namespace</span>. For
+ example, if the spec talks about "a <code>menuitem</code> element", then that is an element with
+ the local name "<code title="">menuitem</code>", the namespace "<code
+ title="">http://www.w3.org/1999/xhtml</code>", and the interface <code>HTMLMenuItemElement</code>.
+ Where possible, references to such elements are hyperlinked to their definition.</p>
+
</div>
@@ -96436,6 +96444,10 @@
errors</span>. These are all <span>control characters</span> or permanently
undefined Unicode characters (noncharacters).</p>
+ <p>Any <span>character</span> that is a not a <span>Unicode character</span>, i.e. any isolated
+ surrogates, is a <span>parse error</span>. (These can only find their way into the input stream
+ via script APIs such as <code title="dom-document-write">document.write()</code>.)</p>
+
<p>U+000D CARRIAGE RETURN (CR) characters and U+000A LINE FEED (LF)
characters are treated specially. All CR characters must be
converted to LF characters, and any LF characters that immediately
More information about the Commit-Watchers
mailing list