[html5] r4933 - [cgiowt] (2) Make 
 map to U+000D and not U+000A. This has ramifications thr [...]

whatwg at whatwg.org whatwg at whatwg.org
Wed Mar 31 18:21:33 PDT 2010


Author: ianh
Date: 2010-03-31 18:21:32 -0700 (Wed, 31 Mar 2010)
New Revision: 4933

Modified:
   complete.html
   index
   source
Log:
[cgiowt] (2) Make 
 map to U+000D and not U+000A. This has ramifications throughout the parser.
Fixing http://www.w3.org/Bugs/Public/show_bug.cgi?id=9144

Modified: complete.html
===================================================================
--- complete.html	2010-04-01 01:00:38 UTC (rev 4932)
+++ complete.html	2010-04-01 01:21:32 UTC (rev 4933)
@@ -73323,7 +73323,12 @@
   LINE FEED (LF) characters, or pairs of U+000D CARRIAGE RETURN (CR),
   U+000A LINE FEED (LF) characters in that order.</p>
 
+  <p>Where <a href=#syntax-charref title=syntax-charref>character references</a>
+  are allowed, a character reference of a U+000A LINE FEED (LF)
+  character (but not a U+000D CARRIAGE RETURN (CR) character) also
+  represents a <a href=#syntax-newlines title=syntax-newlines>newline</a>.</p>
 
+
   <h4 id=character-references><span class=secno>12.1.4 </span>Character references</h4>
 
   <p>In certain cases described in other sections, <a href=#syntax-text title=syntax-text>text</a> may be mixed with <dfn id=syntax-charref title=syntax-charref>character references</dfn>. These can be used
@@ -73367,9 +73372,9 @@
    (;).</dd>
 
   </dl><p>The numeric character reference forms described above are allowed
-  to reference any Unicode code point other than U+0000, permanently
-  undefined Unicode characters (noncharacters), and control characters
-  other than <a href=#space-character title="space character">space
+  to reference any Unicode code point other than U+0000, U+000D,
+  permanently undefined Unicode characters (noncharacters), and
+  control characters other than <a href=#space-character title="space character">space
   characters</a>.</p>
 
   <p>An <dfn id=syntax-ambiguous-ampersand title=syntax-ambiguous-ampersand>ambiguous
@@ -76700,7 +76705,7 @@
 
     <table><thead><tr><th>Number <th colspan=2>Unicode character
      <tbody><tr><td>0x00 <td>U+FFFD <td>REPLACEMENT CHARACTER
-      <tr><td>0x0D <td>U+000A <td>LINE FEED (LF)
+      <tr><td>0x0D <td>U+000D <td>CARRIAGE RETURN (CR)
       <tr><td>0x80 <td>U+20AC <td>EURO SIGN (€)
       <tr><td>0x81 <td>U+0081 <td><control>
       <tr><td>0x82 <td>U+201A <td>SINGLE LOW-9 QUOTATION MARK (‚)
@@ -77125,7 +77130,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Ignore the token.</p>
    </dd>
@@ -77331,7 +77336,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Ignore the token.</p>
    </dd>
@@ -77403,7 +77408,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Ignore the token.</p> <!-- :-( -->
    </dd>
@@ -77469,7 +77474,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><a href=#insert-a-character title="insert a character">Insert the character</a> into
     the <a href=#current-node>current node</a>.</p>
@@ -77654,7 +77659,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dt>A comment token</dt>
    <dt>A start tag whose tag name is one of: "link", "meta", "noframes", "style"</dt>
    <dd>
@@ -77691,7 +77696,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><a href=#insert-a-character title="insert a character">Insert the character</a> into
     the <a href=#current-node>current node</a>.</p>
@@ -77789,8 +77794,8 @@
     character</a> into the <a href=#current-node>current node</a>.</p>
 
     <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
-    LINE FEED (LF), U+000C FORM FEED (FF), <!--U+000D CARRIAGE RETURN
-    (CR),--> or U+0020 SPACE, then set the <a href=#frameset-ok-flag>frameset-ok
+    LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
+    (CR), or U+0020 SPACE, then set the <a href=#frameset-ok-flag>frameset-ok
     flag</a> to "not ok".</p>
 
    </dd>
@@ -77986,6 +77991,9 @@
     one. (Newlines at the start of <code><a href=#the-pre-element>pre</a></code> blocks are
     ignored as an authoring convenience.)</p>
 
+    <!-- <pre>[CR]X will eat the [CR], <pre>&#x10;X will eat the
+    &#x10;, but <pre>&#x13;X will not eat the &#x13;. -->
+
     <p>Set the <a href=#frameset-ok-flag>frameset-ok flag</a> to "not ok".</p>
 
    </dd>
@@ -78722,6 +78730,8 @@
      token, then ignore that token and move on to the next
      one. (Newlines at the start of <code><a href=#the-textarea-element>textarea</a></code> elements are
      ignored as an authoring convenience.)</li>
+     
+     <!-- see comment in <pre> start tag bit -->
 
      <li><p>Switch the tokenizer to the <a href=#rcdata-state>RCDATA
      state</a>.</li>
@@ -79349,7 +79359,7 @@
     <p>If any of the tokens in the <var><a href=#pending-table-character-tokens>pending table character
     tokens</a></var> list are character tokens that are not one of U+0009
     CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED
-    (FF), <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE, then
+    (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE, then
     reprocess those character tokens using the rules given in the
     "anything else" entry in the <a href=#parsing-main-intable title="insertion mode: in
     table">in table</a>" insertion mode.</p>
@@ -79428,7 +79438,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><a href=#insert-a-character title="insert a character">Insert the character</a> into
     the <a href=#current-node>current node</a>.</p>
@@ -79974,8 +79984,8 @@
     character</a> into the <a href=#current-node>current node</a>.</p>
 
     <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
-    LINE FEED (LF), U+000C FORM FEED (FF), <!--U+000D CARRIAGE RETURN
-    (CR),--> or U+0020 SPACE, then set the <a href=#frameset-ok-flag>frameset-ok
+    LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
+    (CR), or U+0020 SPACE, then set the <a href=#frameset-ok-flag>frameset-ok
     flag</a> to "not ok".</p>
 
    </dd>
@@ -80195,7 +80205,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Process the token <a href=#using-the-rules-for>using the rules for</a> the "<a href=#parsing-main-inbody title="insertion mode: in body">in body</a>" <a href=#insertion-mode>insertion
     mode</a>.</p>
@@ -80253,7 +80263,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><a href=#insert-a-character title="insert a character">Insert the character</a> into
     the <a href=#current-node>current node</a>.</p>
@@ -80347,7 +80357,7 @@
   <!-- due to rules in the "in frameset" mode, this can't be entered in the fragment case -->
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><a href=#insert-a-character title="insert a character">Insert the character</a> into
     the <a href=#current-node>current node</a>.</p>
@@ -80408,7 +80418,7 @@
    <dt>A DOCTYPE token</dt>
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dt>A start tag whose tag name is "html"</dt>
    <dd>
     <p>Process the token <a href=#using-the-rules-for>using the rules for</a> the "<a href=#parsing-main-inbody title="insertion mode: in body">in body</a>" <a href=#insertion-mode>insertion
@@ -80442,7 +80452,7 @@
    <dt>A DOCTYPE token</dt>
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dt>A start tag whose tag name is "html"</dt>
    <dd>
     <p>Process the token <a href=#using-the-rules-for>using the rules for</a> the "<a href=#parsing-main-inbody title="insertion mode: in body">in body</a>" <a href=#insertion-mode>insertion

Modified: index
===================================================================
--- index	2010-04-01 01:00:38 UTC (rev 4932)
+++ index	2010-04-01 01:21:32 UTC (rev 4933)
@@ -66595,7 +66595,12 @@
   LINE FEED (LF) characters, or pairs of U+000D CARRIAGE RETURN (CR),
   U+000A LINE FEED (LF) characters in that order.</p>
 
+  <p>Where <a href=#syntax-charref title=syntax-charref>character references</a>
+  are allowed, a character reference of a U+000A LINE FEED (LF)
+  character (but not a U+000D CARRIAGE RETURN (CR) character) also
+  represents a <a href=#syntax-newlines title=syntax-newlines>newline</a>.</p>
 
+
   <h4 id=character-references><span class=secno>10.1.4 </span>Character references</h4>
 
   <p>In certain cases described in other sections, <a href=#syntax-text title=syntax-text>text</a> may be mixed with <dfn id=syntax-charref title=syntax-charref>character references</dfn>. These can be used
@@ -66639,9 +66644,9 @@
    (;).</dd>
 
   </dl><p>The numeric character reference forms described above are allowed
-  to reference any Unicode code point other than U+0000, permanently
-  undefined Unicode characters (noncharacters), and control characters
-  other than <a href=#space-character title="space character">space
+  to reference any Unicode code point other than U+0000, U+000D,
+  permanently undefined Unicode characters (noncharacters), and
+  control characters other than <a href=#space-character title="space character">space
   characters</a>.</p>
 
   <p>An <dfn id=syntax-ambiguous-ampersand title=syntax-ambiguous-ampersand>ambiguous
@@ -69972,7 +69977,7 @@
 
     <table><thead><tr><th>Number <th colspan=2>Unicode character
      <tbody><tr><td>0x00 <td>U+FFFD <td>REPLACEMENT CHARACTER
-      <tr><td>0x0D <td>U+000A <td>LINE FEED (LF)
+      <tr><td>0x0D <td>U+000D <td>CARRIAGE RETURN (CR)
       <tr><td>0x80 <td>U+20AC <td>EURO SIGN (€)
       <tr><td>0x81 <td>U+0081 <td><control>
       <tr><td>0x82 <td>U+201A <td>SINGLE LOW-9 QUOTATION MARK (‚)
@@ -70397,7 +70402,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Ignore the token.</p>
    </dd>
@@ -70603,7 +70608,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Ignore the token.</p>
    </dd>
@@ -70675,7 +70680,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Ignore the token.</p> <!-- :-( -->
    </dd>
@@ -70741,7 +70746,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><a href=#insert-a-character title="insert a character">Insert the character</a> into
     the <a href=#current-node>current node</a>.</p>
@@ -70926,7 +70931,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dt>A comment token</dt>
    <dt>A start tag whose tag name is one of: "link", "meta", "noframes", "style"</dt>
    <dd>
@@ -70963,7 +70968,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><a href=#insert-a-character title="insert a character">Insert the character</a> into
     the <a href=#current-node>current node</a>.</p>
@@ -71061,8 +71066,8 @@
     character</a> into the <a href=#current-node>current node</a>.</p>
 
     <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
-    LINE FEED (LF), U+000C FORM FEED (FF), <!--U+000D CARRIAGE RETURN
-    (CR),--> or U+0020 SPACE, then set the <a href=#frameset-ok-flag>frameset-ok
+    LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
+    (CR), or U+0020 SPACE, then set the <a href=#frameset-ok-flag>frameset-ok
     flag</a> to "not ok".</p>
 
    </dd>
@@ -71258,6 +71263,9 @@
     one. (Newlines at the start of <code><a href=#the-pre-element>pre</a></code> blocks are
     ignored as an authoring convenience.)</p>
 
+    <!-- <pre>[CR]X will eat the [CR], <pre>&#x10;X will eat the
+    &#x10;, but <pre>&#x13;X will not eat the &#x13;. -->
+
     <p>Set the <a href=#frameset-ok-flag>frameset-ok flag</a> to "not ok".</p>
 
    </dd>
@@ -71994,6 +72002,8 @@
      token, then ignore that token and move on to the next
      one. (Newlines at the start of <code><a href=#the-textarea-element>textarea</a></code> elements are
      ignored as an authoring convenience.)</li>
+     
+     <!-- see comment in <pre> start tag bit -->
 
      <li><p>Switch the tokenizer to the <a href=#rcdata-state>RCDATA
      state</a>.</li>
@@ -72621,7 +72631,7 @@
     <p>If any of the tokens in the <var><a href=#pending-table-character-tokens>pending table character
     tokens</a></var> list are character tokens that are not one of U+0009
     CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED
-    (FF), <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE, then
+    (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE, then
     reprocess those character tokens using the rules given in the
     "anything else" entry in the <a href=#parsing-main-intable title="insertion mode: in
     table">in table</a>" insertion mode.</p>
@@ -72700,7 +72710,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><a href=#insert-a-character title="insert a character">Insert the character</a> into
     the <a href=#current-node>current node</a>.</p>
@@ -73246,8 +73256,8 @@
     character</a> into the <a href=#current-node>current node</a>.</p>
 
     <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
-    LINE FEED (LF), U+000C FORM FEED (FF), <!--U+000D CARRIAGE RETURN
-    (CR),--> or U+0020 SPACE, then set the <a href=#frameset-ok-flag>frameset-ok
+    LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
+    (CR), or U+0020 SPACE, then set the <a href=#frameset-ok-flag>frameset-ok
     flag</a> to "not ok".</p>
 
    </dd>
@@ -73467,7 +73477,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Process the token <a href=#using-the-rules-for>using the rules for</a> the "<a href=#parsing-main-inbody title="insertion mode: in body">in body</a>" <a href=#insertion-mode>insertion
     mode</a>.</p>
@@ -73525,7 +73535,7 @@
 
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><a href=#insert-a-character title="insert a character">Insert the character</a> into
     the <a href=#current-node>current node</a>.</p>
@@ -73619,7 +73629,7 @@
   <!-- due to rules in the "in frameset" mode, this can't be entered in the fragment case -->
   <dl class=switch><dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><a href=#insert-a-character title="insert a character">Insert the character</a> into
     the <a href=#current-node>current node</a>.</p>
@@ -73680,7 +73690,7 @@
    <dt>A DOCTYPE token</dt>
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dt>A start tag whose tag name is "html"</dt>
    <dd>
     <p>Process the token <a href=#using-the-rules-for>using the rules for</a> the "<a href=#parsing-main-inbody title="insertion mode: in body">in body</a>" <a href=#insertion-mode>insertion
@@ -73714,7 +73724,7 @@
    <dt>A DOCTYPE token</dt>
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dt>A start tag whose tag name is "html"</dt>
    <dd>
     <p>Process the token <a href=#using-the-rules-for>using the rules for</a> the "<a href=#parsing-main-inbody title="insertion mode: in body">in body</a>" <a href=#insertion-mode>insertion

Modified: source
===================================================================
--- source	2010-04-01 01:00:38 UTC (rev 4932)
+++ source	2010-04-01 01:21:32 UTC (rev 4933)
@@ -83536,7 +83536,12 @@
   LINE FEED (LF) characters, or pairs of U+000D CARRIAGE RETURN (CR),
   U+000A LINE FEED (LF) characters in that order.</p>
 
+  <p>Where <span title="syntax-charref">character references</span>
+  are allowed, a character reference of a U+000A LINE FEED (LF)
+  character (but not a U+000D CARRIAGE RETURN (CR) character) also
+  represents a <span title="syntax-newlines">newline</span>.</p>
 
+
   <h4>Character references</h4>
 
   <p>In certain cases described in other sections, <span
@@ -83586,9 +83591,9 @@
   </dl>
 
   <p>The numeric character reference forms described above are allowed
-  to reference any Unicode code point other than U+0000, permanently
-  undefined Unicode characters (noncharacters), and control characters
-  other than <span title="space character">space
+  to reference any Unicode code point other than U+0000, U+000D,
+  permanently undefined Unicode characters (noncharacters), and
+  control characters other than <span title="space character">space
   characters</span>.</p>
 
   <p>An <dfn title="syntax-ambiguous-ampersand">ambiguous
@@ -87470,7 +87475,7 @@
       <tr><th>Number <th colspan=2>Unicode character
      <tbody>
       <tr><td>0x00 <td>U+FFFD <td>REPLACEMENT CHARACTER
-      <tr><td>0x0D <td>U+000A <td>LINE FEED (LF)
+      <tr><td>0x0D <td>U+000D <td>CARRIAGE RETURN (CR)
       <tr><td>0x80 <td>U+20AC <td>EURO SIGN (&#x20AC;)
       <tr><td>0x81 <td>U+0081 <td><control>
       <tr><td>0x82 <td>U+201A <td>SINGLE LOW-9 QUOTATION MARK (&#x201A;)
@@ -87945,7 +87950,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Ignore the token.</p>
    </dd>
@@ -88176,7 +88181,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Ignore the token.</p>
    </dd>
@@ -88258,7 +88263,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Ignore the token.</p> <!-- :-( -->
    </dd>
@@ -88331,7 +88336,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><span title="insert a character">Insert the character</span> into
     the <span>current node</span>.</p>
@@ -88536,7 +88541,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dt>A comment token</dt>
    <dt>A start tag whose tag name is one of: "link", "meta", "noframes", "style"</dt>
    <dd>
@@ -88579,7 +88584,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><span title="insert a character">Insert the character</span> into
     the <span>current node</span>.</p>
@@ -88688,8 +88693,8 @@
     character</span> into the <span>current node</span>.</p>
 
     <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
-    LINE FEED (LF), U+000C FORM FEED (FF), <!--U+000D CARRIAGE RETURN
-    (CR),--> or U+0020 SPACE, then set the <span>frameset-ok
+    LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
+    (CR), or U+0020 SPACE, then set the <span>frameset-ok
     flag</span> to "not ok".</p>
 
    </dd>
@@ -88893,6 +88898,9 @@
     one. (Newlines at the start of <code>pre</code> blocks are
     ignored as an authoring convenience.)</p>
 
+    <!-- <pre>[CR]X will eat the [CR], <pre>&#x10;X will eat the
+    &#x10;, but <pre>&#x13;X will not eat the &#x13;. -->
+
     <p>Set the <span>frameset-ok flag</span> to "not ok".</p>
 
    </dd>
@@ -89696,6 +89704,8 @@
      token, then ignore that token and move on to the next
      one. (Newlines at the start of <code>textarea</code> elements are
      ignored as an authoring convenience.)</p></li>
+     
+     <!-- see comment in <pre> start tag bit -->
 
      <li><p>Switch the tokenizer to the <span>RCDATA
      state</span>.</p></li>
@@ -90377,7 +90387,7 @@
     <p>If any of the tokens in the <var>pending table character
     tokens</var> list are character tokens that are not one of U+0009
     CHARACTER TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED
-    (FF), <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE, then
+    (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE, then
     reprocess those character tokens using the rules given in the
     "anything else" entry in the <span title="insertion mode: in
     table">in table</span>" insertion mode.</p>
@@ -90469,7 +90479,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><span title="insert a character">Insert the character</span> into
     the <span>current node</span>.</p>
@@ -91064,8 +91074,8 @@
     character</span> into the <span>current node</span>.</p>
 
     <p>If the token is not one of U+0009 CHARACTER TABULATION, U+000A
-    LINE FEED (LF), U+000C FORM FEED (FF), <!--U+000D CARRIAGE RETURN
-    (CR),--> or U+0020 SPACE, then set the <span>frameset-ok
+    LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN
+    (CR), or U+0020 SPACE, then set the <span>frameset-ok
     flag</span> to "not ok".</p>
 
    </dd>
@@ -91304,7 +91314,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p>Process the token <span>using the rules for</span> the "<span
     title="insertion mode: in body">in body</span>" <span>insertion
@@ -91370,7 +91380,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><span title="insert a character">Insert the character</span> into
     the <span>current node</span>.</p>
@@ -91472,7 +91482,7 @@
 
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dd>
     <p><span title="insert a character">Insert the character</span> into
     the <span>current node</span>.</p>
@@ -91541,7 +91551,7 @@
    <dt>A DOCTYPE token</dt>
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dt>A start tag whose tag name is "html"</dt>
    <dd>
     <p>Process the token <span>using the rules for</span> the "<span
@@ -91581,7 +91591,7 @@
    <dt>A DOCTYPE token</dt>
    <dt>A character token that is one of U+0009 CHARACTER
    TABULATION, U+000A LINE FEED (LF), U+000C FORM FEED (FF),
-   <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
+   U+000D CARRIAGE RETURN (CR), or U+0020 SPACE</dt>
    <dt>A start tag whose tag name is "html"</dt>
    <dd>
     <p>Process the token <span>using the rules for</span> the "<span




More information about the Commit-Watchers mailing list