[html5] r946 - /

whatwg at whatwg.org whatwg at whatwg.org
Thu Jun 21 21:11:21 PDT 2007


Author: ianh
Date: 2007-06-21 21:08:52 -0700 (Thu, 21 Jun 2007)
New Revision: 946

Modified:
   index
   source
Log:
[ct] (2) Update to the rules for handling of entities (require semicolons, and some changes for parsing entities without semicolons when in attributes).

Modified: index
===================================================================
--- index	2007-06-22 02:20:36 UTC (rev 945)
+++ index	2007-06-22 04:08:52 UTC (rev 946)
@@ -32748,9 +32748,9 @@
    <dt>Named entities
 
    <dd>The ampersand must be followed by one of the names given in the <a
-    href="#entities0">entities</a> section, using the same case. <!--Finally,
-   after the name, the entity must be terminated by a U+003B SEMICOLON
-   character (<code title="">;</code>).-->
+    href="#entities0">entities</a> section, using the same case. The name
+    must be one that is terminated by a U+003B SEMICOLON (<code
+    title="">;</code>) character.
 
    <dt>Decimal numeric entities
 
@@ -35060,11 +35060,21 @@
 
     <p>If no match can be made, then this is a <a href="#parse">parse
      error</a>. No characters are consumed, and nothing is returned.</p>
-    <!--<p>If the last character matched is not a U+003B SEMICOLON,
-    there is a <span>parse error</span>.</p>-->
-    
-    <p>Return a character token for the character corresponding to the entity
-     name (as given by the second column of the <a
+
+    <p>If the last character matched is not a U+003B SEMICOLON (<code
+     title="">;</code>), there is a <a href="#parse">parse error</a>.</p>
+
+    <p>If the entity is being consumed <a href="#entity0" title="entity in
+     attribute value state">as part of an attribute</a>, and the last
+     character matched is not a U+003B SEMICOLON (<code title="">;</code>),
+     and the next character is in the range U+0030 DIGIT ZERO to U+0039 DIGIT
+     NINE, U+0041 LATIN CAPITAL LETTER A to U+005A LATIN CAPITAL LETTER Z, or
+     U+0061 LATIN SMALL LETTER A to U+007A LATIN SMALL LETTER Z, then, for
+     historical reasons, all the characters that were matched after the
+     U+0026 AMPERSAND (&) must be unconsumed, and nothing is returned.</p>
+
+    <p>Otherwise, return a character token for the character corresponding to
+     the entity name (as given by the second column of the <a
      href="#entities0">entities</a> table).</p>
 
     <div class=example>

Modified: source
===================================================================
--- source	2007-06-22 02:20:36 UTC (rev 945)
+++ source	2007-06-22 04:08:52 UTC (rev 946)
@@ -30243,9 +30243,9 @@
    <dt>Named entities</dt>
 
    <dd>The ampersand must be followed by one of the names given in the
-   <span>entities</span> section, using the same case. <!--Finally,
-   after the name, the entity must be terminated by a U+003B SEMICOLON
-   character (<code title="">;</code>).--></dd>
+   <span>entities</span> section, using the same case. The name must
+   be one that is terminated by a U+003B SEMICOLON (<code
+   title="">;</code>) character.</dd>
 
 
    <dt>Decimal numeric entities</dt>
@@ -32376,13 +32376,23 @@
     error</span>. No characters are consumed, and nothing is
     returned.</p>
 
-    <!--<p>If the last character matched is not a U+003B SEMICOLON,
-    there is a <span>parse error</span>.</p>-->
+    <p>If the last character matched is not a U+003B SEMICOLON (<code
+    title="">;</code>), there is a <span>parse error</span>.</p>
 
-    <p>Return a character token for the character corresponding to the
-    entity name (as given by the second column of the
-    <span>entities</span> table).</p>
+    <p>If the entity is being consumed <span title="entity in
+    attribute value state">as part of an attribute</span>, and the
+    last character matched is not a U+003B SEMICOLON (<code
+    title="">;</code>), and the next character is in the range U+0030
+    DIGIT ZERO to U+0039 DIGIT NINE, U+0041 LATIN CAPITAL LETTER A to
+    U+005A LATIN CAPITAL LETTER Z, or U+0061 LATIN SMALL LETTER A to
+    U+007A LATIN SMALL LETTER Z, then, for historical reasons, all the
+    characters that were matched after the U+0026 AMPERSAND (&)
+    must be unconsumed, and nothing is returned.</p>
 
+    <p>Otherwise, return a character token for the character
+    corresponding to the entity name (as given by the second column of
+    the <span>entities</span> table).</p>
+
     <div class="example">
 
      <p>If the markup contains <code title="">I'm &notit; I tell




More information about the Commit-Watchers mailing list