[html5] r945 - /

Thu Jun 21 19:22:48 PDT 2007

Author: ianh
Date: 2007-06-21 19:20:36 -0700 (Thu, 21 Jun 2007)
New Revision: 945

Modified:
   index
   source
Log:
[a] (0) allow arbitrary characters to be in attribute names; ban U+0000; reflect the recent rule changes for entities into the writing section.

Modified: index
===================================================================

--- index	2007-06-22 02:03:13 UTC (rev 944)
+++ index	2007-06-22 02:20:36 UTC (rev 945)
@@ -32123,6 +32123,8 @@
    title=attr-meta-charset>character encoding declarations</a> are to be
    serialised, as discussed in the section on that topic.
 
+  <p>The U+0000 NULL character must not appear anywhere in a document.
+
   <p class=note>Space characters before the root <code><a
    href="#html">html</a></code> element will be dropped when the document is
    parsed; space characters <em>after</em> the root <code><a
@@ -32374,13 +32376,16 @@
    element are expressed inside the element's start tag.
 
   <p>Attributes have a name and a value. <dfn id=attribute
-   title=syntax-attribute-name>Attribute names</dfn> use characters in the
-   range U+0061 LATIN SMALL LETTER A .. U+007A LATIN SMALL LETTER Z, or, in
-   uppercase, U+0041 LATIN CAPITAL LETTER A .. U+005A LATIN CAPITAL LETTER Z,
-   and U+002D HYPHEN-MINUS (<code>-</code>). In the HTML syntax, attribute
-   names may be written with any mix of lower- and uppercase letters that,
-   when converted to all-lowercase, matches the attribute's name; attribute
-   names are case-insensitive.
+   title=syntax-attribute-name>Attribute names</dfn> must consist of one
+   character other than the <a href="#space" title="space character">space
+   characters</a>, U+003E GREATER-THAN SIGN (>), and U+002F SOLIDUS (/),
+   followed by zero or more characters other than the <a href="#space"
+   title="space character">space characters</a>, U+003E GREATER-THAN SIGN
+   (>), U+002F SOLIDUS (/), and U+003D EQUALS SIGN (=). In the HTML
+   syntax, attribute names may be written with any mix of lower- and
+   uppercase letters that, when converted to
+   all-lowercase<!-- ASCII case-insensitive -->, matches the attribute's
+   name; attribute names are case-insensitive.
 
   <p><dfn id=attribute0 title=syntax-attribute-value>Attribute values</dfn>
    are a mixture of <a href="#text1" title=syntax-text>text</a> and <a
@@ -32752,9 +32757,10 @@
    <dd>The ampersand must be followed by a U+0023 NUMBER SIGN
     (<code>#</code>) character, followed by one or more digits in the range
     U+0030 DIGIT ZERO .. U+0039 DIGIT NINE, representing a base-ten integer
-    that itself is a valid Unicode code point that is neither U+0000 nor a
-    character in the range U+0080 .. U+009F. The digits must then be followed
-    by a U+003B SEMICOLON character (<code title="">;</code>).
+    that itself is a valid Unicode code point that is not U+0000, U+000D, in
+    the range U+0080 .. U+009F, or in the range 0xD800 .. 0xDFFF
+    (surrogates). The digits must then be followed by a U+003B SEMICOLON
+    character (<code title="">;</code>).
 
    <dt>Hexadecimal numeric entities
 
@@ -32765,9 +32771,10 @@
     ZERO .. U+0039 DIGIT NINE, U+0061 LATIN SMALL LETTER A .. U+0066 LATIN
     SMALL LETTER F, and U+0041 LATIN CAPITAL LETTER A .. U+0046 LATIN CAPITAL
     LETTER F, representing a base-sixteen integer that itself is a valid
-    Unicode code point that is neither U+0000 nor a character in the range
-    U+0080 .. U+009F. The digits must then be followed by a U+003B SEMICOLON
-    character (<code title="">;</code>).
+    Unicode code point that is not U+0000, U+000D, in the range U+0080 ..
+    U+009F, or in the range 0xD800 .. 0xDFFF (surrogates). The digits must
+    then be followed by a U+003B SEMICOLON character (<code
+    title="">;</code>).
   </dl>
 
   <p>An <dfn id=ambiguous title=syntax-ambiguous-ampersand>ambiguous

Modified: source
===================================================================
--- source	2007-06-22 02:03:13 UTC (rev 944)
+++ source	2007-06-22 02:20:36 UTC (rev 945)
@@ -29618,6 +29618,9 @@
   title="attr-meta-charset">character encoding declarations</span> are
   to be serialised, as discussed in the section on that topic.</p>
 
+  <p>The U+0000 NULL character must not appear anywhere in a
+  document.</p>
+
   <p class="note">Space characters before the root <code>html</code>
   element will be dropped when the document is parsed; space
   characters <em>after</em> the root <code>html</code> element will be
@@ -29862,14 +29865,16 @@
   are expressed inside the element's start tag.</p>
 
   <p>Attributes have a name and a value. <dfn
-  title="syntax-attribute-name">Attribute names</dfn> use characters
-  in the range U+0061 LATIN SMALL LETTER A .. U+007A LATIN SMALL
-  LETTER Z, or, in uppercase, U+0041 LATIN CAPITAL LETTER A .. U+005A
-  LATIN CAPITAL LETTER Z, and U+002D HYPHEN-MINUS (<code>-</code>). In
-  the HTML syntax, attribute names may be written with any mix of
-  lower- and uppercase letters that, when converted to all-lowercase,
-  matches the attribute's name; attribute names are
-  case-insensitive.</p>
+  title="syntax-attribute-name">Attribute names</dfn> must consist of
+  one character other than the <span title="space character">space
+  characters</span>, U+003E GREATER-THAN SIGN (>), and U+002F
+  SOLIDUS (/), followed by zero or more characters other than the
+  <span title="space character">space characters</span>, U+003E
+  GREATER-THAN SIGN (>), U+002F SOLIDUS (/), and U+003D EQUALS SIGN
+  (=). In the HTML syntax, attribute names may be written with any mix
+  of lower- and uppercase letters that, when converted to
+  all-lowercase<!-- ASCII case-insensitive -->, matches the
+  attribute's name; attribute names are case-insensitive.</p>
 
   <p><dfn title="syntax-attribute-value">Attribute values</dfn> are a
   mixture of <span title="syntax-text">text</span> and <span
@@ -30249,9 +30254,9 @@
    (<code>#</code>) character, followed by one or more digits in the
    range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE, representing a
    base-ten integer that itself is a valid Unicode code point that is
-   neither U+0000 nor a character in the range U+0080 .. U+009F. The
-   digits must then be followed by a U+003B SEMICOLON character (<code
-   title="">;</code>).</dd>
+   not U+0000, U+000D, in the range U+0080 .. U+009F, or in the range
+   0xD800 .. 0xDFFF (surrogates). The digits must then be followed by
+   a U+003B SEMICOLON character (<code title="">;</code>).</dd>
 
 
    <dt>Hexadecimal numeric entities</dt>
@@ -30264,9 +30269,10 @@
    LETTER A .. U+0066 LATIN SMALL LETTER F, and U+0041 LATIN CAPITAL
    LETTER A .. U+0046 LATIN CAPITAL LETTER F, representing a
    base-sixteen integer that itself is a valid Unicode code point that
-   is neither U+0000 nor a character in the range U+0080
-   .. U+009F. The digits must then be followed by a U+003B SEMICOLON
-   character (<code title="">;</code>).</dd>
+   is not U+0000, U+000D, in the range U+0080 .. U+009F, or in the
+   range 0xD800 .. 0xDFFF (surrogates). The digits must then be
+   followed by a U+003B SEMICOLON character (<code
+   title="">;</code>).</dd>
 
   </dl>