[html5] r960 - /

Sat Jun 23 02:53:30 PDT 2007

Author: ianh
Date: 2007-06-23 02:49:27 -0700 (Sat, 23 Jun 2007)
New Revision: 960

Modified:
   index
   source
Log:
[e] (1) Make a new section to put various charset support reqs together; fix a xref

Modified: index
===================================================================

--- index	2007-06-23 09:40:45 UTC (rev 959)
+++ index	2007-06-23 09:49:27 UTC (rev 960)
@@ -1475,10 +1475,13 @@
          <li><a href="#determining0"><span class=secno>8.2.2.1.
           </span>Determining the character encoding</a>
 
-         <li><a href="#preprocessing"><span class=secno>8.2.2.2.
+         <li><a href="#character0"><span class=secno>8.2.2.2.
+          </span>Character encoding requirements</a>
+
+         <li><a href="#preprocessing"><span class=secno>8.2.2.3.
           </span>Preprocessing the input stream</a>
 
-         <li><a href="#changing"><span class=secno>8.2.2.3. </span>Changing
+         <li><a href="#changing"><span class=secno>8.2.2.4. </span>Changing
           the encoding while parsing</a>
         </ul>
 
@@ -32363,14 +32366,14 @@
    described below.
 
   <p>RCDATA elements can have <a href="#text1" title=syntax-text>text</a> and
-   <a href="#character0" title=syntax-entities>character entity
+   <a href="#character1" title=syntax-entities>character entity
    references</a>, but the text must not contain an <a href="#ambiguous"
    title=syntax-ambiguous-ampersand>ambiguous ampersand</a>. There are also
    <a href="#cdata-rcdata-restrictions">further restrictions</a> described
    below.
 
   <p>Normal elements can have <a href="#text1" title=syntax-text>text</a>, <a
-   href="#character0" title=syntax-entities>character entity references</a>,
+   href="#character1" title=syntax-entities>character entity references</a>,
    other <a href="#elements2" title=syntax-elements>elements</a>, and <a
    href="#comments0" title=syntax-comments>comments</a>, but the text must
    not contain the character U+003C LESS-THAN SIGN (<code><</code>) or an
@@ -32466,7 +32469,7 @@
 
   <p><dfn id=attribute0 title=syntax-attribute-value>Attribute values</dfn>
    are a mixture of <a href="#text1" title=syntax-text>text</a> and <a
-   href="#character0" title=syntax-entities>character entity references</a>,
+   href="#character1" title=syntax-entities>character entity references</a>,
    except with the additional restriction that the text cannot contain an <a
    href="#ambiguous" title=syntax-ambiguous-ampersand>ambiguous
    ampersand</a>.
@@ -32805,7 +32808,7 @@
 
   <p>An <dfn id=escaping title=syntax-escape>escaping text span</dfn> is a
    span of <a href="#text1" title=syntax-text>text</a> (in CDATA and RCDATA
-   elements) and <a href="#character0" title=syntax-entities>character entity
+   elements) and <a href="#character1" title=syntax-entities>character entity
    references</a> (in RCDATA elements) that starts with an <a
    href="#escaping0" title=syntax-escape-start>escaping text span start</a>
    that is not itself in an <a href="#escaping" title=syntax-escape>escaping
@@ -32854,7 +32857,7 @@
    references</h4>
 
   <p>In certain cases described in other sections, <a href="#text1"
-   title=syntax-text>text</a> may be mixed with <dfn id=character0
+   title=syntax-text>text</a> may be mixed with <dfn id=character1
    title=syntax-entities>character entity references</dfn>. These can be used
    to escape characters that couldn't otherwise legally be included in <a
    href="#text1" title=syntax-text>text</a>.
@@ -33466,6 +33469,9 @@
      may heuristically decide which to use as a default.
   </ol>
 
+  <h5 id=character0><span class=secno>8.2.2.2. </span>Character encoding
+   requirements</h5>
+
   <p>User agents must at a minimum support the UTF-8 and Windows-1252
    encodings, but may support more.
 
@@ -33477,13 +33483,6 @@
    all the IANA-registered aliases. <a
    href="#refsIANACHARSET">[IANACHARSET]</a>
 
-  <h5 id=preprocessing><span class=secno>8.2.2.2. </span>Preprocessing the
-   input stream</h5>
-
-  <p>Given an encoding, the bytes in the input stream must be converted to
-   Unicode characters for the tokeniser, as described by the rules for that
-   encoding.
-
   <p>When a user agent would otherwise use the ISO-8859-1 encoding, it must
    instead use the Windows-1252 encoding. User agents must not support the
    CESU-8, UTF-7, BOCU-1 and SCSU encodings. <a href="#refsCESU8">[CESU8]</a>
@@ -33493,6 +33492,13 @@
   <p>Support for UTF-32 is not recommended. This encoding is rarely used, and
    frequently misimplemented.
 
+  <h5 id=preprocessing><span class=secno>8.2.2.3. </span>Preprocessing the
+   input stream</h5>
+
+  <p>Given an encoding, the bytes in the input stream must be converted to
+   Unicode characters for the tokeniser, as described by the rules for that
+   encoding.
+
   <p>Bytes or sequences of bytes in the original byte stream that could not
    be converted to Unicode characters must be converted to U+FFFD REPLACEMENT
    CHARACTER code points.
@@ -33532,7 +33538,7 @@
    method) is consumed. Otherwise, the "EOF" character is not a real
    character in the stream, but rather the lack of any further characters.
 
-  <h5 id=changing><span class=secno>8.2.2.3. </span>Changing the encoding
+  <h5 id=changing><span class=secno>8.2.2.4. </span>Changing the encoding
    while parsing</h5>
 
   <p>When the parser requires the user agent to <dfn id=change>change the
@@ -36386,13 +36392,13 @@
         <p><a href="#insert" title="insert an html element">Insert an HTML
          element</a> for the token.</p>
 
-        <p>If the element has a <code title=attr-meta-charset><a
-         href="#charset0">charset</a></code> attribute, and its value is a
-         supported encoding, and the <a href="#confidence"
-         title=concept-encoding-confidence>confidence</a> is currently
-         <i>tentative</i>, then <a href="#change">change the encoding</a> to
-         the encoding given by the value of the <code
+        <p id=meta-charset-during-parse>If the element has a <code
          title=attr-meta-charset><a href="#charset0">charset</a></code>
+         attribute, and its value is a supported encoding, and the <a
+         href="#confidence" title=concept-encoding-confidence>confidence</a>
+         is currently <i>tentative</i>, then <a href="#change">change the
+         encoding</a> to the encoding given by the value of the <code
+         title=attr-meta-charset><a href="#charset0">charset</a></code>
          attribute.</p>
 
         <p>Otherwise, if the element has a <code title=attr-meta-charset><a

Modified: source
===================================================================
--- source	2007-06-23 09:40:45 UTC (rev 959)
+++ source	2007-06-23 09:49:27 UTC (rev 960)
@@ -30976,6 +30976,9 @@
 
   </ol>
 
+
+  <h5>Character encoding requirements</h5>
+
   <p>User agents must at a minimum support the UTF-8 and Windows-1252
   encodings, but may support more.</p>
 
@@ -30987,13 +30990,6 @@
   should support all the IANA-registered aliases. <a
   href="#refsIANACHARSET">[IANACHARSET]</a></p>
 
-
-  <h5>Preprocessing the input stream</h5>
-
-  <p>Given an encoding, the bytes in the input stream must be
-  converted to Unicode characters for the tokeniser, as described by
-  the rules for that encoding.</p>
-
   <p>When a user agent would otherwise use the ISO-8859-1 encoding, it
   must instead use the Windows-1252 encoding. User agents must not
   support the CESU-8, UTF-7, BOCU-1 and SCSU encodings. <a
@@ -31003,6 +30999,14 @@
   <p>Support for UTF-32 is not recommended. This encoding is rarely
   used, and frequently misimplemented.</p>
 
+
+
+  <h5>Preprocessing the input stream</h5>
+
+  <p>Given an encoding, the bytes in the input stream must be
+  converted to Unicode characters for the tokeniser, as described by
+  the rules for that encoding.</p>
+
   <p>Bytes or sequences of bytes in the original byte stream that
   could not be converted to Unicode characters must be converted to
   U+FFFD REPLACEMENT CHARACTER code points.</p>
@@ -33524,7 +33528,7 @@
         <p><span title="insert an html element">Insert an HTML
         element</span> for the token.</p>
 
-        <p>If the element has a <code
+        <p id="meta-charset-during-parse">If the element has a <code
         title="attr-meta-charset">charset</code> attribute, and its
         value is a supported encoding, and the <span
         title="concept-encoding-confidence">confidence</span> is