[html5] r4177 - [ct] (0) Remove the 'content model flag' and expand it into separate states inst [...]

Mon Oct 19 04:00:34 PDT 2009

Author: ianh
Date: 2009-10-19 04:00:31 -0700 (Mon, 19 Oct 2009)
New Revision: 4177

Modified:
   complete.html
   index
   source
Log:
[ct] (0) Remove the 'content model flag' and expand it into separate states instead. This edit *should* have no effect on black-box conformance requirements. Please report any changes you find.

Modified: complete.html
===================================================================

--- complete.html	2009-10-19 05:52:18 UTC (rev 4176)
+++ complete.html	2009-10-19 11:00:31 UTC (rev 4177)
@@ -1052,47 +1052,65 @@
      <li><a href=#tokenization><span class=secno>11.2.4 </span>Tokenization</a>
       <ol>
        <li><a href=#data-state><span class=secno>11.2.4.1 </span>Data state</a></li>
-       <li><a href=#character-reference-in-data-state><span class=secno>11.2.4.2 </span>Character reference in data state</a></li>
-       <li><a href=#tag-open-state><span class=secno>11.2.4.3 </span>Tag open state</a></li>
-       <li><a href=#close-tag-open-state><span class=secno>11.2.4.4 </span>Close tag open state</a></li>
-       <li><a href=#tag-name-state><span class=secno>11.2.4.5 </span>Tag name state</a></li>
-       <li><a href=#before-attribute-name-state><span class=secno>11.2.4.6 </span>Before attribute name state</a></li>
-       <li><a href=#attribute-name-state><span class=secno>11.2.4.7 </span>Attribute name state</a></li>
-       <li><a href=#after-attribute-name-state><span class=secno>11.2.4.8 </span>After attribute name state</a></li>
-       <li><a href=#before-attribute-value-state><span class=secno>11.2.4.9 </span>Before attribute value state</a></li>
-       <li><a href=#attribute-value-(double-quoted)-state><span class=secno>11.2.4.10 </span>Attribute value (double-quoted) state</a></li>
-       <li><a href=#attribute-value-(single-quoted)-state><span class=secno>11.2.4.11 </span>Attribute value (single-quoted) state</a></li>
-       <li><a href=#attribute-value-(unquoted)-state><span class=secno>11.2.4.12 </span>Attribute value (unquoted) state</a></li>
-       <li><a href=#character-reference-in-attribute-value-state><span class=secno>11.2.4.13 </span>Character reference in attribute value state</a></li>
-       <li><a href=#after-attribute-value-(quoted)-state><span class=secno>11.2.4.14 </span>After attribute value (quoted) state</a></li>
-       <li><a href=#self-closing-start-tag-state><span class=secno>11.2.4.15 </span>Self-closing start tag state</a></li>
-       <li><a href=#bogus-comment-state><span class=secno>11.2.4.16 </span>Bogus comment state</a></li>
-       <li><a href=#markup-declaration-open-state><span class=secno>11.2.4.17 </span>Markup declaration open state</a></li>
-       <li><a href=#comment-start-state><span class=secno>11.2.4.18 </span>Comment start state</a></li>
-       <li><a href=#comment-start-dash-state><span class=secno>11.2.4.19 </span>Comment start dash state</a></li>
-       <li><a href=#comment-state><span class=secno>11.2.4.20 </span>Comment state</a></li>
-       <li><a href=#comment-end-dash-state><span class=secno>11.2.4.21 </span>Comment end dash state</a></li>
-       <li><a href=#comment-end-state><span class=secno>11.2.4.22 </span>Comment end state</a></li>
-       <li><a href=#comment-end-bang-state><span class=secno>11.2.4.23 </span>Comment end bang state</a></li>
-       <li><a href=#comment-end-space-state><span class=secno>11.2.4.24 </span>Comment end space state</a></li>
-       <li><a href=#doctype-state><span class=secno>11.2.4.25 </span>DOCTYPE state</a></li>
-       <li><a href=#before-doctype-name-state><span class=secno>11.2.4.26 </span>Before DOCTYPE name state</a></li>
-       <li><a href=#doctype-name-state><span class=secno>11.2.4.27 </span>DOCTYPE name state</a></li>
-       <li><a href=#after-doctype-name-state><span class=secno>11.2.4.28 </span>After DOCTYPE name state</a></li>
-       <li><a href=#after-doctype-public-keyword-state><span class=secno>11.2.4.29 </span>After DOCTYPE public keyword state</a></li>
-       <li><a href=#before-doctype-public-identifier-state><span class=secno>11.2.4.30 </span>Before DOCTYPE public identifier state</a></li>
-       <li><a href=#doctype-public-identifier-(double-quoted)-state><span class=secno>11.2.4.31 </span>DOCTYPE public identifier (double-quoted) state</a></li>
-       <li><a href=#doctype-public-identifier-(single-quoted)-state><span class=secno>11.2.4.32 </span>DOCTYPE public identifier (single-quoted) state</a></li>
-       <li><a href=#after-doctype-public-identifier-state><span class=secno>11.2.4.33 </span>After DOCTYPE public identifier state</a></li>
-       <li><a href=#between-doctype-public-and-system-identifiers-state><span class=secno>11.2.4.34 </span>Between DOCTYPE public and system identifiers state</a></li>
-       <li><a href=#after-doctype-system-keyword-state><span class=secno>11.2.4.35 </span>After DOCTYPE system keyword state</a></li>
-       <li><a href=#before-doctype-system-identifier-state><span class=secno>11.2.4.36 </span>Before DOCTYPE system identifier state</a></li>
-       <li><a href=#doctype-system-identifier-(double-quoted)-state><span class=secno>11.2.4.37 </span>DOCTYPE system identifier (double-quoted) state</a></li>
-       <li><a href=#doctype-system-identifier-(single-quoted)-state><span class=secno>11.2.4.38 </span>DOCTYPE system identifier (single-quoted) state</a></li>
-       <li><a href=#after-doctype-system-identifier-state><span class=secno>11.2.4.39 </span>After DOCTYPE system identifier state</a></li>
-       <li><a href=#bogus-doctype-state><span class=secno>11.2.4.40 </span>Bogus DOCTYPE state</a></li>
-       <li><a href=#cdata-section-state><span class=secno>11.2.4.41 </span>CDATA section state</a></li>
-       <li><a href=#tokenizing-character-references><span class=secno>11.2.4.42 </span>Tokenizing character references</a></ol></li>
+       <li><a href=#rcdata-state><span class=secno>11.2.4.2 </span>RCDATA state</a></li>
+       <li><a href=#rawtext-state><span class=secno>11.2.4.3 </span>RAWTEXT state</a></li>
+       <li><a href=#script-data-state><span class=secno>11.2.4.4 </span>Script data state</a></li>
+       <li><a href=#plaintext-state><span class=secno>11.2.4.5 </span>PLAINTEXT state</a></li>
+       <li><a href=#character-reference-in-data-state><span class=secno>11.2.4.6 </span>Character reference in data state</a></li>
+       <li><a href=#tag-open-state><span class=secno>11.2.4.7 </span>Tag open state</a></li>
+       <li><a href=#close-tag-open-state><span class=secno>11.2.4.8 </span>Close tag open state</a></li>
+       <li><a href=#tag-name-state><span class=secno>11.2.4.9 </span>Tag name state</a></li>
+       <li><a href=#rcdata-less-than-sign-state><span class=secno>11.2.4.10 </span>RCDATA less-than sign state</a></li>
+       <li><a href=#rcdata-end-tag-open-state><span class=secno>11.2.4.11 </span>RCDATA end tag open state</a></li>
+       <li><a href=#rcdata-end-tag-name-state><span class=secno>11.2.4.12 </span>RCDATA end tag name state</a></li>
+       <li><a href=#rawtext-less-than-sign-state><span class=secno>11.2.4.13 </span>RAWTEXT less-than sign state</a></li>
+       <li><a href=#rawtext-end-tag-open-state><span class=secno>11.2.4.14 </span>RAWTEXT end tag open state</a></li>
+       <li><a href=#rawtext-end-tag-name-state><span class=secno>11.2.4.15 </span>RAWTEXT end tag name state</a></li>
+       <li><a href=#script-data-less-than-sign-state><span class=secno>11.2.4.16 </span>Script data less-than sign state</a></li>
+       <li><a href=#script-data-end-tag-open-state><span class=secno>11.2.4.17 </span>Script data end tag open state</a></li>
+       <li><a href=#script-data-end-tag-name-state><span class=secno>11.2.4.18 </span>Script data end tag name state</a></li>
+       <li><a href=#script-data-escape-start-state><span class=secno>11.2.4.19 </span>Script data escape start state</a></li>
+       <li><a href=#script-data-escape-start-dash-state><span class=secno>11.2.4.20 </span>Script data escape start dash state</a></li>
+       <li><a href=#script-data-escaped-state><span class=secno>11.2.4.21 </span>Script data escaped state</a></li>
+       <li><a href=#script-data-escaped-dash-state><span class=secno>11.2.4.22 </span>Script data escaped dash state</a></li>
+       <li><a href=#script-data-escaped-dash-dash-state><span class=secno>11.2.4.23 </span>Script data escaped dash dash state</a></li>
+       <li><a href=#before-attribute-name-state><span class=secno>11.2.4.24 </span>Before attribute name state</a></li>
+       <li><a href=#attribute-name-state><span class=secno>11.2.4.25 </span>Attribute name state</a></li>
+       <li><a href=#after-attribute-name-state><span class=secno>11.2.4.26 </span>After attribute name state</a></li>
+       <li><a href=#before-attribute-value-state><span class=secno>11.2.4.27 </span>Before attribute value state</a></li>
+       <li><a href=#attribute-value-(double-quoted)-state><span class=secno>11.2.4.28 </span>Attribute value (double-quoted) state</a></li>
+       <li><a href=#attribute-value-(single-quoted)-state><span class=secno>11.2.4.29 </span>Attribute value (single-quoted) state</a></li>
+       <li><a href=#attribute-value-(unquoted)-state><span class=secno>11.2.4.30 </span>Attribute value (unquoted) state</a></li>
+       <li><a href=#character-reference-in-attribute-value-state><span class=secno>11.2.4.31 </span>Character reference in attribute value state</a></li>
+       <li><a href=#after-attribute-value-(quoted)-state><span class=secno>11.2.4.32 </span>After attribute value (quoted) state</a></li>
+       <li><a href=#self-closing-start-tag-state><span class=secno>11.2.4.33 </span>Self-closing start tag state</a></li>
+       <li><a href=#bogus-comment-state><span class=secno>11.2.4.34 </span>Bogus comment state</a></li>
+       <li><a href=#markup-declaration-open-state><span class=secno>11.2.4.35 </span>Markup declaration open state</a></li>
+       <li><a href=#comment-start-state><span class=secno>11.2.4.36 </span>Comment start state</a></li>
+       <li><a href=#comment-start-dash-state><span class=secno>11.2.4.37 </span>Comment start dash state</a></li>
+       <li><a href=#comment-state><span class=secno>11.2.4.38 </span>Comment state</a></li>
+       <li><a href=#comment-end-dash-state><span class=secno>11.2.4.39 </span>Comment end dash state</a></li>
+       <li><a href=#comment-end-state><span class=secno>11.2.4.40 </span>Comment end state</a></li>
+       <li><a href=#comment-end-bang-state><span class=secno>11.2.4.41 </span>Comment end bang state</a></li>
+       <li><a href=#comment-end-space-state><span class=secno>11.2.4.42 </span>Comment end space state</a></li>
+       <li><a href=#doctype-state><span class=secno>11.2.4.43 </span>DOCTYPE state</a></li>
+       <li><a href=#before-doctype-name-state><span class=secno>11.2.4.44 </span>Before DOCTYPE name state</a></li>
+       <li><a href=#doctype-name-state><span class=secno>11.2.4.45 </span>DOCTYPE name state</a></li>
+       <li><a href=#after-doctype-name-state><span class=secno>11.2.4.46 </span>After DOCTYPE name state</a></li>
+       <li><a href=#after-doctype-public-keyword-state><span class=secno>11.2.4.47 </span>After DOCTYPE public keyword state</a></li>
+       <li><a href=#before-doctype-public-identifier-state><span class=secno>11.2.4.48 </span>Before DOCTYPE public identifier state</a></li>
+       <li><a href=#doctype-public-identifier-(double-quoted)-state><span class=secno>11.2.4.49 </span>DOCTYPE public identifier (double-quoted) state</a></li>
+       <li><a href=#doctype-public-identifier-(single-quoted)-state><span class=secno>11.2.4.50 </span>DOCTYPE public identifier (single-quoted) state</a></li>
+       <li><a href=#after-doctype-public-identifier-state><span class=secno>11.2.4.51 </span>After DOCTYPE public identifier state</a></li>
+       <li><a href=#between-doctype-public-and-system-identifiers-state><span class=secno>11.2.4.52 </span>Between DOCTYPE public and system identifiers state</a></li>
+       <li><a href=#after-doctype-system-keyword-state><span class=secno>11.2.4.53 </span>After DOCTYPE system keyword state</a></li>
+       <li><a href=#before-doctype-system-identifier-state><span class=secno>11.2.4.54 </span>Before DOCTYPE system identifier state</a></li>
+       <li><a href=#doctype-system-identifier-(double-quoted)-state><span class=secno>11.2.4.55 </span>DOCTYPE system identifier (double-quoted) state</a></li>
+       <li><a href=#doctype-system-identifier-(single-quoted)-state><span class=secno>11.2.4.56 </span>DOCTYPE system identifier (single-quoted) state</a></li>
+       <li><a href=#after-doctype-system-identifier-state><span class=secno>11.2.4.57 </span>After DOCTYPE system identifier state</a></li>
+       <li><a href=#bogus-doctype-state><span class=secno>11.2.4.58 </span>Bogus DOCTYPE state</a></li>
+       <li><a href=#cdata-section-state><span class=secno>11.2.4.59 </span>CDATA section state</a></li>
+       <li><a href=#tokenizing-character-references><span class=secno>11.2.4.60 </span>Tokenizing character references</a></ol></li>
      <li><a href=#tree-construction><span class=secno>11.2.5 </span>Tree construction</a>
       <ol>
        <li><a href=#creating-and-inserting-elements><span class=secno>11.2.5.1 </span>Creating and inserting elements</a></li>
@@ -9785,9 +9803,9 @@
     <p>If <var title="">type</var> is <em>not</em> now an <a href=#ascii-case-insensitive>ASCII
     case-insensitive</a> match for the string
     "<code><a href=#text/html>text/html</a></code>", then act as if the tokenizer had emitted
-    a start tag token with the tag name "pre", then set the <a href=#html-parser>HTML
-    parser</a>'s <a href=#tokenization>tokenization</a> stage's <a href=#content-model-flag>content
-    model flag</a> to <i title="">PLAINTEXT</i>.</p>
+    a start tag token with the tag name "pre", then switch the
+    <a href=#html-parser>HTML parser</a>'s tokenizer to the <a href=#plaintext-state>PLAINTEXT
+    state</a>.</p>
 
     <!--
  http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E...%3Ciframe%3E%3C%2Fiframe%3E%3Cscript%3Eonload%20%3D%20function%20()%20%7B%20%0D%0A%20%20var%20d%20%3D%20document.getElementsByTagName('iframe')%5B0%5D.contentDocument%3B%0D%0A%20%20d.open('image%2Fsvg%2Bxml')%3B%0D%0A%20%20d.write(%22%3Cinput%20xmlns%3D'http%3A%2F%2Fwww.w3.org%2F1999%2Fxhtml'%20value%3D'(x)html'%2F%3E%22)%3B%0D%0A%20%20d.close()%3B%0D%0A%7D%3B%3C%2Fscript%3E
@@ -55758,9 +55776,9 @@
   context</a>, the user agent should <a href=#create-a-document-object>create a
   <code>Document</code> object</a>, mark it as being an <a href=#html-documents title="HTML documents">HTML document</a>, create an <a href=#html-parser>HTML
   parser</a>, associate it with the document, act as if the
-  tokenizer had emitted a start tag token with the tag name "pre", set
-  the <a href=#tokenization>tokenization</a> stage's <a href=#content-model-flag>content model
-  flag</a> to <i title="">PLAINTEXT</i>, and begin to pass the stream of
+  tokenizer had emitted a start tag token with the tag name "pre",
+  switch the <a href=#html-parser>HTML parser</a>'s tokenizer to the
+  <a href=#plaintext-state>PLAINTEXT state</a>, and begin to pass the stream of
   characters in the plain text document to that tokenizer.</p>
 
   <p>The rules for how to convert the bytes of the plain text document
@@ -70362,16 +70380,13 @@
   switches it to a new state (to consume the next character), or
   repeats the same state (to consume the next character). Some states
   have more complicated behavior and can consume several characters
-  before switching to another state.</p>
+  before switching to another state. In some cases, the tokenizer
+  state is also changed by the tree construction stage.</p>
 
-  <p>The exact behavior of certain states depends on a <dfn id=content-model-flag>content
-  model flag</dfn> that is set after certain tokens are emitted. The
-  flag has several states: <i title="">PCDATA</i>, <i title="">RCDATA</i>, <i title="">RAWTEXT</i>, and <i title="">PLAINTEXT</i>. Initially, it must be in the PCDATA
-  state. In the RCDATA and RAWTEXT states, a further <dfn id=escape-flag>escape
-  flag</dfn> is used to control the behavior of the tokenizer. It is
-  either true or false, and initially must be set to the false
-  state. The <a href=#insertion-mode>insertion mode</a> and the <a href=#stack-of-open-elements>stack of open
-  elements</a> also affects tokenization.</p>
+  <p>The exact behavior of certain states depends on the
+  <a href=#insertion-mode>insertion mode</a> and the <a href=#stack-of-open-elements>stack of open
+  elements</a>. Certain states also use a <dfn id=temporary-buffer><var>temporary
+  buffer</var></dfn> to track progress.</p>
 
   <p>The output of the tokenization step is a series of zero or more
   of the following tokens: DOCTYPE, start tag, end tag, comment,
@@ -70390,8 +70405,8 @@
 
   <p>When a token is emitted, it must immediately be handled by the
   <a href=#tree-construction>tree construction</a> stage. The tree construction stage
-  can affect the state of the <a href=#content-model-flag>content model flag</a>, and can
-  insert additional characters into the stream. (For example, the
+  can affect the state of the tokenization stage, and can insert
+  additional characters into the stream. (For example, the
   <code><a href=#script>script</a></code> element can result in scripts executing and
   using the <a href=#dynamic-markup-insertion>dynamic markup insertion</a> APIs to insert
   characters into the stream being tokenized.)</p>
@@ -70401,15 +70416,18 @@
   self-closing flag">acknowledged</dfn> when it is processed by the
   tree construction stage, that is a <a href=#parse-error>parse error</a>.</p>
 
-  <p>When an end tag token is emitted, the <a href=#content-model-flag>content model
-  flag</a> must be switched to the PCDATA state.</p>
-
   <p>When an end tag token is emitted with attributes, that is a
   <a href=#parse-error>parse error</a>.</p>
 
   <p>When an end tag token is emitted with its <i>self-closing
   flag</i> set, that is a <a href=#parse-error>parse error</a>.</p>
 
+  <p>An <dfn id=appropriate-end-tag-token>appropriate end tag token</dfn> is an end tag token whose
+  tag name matches the tag name of the last start tag to have been
+  emitted from this tokenizer, if any. If no start tag has been
+  emitted from this tokenizer, then no end tag token is
+  appropriate.</p>
+
   <p>Before each step of the tokenizer, the user agent must first
   check the <a href=#parser-pause-flag>parser pause flag</a>. If it is true, then the
   tokenizer must abort the processing of any nested invocations of the
@@ -70418,187 +70436,152 @@
   <p>The tokenizer state machine consists of the states defined in the
   following subsections.</p>
 
+
   <!-- Order of the lists below is supposed to be non-error then
   error, by unicode, then EOF, ending with "anything else" -->
 
+
   <h5 id=data-state><span class=secno>11.2.4.1 </span><dfn>Data state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
   <dl class=switch><dt>U+0026 AMPERSAND (&)</dt>
-   <dd>When the <a href=#content-model-flag>content model flag</a> is set to one of the
-   PCDATA or RCDATA states and the <a href=#escape-flag>escape flag</a> is
-   false: switch to the <a href=#character-reference-in-data-state>character reference in data
+   <dd>Switch to the <a href=#character-reference-in-data-state>character reference in data
    state</a>.</dd>
-   <dd>Otherwise: treat it as per the "anything else" entry
-   below.</dd>
 
-   <dt>U+002D HYPHEN-MINUS (-)</dt>
-   <dd>
+   <dt>U+003C LESS-THAN SIGN (<)</dt>
+   <dd>Switch to the <a href=#tag-open-state>tag open state</a>.</dd>
 
-    <p>If the <a href=#content-model-flag>content model flag</a> is set to either the
-    RCDATA state or the RAWTEXT state, and the <a href=#escape-flag>escape flag</a>
-    is false, and there are at least three characters before this
-    one in the input stream, and the last four characters in the
-    input stream, including this one, are U+003C LESS-THAN SIGN,
-    U+0021 EXCLAMATION MARK, U+002D HYPHEN-MINUS, and U+002D
-    HYPHEN-MINUS ("<!--"), then set the <a href=#escape-flag>escape flag</a>
-    to true.</p>
+   <dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-    <p>In any case, emit the input character as a character
-    token. Stay in the <a href=#data-state>data state</a>.</p>
+   <dt>Anything else</dt>
+   <dd>Emit the <a href=#current-input-character>current input character</a> as a character
+   token. Stay in the <a href=#data-state>data state</a>.</dd>
 
-   </dd>
+  </dl><h5 id=rcdata-state><span class=secno>11.2.4.2 </span><dfn>RCDATA state</dfn></h5>
 
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0026 AMPERSAND (&)</dt>
+   <dd>Switch to the <a href=#character-reference-in-data-state>character reference in data
+   state</a>.</dd>
+
    <dt>U+003C LESS-THAN SIGN (<)</dt>
-   <dd>When the <a href=#content-model-flag>content model flag</a> is set to the PCDATA
-   state: switch to the <a href=#tag-open-state>tag open state</a>.</dd>
-   <dd>When the <a href=#content-model-flag>content model flag</a> is set to either the
-   RCDATA state or the RAWTEXT state, and the <a href=#escape-flag>escape flag</a>
-   is false: switch to the <a href=#tag-open-state>tag open state</a>.</dd>
-   <dd>Otherwise: treat it as per the "anything else" entry
-   below.</dd>
+   <dd>Switch to the <a href=#rcdata-less-than-sign-state>RCDATA less-than sign state</a>.</dd>
 
-   <dt>U+003E GREATER-THAN SIGN (>)</dt>
-   <dd>
+   <dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-    <p>If the <a href=#content-model-flag>content model flag</a> is set to either the
-    RCDATA state or the RAWTEXT state, and the <a href=#escape-flag>escape
-    flag</a> is true, and the last three characters in the input
-    stream including this one are U+002D HYPHEN-MINUS, U+002D
-    HYPHEN-MINUS, U+003E GREATER-THAN SIGN ("-->"), set the
-    <a href=#escape-flag>escape flag</a> to false.</p> <!-- no need to check
-    that there are enough characters, since you can only run into
-    this if the flag is true in the first place, which requires four
-    characters. -->
+   <dt>Anything else</dt>
+   <dd>Emit the <a href=#current-input-character>current input character</a> as a character
+   token. Stay in the <a href=#rcdata-state>RCDATA state</a>.</dd>
 
-    <p>In any case, emit the input character as a character
-    token. Stay in the <a href=#data-state>data state</a>.</p>
+  </dl><h5 id=rawtext-state><span class=secno>11.2.4.3 </span><dfn>RAWTEXT state</dfn></h5>
 
-   </dd>
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
+  <dl class=switch><dt>U+003C LESS-THAN SIGN (<)</dt>
+   <dd>Switch to the <a href=#rawtext-less-than-sign-state>RAWTEXT less-than sign state</a>.</dd>
+
    <dt>EOF</dt>
    <dd>Emit an end-of-file token.</dd>
 
    <dt>Anything else</dt>
-   <dd>Emit the input character as a character token. Stay in the
-   <a href=#data-state>data state</a>.</dd>
+   <dd>Emit the <a href=#current-input-character>current input character</a> as a character
+   token. Stay in the <a href=#rawtext-state>RAWTEXT state</a>.</dd>
 
-  </dl><h5 id=character-reference-in-data-state><span class=secno>11.2.4.2 </span><dfn>Character reference in data state</dfn></h5>
+  </dl><h5 id=script-data-state><span class=secno>11.2.4.4 </span><dfn>Script data state</dfn></h5>
 
-  <p><i>(This cannot happen if the <a href=#content-model-flag>content model flag</a>
-  is set to the RAWTEXT state.)</i></p>
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
-  <p>Attempt to <a href=#consume-a-character-reference>consume a character reference</a>, with no
-  <a href=#additional-allowed-character>additional allowed character</a>.</p>
+  <dl class=switch><dt>U+003C LESS-THAN SIGN (<)</dt>
+   <dd>Switch to the <a href=#script-data-less-than-sign-state>script data less-than sign state</a>.</dd>
 
-  <p>If nothing is returned, emit a U+0026 AMPERSAND character
-  token.</p>
+   <dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-  <p>Otherwise, emit the character token that was returned.</p>
+   <dt>Anything else</dt>
+   <dd>Emit the <a href=#current-input-character>current input character</a> as a character
+   token. Stay in the <a href=#script-data-state>script data state</a>.</dd>
 
-  <p>Finally, switch to the <a href=#data-state>data state</a>.</p>
+  </dl><h5 id=plaintext-state><span class=secno>11.2.4.5 </span><dfn>PLAINTEXT state</dfn></h5>
 
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
-  <h5 id=tag-open-state><span class=secno>11.2.4.3 </span><dfn>Tag open state</dfn></h5>
+  <dl class=switch><dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-  <p>The behavior of this state depends on the <a href=#content-model-flag>content model
-  flag</a>.</p>
+   <dt>Anything else</dt>
+   <dd>Emit the <a href=#current-input-character>current input character</a> as a character
+   token. Stay in the <a href=#plaintext-state>PLAINTEXT state</a>.</dd>
 
-  <dl><dt>If the <a href=#content-model-flag>content model flag</a> is set to the RCDATA
-   or RAWTEXT states</dt>
+  </dl><h5 id=character-reference-in-data-state><span class=secno>11.2.4.6 </span><dfn>Character reference in data state</dfn></h5>
 
-   <dd>
+  <p>Attempt to <a href=#consume-a-character-reference>consume a character reference</a>, with no
+  <a href=#additional-allowed-character>additional allowed character</a>.</p>
 
-    <p>Consume the <a href=#next-input-character>next input character</a>. If it is a
-    U+002F SOLIDUS character (/), switch to the <a href=#close-tag-open-state>close tag open
-    state</a>. Otherwise, emit a U+003C LESS-THAN SIGN character
-    token and reconsume the <a href=#current-input-character>current input character</a> in the
-    <a href=#data-state>data state</a>.</p>
+  <p>If nothing is returned, emit a U+0026 AMPERSAND character
+  token.</p>
 
-   </dd>
+  <p>Otherwise, emit the character token that was returned.</p>
 
-   <dt>If the <a href=#content-model-flag>content model flag</a> is set to the PCDATA
-   state</dt>
+  <p>Finally, switch to the <a href=#data-state>data state</a>.</p>
 
-   <dd>
 
-    <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+  <h5 id=tag-open-state><span class=secno>11.2.4.7 </span><dfn>Tag open state</dfn></h5>
 
-    <dl class=switch><dt>U+0021 EXCLAMATION MARK (!)</dt>
-     <dd>Switch to the <a href=#markup-declaration-open-state>markup declaration open state</a>.</dd>
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
-     <dt>U+002F SOLIDUS (/)</dt>
-     <dd>Switch to the <a href=#close-tag-open-state>close tag open state</a>.</dd>
+  <dl class=switch><dt>U+0021 EXCLAMATION MARK (!)</dt>
+   <dd>Switch to the <a href=#markup-declaration-open-state>markup declaration open state</a>.</dd>
 
-     <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-     <dd>Create a new start tag token, set its tag name to the
-     lowercase version of the input character (add 0x0020 to the
-     character's code point), then switch to the <a href=#tag-name-state>tag name
-     state</a>. (Don't emit the token yet; further details will
-     be filled in before it is emitted.)</dd>
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>Switch to the <a href=#close-tag-open-state>close tag open state</a>.</dd>
 
-     <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
-     <dd>Create a new start tag token, set its tag name to the input
-     character, then switch to the <a href=#tag-name-state>tag name
-     state</a>. (Don't emit the token yet; further details will
-     be filled in before it is emitted.)</dd>
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new start tag token, set its tag name to the
+   lowercase version of the <a href=#current-input-character>current input character</a> (add 0x0020 to the
+   character's code point), then switch to the <a href=#tag-name-state>tag name
+   state</a>. (Don't emit the token yet; further details will
+   be filled in before it is emitted.)</dd>
 
-     <dt>U+003E GREATER-THAN SIGN (>)</dt>
-     <dd><a href=#parse-error>Parse error</a>. Emit a U+003C LESS-THAN SIGN
-     character token and a U+003E GREATER-THAN SIGN character
-     token. Switch to the <a href=#data-state>data state</a>.</dd>
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new start tag token, set its tag name to the
+   <a href=#current-input-character>current input character</a>, then switch to the <a href=#tag-name-state>tag
+   name state</a>. (Don't emit the token yet; further details will
+   be filled in before it is emitted.)</dd>
 
-     <dt>U+003F QUESTION MARK (?)</dt>
-     <dd><a href=#parse-error>Parse error</a>. Switch to the <a href=#bogus-comment-state>bogus
-     comment state</a>.</dd>
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd><a href=#parse-error>Parse error</a>. Emit a U+003C LESS-THAN SIGN
+   character token and a U+003E GREATER-THAN SIGN character
+   token. Switch to the <a href=#data-state>data state</a>.</dd>
 
-     <dt>Anything else</dt>
-     <dd><a href=#parse-error>Parse error</a>. Emit a U+003C LESS-THAN SIGN
-     character token and reconsume the <a href=#current-input-character>current input character</a> in the
-     <a href=#data-state>data state</a>.</dd>
+   <dt>U+003F QUESTION MARK (?)</dt>
+   <dd><a href=#parse-error>Parse error</a>. Switch to the <a href=#bogus-comment-state>bogus
+   comment state</a>.</dd>
 
-    </dl></dd>
+   <dt>Anything else</dt>
+   <dd><a href=#parse-error>Parse error</a>. Emit a U+003C LESS-THAN SIGN
+   character token and reconsume the <a href=#current-input-character>current input
+   character</a> in the <a href=#data-state>data state</a>.</dd>
 
-  </dl><h5 id=close-tag-open-state><span class=secno>11.2.4.4 </span><dfn>Close tag open state</dfn></h5>
+  </dl><h5 id=close-tag-open-state><span class=secno>11.2.4.8 </span><dfn>Close tag open state</dfn></h5>
 
-  <p>If the <a href=#content-model-flag>content model flag</a> is set to the RCDATA or
-  RAWTEXT states but no start tag token has ever been emitted by this
-  instance of the tokenizer (<a href=#fragment-case>fragment case</a>), or, if the
-  <a href=#content-model-flag>content model flag</a> is set to the RCDATA or RAWTEXT states
-  and the next few characters do not match the tag name of the last
-  start tag token emitted (compared in an <a href=#ascii-case-insensitive>ASCII
-  case-insensitive</a> manner), or if they do but they are not
-  immediately followed by one of the following characters:</p>
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
-  <ul class=brief><li>U+0009 CHARACTER TABULATION</li>
-   <li>U+000A LINE FEED (LF)</li>
-   <li>U+000C FORM FEED (FF)</li>
-   <!--<li>U+000D CARRIAGE RETURN (CR)</li>-->
-   <li>U+0020 SPACE</li>
-   <li>U+003E GREATER-THAN SIGN (>)</li>
-   <li>U+002F SOLIDUS (/)</li>
-   <li>EOF</li>
-  </ul><p>...then emit a U+003C LESS-THAN SIGN character token, a U+002F
-  SOLIDUS character token, and switch to the <a href=#data-state>data state</a>
-  to process the <a href=#next-input-character>next input character</a>.</p>
-
-  <p>Otherwise, if the <a href=#content-model-flag>content model flag</a> is set to the
-  PCDATA state, or if the next few characters <em>do</em> match that tag
-  name, consume the <a href=#next-input-character>next input character</a>:</p>
-
   <dl class=switch><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
    <dd>Create a new end tag token, set its tag name to the lowercase
-   version of the input character (add 0x0020 to the character's
-   code point), then switch to the <a href=#tag-name-state>tag name
+   version of the <a href=#current-input-character>current input character</a> (add 0x0020 to
+   the character's code point), then switch to the <a href=#tag-name-state>tag name
    state</a>. (Don't emit the token yet; further details will be
    filled in before it is emitted.)</dd>
 
    <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
-   <dd>Create a new end tag token, set its tag name to the input
-   character, then switch to the <a href=#tag-name-state>tag name state</a>. (Don't
-   emit the token yet; further details will be filled in before it
-   is emitted.)</dd>
+   <dd>Create a new end tag token, set its tag name to the
+   <a href=#current-input-character>current input character</a>, then switch to the <a href=#tag-name-state>tag
+   name state</a>. (Don't emit the token yet; further details will
+   be filled in before it is emitted.)</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd><a href=#parse-error>Parse error</a>. Switch to the <a href=#data-state>data
@@ -70613,7 +70596,7 @@
    <dd><a href=#parse-error>Parse error</a>. Switch to the <a href=#bogus-comment-state>bogus
    comment state</a>.</dd>
 
-  </dl><h5 id=tag-name-state><span class=secno>11.2.4.5 </span><dfn>Tag name state</dfn></h5>
+  </dl><h5 id=tag-name-state><span class=secno>11.2.4.9 </span><dfn>Tag name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -70632,27 +70615,372 @@
    state</a>.</dd>
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-   <dd>Append the lowercase version of the <a href=#current-input-character>current input character</a>
-   (add 0x0020 to the character's code point) to the current tag
-   token's tag name. Stay in the <a href=#tag-name-state>tag name state</a>.</dd>
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Stay in the <a href=#tag-name-state>tag name
+   state</a>.</dd>
 
    <dt>EOF</dt>
    <dd><a href=#parse-error>Parse error</a>. Reconsume the EOF character in the
    <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current tag token's
-   tag name. Stay in the <a href=#tag-name-state>tag name state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   tag token's tag name. Stay in the <a href=#tag-name-state>tag name state</a>.</dd>
 
-  </dl><h5 id=before-attribute-name-state><span class=secno>11.2.4.6 </span><dfn>Before attribute name state</dfn></h5>
+  </dl><h5 id=rcdata-less-than-sign-state><span class=secno>11.2.4.10 </span><dfn>RCDATA less-than sign state</dfn></h5>
+  <!-- identical to the RAWTEXT less-than sign state, except s/RAWTEXT/RCDATA/g -->
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
+  <dl class=switch><dt>U+002F SOLIDUS (/)</dt>
+   <dd>Set the <var><a href=#temporary-buffer>temporary buffer</a></var> to the empty string. Switch
+   to the <a href=#rcdata-end-tag-open-state>RCDATA end tag open state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
+   <a href=#current-input-character>current input character</a> in the <a href=#rcdata-state>RCDATA
+   state</a>.</dd>
+
+  </dl><h5 id=rcdata-end-tag-open-state><span class=secno>11.2.4.11 </span><dfn>RCDATA end tag open state</dfn></h5>
+  <!-- identical to the RAWTEXT (and Script data) end tag open state, except s/RAWTEXT/RCDATA/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   lowercase version of the <a href=#current-input-character>current input character</a> (add
+   0x0020 to the character's code point). Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#rcdata-end-tag-name-state>RCDATA end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   <a href=#current-input-character>current input character</a>. Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#rcdata-end-tag-name-state>RCDATA end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, and reconsume the <a href=#current-input-character>current input
+   character</a> in the <a href=#rcdata-state>RCDATA state</a>.</dd>
+
+  </dl><h5 id=rcdata-end-tag-name-state><span class=secno>11.2.4.12 </span><dfn>RCDATA end tag name state</dfn></h5>
+  <!-- identical to the RAWTEXT (and Script data) end tag name state, except s/RAWTEXT/RCDATA/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
   <dl class=switch><dt>U+0009 CHARACTER TABULATION</dt>
    <dt>U+000A LINE FEED (LF)</dt>
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#before-attribute-name-state>before attribute name
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#self-closing-start-tag-state>self-closing start tag
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then emit the current tag token and switch to the
+   <a href=#data-state>data state</a>. Otherwise, treat it as per the "anything
+   else" entry below.</dd>
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#rcdata-end-tag-name-state>RCDATA end tag name state</a>.</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#rcdata-end-tag-name-state>RCDATA end tag name state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, a character token for each of the characters in
+   the <var><a href=#temporary-buffer>temporary buffer</a></var> (in the order they were added to
+   the buffer), and reconsume the <a href=#current-input-character>current input character</a>
+   in the <a href=#rcdata-state>RCDATA state</a>.</dd>
+
+  </dl><h5 id=rawtext-less-than-sign-state><span class=secno>11.2.4.13 </span><dfn>RAWTEXT less-than sign state</dfn></h5>
+  <!-- identical to the RCDATA less-than sign state, except s/RCDATA/RAWTEXT/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002F SOLIDUS (/)</dt>
+   <dd>Set the <var><a href=#temporary-buffer>temporary buffer</a></var> to the empty string. Switch
+   to the <a href=#rawtext-end-tag-open-state>RAWTEXT end tag open state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
+   <a href=#current-input-character>current input character</a> in the <a href=#rawtext-state>RAWTEXT
+   state</a>.</dd>
+
+  </dl><h5 id=rawtext-end-tag-open-state><span class=secno>11.2.4.14 </span><dfn>RAWTEXT end tag open state</dfn></h5>
+  <!-- identical to the RCDATA (and Script data) end tag open state, except s/RCDATA/RAWTEXT/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   lowercase version of the <a href=#current-input-character>current input character</a> (add
+   0x0020 to the character's code point). Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#rawtext-end-tag-name-state>RAWTEXT end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   <a href=#current-input-character>current input character</a>. Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#rawtext-end-tag-name-state>RAWTEXT end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, and reconsume the <a href=#current-input-character>current input
+   character</a> in the <a href=#rawtext-state>RAWTEXT state</a>.</dd>
+
+  </dl><h5 id=rawtext-end-tag-name-state><span class=secno>11.2.4.15 </span><dfn>RAWTEXT end tag name state</dfn></h5>
+  <!-- identical to the RCDATA (and Script data) end tag name state, except s/RCDATA/RAWTEXT/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0009 CHARACTER TABULATION</dt>
+   <dt>U+000A LINE FEED (LF)</dt>
+   <dt>U+000C FORM FEED (FF)</dt>
+   <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+   <dt>U+0020 SPACE</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#before-attribute-name-state>before attribute name
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#self-closing-start-tag-state>self-closing start tag
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then emit the current tag token and switch to the
+   <a href=#data-state>data state</a>. Otherwise, treat it as per the "anything
+   else" entry below.</dd>
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#rawtext-end-tag-name-state>RAWTEXT end tag name state</a>.</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#rawtext-end-tag-name-state>RAWTEXT end tag name state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, a character token for each of the characters in
+   the <var><a href=#temporary-buffer>temporary buffer</a></var> (in the order they were added to
+   the buffer), and reconsume the <a href=#current-input-character>current input character</a>
+   in the <a href=#rawtext-state>RAWTEXT state</a>.</dd>
+
+  </dl><h5 id=script-data-less-than-sign-state><span class=secno>11.2.4.16 </span><dfn>Script data less-than sign state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002F SOLIDUS (/)</dt>
+   <dd>Set the <var><a href=#temporary-buffer>temporary buffer</a></var> to the empty string. Switch
+   to the <a href=#script-data-end-tag-open-state>script data end tag open state</a>.</dd>
+
+   <dt>U+0021 EXCLAMATION MARK (!)</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and a U+0021
+   EXCLAMATION MARK character token. Switch to the <a href=#script-data-escape-start-state>script data
+   escape start state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
+   <a href=#current-input-character>current input character</a> in the <a href=#script-data-state>script data
+   state</a>.</dd>
+
+  </dl><h5 id=script-data-end-tag-open-state><span class=secno>11.2.4.17 </span><dfn>Script data end tag open state</dfn></h5>
+  <!-- identical to the RCDATA (and RAWTEXT) end tag open state, except s/RCDATA/Script data/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   lowercase version of the <a href=#current-input-character>current input character</a> (add
+   0x0020 to the character's code point). Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#script-data-end-tag-name-state>script data end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   <a href=#current-input-character>current input character</a>. Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#script-data-end-tag-name-state>script data end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, and reconsume the <a href=#current-input-character>current input
+   character</a> in the <a href=#script-data-state>script data state</a>.</dd>
+
+  </dl><h5 id=script-data-end-tag-name-state><span class=secno>11.2.4.18 </span><dfn>Script data end tag name state</dfn></h5>
+  <!-- identical to the RCDATA (and RAWTEXT) end tag name state, except s/RCDATA/Script data/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0009 CHARACTER TABULATION</dt>
+   <dt>U+000A LINE FEED (LF)</dt>
+   <dt>U+000C FORM FEED (FF)</dt>
+   <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+   <dt>U+0020 SPACE</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#before-attribute-name-state>before attribute name
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#self-closing-start-tag-state>self-closing start tag
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then emit the current tag token and switch to the
+   <a href=#data-state>data state</a>. Otherwise, treat it as per the "anything
+   else" entry below.</dd>
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#script-data-end-tag-name-state>Script data end tag name state</a>.</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#script-data-end-tag-name-state>Script data end tag name state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, a character token for each of the characters in
+   the <var><a href=#temporary-buffer>temporary buffer</a></var> (in the order they were added to
+   the buffer), and reconsume the <a href=#current-input-character>current input character</a>
+   in the <a href=#script-data-state>script data state</a>.</dd>
+
+  </dl><h5 id=script-data-escape-start-state><span class=secno>11.2.4.19 </span><dfn>Script data escape start state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <a href=#script-data-escape-start-dash-state>script data escape start dash state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Reconsume the <a href=#current-input-character>current input character</a> in the
+   <a href=#script-data-state>script data state</a>.</dd>
+
+  </dl><h5 id=script-data-escape-start-dash-state><span class=secno>11.2.4.20 </span><dfn>Script data escape start dash state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <a href=#script-data-escaped-dash-dash-state>script data escaped dash dash state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Reconsume the <a href=#current-input-character>current input character</a> in the
+   <a href=#script-data-state>script data state</a>.</dd>
+
+  </dl><h5 id=script-data-escaped-state><span class=secno>11.2.4.21 </span><dfn>Script data escaped state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <a href=#script-data-escaped-dash-state>script data escaped dash state</a>.</dd>
+
+   <dt>EOF</dt>
+   <dd><a href=#parse-error>Parse error</a>. Reconsume the EOF character in the
+   <a href=#data-state>data state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit the current input character as a character token. Stay in
+   the <a href=#script-data-escaped-state>script data escaped state</a>.</dd>
+
+  </dl><h5 id=script-data-escaped-dash-state><span class=secno>11.2.4.22 </span><dfn>Script data escaped dash state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <a href=#script-data-escaped-dash-dash-state>script data escaped dash dash state</a>.</dd>
+
+   <dt>EOF</dt>
+   <dd><a href=#parse-error>Parse error</a>. Reconsume the EOF character in the
+   <a href=#data-state>data state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit the current input character as a character token. Switch
+   to the <a href=#script-data-escaped-state>script data escaped state</a>.</dd>
+
+  </dl><h5 id=script-data-escaped-dash-dash-state><span class=secno>11.2.4.23 </span><dfn>Script data escaped dash dash state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Stay in the
+   <a href=#script-data-escaped-dash-dash-state>script data escaped dash dash state</a>.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>Emit a U+003E GREATER-THAN SIGN character token. Switch to the
+   <a href=#script-data-state>script data state</a>.</dd>
+
+   <dt>EOF</dt>
+   <dd><a href=#parse-error>Parse error</a>. Reconsume the EOF character in the
+   <a href=#data-state>data state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit the current input character as a character token. Switch
+   to the <a href=#script-data-escaped-state>script data escaped state</a>.</dd>
+
+  </dl><h5 id=before-attribute-name-state><span class=secno>11.2.4.24 </span><dfn>Before attribute name state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0009 CHARACTER TABULATION</dt>
+   <dt>U+000A LINE FEED (LF)</dt>
+   <dt>U+000C FORM FEED (FF)</dt>
+   <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+   <dt>U+0020 SPACE</dt>
    <dd>Stay in the <a href=#before-attribute-name-state>before attribute name state</a>.</dd>
 
    <dt>U+002F SOLIDUS (/)</dt>
@@ -70686,7 +71014,7 @@
    the empty string. Switch to the <a href=#attribute-name-state>attribute name
    state</a>.</dd>
 
-  </dl><h5 id=attribute-name-state><span class=secno>11.2.4.7 </span><dfn>Attribute name state</dfn></h5>
+  </dl><h5 id=attribute-name-state><span class=secno>11.2.4.25 </span><dfn>Attribute name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -70708,9 +71036,9 @@
    state</a>.</dd>
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-   <dd>Append the lowercase version of the <a href=#current-input-character>current input character</a>
-   (add 0x0020 to the character's code point) to the current
-   attribute's name. Stay in the <a href=#attribute-name-state>attribute name
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current attribute's name. Stay in the <a href=#attribute-name-state>attribute name
    state</a>.</dd>
 
    <dt>U+0022 QUOTATION MARK (")</dt>
@@ -70724,8 +71052,9 @@
    <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current attribute's
-   name. Stay in the <a href=#attribute-name-state>attribute name state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   attribute's name. Stay in the <a href=#attribute-name-state>attribute name
+   state</a>.</dd>
 
   </dl><p>When the user agent leaves the attribute name state (and before
   emitting the tag token, if appropriate), the complete attribute's
@@ -70736,7 +71065,7 @@
   associated with it (if any).</p>
 
 
-  <h5 id=after-attribute-name-state><span class=secno>11.2.4.8 </span><dfn>After attribute name state</dfn></h5>
+  <h5 id=after-attribute-name-state><span class=secno>11.2.4.26 </span><dfn>After attribute name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -70759,10 +71088,10 @@
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
    <dd>Start a new attribute in the current tag token. Set that
-   attribute's name to the lowercase version of the <a href=#current-input-character>current input character</a>
-   (add 0x0020 to the character's code point), and its value to
-   the empty string. Switch to the <a href=#attribute-name-state>attribute name
-   state</a>.</dd>
+   attribute's name to the lowercase version of the <a href=#current-input-character>current
+   input character</a> (add 0x0020 to the character's code point),
+   and its value to the empty string. Switch to the <a href=#attribute-name-state>attribute
+   name state</a>.</dd>
 
    <dt>U+0022 QUOTATION MARK (")</dt>
    <dt>U+0027 APOSTROPHE (')</dt>
@@ -70776,11 +71105,11 @@
 
    <dt>Anything else</dt>
    <dd>Start a new attribute in the current tag token. Set that
-   attribute's name to the <a href=#current-input-character>current input character</a>, and its value to
-   the empty string. Switch to the <a href=#attribute-name-state>attribute name
+   attribute's name to the <a href=#current-input-character>current input character</a>, and
+   its value to the empty string. Switch to the <a href=#attribute-name-state>attribute name
    state</a>.</dd>
 
-  </dl><h5 id=before-attribute-value-state><span class=secno>11.2.4.9 </span><dfn>Before attribute value state</dfn></h5>
+  </dl><h5 id=before-attribute-value-state><span class=secno>11.2.4.27 </span><dfn>Before attribute value state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -70796,7 +71125,7 @@
 
    <dt>U+0026 AMPERSAND (&)</dt>
    <dd>Switch to the <a href=#attribute-value-(unquoted)-state>attribute value (unquoted) state</a>
-   and reconsume this input character.</dd>
+   and reconsume this <a href=#current-input-character>current input character</a>.</dd>
 
    <dt>U+0027 APOSTROPHE (')</dt>
    <dd>Switch to the <a href=#attribute-value-(single-quoted)-state>attribute value (single-quoted) state</a>.</dd>
@@ -70820,7 +71149,7 @@
    attribute's value. Switch to the <a href=#attribute-value-(unquoted)-state>attribute value (unquoted)
    state</a>.</dd>
 
-  </dl><h5 id=attribute-value-(double-quoted)-state><span class=secno>11.2.4.10 </span><dfn>Attribute value (double-quoted) state</dfn></h5>
+  </dl><h5 id=attribute-value-(double-quoted)-state><span class=secno>11.2.4.28 </span><dfn>Attribute value (double-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -70838,11 +71167,11 @@
    <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current attribute's
-   value. Stay in the <a href=#attribute-value-(double-quoted)-state>attribute value (double-quoted)
-   state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   attribute's value. Stay in the <a href=#attribute-value-(double-quoted)-state>attribute value
+   (double-quoted) state</a>.</dd>
 
-  </dl><h5 id=attribute-value-(single-quoted)-state><span class=secno>11.2.4.11 </span><dfn>Attribute value (single-quoted) state</dfn></h5>
+  </dl><h5 id=attribute-value-(single-quoted)-state><span class=secno>11.2.4.29 </span><dfn>Attribute value (single-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -70860,11 +71189,11 @@
    <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current attribute's
-   value. Stay in the <a href=#attribute-value-(single-quoted)-state>attribute value (single-quoted)
-   state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   attribute's value. Stay in the <a href=#attribute-value-(single-quoted)-state>attribute value
+   (single-quoted) state</a>.</dd>
 
-  </dl><h5 id=attribute-value-(unquoted)-state><span class=secno>11.2.4.12 </span><dfn>Attribute value (unquoted) state</dfn></h5>
+  </dl><h5 id=attribute-value-(unquoted)-state><span class=secno>11.2.4.30 </span><dfn>Attribute value (unquoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -70897,11 +71226,11 @@
    <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current attribute's
-   value. Stay in the <a href=#attribute-value-(unquoted)-state>attribute value (unquoted)
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   attribute's value. Stay in the <a href=#attribute-value-(unquoted)-state>attribute value (unquoted)
    state</a>.</dd>
 
-  </dl><h5 id=character-reference-in-attribute-value-state><span class=secno>11.2.4.13 </span><dfn>Character reference in attribute value state</dfn></h5>
+  </dl><h5 id=character-reference-in-attribute-value-state><span class=secno>11.2.4.31 </span><dfn>Character reference in attribute value state</dfn></h5>
 
   <p>Attempt to <a href=#consume-a-character-reference>consume a character reference</a>.</p>
 
@@ -70915,7 +71244,7 @@
   in when were switched into this state.</p>
 
 
-  <h5 id=after-attribute-value-(quoted)-state><span class=secno>11.2.4.14 </span><dfn>After attribute value (quoted) state</dfn></h5>
+  <h5 id=after-attribute-value-(quoted)-state><span class=secno>11.2.4.32 </span><dfn>After attribute value (quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -70941,7 +71270,7 @@
    <dd><a href=#parse-error>Parse error</a>. Reconsume the character in
    the <a href=#before-attribute-name-state>before attribute name state</a>.</dd>
 
-  </dl><h5 id=self-closing-start-tag-state><span class=secno>11.2.4.15 </span><dfn>Self-closing start tag state</dfn></h5>
+  </dl><h5 id=self-closing-start-tag-state><span class=secno>11.2.4.33 </span><dfn>Self-closing start tag state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -70958,11 +71287,8 @@
    <dd><a href=#parse-error>Parse error</a>. Reconsume the character in
    the <a href=#before-attribute-name-state>before attribute name state</a>.</dd>
 
-  </dl><h5 id=bogus-comment-state><span class=secno>11.2.4.16 </span><dfn>Bogus comment state</dfn></h5>
+  </dl><h5 id=bogus-comment-state><span class=secno>11.2.4.34 </span><dfn>Bogus comment state</dfn></h5>
 
-  <p><i>(This can only happen if the <a href=#content-model-flag>content model
-  flag</a> is set to the PCDATA state.)</i></p>
-
   <p>Consume every character up to and including the first U+003E
   GREATER-THAN SIGN character (>) or the end of the file (EOF),
   whichever comes first. Emit a comment token whose data is the
@@ -70979,11 +71305,8 @@
   character.</p>
 
 
-  <h5 id=markup-declaration-open-state><span class=secno>11.2.4.17 </span><dfn>Markup declaration open state</dfn></h5>
+  <h5 id=markup-declaration-open-state><span class=secno>11.2.4.35 </span><dfn>Markup declaration open state</dfn></h5>
 
-  <p><i>(This can only happen if the <a href=#content-model-flag>content model
-  flag</a> is set to the PCDATA state.)</i></p>
-
   <p>If the next two characters are both U+002D HYPHEN-MINUS (-)
   characters, consume those two characters, create a comment token
   whose data is the empty string, and switch to the <a href=#comment-start-state>comment
@@ -71007,7 +71330,7 @@
   comment.</p>
 
 
-  <h5 id=comment-start-state><span class=secno>11.2.4.18 </span><dfn>Comment start state</dfn></h5>
+  <h5 id=comment-start-state><span class=secno>11.2.4.36 </span><dfn>Comment start state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71024,10 +71347,10 @@
    the EOF character in the <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the input character to the comment token's
-   data. Switch to the <a href=#comment-state>comment state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the comment
+   token's data. Switch to the <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=comment-start-dash-state><span class=secno>11.2.4.19 </span><dfn>Comment start dash state</dfn></h5>
+  </dl><h5 id=comment-start-dash-state><span class=secno>11.2.4.37 </span><dfn>Comment start dash state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71044,11 +71367,11 @@
    in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append a U+002D HYPHEN-MINUS character (-) and the input
-   character to the comment token's data. Switch to the
-   <a href=#comment-state>comment state</a>.</dd>
+   <dd>Append a U+002D HYPHEN-MINUS character (-) and the
+   <a href=#current-input-character>current input character</a> to the comment token's
+   data. Switch to the <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=comment-state><span class=secno>11.2.4.20 </span><dfn id=comment>Comment state</dfn></h5>
+  </dl><h5 id=comment-state><span class=secno>11.2.4.38 </span><dfn id=comment>Comment state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71061,10 +71384,10 @@
    in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append the input character to the comment token's data. Stay
-   in the <a href=#comment-state>comment state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the comment
+   token's data. Stay in the <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=comment-end-dash-state><span class=secno>11.2.4.21 </span><dfn>Comment end dash state</dfn></h5>
+  </dl><h5 id=comment-end-dash-state><span class=secno>11.2.4.39 </span><dfn>Comment end dash state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71077,11 +71400,11 @@
    in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append a U+002D HYPHEN-MINUS character (-) and the input
-   character to the comment token's data. Switch to the
-   <a href=#comment-state>comment state</a>.</dd>
+   <dd>Append a U+002D HYPHEN-MINUS character (-) and the
+   <a href=#current-input-character>current input character</a> to the comment token's
+   data. Switch to the <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=comment-end-state><span class=secno>11.2.4.22 </span><dfn>Comment end state</dfn></h5>
+  </dl><h5 id=comment-end-state><span class=secno>11.2.4.40 </span><dfn>Comment end state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71095,8 +71418,9 @@
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
    <dd><a href=#parse-error>Parse error</a>. Append two U+002D HYPHEN-MINUS (-)
-   characters and the input character to the comment token's
-   data. Switch to the <a href=#comment-end-space-state>comment end space state</a>.</dd>
+   characters and the <a href=#current-input-character>current input character</a> to the
+   comment token's data. Switch to the <a href=#comment-end-space-state>comment end space
+   state</a>.</dd>
 
    <dt>U+0021 EXCLAMATION MARK (!)</dt>
    <dd><a href=#parse-error>Parse error</a>. Switch to the <a href=#comment-end-bang-state>comment end bang
@@ -71117,10 +71441,11 @@
 
    <dt>Anything else</dt>
    <dd><a href=#parse-error>Parse error</a>. Append two U+002D HYPHEN-MINUS (-)
-   characters and the input character to the comment token's
-   data. Switch to the <a href=#comment-state>comment state</a>.</dd>
+   characters and the <a href=#current-input-character>current input character</a> to the
+   comment token's data. Switch to the <a href=#comment-state>comment
+   state</a>.</dd>
 
-  </dl><h5 id=comment-end-bang-state><span class=secno>11.2.4.23 </span><dfn>Comment end bang state</dfn></h5>
+  </dl><h5 id=comment-end-bang-state><span class=secno>11.2.4.41 </span><dfn>Comment end bang state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71140,11 +71465,11 @@
 
    <dt>Anything else</dt>
    <dd>Append two U+002D HYPHEN-MINUS (-) characters, a U+0021
-   EXCLAMATION MARK character (!), and the input character to the
-   comment token's data. Switch to the <a href=#comment-state>comment
-   state</a>.</dd>
+   EXCLAMATION MARK character (!), and the <a href=#current-input-character>current input
+   character</a> to the comment token's data. Switch to the
+   <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=comment-end-space-state><span class=secno>11.2.4.24 </span><dfn>Comment end space state</dfn></h5>
+  </dl><h5 id=comment-end-space-state><span class=secno>11.2.4.42 </span><dfn>Comment end space state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71153,7 +71478,7 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Append the input character to the comment token's data. Stay in
+   <dd>Append the <a href=#current-input-character>current input character</a> to the comment token's data. Stay in
    the <a href=#comment-end-space-state>comment end space state</a>.</dd>
 
    <dt>U+002D HYPHEN-MINUS (-)</dt>
@@ -71169,10 +71494,10 @@
    comment in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append the input character to the comment token's data. Switch
+   <dd>Append the <a href=#current-input-character>current input character</a> to the comment token's data. Switch
    to the <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=doctype-state><span class=secno>11.2.4.25 </span><dfn>DOCTYPE state</dfn></h5>
+  </dl><h5 id=doctype-state><span class=secno>11.2.4.43 </span><dfn>DOCTYPE state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71192,7 +71517,7 @@
    <dd><a href=#parse-error>Parse error</a>. Reconsume the current
    character in the <a href=#before-doctype-name-state>before DOCTYPE name state</a>.</dd>
 
-  </dl><h5 id=before-doctype-name-state><span class=secno>11.2.4.26 </span><dfn>Before DOCTYPE name state</dfn></h5>
+  </dl><h5 id=before-doctype-name-state><span class=secno>11.2.4.44 </span><dfn>Before DOCTYPE name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71205,7 +71530,7 @@
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
    <dd>Create a new DOCTYPE token. Set the token's name to the
-   lowercase version of the input character (add 0x0020 to the
+   lowercase version of the <a href=#current-input-character>current input character</a> (add 0x0020 to the
    character's code point). Switch to the <a href=#doctype-name-state>DOCTYPE name
    state</a>.</dd>
 
@@ -71224,7 +71549,7 @@
    <a href=#current-input-character>current input character</a>. Switch to the <a href=#doctype-name-state>DOCTYPE name
    state</a>.</dd>
 
-  </dl><h5 id=doctype-name-state><span class=secno>11.2.4.27 </span><dfn>DOCTYPE name state</dfn></h5>
+  </dl><h5 id=doctype-name-state><span class=secno>11.2.4.45 </span><dfn>DOCTYPE name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71240,9 +71565,10 @@
    state</a>.</dd>
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-   <dd>Append the lowercase version of the input character (add 0x0020
-   to the character's code point) to the current DOCTYPE token's
-   name. Stay in the <a href=#doctype-name-state>DOCTYPE name state</a>.</dd>
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current DOCTYPE token's name. Stay in the <a href=#doctype-name-state>DOCTYPE name
+   state</a>.</dd>
 
    <dt>EOF</dt>
    <dd><a href=#parse-error>Parse error</a>. Set the DOCTYPE token's
@@ -71250,10 +71576,11 @@
    Reconsume the EOF character in the <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current DOCTYPE
-   token's name. Stay in the <a href=#doctype-name-state>DOCTYPE name state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   DOCTYPE token's name. Stay in the <a href=#doctype-name-state>DOCTYPE name
+   state</a>.</dd>
 
-  </dl><h5 id=after-doctype-name-state><span class=secno>11.2.4.28 </span><dfn>After DOCTYPE name state</dfn></h5>
+  </dl><h5 id=after-doctype-name-state><span class=secno>11.2.4.46 </span><dfn>After DOCTYPE name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71293,7 +71620,7 @@
 
    </dd>
 
-  </dl><h5 id=after-doctype-public-keyword-state><span class=secno>11.2.4.29 </span><dfn>After DOCTYPE public keyword state</dfn></h5>
+  </dl><h5 id=after-doctype-public-keyword-state><span class=secno>11.2.4.47 </span><dfn>After DOCTYPE public keyword state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71314,7 +71641,7 @@
    <dd><a href=#parse-error>Parse error</a>. Reconsume the current character in
    the <a href=#before-doctype-public-identifier-state>before DOCTYPE public identifier state</a>.</dd>
 
-  </dl><h5 id=before-doctype-public-identifier-state><span class=secno>11.2.4.30 </span><dfn>Before DOCTYPE public identifier state</dfn></h5>
+  </dl><h5 id=before-doctype-public-identifier-state><span class=secno>11.2.4.48 </span><dfn>Before DOCTYPE public identifier state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71350,7 +71677,7 @@
    <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href=#bogus-doctype-state>bogus
    DOCTYPE state</a>.</dd>
 
-  </dl><h5 id=doctype-public-identifier-(double-quoted)-state><span class=secno>11.2.4.31 </span><dfn>DOCTYPE public identifier (double-quoted) state</dfn></h5>
+  </dl><h5 id=doctype-public-identifier-(double-quoted)-state><span class=secno>11.2.4.49 </span><dfn>DOCTYPE public identifier (double-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71372,7 +71699,7 @@
    token's public identifier. Stay in the <a href=#doctype-public-identifier-(double-quoted)-state>DOCTYPE public
    identifier (double-quoted) state</a>.</dd>
 
-  </dl><h5 id=doctype-public-identifier-(single-quoted)-state><span class=secno>11.2.4.32 </span><dfn>DOCTYPE public identifier (single-quoted) state</dfn></h5>
+  </dl><h5 id=doctype-public-identifier-(single-quoted)-state><span class=secno>11.2.4.50 </span><dfn>DOCTYPE public identifier (single-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71394,7 +71721,7 @@
    token's public identifier. Stay in the <a href=#doctype-public-identifier-(single-quoted)-state>DOCTYPE public
    identifier (single-quoted) state</a>.</dd>
 
-  </dl><h5 id=after-doctype-public-identifier-state><span class=secno>11.2.4.33 </span><dfn>After DOCTYPE public identifier state</dfn></h5>
+  </dl><h5 id=after-doctype-public-identifier-state><span class=secno>11.2.4.51 </span><dfn>After DOCTYPE public identifier state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71403,7 +71730,8 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Switch to the <a href=#between-doctype-public-and-system-identifiers-state>between DOCTYPE public and system identifiers state</a>.</dd>
+   <dd>Switch to the <a href=#between-doctype-public-and-system-identifiers-state>between DOCTYPE public and system
+   identifiers state</a>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd>Emit the current DOCTYPE token. Switch to the <a href=#data-state>data
@@ -71429,7 +71757,7 @@
    <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href=#bogus-doctype-state>bogus
    DOCTYPE state</a>.</dd>
 
-  </dl><h5 id=between-doctype-public-and-system-identifiers-state><span class=secno>11.2.4.34 </span><dfn>Between DOCTYPE public and system identifiers state</dfn></h5>
+  </dl><h5 id=between-doctype-public-and-system-identifiers-state><span class=secno>11.2.4.52 </span><dfn>Between DOCTYPE public and system identifiers state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71438,7 +71766,8 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Stay in the <a href=#between-doctype-public-and-system-identifiers-state>between DOCTYPE public and system identifiers state</a>.</dd>
+   <dd>Stay in the <a href=#between-doctype-public-and-system-identifiers-state>between DOCTYPE public and system identifiers
+   state</a>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd>Emit the current DOCTYPE token. Switch to the <a href=#data-state>data
@@ -71464,7 +71793,7 @@
    <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href=#bogus-doctype-state>bogus
    DOCTYPE state</a>.</dd>
 
-  </dl><h5 id=after-doctype-system-keyword-state><span class=secno>11.2.4.35 </span><dfn>After DOCTYPE system keyword state</dfn></h5>
+  </dl><h5 id=after-doctype-system-keyword-state><span class=secno>11.2.4.53 </span><dfn>After DOCTYPE system keyword state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71485,7 +71814,7 @@
    <dd><a href=#parse-error>Parse error</a>. Reconsume the current character in
    the <a href=#before-doctype-system-identifier-state>before DOCTYPE system identifier state</a>.</dd>
 
-  </dl><h5 id=before-doctype-system-identifier-state><span class=secno>11.2.4.36 </span><dfn>Before DOCTYPE system identifier state</dfn></h5>
+  </dl><h5 id=before-doctype-system-identifier-state><span class=secno>11.2.4.54 </span><dfn>Before DOCTYPE system identifier state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71521,12 +71850,13 @@
    <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href=#bogus-doctype-state>bogus
    DOCTYPE state</a>.</dd>
 
-  </dl><h5 id=doctype-system-identifier-(double-quoted)-state><span class=secno>11.2.4.37 </span><dfn>DOCTYPE system identifier (double-quoted) state</dfn></h5>
+  </dl><h5 id=doctype-system-identifier-(double-quoted)-state><span class=secno>11.2.4.55 </span><dfn>DOCTYPE system identifier (double-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
   <dl class=switch><dt>U+0022 QUOTATION MARK (")</dt>
-   <dd>Switch to the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier state</a>.</dd>
+   <dd>Switch to the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier
+   state</a>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd><a href=#parse-error>Parse error</a>. Set the DOCTYPE token's
@@ -71539,16 +71869,17 @@
    Reconsume the EOF character in the <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current DOCTYPE
-   token's system identifier. Stay in the <a href=#doctype-system-identifier-(double-quoted)-state>DOCTYPE system
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   DOCTYPE token's system identifier. Stay in the <a href=#doctype-system-identifier-(double-quoted)-state>DOCTYPE system
    identifier (double-quoted) state</a>.</dd>
 
-  </dl><h5 id=doctype-system-identifier-(single-quoted)-state><span class=secno>11.2.4.38 </span><dfn>DOCTYPE system identifier (single-quoted) state</dfn></h5>
+  </dl><h5 id=doctype-system-identifier-(single-quoted)-state><span class=secno>11.2.4.56 </span><dfn>DOCTYPE system identifier (single-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
   <dl class=switch><dt>U+0027 APOSTROPHE (')</dt>
-   <dd>Switch to the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier state</a>.</dd>
+   <dd>Switch to the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier
+   state</a>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd><a href=#parse-error>Parse error</a>. Set the DOCTYPE token's
@@ -71561,11 +71892,11 @@
    Reconsume the EOF character in the <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current DOCTYPE
-   token's system identifier. Stay in the <a href=#doctype-system-identifier-(single-quoted)-state>DOCTYPE system
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   DOCTYPE token's system identifier. Stay in the <a href=#doctype-system-identifier-(single-quoted)-state>DOCTYPE system
    identifier (single-quoted) state</a>.</dd>
 
-  </dl><h5 id=after-doctype-system-identifier-state><span class=secno>11.2.4.39 </span><dfn>After DOCTYPE system identifier state</dfn></h5>
+  </dl><h5 id=after-doctype-system-identifier-state><span class=secno>11.2.4.57 </span><dfn>After DOCTYPE system identifier state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71574,7 +71905,8 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Stay in the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier state</a>.</dd>
+   <dd>Stay in the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier
+   state</a>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd>Emit the current DOCTYPE token. Switch to the <a href=#data-state>data
@@ -71590,7 +71922,7 @@
    state</a>. (This does <em>not</em> set the DOCTYPE token's
    <i>force-quirks flag</i> to <i>on</i>.)</dd>
 
-  </dl><h5 id=bogus-doctype-state><span class=secno>11.2.4.40 </span><dfn>Bogus DOCTYPE state</dfn></h5>
+  </dl><h5 id=bogus-doctype-state><span class=secno>11.2.4.58 </span><dfn>Bogus DOCTYPE state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -71605,11 +71937,8 @@
    <dt>Anything else</dt>
    <dd>Stay in the <a href=#bogus-doctype-state>bogus DOCTYPE state</a>.</dd>
 
-  </dl><h5 id=cdata-section-state><span class=secno>11.2.4.41 </span><dfn>CDATA section state</dfn></h5>
+  </dl><h5 id=cdata-section-state><span class=secno>11.2.4.59 </span><dfn>CDATA section state</dfn></h5>
 
-  <p><i>(This can only happen if the <a href=#content-model-flag>content model
-  flag</a> is set to the PCDATA state.)</i></p>
-
   <p>Consume every character up to the next occurrence of the three
   character sequence U+005D RIGHT SQUARE BRACKET U+005D RIGHT SQUARE
   BRACKET U+003E GREATER-THAN SIGN (<code title="">]]></code>), or the
@@ -71625,7 +71954,7 @@
 
 
 
-  <h5 id=tokenizing-character-references><span class=secno>11.2.4.42 </span>Tokenizing character references</h5>
+  <h5 id=tokenizing-character-references><span class=secno>11.2.4.60 </span>Tokenizing character references</h5>
 
   <p>This section defines how to <dfn id=consume-a-character-reference>consume a character
   reference</dfn>. This definition is used when parsing character
@@ -72072,11 +72401,10 @@
   <ol><li><p><a href=#insert-an-html-element>Insert an HTML element</a> for the token.</li>
 
    <li><p>If the algorithm that was invoked is the <a href=#generic-raw-text-element-parsing-algorithm>generic raw
-   text element parsing algorithm</a>, switch the tokenizer's
-   <a href=#content-model-flag>content model flag</a> to the RAWTEXT state; otherwise the
-   algorithm invoked was the <a href=#generic-rcdata-element-parsing-algorithm>generic RCDATA element parsing
-   algorithm</a>, switch the tokenizer's <a href=#content-model-flag>content model
-   flag</a> to the RCDATA state.</li>
+   text element parsing algorithm</a>, switch the tokenizer to the
+   <a href=#rawtext-state>RAWTEXT state</a>; otherwise the algorithm invoked
+   was the <a href=#generic-rcdata-element-parsing-algorithm>generic RCDATA element parsing algorithm</a>,
+   switch the tokenizer to the <a href=#rcdata-state>RCDATA state</a>.</li>
 
    <li><p>Let the <a href=#original-insertion-mode>original insertion mode</a> be the current
    <a href=#insertion-mode>insertion mode</a>.</p>
@@ -72590,8 +72918,8 @@
      and push it onto the <a href=#stack-of-open-elements>stack of open
      elements</a>.</li>
 
-     <li><p>Switch the tokenizer's <a href=#content-model-flag>content model flag</a> to
-     the RAWTEXT state.</li>
+     <li><p>Switch the tokenizer to the <a href=#script-data-state>script data
+     state</a>.</li>
 
      <li><p>Let the <a href=#original-insertion-mode>original insertion mode</a> be the current
      <a href=#insertion-mode>insertion mode</a>.</p>
@@ -73130,14 +73458,12 @@
 
     <p><a href=#insert-an-html-element>Insert an HTML element</a> for the token.</p>
 
-    <p>Switch the <a href=#content-model-flag>content model flag</a> to the PLAINTEXT
-    state.</p>
+    <p>Switch the tokenizer to the <a href=#plaintext-state>PLAINTEXT state</a>.</p>
 
-    <p class=note>Once a start tag with the tag name "plaintext"
-    has been seen, that will be the last token ever seen other
-    than character tokens (and the end-of-file token), because
-    there is no way to switch the <a href=#content-model-flag>content model flag</a>
-    out of the PLAINTEXT state.</p>
+    <p class=note>Once a start tag with the tag name "plaintext" has
+    been seen, that will be the last token ever seen other than
+    character tokens (and the end-of-file token), because there is no
+    way to switch out of the <a href=#plaintext-state>PLAINTEXT state</a>.</p>
 
    </dd>
 
@@ -73733,8 +74059,8 @@
      one. (Newlines at the start of <code><a href=#the-textarea-element>textarea</a></code> elements are
      ignored as an authoring convenience.)</li>
 
-     <li><p>Switch the tokenizer's <a href=#content-model-flag>content model flag</a> to
-     the RCDATA state.</li>
+     <li><p>Switch the tokenizer to the the <a href=#rcdata-state>RCDATA
+     state</a>.</li>
 
      <li><p>Let the <a href=#original-insertion-mode>original insertion mode</a> be the
      current <a href=#insertion-mode>insertion mode</a>.</p>
@@ -76096,42 +76422,38 @@
 
     <ol><li>
 
-      <p>Set the <a href=#html-parser>HTML parser</a>'s <a href=#tokenization>tokenization</a>
-      stage's <a href=#content-model-flag>content model flag</a> according to the <var title="">context</var> element, as follows:</p>
+      <p>Set the state of the <a href=#html-parser>HTML parser</a>'s
+      <a href=#tokenization>tokenization</a> stage as follows:</p>
 
       <dl class=switch><dt>If it is a <code><a href=#the-title-element-0>title</a></code> or <code><a href=#the-textarea-element>textarea</a></code>
        element</dt>
 
-       <dd>Set the <a href=#content-model-flag>content model flag</a> to
-       the RCDATA state.</dd>
+       <dd>Switch the tokenizer to the <a href=#rcdata-state>RCDATA state</a>.</dd>
 
 
        <dt>If it is a <code><a href=#the-style-element>style</a></code>, <code><a href=#script>script</a></code>,
        <code><a href=#xmp>xmp</a></code>, <code><a href=#the-iframe-element>iframe</a></code>, <code><a href=#noembed>noembed</a></code>, or
        <code><a href=#noframes>noframes</a></code> element</dt>
 
-       <dd>Set the <a href=#content-model-flag>content model flag</a> to
-       the RAWTEXT state.</dd>
+       <dd>Switch the tokenizer to the <a href=#rawtext-state>RAWTEXT state</a>.</dd>
 
 
        <dt>If it is a <code><a href=#the-noscript-element>noscript</a></code> element</dt>
 
-       <dd>If the <a href=#scripting-flag>scripting flag</a> is enabled, set the
-       <a href=#content-model-flag>content model flag</a> to the RAWTEXT
-       state. Otherwise, set the <a href=#content-model-flag>content model flag</a> to the
-       PCDATA state.</dd>
+       <dd>If the <a href=#scripting-flag>scripting flag</a> is enabled, switch the
+       tokenizer to the <a href=#rawtext-state>RAWTEXT state</a>.  Otherwise,
+       leave the tokenizer in the <a href=#data-state>data state</a>.</dd>
 
 
        <dt>If it is a <code><a href=#plaintext>plaintext</a></code> element</dt>
 
-       <dd>Set the <a href=#content-model-flag>content model flag</a> to
-       PLAINTEXT.</dd>
+       <dd>Switch the tokenizer to the <a href=#plaintext-state>PLAINTEXT
+       state</a>.</dd>
 
 
        <dt>Otherwise</dt>
 
-       <dd>Leave the <a href=#content-model-flag>content model flag</a> in the PCDATA
-       state.</dd>
+       <dd>Leave the tokenizer in the <a href=#data-state>data state</a>.</dd>
 
       </dl></li>
 

Modified: index
===================================================================
--- index	2009-10-19 05:52:18 UTC (rev 4176)
+++ index	2009-10-19 11:00:31 UTC (rev 4177)
@@ -881,47 +881,65 @@
      <li><a href=#tokenization><span class=secno>9.2.4 </span>Tokenization</a>
       <ol>
        <li><a href=#data-state><span class=secno>9.2.4.1 </span>Data state</a></li>
-       <li><a href=#character-reference-in-data-state><span class=secno>9.2.4.2 </span>Character reference in data state</a></li>
-       <li><a href=#tag-open-state><span class=secno>9.2.4.3 </span>Tag open state</a></li>
-       <li><a href=#close-tag-open-state><span class=secno>9.2.4.4 </span>Close tag open state</a></li>
-       <li><a href=#tag-name-state><span class=secno>9.2.4.5 </span>Tag name state</a></li>
-       <li><a href=#before-attribute-name-state><span class=secno>9.2.4.6 </span>Before attribute name state</a></li>
-       <li><a href=#attribute-name-state><span class=secno>9.2.4.7 </span>Attribute name state</a></li>
-       <li><a href=#after-attribute-name-state><span class=secno>9.2.4.8 </span>After attribute name state</a></li>
-       <li><a href=#before-attribute-value-state><span class=secno>9.2.4.9 </span>Before attribute value state</a></li>
-       <li><a href=#attribute-value-(double-quoted)-state><span class=secno>9.2.4.10 </span>Attribute value (double-quoted) state</a></li>
-       <li><a href=#attribute-value-(single-quoted)-state><span class=secno>9.2.4.11 </span>Attribute value (single-quoted) state</a></li>
-       <li><a href=#attribute-value-(unquoted)-state><span class=secno>9.2.4.12 </span>Attribute value (unquoted) state</a></li>
-       <li><a href=#character-reference-in-attribute-value-state><span class=secno>9.2.4.13 </span>Character reference in attribute value state</a></li>
-       <li><a href=#after-attribute-value-(quoted)-state><span class=secno>9.2.4.14 </span>After attribute value (quoted) state</a></li>
-       <li><a href=#self-closing-start-tag-state><span class=secno>9.2.4.15 </span>Self-closing start tag state</a></li>
-       <li><a href=#bogus-comment-state><span class=secno>9.2.4.16 </span>Bogus comment state</a></li>
-       <li><a href=#markup-declaration-open-state><span class=secno>9.2.4.17 </span>Markup declaration open state</a></li>
-       <li><a href=#comment-start-state><span class=secno>9.2.4.18 </span>Comment start state</a></li>
-       <li><a href=#comment-start-dash-state><span class=secno>9.2.4.19 </span>Comment start dash state</a></li>
-       <li><a href=#comment-state><span class=secno>9.2.4.20 </span>Comment state</a></li>
-       <li><a href=#comment-end-dash-state><span class=secno>9.2.4.21 </span>Comment end dash state</a></li>
-       <li><a href=#comment-end-state><span class=secno>9.2.4.22 </span>Comment end state</a></li>
-       <li><a href=#comment-end-bang-state><span class=secno>9.2.4.23 </span>Comment end bang state</a></li>
-       <li><a href=#comment-end-space-state><span class=secno>9.2.4.24 </span>Comment end space state</a></li>
-       <li><a href=#doctype-state><span class=secno>9.2.4.25 </span>DOCTYPE state</a></li>
-       <li><a href=#before-doctype-name-state><span class=secno>9.2.4.26 </span>Before DOCTYPE name state</a></li>
-       <li><a href=#doctype-name-state><span class=secno>9.2.4.27 </span>DOCTYPE name state</a></li>
-       <li><a href=#after-doctype-name-state><span class=secno>9.2.4.28 </span>After DOCTYPE name state</a></li>
-       <li><a href=#after-doctype-public-keyword-state><span class=secno>9.2.4.29 </span>After DOCTYPE public keyword state</a></li>
-       <li><a href=#before-doctype-public-identifier-state><span class=secno>9.2.4.30 </span>Before DOCTYPE public identifier state</a></li>
-       <li><a href=#doctype-public-identifier-(double-quoted)-state><span class=secno>9.2.4.31 </span>DOCTYPE public identifier (double-quoted) state</a></li>
-       <li><a href=#doctype-public-identifier-(single-quoted)-state><span class=secno>9.2.4.32 </span>DOCTYPE public identifier (single-quoted) state</a></li>
-       <li><a href=#after-doctype-public-identifier-state><span class=secno>9.2.4.33 </span>After DOCTYPE public identifier state</a></li>
-       <li><a href=#between-doctype-public-and-system-identifiers-state><span class=secno>9.2.4.34 </span>Between DOCTYPE public and system identifiers state</a></li>
-       <li><a href=#after-doctype-system-keyword-state><span class=secno>9.2.4.35 </span>After DOCTYPE system keyword state</a></li>
-       <li><a href=#before-doctype-system-identifier-state><span class=secno>9.2.4.36 </span>Before DOCTYPE system identifier state</a></li>
-       <li><a href=#doctype-system-identifier-(double-quoted)-state><span class=secno>9.2.4.37 </span>DOCTYPE system identifier (double-quoted) state</a></li>
-       <li><a href=#doctype-system-identifier-(single-quoted)-state><span class=secno>9.2.4.38 </span>DOCTYPE system identifier (single-quoted) state</a></li>
-       <li><a href=#after-doctype-system-identifier-state><span class=secno>9.2.4.39 </span>After DOCTYPE system identifier state</a></li>
-       <li><a href=#bogus-doctype-state><span class=secno>9.2.4.40 </span>Bogus DOCTYPE state</a></li>
-       <li><a href=#cdata-section-state><span class=secno>9.2.4.41 </span>CDATA section state</a></li>
-       <li><a href=#tokenizing-character-references><span class=secno>9.2.4.42 </span>Tokenizing character references</a></ol></li>
+       <li><a href=#rcdata-state><span class=secno>9.2.4.2 </span>RCDATA state</a></li>
+       <li><a href=#rawtext-state><span class=secno>9.2.4.3 </span>RAWTEXT state</a></li>
+       <li><a href=#script-data-state><span class=secno>9.2.4.4 </span>Script data state</a></li>
+       <li><a href=#plaintext-state><span class=secno>9.2.4.5 </span>PLAINTEXT state</a></li>
+       <li><a href=#character-reference-in-data-state><span class=secno>9.2.4.6 </span>Character reference in data state</a></li>
+       <li><a href=#tag-open-state><span class=secno>9.2.4.7 </span>Tag open state</a></li>
+       <li><a href=#close-tag-open-state><span class=secno>9.2.4.8 </span>Close tag open state</a></li>
+       <li><a href=#tag-name-state><span class=secno>9.2.4.9 </span>Tag name state</a></li>
+       <li><a href=#rcdata-less-than-sign-state><span class=secno>9.2.4.10 </span>RCDATA less-than sign state</a></li>
+       <li><a href=#rcdata-end-tag-open-state><span class=secno>9.2.4.11 </span>RCDATA end tag open state</a></li>
+       <li><a href=#rcdata-end-tag-name-state><span class=secno>9.2.4.12 </span>RCDATA end tag name state</a></li>
+       <li><a href=#rawtext-less-than-sign-state><span class=secno>9.2.4.13 </span>RAWTEXT less-than sign state</a></li>
+       <li><a href=#rawtext-end-tag-open-state><span class=secno>9.2.4.14 </span>RAWTEXT end tag open state</a></li>
+       <li><a href=#rawtext-end-tag-name-state><span class=secno>9.2.4.15 </span>RAWTEXT end tag name state</a></li>
+       <li><a href=#script-data-less-than-sign-state><span class=secno>9.2.4.16 </span>Script data less-than sign state</a></li>
+       <li><a href=#script-data-end-tag-open-state><span class=secno>9.2.4.17 </span>Script data end tag open state</a></li>
+       <li><a href=#script-data-end-tag-name-state><span class=secno>9.2.4.18 </span>Script data end tag name state</a></li>
+       <li><a href=#script-data-escape-start-state><span class=secno>9.2.4.19 </span>Script data escape start state</a></li>
+       <li><a href=#script-data-escape-start-dash-state><span class=secno>9.2.4.20 </span>Script data escape start dash state</a></li>
+       <li><a href=#script-data-escaped-state><span class=secno>9.2.4.21 </span>Script data escaped state</a></li>
+       <li><a href=#script-data-escaped-dash-state><span class=secno>9.2.4.22 </span>Script data escaped dash state</a></li>
+       <li><a href=#script-data-escaped-dash-dash-state><span class=secno>9.2.4.23 </span>Script data escaped dash dash state</a></li>
+       <li><a href=#before-attribute-name-state><span class=secno>9.2.4.24 </span>Before attribute name state</a></li>
+       <li><a href=#attribute-name-state><span class=secno>9.2.4.25 </span>Attribute name state</a></li>
+       <li><a href=#after-attribute-name-state><span class=secno>9.2.4.26 </span>After attribute name state</a></li>
+       <li><a href=#before-attribute-value-state><span class=secno>9.2.4.27 </span>Before attribute value state</a></li>
+       <li><a href=#attribute-value-(double-quoted)-state><span class=secno>9.2.4.28 </span>Attribute value (double-quoted) state</a></li>
+       <li><a href=#attribute-value-(single-quoted)-state><span class=secno>9.2.4.29 </span>Attribute value (single-quoted) state</a></li>
+       <li><a href=#attribute-value-(unquoted)-state><span class=secno>9.2.4.30 </span>Attribute value (unquoted) state</a></li>
+       <li><a href=#character-reference-in-attribute-value-state><span class=secno>9.2.4.31 </span>Character reference in attribute value state</a></li>
+       <li><a href=#after-attribute-value-(quoted)-state><span class=secno>9.2.4.32 </span>After attribute value (quoted) state</a></li>
+       <li><a href=#self-closing-start-tag-state><span class=secno>9.2.4.33 </span>Self-closing start tag state</a></li>
+       <li><a href=#bogus-comment-state><span class=secno>9.2.4.34 </span>Bogus comment state</a></li>
+       <li><a href=#markup-declaration-open-state><span class=secno>9.2.4.35 </span>Markup declaration open state</a></li>
+       <li><a href=#comment-start-state><span class=secno>9.2.4.36 </span>Comment start state</a></li>
+       <li><a href=#comment-start-dash-state><span class=secno>9.2.4.37 </span>Comment start dash state</a></li>
+       <li><a href=#comment-state><span class=secno>9.2.4.38 </span>Comment state</a></li>
+       <li><a href=#comment-end-dash-state><span class=secno>9.2.4.39 </span>Comment end dash state</a></li>
+       <li><a href=#comment-end-state><span class=secno>9.2.4.40 </span>Comment end state</a></li>
+       <li><a href=#comment-end-bang-state><span class=secno>9.2.4.41 </span>Comment end bang state</a></li>
+       <li><a href=#comment-end-space-state><span class=secno>9.2.4.42 </span>Comment end space state</a></li>
+       <li><a href=#doctype-state><span class=secno>9.2.4.43 </span>DOCTYPE state</a></li>
+       <li><a href=#before-doctype-name-state><span class=secno>9.2.4.44 </span>Before DOCTYPE name state</a></li>
+       <li><a href=#doctype-name-state><span class=secno>9.2.4.45 </span>DOCTYPE name state</a></li>
+       <li><a href=#after-doctype-name-state><span class=secno>9.2.4.46 </span>After DOCTYPE name state</a></li>
+       <li><a href=#after-doctype-public-keyword-state><span class=secno>9.2.4.47 </span>After DOCTYPE public keyword state</a></li>
+       <li><a href=#before-doctype-public-identifier-state><span class=secno>9.2.4.48 </span>Before DOCTYPE public identifier state</a></li>
+       <li><a href=#doctype-public-identifier-(double-quoted)-state><span class=secno>9.2.4.49 </span>DOCTYPE public identifier (double-quoted) state</a></li>
+       <li><a href=#doctype-public-identifier-(single-quoted)-state><span class=secno>9.2.4.50 </span>DOCTYPE public identifier (single-quoted) state</a></li>
+       <li><a href=#after-doctype-public-identifier-state><span class=secno>9.2.4.51 </span>After DOCTYPE public identifier state</a></li>
+       <li><a href=#between-doctype-public-and-system-identifiers-state><span class=secno>9.2.4.52 </span>Between DOCTYPE public and system identifiers state</a></li>
+       <li><a href=#after-doctype-system-keyword-state><span class=secno>9.2.4.53 </span>After DOCTYPE system keyword state</a></li>
+       <li><a href=#before-doctype-system-identifier-state><span class=secno>9.2.4.54 </span>Before DOCTYPE system identifier state</a></li>
+       <li><a href=#doctype-system-identifier-(double-quoted)-state><span class=secno>9.2.4.55 </span>DOCTYPE system identifier (double-quoted) state</a></li>
+       <li><a href=#doctype-system-identifier-(single-quoted)-state><span class=secno>9.2.4.56 </span>DOCTYPE system identifier (single-quoted) state</a></li>
+       <li><a href=#after-doctype-system-identifier-state><span class=secno>9.2.4.57 </span>After DOCTYPE system identifier state</a></li>
+       <li><a href=#bogus-doctype-state><span class=secno>9.2.4.58 </span>Bogus DOCTYPE state</a></li>
+       <li><a href=#cdata-section-state><span class=secno>9.2.4.59 </span>CDATA section state</a></li>
+       <li><a href=#tokenizing-character-references><span class=secno>9.2.4.60 </span>Tokenizing character references</a></ol></li>
      <li><a href=#tree-construction><span class=secno>9.2.5 </span>Tree construction</a>
       <ol>
        <li><a href=#creating-and-inserting-elements><span class=secno>9.2.5.1 </span>Creating and inserting elements</a></li>
@@ -9614,9 +9632,9 @@
     <p>If <var title="">type</var> is <em>not</em> now an <a href=#ascii-case-insensitive>ASCII
     case-insensitive</a> match for the string
     "<code><a href=#text/html>text/html</a></code>", then act as if the tokenizer had emitted
-    a start tag token with the tag name "pre", then set the <a href=#html-parser>HTML
-    parser</a>'s <a href=#tokenization>tokenization</a> stage's <a href=#content-model-flag>content
-    model flag</a> to <i title="">PLAINTEXT</i>.</p>
+    a start tag token with the tag name "pre", then switch the
+    <a href=#html-parser>HTML parser</a>'s tokenizer to the <a href=#plaintext-state>PLAINTEXT
+    state</a>.</p>
 
     <!--
  http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E...%3Ciframe%3E%3C%2Fiframe%3E%3Cscript%3Eonload%20%3D%20function%20()%20%7B%20%0D%0A%20%20var%20d%20%3D%20document.getElementsByTagName('iframe')%5B0%5D.contentDocument%3B%0D%0A%20%20d.open('image%2Fsvg%2Bxml')%3B%0D%0A%20%20d.write(%22%3Cinput%20xmlns%3D'http%3A%2F%2Fwww.w3.org%2F1999%2Fxhtml'%20value%3D'(x)html'%2F%3E%22)%3B%0D%0A%20%20d.close()%3B%0D%0A%7D%3B%3C%2Fscript%3E
@@ -52917,9 +52935,9 @@
   context</a>, the user agent should <a href=#create-a-document-object>create a
   <code>Document</code> object</a>, mark it as being an <a href=#html-documents title="HTML documents">HTML document</a>, create an <a href=#html-parser>HTML
   parser</a>, associate it with the document, act as if the
-  tokenizer had emitted a start tag token with the tag name "pre", set
-  the <a href=#tokenization>tokenization</a> stage's <a href=#content-model-flag>content model
-  flag</a> to <i title="">PLAINTEXT</i>, and begin to pass the stream of
+  tokenizer had emitted a start tag token with the tag name "pre",
+  switch the <a href=#html-parser>HTML parser</a>'s tokenizer to the
+  <a href=#plaintext-state>PLAINTEXT state</a>, and begin to pass the stream of
   characters in the plain text document to that tokenizer.</p>
 
   <p>The rules for how to convert the bytes of the plain text document
@@ -61420,16 +61438,13 @@
   switches it to a new state (to consume the next character), or
   repeats the same state (to consume the next character). Some states
   have more complicated behavior and can consume several characters
-  before switching to another state.</p>
+  before switching to another state. In some cases, the tokenizer
+  state is also changed by the tree construction stage.</p>
 
-  <p>The exact behavior of certain states depends on a <dfn id=content-model-flag>content
-  model flag</dfn> that is set after certain tokens are emitted. The
-  flag has several states: <i title="">PCDATA</i>, <i title="">RCDATA</i>, <i title="">RAWTEXT</i>, and <i title="">PLAINTEXT</i>. Initially, it must be in the PCDATA
-  state. In the RCDATA and RAWTEXT states, a further <dfn id=escape-flag>escape
-  flag</dfn> is used to control the behavior of the tokenizer. It is
-  either true or false, and initially must be set to the false
-  state. The <a href=#insertion-mode>insertion mode</a> and the <a href=#stack-of-open-elements>stack of open
-  elements</a> also affects tokenization.</p>
+  <p>The exact behavior of certain states depends on the
+  <a href=#insertion-mode>insertion mode</a> and the <a href=#stack-of-open-elements>stack of open
+  elements</a>. Certain states also use a <dfn id=temporary-buffer><var>temporary
+  buffer</var></dfn> to track progress.</p>
 
   <p>The output of the tokenization step is a series of zero or more
   of the following tokens: DOCTYPE, start tag, end tag, comment,
@@ -61448,8 +61463,8 @@
 
   <p>When a token is emitted, it must immediately be handled by the
   <a href=#tree-construction>tree construction</a> stage. The tree construction stage
-  can affect the state of the <a href=#content-model-flag>content model flag</a>, and can
-  insert additional characters into the stream. (For example, the
+  can affect the state of the tokenization stage, and can insert
+  additional characters into the stream. (For example, the
   <code><a href=#script>script</a></code> element can result in scripts executing and
   using the <a href=#dynamic-markup-insertion>dynamic markup insertion</a> APIs to insert
   characters into the stream being tokenized.)</p>
@@ -61459,15 +61474,18 @@
   self-closing flag">acknowledged</dfn> when it is processed by the
   tree construction stage, that is a <a href=#parse-error>parse error</a>.</p>
 
-  <p>When an end tag token is emitted, the <a href=#content-model-flag>content model
-  flag</a> must be switched to the PCDATA state.</p>
-
   <p>When an end tag token is emitted with attributes, that is a
   <a href=#parse-error>parse error</a>.</p>
 
   <p>When an end tag token is emitted with its <i>self-closing
   flag</i> set, that is a <a href=#parse-error>parse error</a>.</p>
 
+  <p>An <dfn id=appropriate-end-tag-token>appropriate end tag token</dfn> is an end tag token whose
+  tag name matches the tag name of the last start tag to have been
+  emitted from this tokenizer, if any. If no start tag has been
+  emitted from this tokenizer, then no end tag token is
+  appropriate.</p>
+
   <p>Before each step of the tokenizer, the user agent must first
   check the <a href=#parser-pause-flag>parser pause flag</a>. If it is true, then the
   tokenizer must abort the processing of any nested invocations of the
@@ -61476,187 +61494,152 @@
   <p>The tokenizer state machine consists of the states defined in the
   following subsections.</p>
 
+
   <!-- Order of the lists below is supposed to be non-error then
   error, by unicode, then EOF, ending with "anything else" -->
 
+
   <h5 id=data-state><span class=secno>9.2.4.1 </span><dfn>Data state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
   <dl class=switch><dt>U+0026 AMPERSAND (&)</dt>
-   <dd>When the <a href=#content-model-flag>content model flag</a> is set to one of the
-   PCDATA or RCDATA states and the <a href=#escape-flag>escape flag</a> is
-   false: switch to the <a href=#character-reference-in-data-state>character reference in data
+   <dd>Switch to the <a href=#character-reference-in-data-state>character reference in data
    state</a>.</dd>
-   <dd>Otherwise: treat it as per the "anything else" entry
-   below.</dd>
 
-   <dt>U+002D HYPHEN-MINUS (-)</dt>
-   <dd>
+   <dt>U+003C LESS-THAN SIGN (<)</dt>
+   <dd>Switch to the <a href=#tag-open-state>tag open state</a>.</dd>
 
-    <p>If the <a href=#content-model-flag>content model flag</a> is set to either the
-    RCDATA state or the RAWTEXT state, and the <a href=#escape-flag>escape flag</a>
-    is false, and there are at least three characters before this
-    one in the input stream, and the last four characters in the
-    input stream, including this one, are U+003C LESS-THAN SIGN,
-    U+0021 EXCLAMATION MARK, U+002D HYPHEN-MINUS, and U+002D
-    HYPHEN-MINUS ("<!--"), then set the <a href=#escape-flag>escape flag</a>
-    to true.</p>
+   <dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-    <p>In any case, emit the input character as a character
-    token. Stay in the <a href=#data-state>data state</a>.</p>
+   <dt>Anything else</dt>
+   <dd>Emit the <a href=#current-input-character>current input character</a> as a character
+   token. Stay in the <a href=#data-state>data state</a>.</dd>
 
-   </dd>
+  </dl><h5 id=rcdata-state><span class=secno>9.2.4.2 </span><dfn>RCDATA state</dfn></h5>
 
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0026 AMPERSAND (&)</dt>
+   <dd>Switch to the <a href=#character-reference-in-data-state>character reference in data
+   state</a>.</dd>
+
    <dt>U+003C LESS-THAN SIGN (<)</dt>
-   <dd>When the <a href=#content-model-flag>content model flag</a> is set to the PCDATA
-   state: switch to the <a href=#tag-open-state>tag open state</a>.</dd>
-   <dd>When the <a href=#content-model-flag>content model flag</a> is set to either the
-   RCDATA state or the RAWTEXT state, and the <a href=#escape-flag>escape flag</a>
-   is false: switch to the <a href=#tag-open-state>tag open state</a>.</dd>
-   <dd>Otherwise: treat it as per the "anything else" entry
-   below.</dd>
+   <dd>Switch to the <a href=#rcdata-less-than-sign-state>RCDATA less-than sign state</a>.</dd>
 
-   <dt>U+003E GREATER-THAN SIGN (>)</dt>
-   <dd>
+   <dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-    <p>If the <a href=#content-model-flag>content model flag</a> is set to either the
-    RCDATA state or the RAWTEXT state, and the <a href=#escape-flag>escape
-    flag</a> is true, and the last three characters in the input
-    stream including this one are U+002D HYPHEN-MINUS, U+002D
-    HYPHEN-MINUS, U+003E GREATER-THAN SIGN ("-->"), set the
-    <a href=#escape-flag>escape flag</a> to false.</p> <!-- no need to check
-    that there are enough characters, since you can only run into
-    this if the flag is true in the first place, which requires four
-    characters. -->
+   <dt>Anything else</dt>
+   <dd>Emit the <a href=#current-input-character>current input character</a> as a character
+   token. Stay in the <a href=#rcdata-state>RCDATA state</a>.</dd>
 
-    <p>In any case, emit the input character as a character
-    token. Stay in the <a href=#data-state>data state</a>.</p>
+  </dl><h5 id=rawtext-state><span class=secno>9.2.4.3 </span><dfn>RAWTEXT state</dfn></h5>
 
-   </dd>
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
+  <dl class=switch><dt>U+003C LESS-THAN SIGN (<)</dt>
+   <dd>Switch to the <a href=#rawtext-less-than-sign-state>RAWTEXT less-than sign state</a>.</dd>
+
    <dt>EOF</dt>
    <dd>Emit an end-of-file token.</dd>
 
    <dt>Anything else</dt>
-   <dd>Emit the input character as a character token. Stay in the
-   <a href=#data-state>data state</a>.</dd>
+   <dd>Emit the <a href=#current-input-character>current input character</a> as a character
+   token. Stay in the <a href=#rawtext-state>RAWTEXT state</a>.</dd>
 
-  </dl><h5 id=character-reference-in-data-state><span class=secno>9.2.4.2 </span><dfn>Character reference in data state</dfn></h5>
+  </dl><h5 id=script-data-state><span class=secno>9.2.4.4 </span><dfn>Script data state</dfn></h5>
 
-  <p><i>(This cannot happen if the <a href=#content-model-flag>content model flag</a>
-  is set to the RAWTEXT state.)</i></p>
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
-  <p>Attempt to <a href=#consume-a-character-reference>consume a character reference</a>, with no
-  <a href=#additional-allowed-character>additional allowed character</a>.</p>
+  <dl class=switch><dt>U+003C LESS-THAN SIGN (<)</dt>
+   <dd>Switch to the <a href=#script-data-less-than-sign-state>script data less-than sign state</a>.</dd>
 
-  <p>If nothing is returned, emit a U+0026 AMPERSAND character
-  token.</p>
+   <dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-  <p>Otherwise, emit the character token that was returned.</p>
+   <dt>Anything else</dt>
+   <dd>Emit the <a href=#current-input-character>current input character</a> as a character
+   token. Stay in the <a href=#script-data-state>script data state</a>.</dd>
 
-  <p>Finally, switch to the <a href=#data-state>data state</a>.</p>
+  </dl><h5 id=plaintext-state><span class=secno>9.2.4.5 </span><dfn>PLAINTEXT state</dfn></h5>
 
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
-  <h5 id=tag-open-state><span class=secno>9.2.4.3 </span><dfn>Tag open state</dfn></h5>
+  <dl class=switch><dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-  <p>The behavior of this state depends on the <a href=#content-model-flag>content model
-  flag</a>.</p>
+   <dt>Anything else</dt>
+   <dd>Emit the <a href=#current-input-character>current input character</a> as a character
+   token. Stay in the <a href=#plaintext-state>PLAINTEXT state</a>.</dd>
 
-  <dl><dt>If the <a href=#content-model-flag>content model flag</a> is set to the RCDATA
-   or RAWTEXT states</dt>
+  </dl><h5 id=character-reference-in-data-state><span class=secno>9.2.4.6 </span><dfn>Character reference in data state</dfn></h5>
 
-   <dd>
+  <p>Attempt to <a href=#consume-a-character-reference>consume a character reference</a>, with no
+  <a href=#additional-allowed-character>additional allowed character</a>.</p>
 
-    <p>Consume the <a href=#next-input-character>next input character</a>. If it is a
-    U+002F SOLIDUS character (/), switch to the <a href=#close-tag-open-state>close tag open
-    state</a>. Otherwise, emit a U+003C LESS-THAN SIGN character
-    token and reconsume the <a href=#current-input-character>current input character</a> in the
-    <a href=#data-state>data state</a>.</p>
+  <p>If nothing is returned, emit a U+0026 AMPERSAND character
+  token.</p>
 
-   </dd>
+  <p>Otherwise, emit the character token that was returned.</p>
 
-   <dt>If the <a href=#content-model-flag>content model flag</a> is set to the PCDATA
-   state</dt>
+  <p>Finally, switch to the <a href=#data-state>data state</a>.</p>
 
-   <dd>
 
-    <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+  <h5 id=tag-open-state><span class=secno>9.2.4.7 </span><dfn>Tag open state</dfn></h5>
 
-    <dl class=switch><dt>U+0021 EXCLAMATION MARK (!)</dt>
-     <dd>Switch to the <a href=#markup-declaration-open-state>markup declaration open state</a>.</dd>
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
-     <dt>U+002F SOLIDUS (/)</dt>
-     <dd>Switch to the <a href=#close-tag-open-state>close tag open state</a>.</dd>
+  <dl class=switch><dt>U+0021 EXCLAMATION MARK (!)</dt>
+   <dd>Switch to the <a href=#markup-declaration-open-state>markup declaration open state</a>.</dd>
 
-     <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-     <dd>Create a new start tag token, set its tag name to the
-     lowercase version of the input character (add 0x0020 to the
-     character's code point), then switch to the <a href=#tag-name-state>tag name
-     state</a>. (Don't emit the token yet; further details will
-     be filled in before it is emitted.)</dd>
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>Switch to the <a href=#close-tag-open-state>close tag open state</a>.</dd>
 
-     <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
-     <dd>Create a new start tag token, set its tag name to the input
-     character, then switch to the <a href=#tag-name-state>tag name
-     state</a>. (Don't emit the token yet; further details will
-     be filled in before it is emitted.)</dd>
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new start tag token, set its tag name to the
+   lowercase version of the <a href=#current-input-character>current input character</a> (add 0x0020 to the
+   character's code point), then switch to the <a href=#tag-name-state>tag name
+   state</a>. (Don't emit the token yet; further details will
+   be filled in before it is emitted.)</dd>
 
-     <dt>U+003E GREATER-THAN SIGN (>)</dt>
-     <dd><a href=#parse-error>Parse error</a>. Emit a U+003C LESS-THAN SIGN
-     character token and a U+003E GREATER-THAN SIGN character
-     token. Switch to the <a href=#data-state>data state</a>.</dd>
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new start tag token, set its tag name to the
+   <a href=#current-input-character>current input character</a>, then switch to the <a href=#tag-name-state>tag
+   name state</a>. (Don't emit the token yet; further details will
+   be filled in before it is emitted.)</dd>
 
-     <dt>U+003F QUESTION MARK (?)</dt>
-     <dd><a href=#parse-error>Parse error</a>. Switch to the <a href=#bogus-comment-state>bogus
-     comment state</a>.</dd>
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd><a href=#parse-error>Parse error</a>. Emit a U+003C LESS-THAN SIGN
+   character token and a U+003E GREATER-THAN SIGN character
+   token. Switch to the <a href=#data-state>data state</a>.</dd>
 
-     <dt>Anything else</dt>
-     <dd><a href=#parse-error>Parse error</a>. Emit a U+003C LESS-THAN SIGN
-     character token and reconsume the <a href=#current-input-character>current input character</a> in the
-     <a href=#data-state>data state</a>.</dd>
+   <dt>U+003F QUESTION MARK (?)</dt>
+   <dd><a href=#parse-error>Parse error</a>. Switch to the <a href=#bogus-comment-state>bogus
+   comment state</a>.</dd>
 
-    </dl></dd>
+   <dt>Anything else</dt>
+   <dd><a href=#parse-error>Parse error</a>. Emit a U+003C LESS-THAN SIGN
+   character token and reconsume the <a href=#current-input-character>current input
+   character</a> in the <a href=#data-state>data state</a>.</dd>
 
-  </dl><h5 id=close-tag-open-state><span class=secno>9.2.4.4 </span><dfn>Close tag open state</dfn></h5>
+  </dl><h5 id=close-tag-open-state><span class=secno>9.2.4.8 </span><dfn>Close tag open state</dfn></h5>
 
-  <p>If the <a href=#content-model-flag>content model flag</a> is set to the RCDATA or
-  RAWTEXT states but no start tag token has ever been emitted by this
-  instance of the tokenizer (<a href=#fragment-case>fragment case</a>), or, if the
-  <a href=#content-model-flag>content model flag</a> is set to the RCDATA or RAWTEXT states
-  and the next few characters do not match the tag name of the last
-  start tag token emitted (compared in an <a href=#ascii-case-insensitive>ASCII
-  case-insensitive</a> manner), or if they do but they are not
-  immediately followed by one of the following characters:</p>
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
-  <ul class=brief><li>U+0009 CHARACTER TABULATION</li>
-   <li>U+000A LINE FEED (LF)</li>
-   <li>U+000C FORM FEED (FF)</li>
-   <!--<li>U+000D CARRIAGE RETURN (CR)</li>-->
-   <li>U+0020 SPACE</li>
-   <li>U+003E GREATER-THAN SIGN (>)</li>
-   <li>U+002F SOLIDUS (/)</li>
-   <li>EOF</li>
-  </ul><p>...then emit a U+003C LESS-THAN SIGN character token, a U+002F
-  SOLIDUS character token, and switch to the <a href=#data-state>data state</a>
-  to process the <a href=#next-input-character>next input character</a>.</p>
-
-  <p>Otherwise, if the <a href=#content-model-flag>content model flag</a> is set to the
-  PCDATA state, or if the next few characters <em>do</em> match that tag
-  name, consume the <a href=#next-input-character>next input character</a>:</p>
-
   <dl class=switch><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
    <dd>Create a new end tag token, set its tag name to the lowercase
-   version of the input character (add 0x0020 to the character's
-   code point), then switch to the <a href=#tag-name-state>tag name
+   version of the <a href=#current-input-character>current input character</a> (add 0x0020 to
+   the character's code point), then switch to the <a href=#tag-name-state>tag name
    state</a>. (Don't emit the token yet; further details will be
    filled in before it is emitted.)</dd>
 
    <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
-   <dd>Create a new end tag token, set its tag name to the input
-   character, then switch to the <a href=#tag-name-state>tag name state</a>. (Don't
-   emit the token yet; further details will be filled in before it
-   is emitted.)</dd>
+   <dd>Create a new end tag token, set its tag name to the
+   <a href=#current-input-character>current input character</a>, then switch to the <a href=#tag-name-state>tag
+   name state</a>. (Don't emit the token yet; further details will
+   be filled in before it is emitted.)</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd><a href=#parse-error>Parse error</a>. Switch to the <a href=#data-state>data
@@ -61671,7 +61654,7 @@
    <dd><a href=#parse-error>Parse error</a>. Switch to the <a href=#bogus-comment-state>bogus
    comment state</a>.</dd>
 
-  </dl><h5 id=tag-name-state><span class=secno>9.2.4.5 </span><dfn>Tag name state</dfn></h5>
+  </dl><h5 id=tag-name-state><span class=secno>9.2.4.9 </span><dfn>Tag name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -61690,27 +61673,372 @@
    state</a>.</dd>
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-   <dd>Append the lowercase version of the <a href=#current-input-character>current input character</a>
-   (add 0x0020 to the character's code point) to the current tag
-   token's tag name. Stay in the <a href=#tag-name-state>tag name state</a>.</dd>
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Stay in the <a href=#tag-name-state>tag name
+   state</a>.</dd>
 
    <dt>EOF</dt>
    <dd><a href=#parse-error>Parse error</a>. Reconsume the EOF character in the
    <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current tag token's
-   tag name. Stay in the <a href=#tag-name-state>tag name state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   tag token's tag name. Stay in the <a href=#tag-name-state>tag name state</a>.</dd>
 
-  </dl><h5 id=before-attribute-name-state><span class=secno>9.2.4.6 </span><dfn>Before attribute name state</dfn></h5>
+  </dl><h5 id=rcdata-less-than-sign-state><span class=secno>9.2.4.10 </span><dfn>RCDATA less-than sign state</dfn></h5>
+  <!-- identical to the RAWTEXT less-than sign state, except s/RAWTEXT/RCDATA/g -->
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
+  <dl class=switch><dt>U+002F SOLIDUS (/)</dt>
+   <dd>Set the <var><a href=#temporary-buffer>temporary buffer</a></var> to the empty string. Switch
+   to the <a href=#rcdata-end-tag-open-state>RCDATA end tag open state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
+   <a href=#current-input-character>current input character</a> in the <a href=#rcdata-state>RCDATA
+   state</a>.</dd>
+
+  </dl><h5 id=rcdata-end-tag-open-state><span class=secno>9.2.4.11 </span><dfn>RCDATA end tag open state</dfn></h5>
+  <!-- identical to the RAWTEXT (and Script data) end tag open state, except s/RAWTEXT/RCDATA/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   lowercase version of the <a href=#current-input-character>current input character</a> (add
+   0x0020 to the character's code point). Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#rcdata-end-tag-name-state>RCDATA end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   <a href=#current-input-character>current input character</a>. Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#rcdata-end-tag-name-state>RCDATA end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, and reconsume the <a href=#current-input-character>current input
+   character</a> in the <a href=#rcdata-state>RCDATA state</a>.</dd>
+
+  </dl><h5 id=rcdata-end-tag-name-state><span class=secno>9.2.4.12 </span><dfn>RCDATA end tag name state</dfn></h5>
+  <!-- identical to the RAWTEXT (and Script data) end tag name state, except s/RAWTEXT/RCDATA/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
   <dl class=switch><dt>U+0009 CHARACTER TABULATION</dt>
    <dt>U+000A LINE FEED (LF)</dt>
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#before-attribute-name-state>before attribute name
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#self-closing-start-tag-state>self-closing start tag
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then emit the current tag token and switch to the
+   <a href=#data-state>data state</a>. Otherwise, treat it as per the "anything
+   else" entry below.</dd>
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#rcdata-end-tag-name-state>RCDATA end tag name state</a>.</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#rcdata-end-tag-name-state>RCDATA end tag name state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, a character token for each of the characters in
+   the <var><a href=#temporary-buffer>temporary buffer</a></var> (in the order they were added to
+   the buffer), and reconsume the <a href=#current-input-character>current input character</a>
+   in the <a href=#rcdata-state>RCDATA state</a>.</dd>
+
+  </dl><h5 id=rawtext-less-than-sign-state><span class=secno>9.2.4.13 </span><dfn>RAWTEXT less-than sign state</dfn></h5>
+  <!-- identical to the RCDATA less-than sign state, except s/RCDATA/RAWTEXT/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002F SOLIDUS (/)</dt>
+   <dd>Set the <var><a href=#temporary-buffer>temporary buffer</a></var> to the empty string. Switch
+   to the <a href=#rawtext-end-tag-open-state>RAWTEXT end tag open state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
+   <a href=#current-input-character>current input character</a> in the <a href=#rawtext-state>RAWTEXT
+   state</a>.</dd>
+
+  </dl><h5 id=rawtext-end-tag-open-state><span class=secno>9.2.4.14 </span><dfn>RAWTEXT end tag open state</dfn></h5>
+  <!-- identical to the RCDATA (and Script data) end tag open state, except s/RCDATA/RAWTEXT/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   lowercase version of the <a href=#current-input-character>current input character</a> (add
+   0x0020 to the character's code point). Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#rawtext-end-tag-name-state>RAWTEXT end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   <a href=#current-input-character>current input character</a>. Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#rawtext-end-tag-name-state>RAWTEXT end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, and reconsume the <a href=#current-input-character>current input
+   character</a> in the <a href=#rawtext-state>RAWTEXT state</a>.</dd>
+
+  </dl><h5 id=rawtext-end-tag-name-state><span class=secno>9.2.4.15 </span><dfn>RAWTEXT end tag name state</dfn></h5>
+  <!-- identical to the RCDATA (and Script data) end tag name state, except s/RCDATA/RAWTEXT/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0009 CHARACTER TABULATION</dt>
+   <dt>U+000A LINE FEED (LF)</dt>
+   <dt>U+000C FORM FEED (FF)</dt>
+   <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+   <dt>U+0020 SPACE</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#before-attribute-name-state>before attribute name
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#self-closing-start-tag-state>self-closing start tag
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then emit the current tag token and switch to the
+   <a href=#data-state>data state</a>. Otherwise, treat it as per the "anything
+   else" entry below.</dd>
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#rawtext-end-tag-name-state>RAWTEXT end tag name state</a>.</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#rawtext-end-tag-name-state>RAWTEXT end tag name state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, a character token for each of the characters in
+   the <var><a href=#temporary-buffer>temporary buffer</a></var> (in the order they were added to
+   the buffer), and reconsume the <a href=#current-input-character>current input character</a>
+   in the <a href=#rawtext-state>RAWTEXT state</a>.</dd>
+
+  </dl><h5 id=script-data-less-than-sign-state><span class=secno>9.2.4.16 </span><dfn>Script data less-than sign state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002F SOLIDUS (/)</dt>
+   <dd>Set the <var><a href=#temporary-buffer>temporary buffer</a></var> to the empty string. Switch
+   to the <a href=#script-data-end-tag-open-state>script data end tag open state</a>.</dd>
+
+   <dt>U+0021 EXCLAMATION MARK (!)</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and a U+0021
+   EXCLAMATION MARK character token. Switch to the <a href=#script-data-escape-start-state>script data
+   escape start state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
+   <a href=#current-input-character>current input character</a> in the <a href=#script-data-state>script data
+   state</a>.</dd>
+
+  </dl><h5 id=script-data-end-tag-open-state><span class=secno>9.2.4.17 </span><dfn>Script data end tag open state</dfn></h5>
+  <!-- identical to the RCDATA (and RAWTEXT) end tag open state, except s/RCDATA/Script data/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   lowercase version of the <a href=#current-input-character>current input character</a> (add
+   0x0020 to the character's code point). Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#script-data-end-tag-name-state>script data end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   <a href=#current-input-character>current input character</a>. Append the <a href=#current-input-character>current
+   input character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Finally,
+   switch to the <a href=#script-data-end-tag-name-state>script data end tag name state</a>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, and reconsume the <a href=#current-input-character>current input
+   character</a> in the <a href=#script-data-state>script data state</a>.</dd>
+
+  </dl><h5 id=script-data-end-tag-name-state><span class=secno>9.2.4.18 </span><dfn>Script data end tag name state</dfn></h5>
+  <!-- identical to the RCDATA (and RAWTEXT) end tag name state, except s/RCDATA/Script data/g -->
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0009 CHARACTER TABULATION</dt>
+   <dt>U+000A LINE FEED (LF)</dt>
+   <dt>U+000C FORM FEED (FF)</dt>
+   <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+   <dt>U+0020 SPACE</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#before-attribute-name-state>before attribute name
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then switch to the <a href=#self-closing-start-tag-state>self-closing start tag
+   state</a>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>If the current end tag token is an <a href=#appropriate-end-tag-token>appropriate end tag
+   token</a>, then emit the current tag token and switch to the
+   <a href=#data-state>data state</a>. Otherwise, treat it as per the "anything
+   else" entry below.</dd>
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#script-data-end-tag-name-state>Script data end tag name state</a>.</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   tag token's tag name. Append the <a href=#current-input-character>current input
+   character</a> to the <var><a href=#temporary-buffer>temporary buffer</a></var>. Stay in the
+   <a href=#script-data-end-tag-name-state>Script data end tag name state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, a character token for each of the characters in
+   the <var><a href=#temporary-buffer>temporary buffer</a></var> (in the order they were added to
+   the buffer), and reconsume the <a href=#current-input-character>current input character</a>
+   in the <a href=#script-data-state>script data state</a>.</dd>
+
+  </dl><h5 id=script-data-escape-start-state><span class=secno>9.2.4.19 </span><dfn>Script data escape start state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <a href=#script-data-escape-start-dash-state>script data escape start dash state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Reconsume the <a href=#current-input-character>current input character</a> in the
+   <a href=#script-data-state>script data state</a>.</dd>
+
+  </dl><h5 id=script-data-escape-start-dash-state><span class=secno>9.2.4.20 </span><dfn>Script data escape start dash state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <a href=#script-data-escaped-dash-dash-state>script data escaped dash dash state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Reconsume the <a href=#current-input-character>current input character</a> in the
+   <a href=#script-data-state>script data state</a>.</dd>
+
+  </dl><h5 id=script-data-escaped-state><span class=secno>9.2.4.21 </span><dfn>Script data escaped state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <a href=#script-data-escaped-dash-state>script data escaped dash state</a>.</dd>
+
+   <dt>EOF</dt>
+   <dd><a href=#parse-error>Parse error</a>. Reconsume the EOF character in the
+   <a href=#data-state>data state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit the current input character as a character token. Stay in
+   the <a href=#script-data-escaped-state>script data escaped state</a>.</dd>
+
+  </dl><h5 id=script-data-escaped-dash-state><span class=secno>9.2.4.22 </span><dfn>Script data escaped dash state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <a href=#script-data-escaped-dash-dash-state>script data escaped dash dash state</a>.</dd>
+
+   <dt>EOF</dt>
+   <dd><a href=#parse-error>Parse error</a>. Reconsume the EOF character in the
+   <a href=#data-state>data state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit the current input character as a character token. Switch
+   to the <a href=#script-data-escaped-state>script data escaped state</a>.</dd>
+
+  </dl><h5 id=script-data-escaped-dash-dash-state><span class=secno>9.2.4.23 </span><dfn>Script data escaped dash dash state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Stay in the
+   <a href=#script-data-escaped-dash-dash-state>script data escaped dash dash state</a>.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>Emit a U+003E GREATER-THAN SIGN character token. Switch to the
+   <a href=#script-data-state>script data state</a>.</dd>
+
+   <dt>EOF</dt>
+   <dd><a href=#parse-error>Parse error</a>. Reconsume the EOF character in the
+   <a href=#data-state>data state</a>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit the current input character as a character token. Switch
+   to the <a href=#script-data-escaped-state>script data escaped state</a>.</dd>
+
+  </dl><h5 id=before-attribute-name-state><span class=secno>9.2.4.24 </span><dfn>Before attribute name state</dfn></h5>
+
+  <p>Consume the <a href=#next-input-character>next input character</a>:</p>
+
+  <dl class=switch><dt>U+0009 CHARACTER TABULATION</dt>
+   <dt>U+000A LINE FEED (LF)</dt>
+   <dt>U+000C FORM FEED (FF)</dt>
+   <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+   <dt>U+0020 SPACE</dt>
    <dd>Stay in the <a href=#before-attribute-name-state>before attribute name state</a>.</dd>
 
    <dt>U+002F SOLIDUS (/)</dt>
@@ -61744,7 +62072,7 @@
    the empty string. Switch to the <a href=#attribute-name-state>attribute name
    state</a>.</dd>
 
-  </dl><h5 id=attribute-name-state><span class=secno>9.2.4.7 </span><dfn>Attribute name state</dfn></h5>
+  </dl><h5 id=attribute-name-state><span class=secno>9.2.4.25 </span><dfn>Attribute name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -61766,9 +62094,9 @@
    state</a>.</dd>
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-   <dd>Append the lowercase version of the <a href=#current-input-character>current input character</a>
-   (add 0x0020 to the character's code point) to the current
-   attribute's name. Stay in the <a href=#attribute-name-state>attribute name
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current attribute's name. Stay in the <a href=#attribute-name-state>attribute name
    state</a>.</dd>
 
    <dt>U+0022 QUOTATION MARK (")</dt>
@@ -61782,8 +62110,9 @@
    <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current attribute's
-   name. Stay in the <a href=#attribute-name-state>attribute name state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   attribute's name. Stay in the <a href=#attribute-name-state>attribute name
+   state</a>.</dd>
 
   </dl><p>When the user agent leaves the attribute name state (and before
   emitting the tag token, if appropriate), the complete attribute's
@@ -61794,7 +62123,7 @@
   associated with it (if any).</p>
 
 
-  <h5 id=after-attribute-name-state><span class=secno>9.2.4.8 </span><dfn>After attribute name state</dfn></h5>
+  <h5 id=after-attribute-name-state><span class=secno>9.2.4.26 </span><dfn>After attribute name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -61817,10 +62146,10 @@
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
    <dd>Start a new attribute in the current tag token. Set that
-   attribute's name to the lowercase version of the <a href=#current-input-character>current input character</a>
-   (add 0x0020 to the character's code point), and its value to
-   the empty string. Switch to the <a href=#attribute-name-state>attribute name
-   state</a>.</dd>
+   attribute's name to the lowercase version of the <a href=#current-input-character>current
+   input character</a> (add 0x0020 to the character's code point),
+   and its value to the empty string. Switch to the <a href=#attribute-name-state>attribute
+   name state</a>.</dd>
 
    <dt>U+0022 QUOTATION MARK (")</dt>
    <dt>U+0027 APOSTROPHE (')</dt>
@@ -61834,11 +62163,11 @@
 
    <dt>Anything else</dt>
    <dd>Start a new attribute in the current tag token. Set that
-   attribute's name to the <a href=#current-input-character>current input character</a>, and its value to
-   the empty string. Switch to the <a href=#attribute-name-state>attribute name
+   attribute's name to the <a href=#current-input-character>current input character</a>, and
+   its value to the empty string. Switch to the <a href=#attribute-name-state>attribute name
    state</a>.</dd>
 
-  </dl><h5 id=before-attribute-value-state><span class=secno>9.2.4.9 </span><dfn>Before attribute value state</dfn></h5>
+  </dl><h5 id=before-attribute-value-state><span class=secno>9.2.4.27 </span><dfn>Before attribute value state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -61854,7 +62183,7 @@
 
    <dt>U+0026 AMPERSAND (&)</dt>
    <dd>Switch to the <a href=#attribute-value-(unquoted)-state>attribute value (unquoted) state</a>
-   and reconsume this input character.</dd>
+   and reconsume this <a href=#current-input-character>current input character</a>.</dd>
 
    <dt>U+0027 APOSTROPHE (')</dt>
    <dd>Switch to the <a href=#attribute-value-(single-quoted)-state>attribute value (single-quoted) state</a>.</dd>
@@ -61878,7 +62207,7 @@
    attribute's value. Switch to the <a href=#attribute-value-(unquoted)-state>attribute value (unquoted)
    state</a>.</dd>
 
-  </dl><h5 id=attribute-value-(double-quoted)-state><span class=secno>9.2.4.10 </span><dfn>Attribute value (double-quoted) state</dfn></h5>
+  </dl><h5 id=attribute-value-(double-quoted)-state><span class=secno>9.2.4.28 </span><dfn>Attribute value (double-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -61896,11 +62225,11 @@
    <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current attribute's
-   value. Stay in the <a href=#attribute-value-(double-quoted)-state>attribute value (double-quoted)
-   state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   attribute's value. Stay in the <a href=#attribute-value-(double-quoted)-state>attribute value
+   (double-quoted) state</a>.</dd>
 
-  </dl><h5 id=attribute-value-(single-quoted)-state><span class=secno>9.2.4.11 </span><dfn>Attribute value (single-quoted) state</dfn></h5>
+  </dl><h5 id=attribute-value-(single-quoted)-state><span class=secno>9.2.4.29 </span><dfn>Attribute value (single-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -61918,11 +62247,11 @@
    <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current attribute's
-   value. Stay in the <a href=#attribute-value-(single-quoted)-state>attribute value (single-quoted)
-   state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   attribute's value. Stay in the <a href=#attribute-value-(single-quoted)-state>attribute value
+   (single-quoted) state</a>.</dd>
 
-  </dl><h5 id=attribute-value-(unquoted)-state><span class=secno>9.2.4.12 </span><dfn>Attribute value (unquoted) state</dfn></h5>
+  </dl><h5 id=attribute-value-(unquoted)-state><span class=secno>9.2.4.30 </span><dfn>Attribute value (unquoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -61955,11 +62284,11 @@
    <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current attribute's
-   value. Stay in the <a href=#attribute-value-(unquoted)-state>attribute value (unquoted)
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   attribute's value. Stay in the <a href=#attribute-value-(unquoted)-state>attribute value (unquoted)
    state</a>.</dd>
 
-  </dl><h5 id=character-reference-in-attribute-value-state><span class=secno>9.2.4.13 </span><dfn>Character reference in attribute value state</dfn></h5>
+  </dl><h5 id=character-reference-in-attribute-value-state><span class=secno>9.2.4.31 </span><dfn>Character reference in attribute value state</dfn></h5>
 
   <p>Attempt to <a href=#consume-a-character-reference>consume a character reference</a>.</p>
 
@@ -61973,7 +62302,7 @@
   in when were switched into this state.</p>
 
 
-  <h5 id=after-attribute-value-(quoted)-state><span class=secno>9.2.4.14 </span><dfn>After attribute value (quoted) state</dfn></h5>
+  <h5 id=after-attribute-value-(quoted)-state><span class=secno>9.2.4.32 </span><dfn>After attribute value (quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -61999,7 +62328,7 @@
    <dd><a href=#parse-error>Parse error</a>. Reconsume the character in
    the <a href=#before-attribute-name-state>before attribute name state</a>.</dd>
 
-  </dl><h5 id=self-closing-start-tag-state><span class=secno>9.2.4.15 </span><dfn>Self-closing start tag state</dfn></h5>
+  </dl><h5 id=self-closing-start-tag-state><span class=secno>9.2.4.33 </span><dfn>Self-closing start tag state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62016,11 +62345,8 @@
    <dd><a href=#parse-error>Parse error</a>. Reconsume the character in
    the <a href=#before-attribute-name-state>before attribute name state</a>.</dd>
 
-  </dl><h5 id=bogus-comment-state><span class=secno>9.2.4.16 </span><dfn>Bogus comment state</dfn></h5>
+  </dl><h5 id=bogus-comment-state><span class=secno>9.2.4.34 </span><dfn>Bogus comment state</dfn></h5>
 
-  <p><i>(This can only happen if the <a href=#content-model-flag>content model
-  flag</a> is set to the PCDATA state.)</i></p>
-
   <p>Consume every character up to and including the first U+003E
   GREATER-THAN SIGN character (>) or the end of the file (EOF),
   whichever comes first. Emit a comment token whose data is the
@@ -62037,11 +62363,8 @@
   character.</p>
 
 
-  <h5 id=markup-declaration-open-state><span class=secno>9.2.4.17 </span><dfn>Markup declaration open state</dfn></h5>
+  <h5 id=markup-declaration-open-state><span class=secno>9.2.4.35 </span><dfn>Markup declaration open state</dfn></h5>
 
-  <p><i>(This can only happen if the <a href=#content-model-flag>content model
-  flag</a> is set to the PCDATA state.)</i></p>
-
   <p>If the next two characters are both U+002D HYPHEN-MINUS (-)
   characters, consume those two characters, create a comment token
   whose data is the empty string, and switch to the <a href=#comment-start-state>comment
@@ -62065,7 +62388,7 @@
   comment.</p>
 
 
-  <h5 id=comment-start-state><span class=secno>9.2.4.18 </span><dfn>Comment start state</dfn></h5>
+  <h5 id=comment-start-state><span class=secno>9.2.4.36 </span><dfn>Comment start state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62082,10 +62405,10 @@
    the EOF character in the <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the input character to the comment token's
-   data. Switch to the <a href=#comment-state>comment state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the comment
+   token's data. Switch to the <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=comment-start-dash-state><span class=secno>9.2.4.19 </span><dfn>Comment start dash state</dfn></h5>
+  </dl><h5 id=comment-start-dash-state><span class=secno>9.2.4.37 </span><dfn>Comment start dash state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62102,11 +62425,11 @@
    in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append a U+002D HYPHEN-MINUS character (-) and the input
-   character to the comment token's data. Switch to the
-   <a href=#comment-state>comment state</a>.</dd>
+   <dd>Append a U+002D HYPHEN-MINUS character (-) and the
+   <a href=#current-input-character>current input character</a> to the comment token's
+   data. Switch to the <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=comment-state><span class=secno>9.2.4.20 </span><dfn id=comment>Comment state</dfn></h5>
+  </dl><h5 id=comment-state><span class=secno>9.2.4.38 </span><dfn id=comment>Comment state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62119,10 +62442,10 @@
    in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append the input character to the comment token's data. Stay
-   in the <a href=#comment-state>comment state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the comment
+   token's data. Stay in the <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=comment-end-dash-state><span class=secno>9.2.4.21 </span><dfn>Comment end dash state</dfn></h5>
+  </dl><h5 id=comment-end-dash-state><span class=secno>9.2.4.39 </span><dfn>Comment end dash state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62135,11 +62458,11 @@
    in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append a U+002D HYPHEN-MINUS character (-) and the input
-   character to the comment token's data. Switch to the
-   <a href=#comment-state>comment state</a>.</dd>
+   <dd>Append a U+002D HYPHEN-MINUS character (-) and the
+   <a href=#current-input-character>current input character</a> to the comment token's
+   data. Switch to the <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=comment-end-state><span class=secno>9.2.4.22 </span><dfn>Comment end state</dfn></h5>
+  </dl><h5 id=comment-end-state><span class=secno>9.2.4.40 </span><dfn>Comment end state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62153,8 +62476,9 @@
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
    <dd><a href=#parse-error>Parse error</a>. Append two U+002D HYPHEN-MINUS (-)
-   characters and the input character to the comment token's
-   data. Switch to the <a href=#comment-end-space-state>comment end space state</a>.</dd>
+   characters and the <a href=#current-input-character>current input character</a> to the
+   comment token's data. Switch to the <a href=#comment-end-space-state>comment end space
+   state</a>.</dd>
 
    <dt>U+0021 EXCLAMATION MARK (!)</dt>
    <dd><a href=#parse-error>Parse error</a>. Switch to the <a href=#comment-end-bang-state>comment end bang
@@ -62175,10 +62499,11 @@
 
    <dt>Anything else</dt>
    <dd><a href=#parse-error>Parse error</a>. Append two U+002D HYPHEN-MINUS (-)
-   characters and the input character to the comment token's
-   data. Switch to the <a href=#comment-state>comment state</a>.</dd>
+   characters and the <a href=#current-input-character>current input character</a> to the
+   comment token's data. Switch to the <a href=#comment-state>comment
+   state</a>.</dd>
 
-  </dl><h5 id=comment-end-bang-state><span class=secno>9.2.4.23 </span><dfn>Comment end bang state</dfn></h5>
+  </dl><h5 id=comment-end-bang-state><span class=secno>9.2.4.41 </span><dfn>Comment end bang state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62198,11 +62523,11 @@
 
    <dt>Anything else</dt>
    <dd>Append two U+002D HYPHEN-MINUS (-) characters, a U+0021
-   EXCLAMATION MARK character (!), and the input character to the
-   comment token's data. Switch to the <a href=#comment-state>comment
-   state</a>.</dd>
+   EXCLAMATION MARK character (!), and the <a href=#current-input-character>current input
+   character</a> to the comment token's data. Switch to the
+   <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=comment-end-space-state><span class=secno>9.2.4.24 </span><dfn>Comment end space state</dfn></h5>
+  </dl><h5 id=comment-end-space-state><span class=secno>9.2.4.42 </span><dfn>Comment end space state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62211,7 +62536,7 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Append the input character to the comment token's data. Stay in
+   <dd>Append the <a href=#current-input-character>current input character</a> to the comment token's data. Stay in
    the <a href=#comment-end-space-state>comment end space state</a>.</dd>
 
    <dt>U+002D HYPHEN-MINUS (-)</dt>
@@ -62227,10 +62552,10 @@
    comment in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append the input character to the comment token's data. Switch
+   <dd>Append the <a href=#current-input-character>current input character</a> to the comment token's data. Switch
    to the <a href=#comment-state>comment state</a>.</dd>
 
-  </dl><h5 id=doctype-state><span class=secno>9.2.4.25 </span><dfn>DOCTYPE state</dfn></h5>
+  </dl><h5 id=doctype-state><span class=secno>9.2.4.43 </span><dfn>DOCTYPE state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62250,7 +62575,7 @@
    <dd><a href=#parse-error>Parse error</a>. Reconsume the current
    character in the <a href=#before-doctype-name-state>before DOCTYPE name state</a>.</dd>
 
-  </dl><h5 id=before-doctype-name-state><span class=secno>9.2.4.26 </span><dfn>Before DOCTYPE name state</dfn></h5>
+  </dl><h5 id=before-doctype-name-state><span class=secno>9.2.4.44 </span><dfn>Before DOCTYPE name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62263,7 +62588,7 @@
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
    <dd>Create a new DOCTYPE token. Set the token's name to the
-   lowercase version of the input character (add 0x0020 to the
+   lowercase version of the <a href=#current-input-character>current input character</a> (add 0x0020 to the
    character's code point). Switch to the <a href=#doctype-name-state>DOCTYPE name
    state</a>.</dd>
 
@@ -62282,7 +62607,7 @@
    <a href=#current-input-character>current input character</a>. Switch to the <a href=#doctype-name-state>DOCTYPE name
    state</a>.</dd>
 
-  </dl><h5 id=doctype-name-state><span class=secno>9.2.4.27 </span><dfn>DOCTYPE name state</dfn></h5>
+  </dl><h5 id=doctype-name-state><span class=secno>9.2.4.45 </span><dfn>DOCTYPE name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62298,9 +62623,10 @@
    state</a>.</dd>
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-   <dd>Append the lowercase version of the input character (add 0x0020
-   to the character's code point) to the current DOCTYPE token's
-   name. Stay in the <a href=#doctype-name-state>DOCTYPE name state</a>.</dd>
+   <dd>Append the lowercase version of the <a href=#current-input-character>current input
+   character</a> (add 0x0020 to the character's code point) to the
+   current DOCTYPE token's name. Stay in the <a href=#doctype-name-state>DOCTYPE name
+   state</a>.</dd>
 
    <dt>EOF</dt>
    <dd><a href=#parse-error>Parse error</a>. Set the DOCTYPE token's
@@ -62308,10 +62634,11 @@
    Reconsume the EOF character in the <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current DOCTYPE
-   token's name. Stay in the <a href=#doctype-name-state>DOCTYPE name state</a>.</dd>
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   DOCTYPE token's name. Stay in the <a href=#doctype-name-state>DOCTYPE name
+   state</a>.</dd>
 
-  </dl><h5 id=after-doctype-name-state><span class=secno>9.2.4.28 </span><dfn>After DOCTYPE name state</dfn></h5>
+  </dl><h5 id=after-doctype-name-state><span class=secno>9.2.4.46 </span><dfn>After DOCTYPE name state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62351,7 +62678,7 @@
 
    </dd>
 
-  </dl><h5 id=after-doctype-public-keyword-state><span class=secno>9.2.4.29 </span><dfn>After DOCTYPE public keyword state</dfn></h5>
+  </dl><h5 id=after-doctype-public-keyword-state><span class=secno>9.2.4.47 </span><dfn>After DOCTYPE public keyword state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62372,7 +62699,7 @@
    <dd><a href=#parse-error>Parse error</a>. Reconsume the current character in
    the <a href=#before-doctype-public-identifier-state>before DOCTYPE public identifier state</a>.</dd>
 
-  </dl><h5 id=before-doctype-public-identifier-state><span class=secno>9.2.4.30 </span><dfn>Before DOCTYPE public identifier state</dfn></h5>
+  </dl><h5 id=before-doctype-public-identifier-state><span class=secno>9.2.4.48 </span><dfn>Before DOCTYPE public identifier state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62408,7 +62735,7 @@
    <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href=#bogus-doctype-state>bogus
    DOCTYPE state</a>.</dd>
 
-  </dl><h5 id=doctype-public-identifier-(double-quoted)-state><span class=secno>9.2.4.31 </span><dfn>DOCTYPE public identifier (double-quoted) state</dfn></h5>
+  </dl><h5 id=doctype-public-identifier-(double-quoted)-state><span class=secno>9.2.4.49 </span><dfn>DOCTYPE public identifier (double-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62430,7 +62757,7 @@
    token's public identifier. Stay in the <a href=#doctype-public-identifier-(double-quoted)-state>DOCTYPE public
    identifier (double-quoted) state</a>.</dd>
 
-  </dl><h5 id=doctype-public-identifier-(single-quoted)-state><span class=secno>9.2.4.32 </span><dfn>DOCTYPE public identifier (single-quoted) state</dfn></h5>
+  </dl><h5 id=doctype-public-identifier-(single-quoted)-state><span class=secno>9.2.4.50 </span><dfn>DOCTYPE public identifier (single-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62452,7 +62779,7 @@
    token's public identifier. Stay in the <a href=#doctype-public-identifier-(single-quoted)-state>DOCTYPE public
    identifier (single-quoted) state</a>.</dd>
 
-  </dl><h5 id=after-doctype-public-identifier-state><span class=secno>9.2.4.33 </span><dfn>After DOCTYPE public identifier state</dfn></h5>
+  </dl><h5 id=after-doctype-public-identifier-state><span class=secno>9.2.4.51 </span><dfn>After DOCTYPE public identifier state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62461,7 +62788,8 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Switch to the <a href=#between-doctype-public-and-system-identifiers-state>between DOCTYPE public and system identifiers state</a>.</dd>
+   <dd>Switch to the <a href=#between-doctype-public-and-system-identifiers-state>between DOCTYPE public and system
+   identifiers state</a>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd>Emit the current DOCTYPE token. Switch to the <a href=#data-state>data
@@ -62487,7 +62815,7 @@
    <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href=#bogus-doctype-state>bogus
    DOCTYPE state</a>.</dd>
 
-  </dl><h5 id=between-doctype-public-and-system-identifiers-state><span class=secno>9.2.4.34 </span><dfn>Between DOCTYPE public and system identifiers state</dfn></h5>
+  </dl><h5 id=between-doctype-public-and-system-identifiers-state><span class=secno>9.2.4.52 </span><dfn>Between DOCTYPE public and system identifiers state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62496,7 +62824,8 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Stay in the <a href=#between-doctype-public-and-system-identifiers-state>between DOCTYPE public and system identifiers state</a>.</dd>
+   <dd>Stay in the <a href=#between-doctype-public-and-system-identifiers-state>between DOCTYPE public and system identifiers
+   state</a>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd>Emit the current DOCTYPE token. Switch to the <a href=#data-state>data
@@ -62522,7 +62851,7 @@
    <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href=#bogus-doctype-state>bogus
    DOCTYPE state</a>.</dd>
 
-  </dl><h5 id=after-doctype-system-keyword-state><span class=secno>9.2.4.35 </span><dfn>After DOCTYPE system keyword state</dfn></h5>
+  </dl><h5 id=after-doctype-system-keyword-state><span class=secno>9.2.4.53 </span><dfn>After DOCTYPE system keyword state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62543,7 +62872,7 @@
    <dd><a href=#parse-error>Parse error</a>. Reconsume the current character in
    the <a href=#before-doctype-system-identifier-state>before DOCTYPE system identifier state</a>.</dd>
 
-  </dl><h5 id=before-doctype-system-identifier-state><span class=secno>9.2.4.36 </span><dfn>Before DOCTYPE system identifier state</dfn></h5>
+  </dl><h5 id=before-doctype-system-identifier-state><span class=secno>9.2.4.54 </span><dfn>Before DOCTYPE system identifier state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62579,12 +62908,13 @@
    <i>force-quirks flag</i> to <i>on</i>. Switch to the <a href=#bogus-doctype-state>bogus
    DOCTYPE state</a>.</dd>
 
-  </dl><h5 id=doctype-system-identifier-(double-quoted)-state><span class=secno>9.2.4.37 </span><dfn>DOCTYPE system identifier (double-quoted) state</dfn></h5>
+  </dl><h5 id=doctype-system-identifier-(double-quoted)-state><span class=secno>9.2.4.55 </span><dfn>DOCTYPE system identifier (double-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
   <dl class=switch><dt>U+0022 QUOTATION MARK (")</dt>
-   <dd>Switch to the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier state</a>.</dd>
+   <dd>Switch to the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier
+   state</a>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd><a href=#parse-error>Parse error</a>. Set the DOCTYPE token's
@@ -62597,16 +62927,17 @@
    Reconsume the EOF character in the <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current DOCTYPE
-   token's system identifier. Stay in the <a href=#doctype-system-identifier-(double-quoted)-state>DOCTYPE system
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   DOCTYPE token's system identifier. Stay in the <a href=#doctype-system-identifier-(double-quoted)-state>DOCTYPE system
    identifier (double-quoted) state</a>.</dd>
 
-  </dl><h5 id=doctype-system-identifier-(single-quoted)-state><span class=secno>9.2.4.38 </span><dfn>DOCTYPE system identifier (single-quoted) state</dfn></h5>
+  </dl><h5 id=doctype-system-identifier-(single-quoted)-state><span class=secno>9.2.4.56 </span><dfn>DOCTYPE system identifier (single-quoted) state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
   <dl class=switch><dt>U+0027 APOSTROPHE (')</dt>
-   <dd>Switch to the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier state</a>.</dd>
+   <dd>Switch to the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier
+   state</a>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd><a href=#parse-error>Parse error</a>. Set the DOCTYPE token's
@@ -62619,11 +62950,11 @@
    Reconsume the EOF character in the <a href=#data-state>data state</a>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <a href=#current-input-character>current input character</a> to the current DOCTYPE
-   token's system identifier. Stay in the <a href=#doctype-system-identifier-(single-quoted)-state>DOCTYPE system
+   <dd>Append the <a href=#current-input-character>current input character</a> to the current
+   DOCTYPE token's system identifier. Stay in the <a href=#doctype-system-identifier-(single-quoted)-state>DOCTYPE system
    identifier (single-quoted) state</a>.</dd>
 
-  </dl><h5 id=after-doctype-system-identifier-state><span class=secno>9.2.4.39 </span><dfn>After DOCTYPE system identifier state</dfn></h5>
+  </dl><h5 id=after-doctype-system-identifier-state><span class=secno>9.2.4.57 </span><dfn>After DOCTYPE system identifier state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62632,7 +62963,8 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Stay in the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier state</a>.</dd>
+   <dd>Stay in the <a href=#after-doctype-system-identifier-state>after DOCTYPE system identifier
+   state</a>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd>Emit the current DOCTYPE token. Switch to the <a href=#data-state>data
@@ -62648,7 +62980,7 @@
    state</a>. (This does <em>not</em> set the DOCTYPE token's
    <i>force-quirks flag</i> to <i>on</i>.)</dd>
 
-  </dl><h5 id=bogus-doctype-state><span class=secno>9.2.4.40 </span><dfn>Bogus DOCTYPE state</dfn></h5>
+  </dl><h5 id=bogus-doctype-state><span class=secno>9.2.4.58 </span><dfn>Bogus DOCTYPE state</dfn></h5>
 
   <p>Consume the <a href=#next-input-character>next input character</a>:</p>
 
@@ -62663,11 +62995,8 @@
    <dt>Anything else</dt>
    <dd>Stay in the <a href=#bogus-doctype-state>bogus DOCTYPE state</a>.</dd>
 
-  </dl><h5 id=cdata-section-state><span class=secno>9.2.4.41 </span><dfn>CDATA section state</dfn></h5>
+  </dl><h5 id=cdata-section-state><span class=secno>9.2.4.59 </span><dfn>CDATA section state</dfn></h5>
 
-  <p><i>(This can only happen if the <a href=#content-model-flag>content model
-  flag</a> is set to the PCDATA state.)</i></p>
-
   <p>Consume every character up to the next occurrence of the three
   character sequence U+005D RIGHT SQUARE BRACKET U+005D RIGHT SQUARE
   BRACKET U+003E GREATER-THAN SIGN (<code title="">]]></code>), or the
@@ -62683,7 +63012,7 @@
 
 
 
-  <h5 id=tokenizing-character-references><span class=secno>9.2.4.42 </span>Tokenizing character references</h5>
+  <h5 id=tokenizing-character-references><span class=secno>9.2.4.60 </span>Tokenizing character references</h5>
 
   <p>This section defines how to <dfn id=consume-a-character-reference>consume a character
   reference</dfn>. This definition is used when parsing character
@@ -63130,11 +63459,10 @@
   <ol><li><p><a href=#insert-an-html-element>Insert an HTML element</a> for the token.</li>
 
    <li><p>If the algorithm that was invoked is the <a href=#generic-raw-text-element-parsing-algorithm>generic raw
-   text element parsing algorithm</a>, switch the tokenizer's
-   <a href=#content-model-flag>content model flag</a> to the RAWTEXT state; otherwise the
-   algorithm invoked was the <a href=#generic-rcdata-element-parsing-algorithm>generic RCDATA element parsing
-   algorithm</a>, switch the tokenizer's <a href=#content-model-flag>content model
-   flag</a> to the RCDATA state.</li>
+   text element parsing algorithm</a>, switch the tokenizer to the
+   <a href=#rawtext-state>RAWTEXT state</a>; otherwise the algorithm invoked
+   was the <a href=#generic-rcdata-element-parsing-algorithm>generic RCDATA element parsing algorithm</a>,
+   switch the tokenizer to the <a href=#rcdata-state>RCDATA state</a>.</li>
 
    <li><p>Let the <a href=#original-insertion-mode>original insertion mode</a> be the current
    <a href=#insertion-mode>insertion mode</a>.</p>
@@ -63648,8 +63976,8 @@
      and push it onto the <a href=#stack-of-open-elements>stack of open
      elements</a>.</li>
 
-     <li><p>Switch the tokenizer's <a href=#content-model-flag>content model flag</a> to
-     the RAWTEXT state.</li>
+     <li><p>Switch the tokenizer to the <a href=#script-data-state>script data
+     state</a>.</li>
 
      <li><p>Let the <a href=#original-insertion-mode>original insertion mode</a> be the current
      <a href=#insertion-mode>insertion mode</a>.</p>
@@ -64188,14 +64516,12 @@
 
     <p><a href=#insert-an-html-element>Insert an HTML element</a> for the token.</p>
 
-    <p>Switch the <a href=#content-model-flag>content model flag</a> to the PLAINTEXT
-    state.</p>
+    <p>Switch the tokenizer to the <a href=#plaintext-state>PLAINTEXT state</a>.</p>
 
-    <p class=note>Once a start tag with the tag name "plaintext"
-    has been seen, that will be the last token ever seen other
-    than character tokens (and the end-of-file token), because
-    there is no way to switch the <a href=#content-model-flag>content model flag</a>
-    out of the PLAINTEXT state.</p>
+    <p class=note>Once a start tag with the tag name "plaintext" has
+    been seen, that will be the last token ever seen other than
+    character tokens (and the end-of-file token), because there is no
+    way to switch out of the <a href=#plaintext-state>PLAINTEXT state</a>.</p>
 
    </dd>
 
@@ -64791,8 +65117,8 @@
      one. (Newlines at the start of <code><a href=#the-textarea-element>textarea</a></code> elements are
      ignored as an authoring convenience.)</li>
 
-     <li><p>Switch the tokenizer's <a href=#content-model-flag>content model flag</a> to
-     the RCDATA state.</li>
+     <li><p>Switch the tokenizer to the the <a href=#rcdata-state>RCDATA
+     state</a>.</li>
 
      <li><p>Let the <a href=#original-insertion-mode>original insertion mode</a> be the
      current <a href=#insertion-mode>insertion mode</a>.</p>
@@ -67154,42 +67480,38 @@
 
     <ol><li>
 
-      <p>Set the <a href=#html-parser>HTML parser</a>'s <a href=#tokenization>tokenization</a>
-      stage's <a href=#content-model-flag>content model flag</a> according to the <var title="">context</var> element, as follows:</p>
+      <p>Set the state of the <a href=#html-parser>HTML parser</a>'s
+      <a href=#tokenization>tokenization</a> stage as follows:</p>
 
       <dl class=switch><dt>If it is a <code><a href=#the-title-element-0>title</a></code> or <code><a href=#the-textarea-element>textarea</a></code>
        element</dt>
 
-       <dd>Set the <a href=#content-model-flag>content model flag</a> to
-       the RCDATA state.</dd>
+       <dd>Switch the tokenizer to the <a href=#rcdata-state>RCDATA state</a>.</dd>
 
 
        <dt>If it is a <code><a href=#the-style-element>style</a></code>, <code><a href=#script>script</a></code>,
        <code><a href=#xmp>xmp</a></code>, <code><a href=#the-iframe-element>iframe</a></code>, <code><a href=#noembed>noembed</a></code>, or
        <code><a href=#noframes>noframes</a></code> element</dt>
 
-       <dd>Set the <a href=#content-model-flag>content model flag</a> to
-       the RAWTEXT state.</dd>
+       <dd>Switch the tokenizer to the <a href=#rawtext-state>RAWTEXT state</a>.</dd>
 
 
        <dt>If it is a <code><a href=#the-noscript-element>noscript</a></code> element</dt>
 
-       <dd>If the <a href=#scripting-flag>scripting flag</a> is enabled, set the
-       <a href=#content-model-flag>content model flag</a> to the RAWTEXT
-       state. Otherwise, set the <a href=#content-model-flag>content model flag</a> to the
-       PCDATA state.</dd>
+       <dd>If the <a href=#scripting-flag>scripting flag</a> is enabled, switch the
+       tokenizer to the <a href=#rawtext-state>RAWTEXT state</a>.  Otherwise,
+       leave the tokenizer in the <a href=#data-state>data state</a>.</dd>
 
 
        <dt>If it is a <code><a href=#plaintext>plaintext</a></code> element</dt>
 
-       <dd>Set the <a href=#content-model-flag>content model flag</a> to
-       PLAINTEXT.</dd>
+       <dd>Switch the tokenizer to the <a href=#plaintext-state>PLAINTEXT
+       state</a>.</dd>
 
 
        <dt>Otherwise</dt>
 
-       <dd>Leave the <a href=#content-model-flag>content model flag</a> in the PCDATA
-       state.</dd>
+       <dd>Leave the tokenizer in the <a href=#data-state>data state</a>.</dd>
 
       </dl></li>
 

Modified: source
===================================================================
--- source	2009-10-19 05:52:18 UTC (rev 4176)
+++ source	2009-10-19 11:00:31 UTC (rev 4177)
@@ -9981,9 +9981,9 @@
     <p>If <var title="">type</var> is <em>not</em> now an <span>ASCII
     case-insensitive</span> match for the string
     "<code>text/html</code>", then act as if the tokenizer had emitted
-    a start tag token with the tag name "pre", then set the <span>HTML
-    parser</span>'s <span>tokenization</span> stage's <span>content
-    model flag</span> to <i title="">PLAINTEXT</i>.</p>
+    a start tag token with the tag name "pre", then switch the
+    <span>HTML parser</span>'s tokenizer to the <span>PLAINTEXT
+    state</span>.</p>
 
     <!--
  http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E...%3Ciframe%3E%3C%2Fiframe%3E%3Cscript%3Eonload%20%3D%20function%20()%20%7B%20%0D%0A%20%20var%20d%20%3D%20document.getElementsByTagName('iframe')%5B0%5D.contentDocument%3B%0D%0A%20%20d.open('image%2Fsvg%2Bxml')%3B%0D%0A%20%20d.write(%22%3Cinput%20xmlns%3D'http%3A%2F%2Fwww.w3.org%2F1999%2Fxhtml'%20value%3D'(x)html'%2F%3E%22)%3B%0D%0A%20%20d.close()%3B%0D%0A%7D%3B%3C%2Fscript%3E
@@ -62932,9 +62932,9 @@
   <code>Document</code> object</span>, mark it as being an <span
   title="HTML documents">HTML document</span>, create an <span>HTML
   parser</span>, associate it with the document, act as if the
-  tokenizer had emitted a start tag token with the tag name "pre", set
-  the <span>tokenization</span> stage's <span>content model
-  flag</span> to <i title="">PLAINTEXT</i>, and begin to pass the stream of
+  tokenizer had emitted a start tag token with the tag name "pre",
+  switch the <span>HTML parser</span>'s tokenizer to the
+  <span>PLAINTEXT state</span>, and begin to pass the stream of
   characters in the plain text document to that tokenizer.</p>
 
   <p>The rules for how to convert the bytes of the plain text document
@@ -79210,18 +79210,13 @@
   switches it to a new state (to consume the next character), or
   repeats the same state (to consume the next character). Some states
   have more complicated behavior and can consume several characters
-  before switching to another state.</p>
+  before switching to another state. In some cases, the tokenizer
+  state is also changed by the tree construction stage.</p>
 
-  <p>The exact behavior of certain states depends on a <dfn>content
-  model flag</dfn> that is set after certain tokens are emitted. The
-  flag has several states: <i title="">PCDATA</i>, <i
-  title="">RCDATA</i>, <i title="">RAWTEXT</i>, and <i
-  title="">PLAINTEXT</i>. Initially, it must be in the PCDATA
-  state. In the RCDATA and RAWTEXT states, a further <dfn>escape
-  flag</dfn> is used to control the behavior of the tokenizer. It is
-  either true or false, and initially must be set to the false
-  state. The <span>insertion mode</span> and the <span>stack of open
-  elements</span> also affects tokenization.</p>
+  <p>The exact behavior of certain states depends on the
+  <span>insertion mode</span> and the <span>stack of open
+  elements</span>. Certain states also use a <dfn><var>temporary
+  buffer</var></dfn> to track progress.</p>
 
   <p>The output of the tokenization step is a series of zero or more
   of the following tokens: DOCTYPE, start tag, end tag, comment,
@@ -79240,8 +79235,8 @@
 
   <p>When a token is emitted, it must immediately be handled by the
   <span>tree construction</span> stage. The tree construction stage
-  can affect the state of the <span>content model flag</span>, and can
-  insert additional characters into the stream. (For example, the
+  can affect the state of the tokenization stage, and can insert
+  additional characters into the stream. (For example, the
   <code>script</code> element can result in scripts executing and
   using the <span>dynamic markup insertion</span> APIs to insert
   characters into the stream being tokenized.)</p>
@@ -79251,15 +79246,18 @@
   self-closing flag">acknowledged</dfn> when it is processed by the
   tree construction stage, that is a <span>parse error</span>.</p>
 
-  <p>When an end tag token is emitted, the <span>content model
-  flag</span> must be switched to the PCDATA state.</p>
-
   <p>When an end tag token is emitted with attributes, that is a
   <span>parse error</span>.</p>
 
   <p>When an end tag token is emitted with its <i>self-closing
   flag</i> set, that is a <span>parse error</span>.</p>
 
+  <p>An <dfn>appropriate end tag token</dfn> is an end tag token whose
+  tag name matches the tag name of the last start tag to have been
+  emitted from this tokenizer, if any. If no start tag has been
+  emitted from this tokenizer, then no end tag token is
+  appropriate.</p>
+
   <p>Before each step of the tokenizer, the user agent must first
   check the <span>parser pause flag</span>. If it is true, then the
   tokenizer must abort the processing of any nested invocations of the
@@ -79268,9 +79266,11 @@
   <p>The tokenizer state machine consists of the states defined in the
   following subsections.</p>
 
+
   <!-- Order of the lists below is supposed to be non-error then
   error, by unicode, then EOF, ending with "anything else" -->
 
+
   <h5><dfn>Data state</dfn></h5>
 
   <p>Consume the <span>next input character</span>:</p>
@@ -79278,196 +79278,172 @@
   <dl class="switch">
 
    <dt>U+0026 AMPERSAND (&)</dt>
-   <dd>When the <span>content model flag</span> is set to one of the
-   PCDATA or RCDATA states and the <span>escape flag</span> is
-   false: switch to the <span>character reference in data
+   <dd>Switch to the <span>character reference in data
    state</span>.</dd>
-   <dd>Otherwise: treat it as per the "anything else" entry
-   below.</dd>
 
-   <dt>U+002D HYPHEN-MINUS (-)</dt>
-   <dd>
+   <dt>U+003C LESS-THAN SIGN (<)</dt>
+   <dd>Switch to the <span>tag open state</span>.</dd>
 
-    <p>If the <span>content model flag</span> is set to either the
-    RCDATA state or the RAWTEXT state, and the <span>escape flag</span>
-    is false, and there are at least three characters before this
-    one in the input stream, and the last four characters in the
-    input stream, including this one, are U+003C LESS-THAN SIGN,
-    U+0021 EXCLAMATION MARK, U+002D HYPHEN-MINUS, and U+002D
-    HYPHEN-MINUS ("<!--"), then set the <span>escape flag</span>
-    to true.</p>
+   <dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-    <p>In any case, emit the input character as a character
-    token. Stay in the <span>data state</span>.</p>
+   <dt>Anything else</dt>
+   <dd>Emit the <span>current input character</span> as a character
+   token. Stay in the <span>data state</span>.</dd>
 
-   </dd>
+  </dl>
 
+
+  <h5><dfn>RCDATA state</dfn></h5>
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+0026 AMPERSAND (&)</dt>
+   <dd>Switch to the <span>character reference in data
+   state</span>.</dd>
+
    <dt>U+003C LESS-THAN SIGN (<)</dt>
-   <dd>When the <span>content model flag</span> is set to the PCDATA
-   state: switch to the <span>tag open state</span>.</dd>
-   <dd>When the <span>content model flag</span> is set to either the
-   RCDATA state or the RAWTEXT state, and the <span>escape flag</span>
-   is false: switch to the <span>tag open state</span>.</dd>
-   <dd>Otherwise: treat it as per the "anything else" entry
-   below.</dd>
+   <dd>Switch to the <span>RCDATA less-than sign state</span>.</dd>
 
-   <dt>U+003E GREATER-THAN SIGN (>)</dt>
-   <dd>
+   <dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-    <p>If the <span>content model flag</span> is set to either the
-    RCDATA state or the RAWTEXT state, and the <span>escape
-    flag</span> is true, and the last three characters in the input
-    stream including this one are U+002D HYPHEN-MINUS, U+002D
-    HYPHEN-MINUS, U+003E GREATER-THAN SIGN ("-->"), set the
-    <span>escape flag</span> to false.</p> <!-- no need to check
-    that there are enough characters, since you can only run into
-    this if the flag is true in the first place, which requires four
-    characters. -->
+   <dt>Anything else</dt>
+   <dd>Emit the <span>current input character</span> as a character
+   token. Stay in the <span>RCDATA state</span>.</dd>
 
-    <p>In any case, emit the input character as a character
-    token. Stay in the <span>data state</span>.</p>
+  </dl>
 
-   </dd>
 
+  <h5><dfn>RAWTEXT state</dfn></h5>
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+003C LESS-THAN SIGN (<)</dt>
+   <dd>Switch to the <span>RAWTEXT less-than sign state</span>.</dd>
+
    <dt>EOF</dt>
    <dd>Emit an end-of-file token.</dd>
 
    <dt>Anything else</dt>
-   <dd>Emit the input character as a character token. Stay in the
-   <span>data state</span>.</dd>
+   <dd>Emit the <span>current input character</span> as a character
+   token. Stay in the <span>RAWTEXT state</span>.</dd>
 
   </dl>
 
 
-  <h5><dfn>Character reference in data state</dfn></h5>
+  <h5><dfn>Script data state</dfn></h5>
 
-  <p><i>(This cannot happen if the <span>content model flag</span>
-  is set to the RAWTEXT state.)</i></p>
+  <p>Consume the <span>next input character</span>:</p>
 
-  <p>Attempt to <span>consume a character reference</span>, with no
-  <span>additional allowed character</span>.</p>
+  <dl class="switch">
 
-  <p>If nothing is returned, emit a U+0026 AMPERSAND character
-  token.</p>
+   <dt>U+003C LESS-THAN SIGN (<)</dt>
+   <dd>Switch to the <span>script data less-than sign state</span>.</dd>
 
-  <p>Otherwise, emit the character token that was returned.</p>
+   <dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-  <p>Finally, switch to the <span>data state</span>.</p>
+   <dt>Anything else</dt>
+   <dd>Emit the <span>current input character</span> as a character
+   token. Stay in the <span>script data state</span>.</dd>
 
+  </dl>
 
-  <h5><dfn>Tag open state</dfn></h5>
 
-  <p>The behavior of this state depends on the <span>content model
-  flag</span>.</p>
+  <h5><dfn>PLAINTEXT state</dfn></h5>
 
-  <dl>
+  <p>Consume the <span>next input character</span>:</p>
 
-   <dt>If the <span>content model flag</span> is set to the RCDATA
-   or RAWTEXT states</dt>
+  <dl class="switch">
 
-   <dd>
+   <dt>EOF</dt>
+   <dd>Emit an end-of-file token.</dd>
 
-    <p>Consume the <span>next input character</span>. If it is a
-    U+002F SOLIDUS character (/), switch to the <span>close tag open
-    state</span>. Otherwise, emit a U+003C LESS-THAN SIGN character
-    token and reconsume the <span>current input character</span> in the
-    <span>data state</span>.</p>
+   <dt>Anything else</dt>
+   <dd>Emit the <span>current input character</span> as a character
+   token. Stay in the <span>PLAINTEXT state</span>.</dd>
 
-   </dd>
+  </dl>
 
-   <dt>If the <span>content model flag</span> is set to the PCDATA
-   state</dt>
 
-   <dd>
+  <h5><dfn>Character reference in data state</dfn></h5>
 
-    <p>Consume the <span>next input character</span>:</p>
+  <p>Attempt to <span>consume a character reference</span>, with no
+  <span>additional allowed character</span>.</p>
 
-    <dl class="switch">
+  <p>If nothing is returned, emit a U+0026 AMPERSAND character
+  token.</p>
 
-     <dt>U+0021 EXCLAMATION MARK (!)</dt>
-     <dd>Switch to the <span>markup declaration open state</span>.</dd>
+  <p>Otherwise, emit the character token that was returned.</p>
 
-     <dt>U+002F SOLIDUS (/)</dt>
-     <dd>Switch to the <span>close tag open state</span>.</dd>
+  <p>Finally, switch to the <span>data state</span>.</p>
 
-     <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-     <dd>Create a new start tag token, set its tag name to the
-     lowercase version of the input character (add 0x0020 to the
-     character's code point), then switch to the <span>tag name
-     state</span>. (Don't emit the token yet; further details will
-     be filled in before it is emitted.)</dd>
 
-     <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
-     <dd>Create a new start tag token, set its tag name to the input
-     character, then switch to the <span>tag name
-     state</span>. (Don't emit the token yet; further details will
-     be filled in before it is emitted.)</dd>
+  <h5><dfn>Tag open state</dfn></h5>
 
-     <dt>U+003E GREATER-THAN SIGN (>)</dt>
-     <dd><span>Parse error</span>. Emit a U+003C LESS-THAN SIGN
-     character token and a U+003E GREATER-THAN SIGN character
-     token. Switch to the <span>data state</span>.</dd>
+  <p>Consume the <span>next input character</span>:</p>
 
-     <dt>U+003F QUESTION MARK (?)</dt>
-     <dd><span>Parse error</span>. Switch to the <span>bogus
-     comment state</span>.</dd>
+  <dl class="switch">
 
-     <dt>Anything else</dt>
-     <dd><span>Parse error</span>. Emit a U+003C LESS-THAN SIGN
-     character token and reconsume the <span>current input character</span> in the
-     <span>data state</span>.</dd>
+   <dt>U+0021 EXCLAMATION MARK (!)</dt>
+   <dd>Switch to the <span>markup declaration open state</span>.</dd>
 
-    </dl>
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>Switch to the <span>close tag open state</span>.</dd>
 
-   </dd>
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new start tag token, set its tag name to the
+   lowercase version of the <span>current input character</span> (add 0x0020 to the
+   character's code point), then switch to the <span>tag name
+   state</span>. (Don't emit the token yet; further details will
+   be filled in before it is emitted.)</dd>
 
-  </dl>
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new start tag token, set its tag name to the
+   <span>current input character</span>, then switch to the <span>tag
+   name state</span>. (Don't emit the token yet; further details will
+   be filled in before it is emitted.)</dd>
 
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd><span>Parse error</span>. Emit a U+003C LESS-THAN SIGN
+   character token and a U+003E GREATER-THAN SIGN character
+   token. Switch to the <span>data state</span>.</dd>
 
-  <h5><dfn>Close tag open state</dfn></h5>
+   <dt>U+003F QUESTION MARK (?)</dt>
+   <dd><span>Parse error</span>. Switch to the <span>bogus
+   comment state</span>.</dd>
 
-  <p>If the <span>content model flag</span> is set to the RCDATA or
-  RAWTEXT states but no start tag token has ever been emitted by this
-  instance of the tokenizer (<span>fragment case</span>), or, if the
-  <span>content model flag</span> is set to the RCDATA or RAWTEXT states
-  and the next few characters do not match the tag name of the last
-  start tag token emitted (compared in an <span>ASCII
-  case-insensitive</span> manner), or if they do but they are not
-  immediately followed by one of the following characters:</p>
+   <dt>Anything else</dt>
+   <dd><span>Parse error</span>. Emit a U+003C LESS-THAN SIGN
+   character token and reconsume the <span>current input
+   character</span> in the <span>data state</span>.</dd>
 
-  <ul class="brief">
-   <li>U+0009 CHARACTER TABULATION</li>
-   <li>U+000A LINE FEED (LF)</li>
-   <li>U+000C FORM FEED (FF)</li>
-   <!--<li>U+000D CARRIAGE RETURN (CR)</li>-->
-   <li>U+0020 SPACE</li>
-   <li>U+003E GREATER-THAN SIGN (>)</li>
-   <li>U+002F SOLIDUS (/)</li>
-   <li>EOF</li>
-  </ul>
+  </dl>
 
-  <p>...then emit a U+003C LESS-THAN SIGN character token, a U+002F
-  SOLIDUS character token, and switch to the <span>data state</span>
-  to process the <span>next input character</span>.</p>
 
-  <p>Otherwise, if the <span>content model flag</span> is set to the
-  PCDATA state, or if the next few characters <em>do</em> match that tag
-  name, consume the <span>next input character</span>:</p>
+  <h5><dfn>Close tag open state</dfn></h5>
 
+  <p>Consume the <span>next input character</span>:</p>
+
   <dl class="switch">
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
    <dd>Create a new end tag token, set its tag name to the lowercase
-   version of the input character (add 0x0020 to the character's
-   code point), then switch to the <span>tag name
+   version of the <span>current input character</span> (add 0x0020 to
+   the character's code point), then switch to the <span>tag name
    state</span>. (Don't emit the token yet; further details will be
    filled in before it is emitted.)</dd>
 
    <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
-   <dd>Create a new end tag token, set its tag name to the input
-   character, then switch to the <span>tag name state</span>. (Don't
-   emit the token yet; further details will be filled in before it
-   is emitted.)</dd>
+   <dd>Create a new end tag token, set its tag name to the
+   <span>current input character</span>, then switch to the <span>tag
+   name state</span>. (Don't emit the token yet; further details will
+   be filled in before it is emitted.)</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd><span>Parse error</span>. Switch to the <span>data
@@ -79506,21 +79482,436 @@
    state</span>.</dd>
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-   <dd>Append the lowercase version of the <span>current input character</span>
-   (add 0x0020 to the character's code point) to the current tag
-   token's tag name. Stay in the <span>tag name state</span>.</dd>
+   <dd>Append the lowercase version of the <span>current input
+   character</span> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Stay in the <span>tag name
+   state</span>.</dd>
 
    <dt>EOF</dt>
    <dd><span>Parse error</span>. Reconsume the EOF character in the
    <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <span>current input character</span> to the current tag token's
-   tag name. Stay in the <span>tag name state</span>.</dd>
+   <dd>Append the <span>current input character</span> to the current
+   tag token's tag name. Stay in the <span>tag name state</span>.</dd>
 
   </dl>
 
 
+  <h5><dfn>RCDATA less-than sign state</dfn></h5>
+  <!-- identical to the RAWTEXT less-than sign state, except s/RAWTEXT/RCDATA/g -->
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>Set the <var>temporary buffer</var> to the empty string. Switch
+   to the <span>RCDATA end tag open state</span>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
+   <span>current input character</span> in the <span>RCDATA
+   state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>RCDATA end tag open state</dfn></h5>
+  <!-- identical to the RAWTEXT (and Script data) end tag open state, except s/RAWTEXT/RCDATA/g -->
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   lowercase version of the <span>current input character</span> (add
+   0x0020 to the character's code point). Append the <span>current
+   input character</span> to the <var>temporary buffer</var>. Finally,
+   switch to the <span>RCDATA end tag name state</span>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   <span>current input character</span>. Append the <span>current
+   input character</span> to the <var>temporary buffer</var>. Finally,
+   switch to the <span>RCDATA end tag name state</span>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, and reconsume the <span>current input
+   character</span> in the <span>RCDATA state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>RCDATA end tag name state</dfn></h5>
+  <!-- identical to the RAWTEXT (and Script data) end tag name state, except s/RAWTEXT/RCDATA/g -->
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+0009 CHARACTER TABULATION</dt>
+   <dt>U+000A LINE FEED (LF)</dt>
+   <dt>U+000C FORM FEED (FF)</dt>
+   <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+   <dt>U+0020 SPACE</dt>
+   <dd>If the current end tag token is an <span>appropriate end tag
+   token</span>, then switch to the <span>before attribute name
+   state</span>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>If the current end tag token is an <span>appropriate end tag
+   token</span>, then switch to the <span>self-closing start tag
+   state</span>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>If the current end tag token is an <span>appropriate end tag
+   token</span>, then emit the current tag token and switch to the
+   <span>data state</span>. Otherwise, treat it as per the "anything
+   else" entry below.</dd>
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Append the lowercase version of the <span>current input
+   character</span> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Append the <span>current input
+   character</span> to the <var>temporary buffer</var>. Stay in the
+   <span>RCDATA end tag name state</span>.</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Append the <span>current input character</span> to the current
+   tag token's tag name. Append the <span>current input
+   character</span> to the <var>temporary buffer</var>. Stay in the
+   <span>RCDATA end tag name state</span>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, a character token for each of the characters in
+   the <var>temporary buffer</var> (in the order they were added to
+   the buffer), and reconsume the <span>current input character</span>
+   in the <span>RCDATA state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>RAWTEXT less-than sign state</dfn></h5>
+  <!-- identical to the RCDATA less-than sign state, except s/RCDATA/RAWTEXT/g -->
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>Set the <var>temporary buffer</var> to the empty string. Switch
+   to the <span>RAWTEXT end tag open state</span>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
+   <span>current input character</span> in the <span>RAWTEXT
+   state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>RAWTEXT end tag open state</dfn></h5>
+  <!-- identical to the RCDATA (and Script data) end tag open state, except s/RCDATA/RAWTEXT/g -->
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   lowercase version of the <span>current input character</span> (add
+   0x0020 to the character's code point). Append the <span>current
+   input character</span> to the <var>temporary buffer</var>. Finally,
+   switch to the <span>RAWTEXT end tag name state</span>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   <span>current input character</span>. Append the <span>current
+   input character</span> to the <var>temporary buffer</var>. Finally,
+   switch to the <span>RAWTEXT end tag name state</span>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, and reconsume the <span>current input
+   character</span> in the <span>RAWTEXT state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>RAWTEXT end tag name state</dfn></h5>
+  <!-- identical to the RCDATA (and Script data) end tag name state, except s/RCDATA/RAWTEXT/g -->
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+0009 CHARACTER TABULATION</dt>
+   <dt>U+000A LINE FEED (LF)</dt>
+   <dt>U+000C FORM FEED (FF)</dt>
+   <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+   <dt>U+0020 SPACE</dt>
+   <dd>If the current end tag token is an <span>appropriate end tag
+   token</span>, then switch to the <span>before attribute name
+   state</span>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>If the current end tag token is an <span>appropriate end tag
+   token</span>, then switch to the <span>self-closing start tag
+   state</span>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>If the current end tag token is an <span>appropriate end tag
+   token</span>, then emit the current tag token and switch to the
+   <span>data state</span>. Otherwise, treat it as per the "anything
+   else" entry below.</dd>
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Append the lowercase version of the <span>current input
+   character</span> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Append the <span>current input
+   character</span> to the <var>temporary buffer</var>. Stay in the
+   <span>RAWTEXT end tag name state</span>.</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Append the <span>current input character</span> to the current
+   tag token's tag name. Append the <span>current input
+   character</span> to the <var>temporary buffer</var>. Stay in the
+   <span>RAWTEXT end tag name state</span>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, a character token for each of the characters in
+   the <var>temporary buffer</var> (in the order they were added to
+   the buffer), and reconsume the <span>current input character</span>
+   in the <span>RAWTEXT state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>Script data less-than sign state</dfn></h5>
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>Set the <var>temporary buffer</var> to the empty string. Switch
+   to the <span>script data end tag open state</span>.</dd>
+
+   <dt>U+0021 EXCLAMATION MARK (!)</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and a U+0021
+   EXCLAMATION MARK character token. Switch to the <span>script data
+   escape start state</span>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token and reconsume the
+   <span>current input character</span> in the <span>script data
+   state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>Script data end tag open state</dfn></h5>
+  <!-- identical to the RCDATA (and RAWTEXT) end tag open state, except s/RCDATA/Script data/g -->
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   lowercase version of the <span>current input character</span> (add
+   0x0020 to the character's code point). Append the <span>current
+   input character</span> to the <var>temporary buffer</var>. Finally,
+   switch to the <span>script data end tag name state</span>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Create a new end tag token, and set its tag name to the
+   <span>current input character</span>. Append the <span>current
+   input character</span> to the <var>temporary buffer</var>. Finally,
+   switch to the <span>script data end tag name state</span>. (Don't emit
+   the token yet; further details will be filled in before it is
+   emitted.)</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, and reconsume the <span>current input
+   character</span> in the <span>script data state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>Script data end tag name state</dfn></h5>
+  <!-- identical to the RCDATA (and RAWTEXT) end tag name state, except s/RCDATA/Script data/g -->
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+0009 CHARACTER TABULATION</dt>
+   <dt>U+000A LINE FEED (LF)</dt>
+   <dt>U+000C FORM FEED (FF)</dt>
+   <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+   <dt>U+0020 SPACE</dt>
+   <dd>If the current end tag token is an <span>appropriate end tag
+   token</span>, then switch to the <span>before attribute name
+   state</span>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+002F SOLIDUS (/)</dt>
+   <dd>If the current end tag token is an <span>appropriate end tag
+   token</span>, then switch to the <span>self-closing start tag
+   state</span>. Otherwise, treat it as per the "anything else" entry
+   below.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>If the current end tag token is an <span>appropriate end tag
+   token</span>, then emit the current tag token and switch to the
+   <span>data state</span>. Otherwise, treat it as per the "anything
+   else" entry below.</dd>
+
+   <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+   <dd>Append the lowercase version of the <span>current input
+   character</span> (add 0x0020 to the character's code point) to the
+   current tag token's tag name. Append the <span>current input
+   character</span> to the <var>temporary buffer</var>. Stay in the
+   <span>Script data end tag name state</span>.</dd>
+
+   <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+   <dd>Append the <span>current input character</span> to the current
+   tag token's tag name. Append the <span>current input
+   character</span> to the <var>temporary buffer</var>. Stay in the
+   <span>Script data end tag name state</span>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+   character token, a character token for each of the characters in
+   the <var>temporary buffer</var> (in the order they were added to
+   the buffer), and reconsume the <span>current input character</span>
+   in the <span>script data state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>Script data escape start state</dfn></h5>
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <span>script data escape start dash state</span>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Reconsume the <span>current input character</span> in the
+   <span>script data state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>Script data escape start dash state</dfn></h5>
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <span>script data escaped dash dash state</span>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Reconsume the <span>current input character</span> in the
+   <span>script data state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>Script data escaped state</dfn></h5>
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <span>script data escaped dash state</span>.</dd>
+
+   <dt>EOF</dt>
+   <dd><span>Parse error</span>. Reconsume the EOF character in the
+   <span>data state</span>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit the current input character as a character token. Stay in
+   the <span>script data escaped state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>Script data escaped dash state</dfn></h5>
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Switch to the
+   <span>script data escaped dash dash state</span>.</dd>
+
+   <dt>EOF</dt>
+   <dd><span>Parse error</span>. Reconsume the EOF character in the
+   <span>data state</span>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit the current input character as a character token. Switch
+   to the <span>script data escaped state</span>.</dd>
+
+  </dl>
+
+
+  <h5><dfn>Script data escaped dash dash state</dfn></h5>
+
+  <p>Consume the <span>next input character</span>:</p>
+
+  <dl class="switch">
+
+   <dt>U+002D HYPHEN-MINUS (-)</dt>
+   <dd>Emit a U+002D HYPHEN-MINUS character token. Stay in the
+   <span>script data escaped dash dash state</span>.</dd>
+
+   <dt>U+003E GREATER-THAN SIGN (>)</dt>
+   <dd>Emit a U+003E GREATER-THAN SIGN character token. Switch to the
+   <span>script data state</span>.</dd>
+
+   <dt>EOF</dt>
+   <dd><span>Parse error</span>. Reconsume the EOF character in the
+   <span>data state</span>.</dd>
+
+   <dt>Anything else</dt>
+   <dd>Emit the current input character as a character token. Switch
+   to the <span>script data escaped state</span>.</dd>
+
+  </dl>
+
+
   <h5><dfn>Before attribute name state</dfn></h5>
 
   <p>Consume the <span>next input character</span>:</p>
@@ -79592,9 +79983,9 @@
    state</span>.</dd>
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-   <dd>Append the lowercase version of the <span>current input character</span>
-   (add 0x0020 to the character's code point) to the current
-   attribute's name. Stay in the <span>attribute name
+   <dd>Append the lowercase version of the <span>current input
+   character</span> (add 0x0020 to the character's code point) to the
+   current attribute's name. Stay in the <span>attribute name
    state</span>.</dd>
 
    <dt>U+0022 QUOTATION MARK (")</dt>
@@ -79608,8 +79999,9 @@
    <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <span>current input character</span> to the current attribute's
-   name. Stay in the <span>attribute name state</span>.</dd>
+   <dd>Append the <span>current input character</span> to the current
+   attribute's name. Stay in the <span>attribute name
+   state</span>.</dd>
 
   </dl>
 
@@ -79647,10 +80039,10 @@
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
    <dd>Start a new attribute in the current tag token. Set that
-   attribute's name to the lowercase version of the <span>current input character</span>
-   (add 0x0020 to the character's code point), and its value to
-   the empty string. Switch to the <span>attribute name
-   state</span>.</dd>
+   attribute's name to the lowercase version of the <span>current
+   input character</span> (add 0x0020 to the character's code point),
+   and its value to the empty string. Switch to the <span>attribute
+   name state</span>.</dd>
 
    <dt>U+0022 QUOTATION MARK (")</dt>
    <dt>U+0027 APOSTROPHE (')</dt>
@@ -79664,8 +80056,8 @@
 
    <dt>Anything else</dt>
    <dd>Start a new attribute in the current tag token. Set that
-   attribute's name to the <span>current input character</span>, and its value to
-   the empty string. Switch to the <span>attribute name
+   attribute's name to the <span>current input character</span>, and
+   its value to the empty string. Switch to the <span>attribute name
    state</span>.</dd>
 
   </dl>
@@ -79689,7 +80081,7 @@
 
    <dt>U+0026 AMPERSAND (&)</dt>
    <dd>Switch to the <span>attribute value (unquoted) state</span>
-   and reconsume this input character.</dd>
+   and reconsume this <span>current input character</span>.</dd>
 
    <dt>U+0027 APOSTROPHE (')</dt>
    <dd>Switch to the <span>attribute value (single-quoted) state</span>.</dd>
@@ -79736,9 +80128,9 @@
    <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <span>current input character</span> to the current attribute's
-   value. Stay in the <span>attribute value (double-quoted)
-   state</span>.</dd>
+   <dd>Append the <span>current input character</span> to the current
+   attribute's value. Stay in the <span>attribute value
+   (double-quoted) state</span>.</dd>
 
   </dl>
 
@@ -79763,9 +80155,9 @@
    <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <span>current input character</span> to the current attribute's
-   value. Stay in the <span>attribute value (single-quoted)
-   state</span>.</dd>
+   <dd>Append the <span>current input character</span> to the current
+   attribute's value. Stay in the <span>attribute value
+   (single-quoted) state</span>.</dd>
 
   </dl>
 
@@ -79805,8 +80197,8 @@
    <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <span>current input character</span> to the current attribute's
-   value. Stay in the <span>attribute value (unquoted)
+   <dd>Append the <span>current input character</span> to the current
+   attribute's value. Stay in the <span>attribute value (unquoted)
    state</span>.</dd>
 
   </dl>
@@ -79881,9 +80273,6 @@
 
   <h5><dfn>Bogus comment state</dfn></h5>
 
-  <p><i>(This can only happen if the <span>content model
-  flag</span> is set to the PCDATA state.)</i></p>
-
   <p>Consume every character up to and including the first U+003E
   GREATER-THAN SIGN character (>) or the end of the file (EOF),
   whichever comes first. Emit a comment token whose data is the
@@ -79902,9 +80291,6 @@
 
   <h5><dfn>Markup declaration open state</dfn></h5>
 
-  <p><i>(This can only happen if the <span>content model
-  flag</span> is set to the PCDATA state.)</i></p>
-
   <p>If the next two characters are both U+002D HYPHEN-MINUS (-)
   characters, consume those two characters, create a comment token
   whose data is the empty string, and switch to the <span>comment
@@ -79948,8 +80334,8 @@
    the EOF character in the <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the input character to the comment token's
-   data. Switch to the <span>comment state</span>.</dd>
+   <dd>Append the <span>current input character</span> to the comment
+   token's data. Switch to the <span>comment state</span>.</dd>
 
   </dl>
 
@@ -79973,9 +80359,9 @@
    in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append a U+002D HYPHEN-MINUS character (-) and the input
-   character to the comment token's data. Switch to the
-   <span>comment state</span>.</dd>
+   <dd>Append a U+002D HYPHEN-MINUS character (-) and the
+   <span>current input character</span> to the comment token's
+   data. Switch to the <span>comment state</span>.</dd>
 
   </dl>
 
@@ -79995,8 +80381,8 @@
    in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append the input character to the comment token's data. Stay
-   in the <span>comment state</span>.</dd>
+   <dd>Append the <span>current input character</span> to the comment
+   token's data. Stay in the <span>comment state</span>.</dd>
 
   </dl>
 
@@ -80016,9 +80402,9 @@
    in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append a U+002D HYPHEN-MINUS character (-) and the input
-   character to the comment token's data. Switch to the
-   <span>comment state</span>.</dd>
+   <dd>Append a U+002D HYPHEN-MINUS character (-) and the
+   <span>current input character</span> to the comment token's
+   data. Switch to the <span>comment state</span>.</dd>
 
   </dl>
 
@@ -80039,8 +80425,9 @@
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
    <dd><span>Parse error</span>. Append two U+002D HYPHEN-MINUS (-)
-   characters and the input character to the comment token's
-   data. Switch to the <span>comment end space state</span>.</dd>
+   characters and the <span>current input character</span> to the
+   comment token's data. Switch to the <span>comment end space
+   state</span>.</dd>
 
    <dt>U+0021 EXCLAMATION MARK (!)</dt>
    <dd><span>Parse error</span>. Switch to the <span>comment end bang
@@ -80061,8 +80448,9 @@
 
    <dt>Anything else</dt>
    <dd><span>Parse error</span>. Append two U+002D HYPHEN-MINUS (-)
-   characters and the input character to the comment token's
-   data. Switch to the <span>comment state</span>.</dd>
+   characters and the <span>current input character</span> to the
+   comment token's data. Switch to the <span>comment
+   state</span>.</dd>
 
   </dl>
 
@@ -80089,9 +80477,9 @@
 
    <dt>Anything else</dt>
    <dd>Append two U+002D HYPHEN-MINUS (-) characters, a U+0021
-   EXCLAMATION MARK character (!), and the input character to the
-   comment token's data. Switch to the <span>comment
-   state</span>.</dd>
+   EXCLAMATION MARK character (!), and the <span>current input
+   character</span> to the comment token's data. Switch to the
+   <span>comment state</span>.</dd>
 
   </dl>
 
@@ -80107,7 +80495,7 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Append the input character to the comment token's data. Stay in
+   <dd>Append the <span>current input character</span> to the comment token's data. Stay in
    the <span>comment end space state</span>.</dd>
 
    <dt>U+002D HYPHEN-MINUS (-)</dt>
@@ -80123,7 +80511,7 @@
    comment in comment end state -->
 
    <dt>Anything else</dt>
-   <dd>Append the input character to the comment token's data. Switch
+   <dd>Append the <span>current input character</span> to the comment token's data. Switch
    to the <span>comment state</span>.</dd>
 
   </dl>
@@ -80169,7 +80557,7 @@
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
    <dd>Create a new DOCTYPE token. Set the token's name to the
-   lowercase version of the input character (add 0x0020 to the
+   lowercase version of the <span>current input character</span> (add 0x0020 to the
    character's code point). Switch to the <span>DOCTYPE name
    state</span>.</dd>
 
@@ -80209,9 +80597,10 @@
    state</span>.</dd>
 
    <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
-   <dd>Append the lowercase version of the input character (add 0x0020
-   to the character's code point) to the current DOCTYPE token's
-   name. Stay in the <span>DOCTYPE name state</span>.</dd>
+   <dd>Append the lowercase version of the <span>current input
+   character</span> (add 0x0020 to the character's code point) to the
+   current DOCTYPE token's name. Stay in the <span>DOCTYPE name
+   state</span>.</dd>
 
    <dt>EOF</dt>
    <dd><span>Parse error</span>. Set the DOCTYPE token's
@@ -80219,8 +80608,9 @@
    Reconsume the EOF character in the <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <span>current input character</span> to the current DOCTYPE
-   token's name. Stay in the <span>DOCTYPE name state</span>.</dd>
+   <dd>Append the <span>current input character</span> to the current
+   DOCTYPE token's name. Stay in the <span>DOCTYPE name
+   state</span>.</dd>
 
   </dl>
 
@@ -80402,7 +80792,8 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Switch to the <span>between DOCTYPE public and system identifiers state</span>.</dd>
+   <dd>Switch to the <span>between DOCTYPE public and system
+   identifiers state</span>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd>Emit the current DOCTYPE token. Switch to the <span>data
@@ -80442,7 +80833,8 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Stay in the <span>between DOCTYPE public and system identifiers state</span>.</dd>
+   <dd>Stay in the <span>between DOCTYPE public and system identifiers
+   state</span>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd>Emit the current DOCTYPE token. Switch to the <span>data
@@ -80545,7 +80937,8 @@
   <dl class="switch">
 
    <dt>U+0022 QUOTATION MARK (")</dt>
-   <dd>Switch to the <span>after DOCTYPE system identifier state</span>.</dd>
+   <dd>Switch to the <span>after DOCTYPE system identifier
+   state</span>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd><span>Parse error</span>. Set the DOCTYPE token's
@@ -80558,8 +80951,8 @@
    Reconsume the EOF character in the <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <span>current input character</span> to the current DOCTYPE
-   token's system identifier. Stay in the <span>DOCTYPE system
+   <dd>Append the <span>current input character</span> to the current
+   DOCTYPE token's system identifier. Stay in the <span>DOCTYPE system
    identifier (double-quoted) state</span>.</dd>
 
   </dl>
@@ -80572,7 +80965,8 @@
   <dl class="switch">
 
    <dt>U+0027 APOSTROPHE (')</dt>
-   <dd>Switch to the <span>after DOCTYPE system identifier state</span>.</dd>
+   <dd>Switch to the <span>after DOCTYPE system identifier
+   state</span>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd><span>Parse error</span>. Set the DOCTYPE token's
@@ -80585,8 +80979,8 @@
    Reconsume the EOF character in the <span>data state</span>.</dd>
 
    <dt>Anything else</dt>
-   <dd>Append the <span>current input character</span> to the current DOCTYPE
-   token's system identifier. Stay in the <span>DOCTYPE system
+   <dd>Append the <span>current input character</span> to the current
+   DOCTYPE token's system identifier. Stay in the <span>DOCTYPE system
    identifier (single-quoted) state</span>.</dd>
 
   </dl>
@@ -80603,7 +80997,8 @@
    <dt>U+000C FORM FEED (FF)</dt>
    <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
    <dt>U+0020 SPACE</dt>
-   <dd>Stay in the <span>after DOCTYPE system identifier state</span>.</dd>
+   <dd>Stay in the <span>after DOCTYPE system identifier
+   state</span>.</dd>
 
    <dt>U+003E GREATER-THAN SIGN (>)</dt>
    <dd>Emit the current DOCTYPE token. Switch to the <span>data
@@ -80644,9 +81039,6 @@
 
   <h5><dfn>CDATA section state</dfn></h5>
 
-  <p><i>(This can only happen if the <span>content model
-  flag</span> is set to the PCDATA state.)</i></p>
-
   <p>Consume every character up to the next occurrence of the three
   character sequence U+005D RIGHT SQUARE BRACKET U+005D RIGHT SQUARE
   BRACKET U+003E GREATER-THAN SIGN (<code title="">]]></code>), or the
@@ -81162,11 +81554,10 @@
    <li><p><span>Insert an HTML element</span> for the token.</p></li>
 
    <li><p>If the algorithm that was invoked is the <span>generic raw
-   text element parsing algorithm</span>, switch the tokenizer's
-   <span>content model flag</span> to the RAWTEXT state; otherwise the
-   algorithm invoked was the <span>generic RCDATA element parsing
-   algorithm</span>, switch the tokenizer's <span>content model
-   flag</span> to the RCDATA state.</p></li>
+   text element parsing algorithm</span>, switch the tokenizer to the
+   <span>RAWTEXT state</span>; otherwise the algorithm invoked
+   was the <span>generic RCDATA element parsing algorithm</span>,
+   switch the tokenizer to the <span>RCDATA state</span>.</p></li>
 
    <li><p>Let the <span>original insertion mode</span> be the current
    <span>insertion mode</span>.</p>
@@ -81744,8 +82135,8 @@
      and push it onto the <span>stack of open
      elements</span>.</p></li>
 
-     <li><p>Switch the tokenizer's <span>content model flag</span> to
-     the RAWTEXT state.</p></li>
+     <li><p>Switch the tokenizer to the <span>script data
+     state</span>.</p></li>
 
      <li><p>Let the <span>original insertion mode</span> be the current
      <span>insertion mode</span>.</p>
@@ -82328,14 +82719,12 @@
 
     <p><span>Insert an HTML element</span> for the token.</p>
 
-    <p>Switch the <span>content model flag</span> to the PLAINTEXT
-    state.</p>
+    <p>Switch the tokenizer to the <span>PLAINTEXT state</span>.</p>
 
-    <p class="note">Once a start tag with the tag name "plaintext"
-    has been seen, that will be the last token ever seen other
-    than character tokens (and the end-of-file token), because
-    there is no way to switch the <span>content model flag</span>
-    out of the PLAINTEXT state.</p>
+    <p class="note">Once a start tag with the tag name "plaintext" has
+    been seen, that will be the last token ever seen other than
+    character tokens (and the end-of-file token), because there is no
+    way to switch out of the <span>PLAINTEXT state</span>.</p>
 
    </dd>
 
@@ -82990,8 +83379,8 @@
      one. (Newlines at the start of <code>textarea</code> elements are
      ignored as an authoring convenience.)</p></li>
 
-     <li><p>Switch the tokenizer's <span>content model flag</span> to
-     the RCDATA state.</p></li>
+     <li><p>Switch the tokenizer to the the <span>RCDATA
+     state</span>.</p></li>
 
      <li><p>Let the <span>original insertion mode</span> be the
      current <span>insertion mode</span>.</p>
@@ -85633,45 +86022,40 @@
 
      <li>
 
-      <p>Set the <span>HTML parser</span>'s <span>tokenization</span>
-      stage's <span>content model flag</span> according to the <var
-      title="">context</var> element, as follows:</p>
+      <p>Set the state of the <span>HTML parser</span>'s
+      <span>tokenization</span> stage as follows:</p>
 
       <dl class="switch">
 
        <dt>If it is a <code>title</code> or <code>textarea</code>
        element</dt>
 
-       <dd>Set the <span>content model flag</span> to
-       the RCDATA state.</dd>
+       <dd>Switch the tokenizer to the <span>RCDATA state</span>.</dd>
 
 
        <dt>If it is a <code>style</code>, <code>script</code>,
        <code>xmp</code>, <code>iframe</code>, <code>noembed</code>, or
        <code>noframes</code> element</dt>
 
-       <dd>Set the <span>content model flag</span> to
-       the RAWTEXT state.</dd>
+       <dd>Switch the tokenizer to the <span>RAWTEXT state</span>.</dd>
 
 
        <dt>If it is a <code>noscript</code> element</dt>
 
-       <dd>If the <span>scripting flag</span> is enabled, set the
-       <span>content model flag</span> to the RAWTEXT
-       state. Otherwise, set the <span>content model flag</span> to the
-       PCDATA state.</dd>
+       <dd>If the <span>scripting flag</span> is enabled, switch the
+       tokenizer to the <span>RAWTEXT state</span>.  Otherwise,
+       leave the tokenizer in the <span>data state</span>.</dd>
 
 
        <dt>If it is a <code>plaintext</code> element</dt>
 
-       <dd>Set the <span>content model flag</span> to
-       PLAINTEXT.</dd>
+       <dd>Switch the tokenizer to the <span>PLAINTEXT
+       state</span>.</dd>
 
 
        <dt>Otherwise</dt>
 
-       <dd>Leave the <span>content model flag</span> in the PCDATA
-       state.</dd>
+       <dd>Leave the tokenizer in the <span>data state</span>.</dd>
 
       </dl>