[html5] r1906 - [e] (0) Make the tokeniser states into sections for easier navigation. (Bug 5881 [...]
whatwg at whatwg.org
whatwg at whatwg.org
Tue Jul 22 18:04:17 PDT 2008
Author: ianh
Date: 2008-07-22 18:04:16 -0700 (Tue, 22 Jul 2008)
New Revision: 1906
Modified:
index
source
Log:
[e] (0) Make the tokeniser states into sections for easier navigation. (Bug 5881) (credit: as)
Modified: index
===================================================================
--- index 2008-07-23 00:56:52 UTC (rev 1905)
+++ index 2008-07-23 01:04:16 UTC (rev 1906)
@@ -1758,7 +1758,112 @@
<li><a href="#tokenization"><span class=secno>8.2.4
</span>Tokenization</a>
<ul class=toc>
- <li><a href="#tokenizing"><span class=secno>8.2.4.1.
+ <li><a href="#data-state"><span class=secno>8.2.4.1. </span>Data
+ state</a>
+
+ <li><a href="#character1"><span class=secno>8.2.4.2.
+ </span>Character reference data state</a>
+
+ <li><a href="#tag-open"><span class=secno>8.2.4.3. </span>Tag open
+ state</a>
+
+ <li><a href="#close"><span class=secno>8.2.4.4. </span>Close tag
+ open state</a>
+
+ <li><a href="#tag-name"><span class=secno>8.2.4.5. </span>Tag name
+ state</a>
+
+ <li><a href="#before"><span class=secno>8.2.4.6. </span>Before
+ attribute name state</a>
+
+ <li><a href="#attribute"><span class=secno>8.2.4.7. </span>Attribute
+ name state</a>
+
+ <li><a href="#after"><span class=secno>8.2.4.8. </span>After
+ attribute name state</a>
+
+ <li><a href="#before0"><span class=secno>8.2.4.9. </span>Before
+ attribute value state</a>
+
+ <li><a href="#attribute0"><span class=secno>8.2.4.10.
+ </span>Attribute value (double-quoted) state</a>
+
+ <li><a href="#attribute1"><span class=secno>8.2.4.11.
+ </span>Attribute value (single-quoted) state</a>
+
+ <li><a href="#attribute2"><span class=secno>8.2.4.12.
+ </span>Attribute value (unquoted) state</a>
+
+ <li><a href="#character2"><span class=secno>8.2.4.13.
+ </span>Character reference in attribute value state</a>
+
+ <li><a href="#after0"><span class=secno>8.2.4.14. </span>After
+ attribute value (quoted) state</a>
+
+ <li><a href="#self-closing"><span class=secno>8.2.4.15.
+ </span>Self-closing start tag state</a>
+
+ <li><a href="#bogus"><span class=secno>8.2.4.16. </span>Bogus
+ comment state</a>
+
+ <li><a href="#markup"><span class=secno>8.2.4.17. </span>Markup
+ declaration open state</a>
+
+ <li><a href="#comment0"><span class=secno>8.2.4.18. </span>Comment
+ start state</a>
+
+ <li><a href="#comment1"><span class=secno>8.2.4.19. </span>Comment
+ start dash state</a>
+
+ <li><a href="#comment2"><span class=secno>8.2.4.20. </span>Comment
+ end dash state</a>
+
+ <li><a href="#comment3"><span class=secno>8.2.4.21. </span>Comment
+ end state</a>
+
+ <li><a href="#doctype"><span class=secno>8.2.4.22. </span>DOCTYPE
+ state</a>
+
+ <li><a href="#before1"><span class=secno>8.2.4.23. </span>Before
+ DOCTYPE name state</a>
+
+ <li><a href="#doctype0"><span class=secno>8.2.4.24. </span>DOCTYPE
+ name state</a>
+
+ <li><a href="#after1"><span class=secno>8.2.4.25. </span>After
+ DOCTYPE name state</a>
+
+ <li><a href="#before2"><span class=secno>8.2.4.26. </span>Before
+ DOCTYPE public identifier state</a>
+
+ <li><a href="#doctype1"><span class=secno>8.2.4.27. </span>DOCTYPE
+ public identifier (double-quoted) state</a>
+
+ <li><a href="#doctype2"><span class=secno>8.2.4.28. </span>DOCTYPE
+ public identifier (single-quoted) state</a>
+
+ <li><a href="#after2"><span class=secno>8.2.4.29. </span>After
+ DOCTYPE public identifier state</a>
+
+ <li><a href="#before3"><span class=secno>8.2.4.30. </span>Before
+ DOCTYPE system identifier state</a>
+
+ <li><a href="#doctype3"><span class=secno>8.2.4.31. </span>DOCTYPE
+ system identifier (double-quoted) state</a>
+
+ <li><a href="#doctype4"><span class=secno>8.2.4.32. </span>DOCTYPE
+ system identifier (single-quoted) state</a>
+
+ <li><a href="#after3"><span class=secno>8.2.4.33. </span>After
+ DOCTYPE system identifier state</a>
+
+ <li><a href="#bogus0"><span class=secno>8.2.4.34. </span>Bogus
+ DOCTYPE state</a>
+
+ <li><a href="#cdata0"><span class=secno>8.2.4.35. </span>CDATA
+ section state</a>
+
+ <li><a href="#tokenizing"><span class=secno>8.2.4.36.
</span>Tokenizing character references</a>
</ul>
@@ -2748,7 +2853,7 @@
<li>
<p>The <a href="#url">URL</a> is a valid IRI reference and the <a
- href="#character1" title="document's character encoding">character
+ href="#character3" title="document's character encoding">character
encoding</a> of the URL's <code>Document</code> is UTF-8 or UTF-16. <a
href="#refsRFC3987">[RFC3987]</a>
</ul>
@@ -2963,7 +3068,7 @@
href="#urldoc">associated with</a> <var title="">url</var>.
<li>
- <p>Let <var title="">encoding</var> be the <a href="#character1"
+ <p>Let <var title="">encoding</var> be the <a href="#character3"
title="document's character encoding">character encoding</a> of <var
title="">document</var>.
@@ -6615,7 +6720,7 @@
<a href="#htmldocument">HTMLDocument</a> <a href="#open" title=dom-document-open>open</a>(in DOMString type, in DOMString replace);
<a href="#window">Window</a> <a href="#open" title=dom-document-open>open</a>(in DOMString url, in DOMString name, in DOMString features);
<a href="#window">Window</a> <a href="#open" title=dom-document-open>open</a>(in DOMString url, in DOMString name, in DOMString features, in boolean replace);
- void <a href="#close" title=dom-document-close>close</a>();
+ void <a href="#close0" title=dom-document-close>close</a>();
void <a href="#document.write" title=dom-document-write>write</a>(in DOMString text);
void <a href="#document.writeln..." title=dom-document-writeln>writeln</a>(in DOMString text);
@@ -6788,9 +6893,9 @@
</ul>
</div>
- <p>Documents have an associated <dfn id=character1 title="document's
+ <p>Documents have an associated <dfn id=character3 title="document's
character encoding">character encoding</dfn>. When a <code>Document</code>
- object is created, the <a href="#character1">document's character
+ object is created, the <a href="#character3">document's character
encoding</a> must be initialized to UTF-16. Various algorithms during page
loading affect this value, as does the <code title=dom-document-charset><a
href="#charset0">charset</a></code> setter. <a
@@ -6800,15 +6905,15 @@
<p>The <dfn id=charset0
title=dom-document-charset><code>charset</code></dfn> DOM attribute must,
on getting, return the preferred MIME name of the <a
- href="#character1">document's character encoding</a>. On setting, if the
+ href="#character3">document's character encoding</a>. On setting, if the
new value is an IANA-registered alias for a character encoding, the <a
- href="#character1">document's character encoding</a> must be set to that
+ href="#character3">document's character encoding</a> must be set to that
character encoding. (Otherwise, nothing happens.)
<p>The <dfn id=characterset
title=dom-document-characterSet><code>characterSet</code></dfn> DOM
attribute must, on getting, return the preferred MIME name of the <a
- href="#character1">document's character encoding</a>.
+ href="#character3">document's character encoding</a>.
<p>The <dfn id=defaultcharset
title=dom-document-defaultCharset><code>defaultCharset</code></dfn> DOM
@@ -8312,7 +8417,7 @@
<p>Remove all child nodes of the document.
<li>
- <p>Change the <a href="#character1">document's character encoding</a> to
+ <p>Change the <a href="#character3">document's character encoding</a> to
UTF-16.
<li>
@@ -8321,9 +8426,9 @@
parser</dfn> (meaning that it can be closed by the <code
title=dom-document-open><a href="#open">document.open()</a></code> and
<code title=dom-document-close><a
- href="#close">document.close()</a></code> methods, and that the
+ href="#close0">document.close()</a></code> methods, and that the
tokeniser will wait for an explicit call to <code
- title=dom-document-close><a href="#close">document.close()</a></code>
+ title=dom-document-close><a href="#close0">document.close()</a></code>
before emitting an end-of-file token).
<li>Mark the document as being an <a href="#html-" title="HTML
@@ -8389,7 +8494,7 @@
href="#htmldocument">HTMLDocument</a></code> object is null, then the
method must raise an <code>INVALID_ACCESS_ERR</code> exception.
- <p>The <dfn id=close title=dom-document-close><code>close()</code></dfn>
+ <p>The <dfn id=close0 title=dom-document-close><code>close()</code></dfn>
method must do nothing if there is no <a
href="#script-created">script-created parser</a> associated with the
document. If there is such a parser, then, when the method is called, the
@@ -9321,7 +9426,7 @@
document-level metadata with the <code title=attr-meta-name><a
href="#name">name</a></code> attribute, pragma directives with the <code
title=attr-meta-http-equiv><a href="#http-equiv">http-equiv</a></code>
- attribute, and the file's <a href="#character2">character encoding
+ attribute, and the file's <a href="#character4">character encoding
declaration</a> when an HTML document is serialized to string form (e.g.
for transmission over the network or for disk storage) with the <code
title=attr-meta-charset><a href="#charset1">charset</a></code> attribute.
@@ -9340,7 +9445,7 @@
<p>The <dfn id=charset1 title=attr-meta-charset><code>charset</code></dfn>
attribute specifies the character encoding used by the document. This is
- called a <a href="#character2">character encoding declaration</a>.
+ called a <a href="#character4">character encoding declaration</a>.
<p>The <code title=attr-meta-charset><a href="#charset1">charset</a></code>
attribute may be specified in <a href="#html5" title=HTML5>HTML
@@ -9621,7 +9726,7 @@
user agent requirements are all handled by the parsing section of the
specification. The state is just an alternative form of setting the
<code title=meta-charset>charset</code> attribute: it is a <a
- href="#character2">character encoding declaration</a>.</p>
+ href="#character4">character encoding declaration</a>.</p>
<p>For <code><a href="#meta0">meta</a></code> elements in the <a
href="#encoding" title=attr-meta-http-equiv-content-type>Encoding
@@ -9831,7 +9936,7 @@
though if we do then we have to duplicate the requirements in the
parsing section for conformance checkers -->
- <p>A <dfn id=character2>character encoding declaration</dfn> is a mechanism
+ <p>A <dfn id=character4>character encoding declaration</dfn> is a mechanism
by which the character encoding used to store or transmit a document is
specified.
@@ -9847,7 +9952,7 @@
http://www.iana.org/assignments/character-sets -->
<li>The character encoding declaration must be serialized without the use
- of <a href="#character3" title=syntax-charref>character references</a> or
+ of <a href="#character5" title=syntax-charref>character references</a> or
character escapes of any kind.
</ul>
@@ -25144,7 +25249,7 @@
<p>Otherwise, let <var><a href="#the-scripts0">the script's character
encoding</a></var> for this <code><a href="#script1">script</a></code>
- element be the same as <a href="#character1" title="document's character
+ element be the same as <a href="#character3" title="document's character
encoding">the encoding of the document itself</a>.</p>
<li>
@@ -30279,7 +30384,7 @@
XXXDOCURL -->
is <code>about:blank</code><!-- XXX xref -->, which is marked as being an
<a href="#html-" title="HTML documents">HTML document</a>, and whose <a
- href="#character1" title="document's character encoding">character
+ href="#character3" title="document's character encoding">character
encoding</a> is UTF-8. The <code>Document</code> must have a single child
<code><a href="#html">html</a></code> node, which itself has a single
child <code><a href="#body0">body</a></code> node. If the <a
@@ -35129,7 +35234,7 @@
or implied by the algorithms given in this specification, are the ones
that must be used when determining the character encoding according to the
rules given in the above specifications. Once the character encoding is
- established, the <a href="#character1">document's character encoding</a>
+ established, the <a href="#character3">document's character encoding</a>
must be set to that character encoding.
<p>If the root element, as parsed according to the XML specifications cited
@@ -35194,7 +35299,7 @@
versions thereof. <a href="#refsRFC2046">[RFC2046]</a> <a
href="#refsRFC2646">[RFC2646]</a>
- <p>The <a href="#character1">document's character encoding</a> must be set
+ <p>The <a href="#character3">document's character encoding</a> must be set
to the character encoding used to decode the document.
<p>Upon creation of the <code>Document</code> object, the user agent must
@@ -41795,10 +41900,10 @@
<p>The <dfn id=disconnect
title=dom-WebSocket-disconnect><code>disconnect()</code></dfn> method must
- <a href="#close1">close the Web Socket connection</a> or connection
+ <a href="#close2">close the Web Socket connection</a> or connection
attempt, if any. If the connection is already closed, it must do nothing.
Closing the connection causes a <code title=event-WebSocket-close><a
- href="#close0">close</a></code> event to be fired and the <code
+ href="#close1">close</a></code> event to be fired and the <code
title=dom-WebSocket-readyState><a
href="#readystate1">readyState</a></code> attribute's value to change, as
<a href="#closeWebSocket">described below</a>.
@@ -41809,7 +41914,7 @@
event is fired when the <a href="#web-socket">Web Socket connection is
established</a>.
- <p>The <dfn id=close0 title=event-WebSocket-close><code>close</code></dfn>
+ <p>The <dfn id=close1 title=event-WebSocket-close><code>close</code></dfn>
event is fired when the connection is closed (whether by the author,
calling the <code title=dom-WebSocket-disconnect><a
href="#disconnect">disconnect()</a></code> method, or by the server, or by
@@ -41859,7 +41964,7 @@
<dd>
<p>Must be invoked whenever an <code title=event-WebSocket-close><a
- href="#close0">close</a></code> event is targeted at or bubbles through
+ href="#close1">close</a></code> event is targeted at or bubbles through
the <code><a href="#websocket0">WebSocket</a></code> object.
</dl>
@@ -42236,7 +42341,7 @@
</ol>
<p>To <dfn id=fail-the>fail the Web Socket connection</dfn>, the user agent
- must <a href="#close1">close the Web Socket connection</a>, and may report
+ must <a href="#close2">close the Web Socket connection</a>, and may report
the problem to the user (which would be especially useful for developers).
However, user agents must not convey the failure information to the script
in a way distinguishable from the Web Socket being closed normally.
@@ -42530,16 +42635,16 @@
<h5 id=closing0><span class=secno>7.3.4.3. </span>Closing the connection</h5>
- <p>To <dfn id=close1>close the Web Socket connection</dfn>, either the user
+ <p>To <dfn id=close2>close the Web Socket connection</dfn>, either the user
agent or the server closes the TCP/IP connection. There is no closing
handshake. Whether the user agent or the server closes the connection, it
is said that the <dfn id=web-socket0>Web Socket connection is
closed</dfn>.
- <p>Servers may <a href="#close1">close the Web Socket connection</a>
+ <p>Servers may <a href="#close2">close the Web Socket connection</a>
whenever desired.
- <p>User agents should not <a href="#close1">close the Web Socket
+ <p>User agents should not <a href="#close2">close the Web Socket
connection</a> arbitrarily.
<p id=closeWebSocket>When the <a href="#web-socket0">Web Socket connection
@@ -42548,7 +42653,7 @@
changed to <code title=dom-WebSocket-CLOSED><a
href="#closed">CLOSED</a></code> (2), and the user agent must <a
href="#firing2">fire a simple event</a> named <code
- title=event-WebSocket-close><a href="#close0">close</a></code> at the
+ title=event-WebSocket-close><a href="#close1">close</a></code> at the
<code><a href="#websocket0">WebSocket</a></code> object.
<h3 id=crossDocumentMessages><span class=secno>7.4 </span><dfn
@@ -42884,7 +42989,7 @@
readonly attribute boolean <a href="#active0" title=dom-MessagePort-active>active</a>;
boolean <a href="#postmessage1" title=dom-MessagePort-postMessage>postMessage</a>(in DOMString message);
boolean <a href="#postmessage1" title=dom-MessagePort-postMessage>postMessage</a>(in DOMString message, in <a href="#messageport0">MessagePort</a> messagePort);
- void <a href="#close2" title=dom-MessagePort-close>close</a>();
+ void <a href="#close3" title=dom-MessagePort-close>close</a>();
// event handler attributes
attribute <span>EventListener</span> <a href="#onmessage1" title=handler-MessagePort-onmessage>onmessage</a>;
@@ -43108,7 +43213,7 @@
<hr>
- <p>The <dfn id=close2
+ <p>The <dfn id=close3
title=dom-MessagePort-close><code>close()</code></dfn> method, when called
on a port <var title="">local port</var> that is entangled with another
port, must cause the user agents to run the following steps:
@@ -43258,7 +43363,7 @@
<li>Any number of <a href="#comments0" title=syntax-comments>comments</a>
and <a href="#space" title="space character">space characters</a>.
- <li>A <a href="#doctype" title=syntax-doctype>DOCTYPE</a>.
+ <li>A <a href="#doctype5" title=syntax-doctype>DOCTYPE</a>.
<li>Any number of <a href="#comments0" title=syntax-comments>comments</a>
and <a href="#space" title="space character">space characters</a>.
@@ -43298,7 +43403,7 @@
<h4 id=the-doctype><span class=secno>8.1.1 </span>The DOCTYPE</h4>
- <p>A <dfn id=doctype title=syntax-doctype>DOCTYPE</dfn> is a mostly
+ <p>A <dfn id=doctype5 title=syntax-doctype>DOCTYPE</dfn> is a mostly
useless, but required, header.
<p class=note>DOCTYPEs are required for legacy reasons. When omitted,
@@ -43434,7 +43539,7 @@
described below.
<p>RCDATA elements can have <a href="#text2" title=syntax-text>text</a> and
- <a href="#character3" title=syntax-charref>character references</a>, but
+ <a href="#character5" title=syntax-charref>character references</a>, but
the text must not contain an <a href="#ambiguous"
title=syntax-ambiguous-ampersand>ambiguous ampersand</a>. There are also
<a href="#cdata-rcdata-restrictions">further restrictions</a> described
@@ -43444,8 +43549,8 @@
any contents (since, again, as there's no end tag, no content can be put
between the start tag and the end tag). Foreign elements whose start tag
is <em>not</em> marked as self-closing can have <a href="#text2"
- title=syntax-text>text</a>, <a href="#character3"
- title=syntax-charref>character references</a>, <a href="#cdata0"
+ title=syntax-text>text</a>, <a href="#character5"
+ title=syntax-charref>character references</a>, <a href="#cdata1"
title=syntax-cdata>CDATA sections</a>, other <a href="#elements3"
title=syntax-elements>elements</a>, and <a href="#comments0"
title=syntax-comments>comments</a>, but the text must not contain the
@@ -43454,7 +43559,7 @@
ampersand</a>.
<p>Normal elements can have <a href="#text2" title=syntax-text>text</a>, <a
- href="#character3" title=syntax-charref>character references</a>, other <a
+ href="#character5" title=syntax-charref>character references</a>, other <a
href="#elements3" title=syntax-elements>elements</a>, and <a
href="#comments0" title=syntax-comments>comments</a>, but the text must
not contain the character U+003C LESS-THAN SIGN (<code><</code>) or an
@@ -43465,7 +43570,7 @@
model and those described in this paragraph. Those restrictions are
described below.
- <p>Tags contain a <dfn id=tag-name title=syntax-tag-name>tag name</dfn>,
+ <p>Tags contain a <dfn id=tag-name0 title=syntax-tag-name>tag name</dfn>,
giving the element's name. HTML elements all have names that only use
characters in the range U+0030 DIGIT ZERO .. U+0039 DIGIT NINE, U+0061
LATIN SMALL LETTER A .. U+007A LATIN SMALL LETTER Z, U+0041 LATIN CAPITAL
@@ -43484,7 +43589,7 @@
(<code><</code>).
<li>The next few characters of a start tag must be the element's <a
- href="#tag-name" title=syntax-tag-name>tag name</a>.
+ href="#tag-name0" title=syntax-tag-name>tag name</a>.
<li>If there are to be any attributes in the next step, there must first
be one or more <a href="#space" title="space character">space
@@ -43522,7 +43627,7 @@
(<code>/</code>).
<li>The next few characters of an end tag must be the element's <a
- href="#tag-name" title=syntax-tag-name>tag name</a>.
+ href="#tag-name0" title=syntax-tag-name>tag name</a>.
<li>After the tag name, there may be one or more <a href="#space"
title="space character">space characters</a>.
@@ -43536,7 +43641,7 @@
<p><dfn id=attributes2 title=syntax-attributes>Attributes</dfn> for an
element are expressed inside the element's start tag.
- <p>Attributes have a name and a value. <dfn id=attribute
+ <p>Attributes have a name and a value. <dfn id=attribute3
title=syntax-attribute-name>Attribute names</dfn> must consist of one or
more characters other than the <a href="#space" title="space
character">space characters</a>, U+0000 NULL, U+0022 QUOTATION MARK
@@ -43548,9 +43653,9 @@
all-lowercase<!-- ASCII case-insensitive -->, matches the attribute's
name; attribute names are case-insensitive.
- <p><dfn id=attribute0 title=syntax-attribute-value>Attribute values</dfn>
+ <p><dfn id=attribute4 title=syntax-attribute-value>Attribute values</dfn>
are a mixture of <a href="#text2" title=syntax-text>text</a> and <a
- href="#character3" title=syntax-charref>character references</a>, except
+ href="#character5" title=syntax-charref>character references</a>, except
with the additional restriction that the text cannot contain an <a
href="#ambiguous" title=syntax-ambiguous-ampersand>ambiguous
ampersand</a>.
@@ -43561,7 +43666,7 @@
<dt>Empty attribute syntax
<dd>
- <p>Just the <a href="#attribute" title=syntax-attribute-name>attribute
+ <p>Just the <a href="#attribute3" title=syntax-attribute-name>attribute
name</a>.</p>
<div class=example>
@@ -43579,11 +43684,11 @@
<dt>Unquoted attribute value syntax
<dd>
- <p>The <a href="#attribute" title=syntax-attribute-name>attribute
+ <p>The <a href="#attribute3" title=syntax-attribute-name>attribute
name</a>, followed by zero or more <a href="#space" title="space
character">space characters</a>, followed by a single U+003D EQUALS SIGN
character, followed by zero or more <a href="#space" title="space
- character">space characters</a>, followed by the <a href="#attribute0"
+ character">space characters</a>, followed by the <a href="#attribute4"
title=syntax-attribute-value>attribute value</a>, which, in addition to
the requirements given above for attribute values, must not contain any
literal <a href="#space" title="space character">space characters</a>, a
@@ -43609,12 +43714,12 @@
<dt>Single-quoted attribute value syntax
<dd>
- <p>The <a href="#attribute" title=syntax-attribute-name>attribute
+ <p>The <a href="#attribute3" title=syntax-attribute-name>attribute
name</a>, followed by zero or more <a href="#space" title="space
character">space characters</a>, followed by a single U+003D EQUALS SIGN
character, followed by zero or more <a href="#space" title="space
character">space characters</a>, followed by a single U+0027 APOSTROPHE
- (<code>'</code>) character, followed by the <a href="#attribute0"
+ (<code>'</code>) character, followed by the <a href="#attribute4"
title=syntax-attribute-value>attribute value</a>, which, in addition to
the requirements given above for attribute values, must not contain any
literal U+0027 APOSTROPHE (<code>'</code>) characters, and finally
@@ -43635,12 +43740,12 @@
<dt>Double-quoted attribute value syntax
<dd>
- <p>The <a href="#attribute" title=syntax-attribute-name>attribute
+ <p>The <a href="#attribute3" title=syntax-attribute-name>attribute
name</a>, followed by zero or more <a href="#space" title="space
character">space characters</a>, followed by a single U+003D EQUALS SIGN
character, followed by zero or more <a href="#space" title="space
character">space characters</a>, followed by a single U+0022 QUOTATION
- MARK (<code>"</code>) character, followed by the <a href="#attribute0"
+ MARK (<code>"</code>) character, followed by the <a href="#attribute4"
title=syntax-attribute-value>attribute value</a>, which, in addition to
the requirements given above for attribute values, must not contain any
literal U+0022 QUOTATION MARK (<code>"</code>) characters, and finally
@@ -43910,7 +44015,7 @@
that is not itself in an <a href="#escaping" title=syntax-escape>escaping
text span</a>, and ends at the next <a href="#escaping1"
title=syntax-escape-end>escaping text span end</a>. There cannot be any <a
- href="#character3" title=syntax-charref>character references</a> inside an
+ href="#character5" title=syntax-charref>character references</a> inside an
<a href="#escaping" title=syntax-escape>escaping text span</a>.
<p>An <dfn id=escaping0 title=syntax-escape-start>escaping text span
@@ -43955,7 +44060,7 @@
<h4 id=character><span class=secno>8.1.4 </span>Character references</h4>
<p>In certain cases described in other sections, <a href="#text2"
- title=syntax-text>text</a> may be mixed with <dfn id=character3
+ title=syntax-text>text</a> may be mixed with <dfn id=character5
title=syntax-charref>character references</dfn>. These can be used to
escape characters that couldn't otherwise legally be included in <a
href="#text2" title=syntax-text>text</a>.
@@ -44004,7 +44109,7 @@
<h4 id=cdata><span class=secno>8.1.5 </span>CDATA sections</h4>
- <p><dfn id=cdata0 title=syntax-cdata>CDATA sections</dfn> must start with
+ <p><dfn id=cdata1 title=syntax-cdata>CDATA sections</dfn> must start with
the character sequence U+003C LESS-THAN SIGN, U+0021 EXCLAMATION MARK,
U+005B LEFT SQUARE BRACKET, U+0043 LATIN CAPITAL LETTER C, U+0044 LATIN
CAPITAL LETTER D, U+0041 LATIN CAPITAL LETTER A, U+0054 LATIN CAPITAL
@@ -44566,7 +44671,7 @@
heuristically decide which to use as a default.
</ol>
- <p>The <a href="#character1">document's character encoding</a> must
+ <p>The <a href="#character3">document's character encoding</a> must
immediately be set to the value returned from this algorithm, at the same
time as the user agent uses the returned value to select the decoder to
use for the input stream.
@@ -44789,7 +44894,7 @@
parser is a <a href="#script-created">script-created parser</a>, then the
end of the <a href="#input0">input stream</a> is reached when an <dfn
id=explicit0>explicit "EOF" character</dfn> (inserted by the <code
- title=dom-document-close><a href="#close">document.close()</a></code>
+ title=dom-document-close><a href="#close0">document.close()</a></code>
method) is consumed. Otherwise, the "EOF" character is not a real
character in the stream, but rather the lack of any further characters.
@@ -44819,7 +44924,7 @@
have the same Unicode interpretations in both the current encoding and
the new encoding, and if the user agent supports changing the converter
on the fly, then the user agent may change to the new converter for the
- encoding on the fly. Set the <a href="#character1">document's character
+ encoding on the fly. Set the <a href="#character3">document's character
encoding</a> and the encoding used to convert the input stream to the new
encoding, set the <a href="#confidence"
title=concept-encoding-confidence>confidence</a> to <i>confident</i>, and
@@ -44842,11 +44947,11 @@
<p>Initially the <span>insertion mode</span> is "<a href="#initial"
title="insertion mode: initial">initial</a>". It can change to "<a
- href="#before4" title="insertion mode: before html">before html</a>", "<a
- href="#before5" title="insertion mode: before head">before head</a>", "<a
+ href="#before9" title="insertion mode: before html">before html</a>", "<a
+ href="#before10" title="insertion mode: before head">before head</a>", "<a
href="#in-head" title="insertion mode: in head">in head</a>", "<a
href="#in-head0" title="insertion mode: in head noscript">in head
- noscript</a>", "<a href="#after4" title="insertion mode: after head">after
+ noscript</a>", "<a href="#after9" title="insertion mode: after head">after
head</a>", "<a href="#in-body" title="insertion mode: in body">in
body</a>", "<a href="#in-table" title="insertion mode: in table">in
table</a>", "<a href="#in-caption" title="insertion mode: in caption">in
@@ -44858,13 +44963,13 @@
select">in select</a>", "<a href="#in-select0" title="insertion mode: in
select in table">in select in table</a>", "<a href="#in-foreign"
title="insertion mode: in foreign content">in foreign content</a>", "<a
- href="#after5" title="insertion mode: after body">after body</a>", "<a
+ href="#after10" title="insertion mode: after body">after body</a>", "<a
href="#in-frameset" title="insertion mode: in frameset">in frameset</a>",
- "<a href="#after6" title="insertion mode: after frameset">after
- frameset</a>", "<a href="#after7" title="insertion mode: after after
- body">after after body</a>", and "<a href="#after8" title="insertion mode:
- after after frameset">after after frameset</a>" during the course of the
- parsing, as described in the <a href="#tree-construction0">tree
+ "<a href="#after11" title="insertion mode: after frameset">after
+ frameset</a>", "<a href="#after12" title="insertion mode: after after
+ body">after after body</a>", and "<a href="#after13" title="insertion
+ mode: after after frameset">after after frameset</a>" during the course of
+ the parsing, as described in the <a href="#tree-construction0">tree
construction</a> stage. The insertion mode affects how tokens are
processed and whether CDATA sections are supported.
@@ -44983,9 +45088,9 @@
<li>If <var title="">node</var> is an <code><a
href="#html">html</a></code> element, then: if the <a
href="#head-element"><code title="">head</code> element pointer</a> is
- null, switch the <span>insertion mode</span> to "<a href="#before5"
+ null, switch the <span>insertion mode</span> to "<a href="#before10"
title="insertion mode: before head">before head</a>", otherwise, switch
- the <span>insertion mode</span> to "<a href="#after4" title="insertion
+ the <span>insertion mode</span> to "<a href="#after9" title="insertion
mode: after head">after head</a>". In either case, abort these steps. (<a
href="#fragment">fragment case</a>)</li>
<!-- XXX
@@ -45012,7 +45117,7 @@
manipulated in a random access fashion as part of <a
href="#adoptionAgency">the handling for misnested tags</a>).
- <p>The "<a href="#before4" title="insertion mode: before html">before
+ <p>The "<a href="#before9" title="insertion mode: before html">before
html</a>" <span>insertion mode</span> creates the <code><a
href="#html">html</a></code> root element node, which is then added to the
stack.
@@ -45022,7 +45127,7 @@
href="#html">html</a></code> element that is created as part of <a
href="#html-fragment0" title="html fragment parsing algorithm">that
algorithm</a>. (The <a href="#fragment">fragment case</a> skips the "<a
- href="#before4" title="insertion mode: before html">before html</a>"
+ href="#before9" title="insertion mode: before html">before html</a>"
<span>insertion mode</span>.)
<p>The <code><a href="#html">html</a></code> node, however it is created,
@@ -45339,12 +45444,13 @@
<p>Implementations must act as if they used the following state machine to
tokenise HTML. The state machine must start in the <a
- href="#data-state">data state</a>. Most states consume a single character,
- which may have various side-effects, and either switches the state machine
- to a new state to <em>reconsume</em> the same character, or switches it to
- a new state (to consume the next character), or repeats the same state (to
- consume the next character). Some states have more complicated behavior
- and can consume several characters before switching to another state.
+ href="#data-state0">data state</a>. Most states consume a single
+ character, which may have various side-effects, and either switches the
+ state machine to a new state to <em>reconsume</em> the same character, or
+ switches it to a new state (to consume the next character), or repeats the
+ same state (to consume the next character). Some states have more
+ complicated behavior and can consume several characters before switching
+ to another state.
<p>The exact behavior of certain states depends on a <dfn
id=content3>content model flag</dfn> that is set after certain tokens are
@@ -45399,1378 +45505,1350 @@
be <a href="#executing0" title="executing a script block">executed</a> and
removed from its list.
- <p>The tokeniser state machine is as follows:</p>
+ <p>The tokeniser state machine consists of the states defined in the
+ following subsections.</p>
<!-- XXX should go through these reordering the entries so that
they're in some consistent order, like, by Unicode, errors last, or
something -->
- <dl>
- <dt><dfn id=data-state>Data state</dfn>
+ <h5 id=data-state><span class=secno>8.2.4.1. </span><dfn
+ id=data-state0>Data state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0026 AMPERSAND (&)
+ <dl class=switch>
+ <dt>U+0026 AMPERSAND (&)
- <dd>When the <a href="#content3">content model flag</a> is set to one of
- the PCDATA or RCDATA states and the <a href="#escape">escape flag</a>
- is false: switch to the <a href="#character4">character reference data
- state</a>.
+ <dd>When the <a href="#content3">content model flag</a> is set to one of
+ the PCDATA or RCDATA states and the <a href="#escape">escape flag</a> is
+ false: switch to the <a href="#character6">character reference data
+ state</a>.
- <dd>Otherwise: treat it as per the "anything else" entry below.
+ <dd>Otherwise: treat it as per the "anything else" entry below.
- <dt>U+002D HYPHEN-MINUS (-)
+ <dt>U+002D HYPHEN-MINUS (-)
- <dd>
- <p>If the <a href="#content3">content model flag</a> is set to either
- the RCDATA state or the CDATA state, and the <a href="#escape">escape
- flag</a> is false, and there are at least three characters before this
- one in the input stream, and the last four characters in the input
- stream, including this one, are U+003C LESS-THAN SIGN, U+0021
- EXCLAMATION MARK, U+002D HYPHEN-MINUS, and U+002D HYPHEN-MINUS
- ("<!--"), then set the <a href="#escape">escape flag</a> to true.</p>
+ <dd>
+ <p>If the <a href="#content3">content model flag</a> is set to either the
+ RCDATA state or the CDATA state, and the <a href="#escape">escape
+ flag</a> is false, and there are at least three characters before this
+ one in the input stream, and the last four characters in the input
+ stream, including this one, are U+003C LESS-THAN SIGN, U+0021
+ EXCLAMATION MARK, U+002D HYPHEN-MINUS, and U+002D HYPHEN-MINUS
+ ("<!--"), then set the <a href="#escape">escape flag</a> to true.</p>
- <p>In any case, emit the input character as a character token. Stay in
- the <a href="#data-state">data state</a>.</p>
+ <p>In any case, emit the input character as a character token. Stay in
+ the <a href="#data-state0">data state</a>.</p>
- <dt>U+003C LESS-THAN SIGN (<)
+ <dt>U+003C LESS-THAN SIGN (<)
- <dd>When the <a href="#content3">content model flag</a> is set to the
- PCDATA state: switch to the <a href="#tag-open">tag open state</a>.
+ <dd>When the <a href="#content3">content model flag</a> is set to the
+ PCDATA state: switch to the <a href="#tag-open0">tag open state</a>.
- <dd>When the <a href="#content3">content model flag</a> is set to either
- the RCDATA state or the CDATA state and the <a href="#escape">escape
- flag</a> is false: switch to the <a href="#tag-open">tag open
- state</a>.
+ <dd>When the <a href="#content3">content model flag</a> is set to either
+ the RCDATA state or the CDATA state and the <a href="#escape">escape
+ flag</a> is false: switch to the <a href="#tag-open0">tag open state</a>.
- <dd>Otherwise: treat it as per the "anything else" entry below.
+ <dd>Otherwise: treat it as per the "anything else" entry below.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>
- <p>If the <a href="#content3">content model flag</a> is set to either
- the RCDATA state or the CDATA state, and the <a href="#escape">escape
- flag</a> is true, and the last three characters in the input stream
- including this one are U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS,
- U+003E GREATER-THAN SIGN ("-->"), set the <a href="#escape">escape
- flag</a> to false.</p>
- <!-- no need to check
- that there are enough characters, since you can only run into
- this if the flag is true in the first place, which requires four
- characters. -->
-
- <p>In any case, emit the input character as a character token. Stay in
- the <a href="#data-state">data state</a>.</p>
+ <dd>
+ <p>If the <a href="#content3">content model flag</a> is set to either the
+ RCDATA state or the CDATA state, and the <a href="#escape">escape
+ flag</a> is true, and the last three characters in the input stream
+ including this one are U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E
+ GREATER-THAN SIGN ("-->"), set the <a href="#escape">escape flag</a>
+ to false.</p>
+ <!-- no need to check
+ that there are enough characters, since you can only run into
+ this if the flag is true in the first place, which requires four
+ characters. -->
+
+ <p>In any case, emit the input character as a character token. Stay in
+ the <a href="#data-state0">data state</a>.</p>
- <dt>EOF
+ <dt>EOF
- <dd>Emit an end-of-file token.
+ <dd>Emit an end-of-file token.
- <dt>Anything else
+ <dt>Anything else
- <dd>Emit the input character as a character token. Stay in the <a
- href="#data-state">data state</a>.
- </dl>
+ <dd>Emit the input character as a character token. Stay in the <a
+ href="#data-state0">data state</a>.
+ </dl>
- <dt><dfn id=character4>Character reference data state</dfn>
+ <h5 id=character1><span class=secno>8.2.4.2. </span><dfn
+ id=character6>Character reference data state</dfn></h5>
- <dd>
- <p><em>(This cannot happen if the <a href="#content3">content model
- flag</a> is set to the CDATA state.)</em></p>
+ <p><em>(This cannot happen if the <a href="#content3">content model
+ flag</a> is set to the CDATA state.)</em>
- <p>Attempt to <a href="#consume">consume a character reference</a>, with
- no <a href="#additional">additional allowed character</a>.</p>
+ <p>Attempt to <a href="#consume">consume a character reference</a>, with no
+ <a href="#additional">additional allowed character</a>.
- <p>If nothing is returned, emit a U+0026 AMPERSAND character token.</p>
+ <p>If nothing is returned, emit a U+0026 AMPERSAND character token.
- <p>Otherwise, emit the character token that was returned.</p>
+ <p>Otherwise, emit the character token that was returned.
- <p>Finally, switch to the <a href="#data-state">data state</a>.</p>
+ <p>Finally, switch to the <a href="#data-state0">data state</a>.
- <dt><dfn id=tag-open>Tag open state</dfn>
+ <h5 id=tag-open><span class=secno>8.2.4.3. </span><dfn id=tag-open0>Tag
+ open state</dfn></h5>
- <dd>
- <p>The behavior of this state depends on the <a href="#content3">content
- model flag</a>.</p>
+ <p>The behavior of this state depends on the <a href="#content3">content
+ model flag</a>.
- <dl>
- <dt>If the <a href="#content3">content model flag</a> is set to the
- RCDATA or CDATA states
+ <dl>
+ <dt>If the <a href="#content3">content model flag</a> is set to the RCDATA
+ or CDATA states
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>. If it is
- a U+002F SOLIDUS (/) character, switch to the <a href="#close3">close
- tag open state</a>. Otherwise, emit a U+003C LESS-THAN SIGN character
- token and reconsume the current input character in the <a
- href="#data-state">data state</a>.</p>
+ <dd>
+ <p>Consume the <a href="#next-input">next input character</a>. If it is a
+ U+002F SOLIDUS (/) character, switch to the <a href="#close4">close tag
+ open state</a>. Otherwise, emit a U+003C LESS-THAN SIGN character token
+ and reconsume the current input character in the <a
+ href="#data-state0">data state</a>.</p>
- <dt>If the <a href="#content3">content model flag</a> is set to the
- PCDATA state
+ <dt>If the <a href="#content3">content model flag</a> is set to the PCDATA
+ state
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <dd>
+ <p>Consume the <a href="#next-input">next input character</a>:</p>
- <dl class=switch>
- <dt>U+0021 EXCLAMATION MARK (!)
+ <dl class=switch>
+ <dt>U+0021 EXCLAMATION MARK (!)
- <dd>Switch to the <a href="#markup">markup declaration open state</a>.
+ <dd>Switch to the <a href="#markup0">markup declaration open state</a>.
- <dt>U+002F SOLIDUS (/)
+ <dt>U+002F SOLIDUS (/)
- <dd>Switch to the <a href="#close3">close tag open state</a>.
+ <dd>Switch to the <a href="#close4">close tag open state</a>.
- <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL
- LETTER Z
+ <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER
+ Z
- <dd>Create a new start tag token, set its tag name to the lowercase
- version of the input character (add 0x0020 to the character's code
- point), then switch to the <a href="#tag-name0">tag name state</a>.
- (Don't emit the token yet; further details will be filled in before
- it is emitted.)
+ <dd>Create a new start tag token, set its tag name to the lowercase
+ version of the input character (add 0x0020 to the character's code
+ point), then switch to the <a href="#tag-name1">tag name state</a>.
+ (Don't emit the token yet; further details will be filled in before it
+ is emitted.)
- <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z
+ <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z
- <dd>Create a new start tag token, set its tag name to the input
- character, then switch to the <a href="#tag-name0">tag name
- state</a>. (Don't emit the token yet; further details will be filled
- in before it is emitted.)
+ <dd>Create a new start tag token, set its tag name to the input
+ character, then switch to the <a href="#tag-name1">tag name state</a>.
+ (Don't emit the token yet; further details will be filled in before it
+ is emitted.)
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd><a href="#parse2">Parse error</a>. Emit a U+003C LESS-THAN SIGN
- character token and a U+003E GREATER-THAN SIGN character token.
- Switch to the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit a U+003C LESS-THAN SIGN
+ character token and a U+003E GREATER-THAN SIGN character token. Switch
+ to the <a href="#data-state0">data state</a>.
- <dt>U+003F QUESTION MARK (?)
+ <dt>U+003F QUESTION MARK (?)
- <dd><a href="#parse2">Parse error</a>. Switch to the <a
- href="#bogus">bogus comment state</a>.
+ <dd><a href="#parse2">Parse error</a>. Switch to the <a
+ href="#bogus1">bogus comment state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd><a href="#parse2">Parse error</a>. Emit a U+003C LESS-THAN SIGN
- character token and reconsume the current input character in the <a
- href="#data-state">data state</a>.
- </dl>
+ <dd><a href="#parse2">Parse error</a>. Emit a U+003C LESS-THAN SIGN
+ character token and reconsume the current input character in the <a
+ href="#data-state0">data state</a>.
</dl>
+ </dl>
- <dt><dfn id=close3>Close tag open state</dfn>
+ <h5 id=close><span class=secno>8.2.4.4. </span><dfn id=close4>Close tag
+ open state</dfn></h5>
- <dd>
- <p>If the <a href="#content3">content model flag</a> is set to the RCDATA
- or CDATA states but no start tag token has ever been emitted by this
- instance of the tokeniser (<a href="#fragment">fragment case</a>), or,
- if the <a href="#content3">content model flag</a> is set to the RCDATA
- or CDATA states and the next few characters do not match the tag name of
- the last start tag token emitted (case insensitively), or if they do but
- they are not immediately followed by one of the following characters:</p>
+ <p>If the <a href="#content3">content model flag</a> is set to the RCDATA
+ or CDATA states but no start tag token has ever been emitted by this
+ instance of the tokeniser (<a href="#fragment">fragment case</a>), or, if
+ the <a href="#content3">content model flag</a> is set to the RCDATA or
+ CDATA states and the next few characters do not match the tag name of the
+ last start tag token emitted (case insensitively), or if they do but they
+ are not immediately followed by one of the following characters:
- <ul class=brief>
- <li>U+0009 CHARACTER TABULATION
+ <ul class=brief>
+ <li>U+0009 CHARACTER TABULATION
- <li>U+000A LINE FEED (LF)
+ <li>U+000A LINE FEED (LF)
- <li>U+000C FORM FEED (FF)</li>
- <!--<li>U+000D CARRIAGE RETURN (CR)</li>-->
+ <li>U+000C FORM FEED (FF)</li>
+ <!--<li>U+000D CARRIAGE RETURN (CR)</li>-->
- <li>U+0020 SPACE
+ <li>U+0020 SPACE
- <li>U+003E GREATER-THAN SIGN (>)
+ <li>U+003E GREATER-THAN SIGN (>)
- <li>U+002F SOLIDUS (/)
+ <li>U+002F SOLIDUS (/)
- <li>EOF
- </ul>
+ <li>EOF
+ </ul>
- <p>...then emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
- character token, and switch to the <a href="#data-state">data state</a>
- to process the <a href="#next-input">next input character</a>.</p>
+ <p>...then emit a U+003C LESS-THAN SIGN character token, a U+002F SOLIDUS
+ character token, and switch to the <a href="#data-state0">data state</a>
+ to process the <a href="#next-input">next input character</a>.
- <p>Otherwise, if the <a href="#content3">content model flag</a> is set to
- the PCDATA state, or if the next few characters <em>do</em> match that
- tag name, consume the <a href="#next-input">next input character</a>:</p>
+ <p>Otherwise, if the <a href="#content3">content model flag</a> is set to
+ the PCDATA state, or if the next few characters <em>do</em> match that tag
+ name, consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER
- Z
+ <dl class=switch>
+ <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z
- <dd>Create a new end tag token, set its tag name to the lowercase
- version of the input character (add 0x0020 to the character's code
- point), then switch to the <a href="#tag-name0">tag name state</a>.
- (Don't emit the token yet; further details will be filled in before it
- is emitted.)
+ <dd>Create a new end tag token, set its tag name to the lowercase version
+ of the input character (add 0x0020 to the character's code point), then
+ switch to the <a href="#tag-name1">tag name state</a>. (Don't emit the
+ token yet; further details will be filled in before it is emitted.)
- <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z
+ <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z
- <dd>Create a new end tag token, set its tag name to the input character,
- then switch to the <a href="#tag-name0">tag name state</a>. (Don't emit
- the token yet; further details will be filled in before it is emitted.)
+ <dd>Create a new end tag token, set its tag name to the input character,
+ then switch to the <a href="#tag-name1">tag name state</a>. (Don't emit
+ the token yet; further details will be filled in before it is emitted.)
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd><a href="#parse2">Parse error</a>. Switch to the <a
- href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Switch to the <a
+ href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit a U+003C LESS-THAN SIGN
- character token and a U+002F SOLIDUS character token. Reconsume the EOF
- character in the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit a U+003C LESS-THAN SIGN
+ character token and a U+002F SOLIDUS character token. Reconsume the EOF
+ character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd><a href="#parse2">Parse error</a>. Switch to the <a
- href="#bogus">bogus comment state</a>.
- </dl>
+ <dd><a href="#parse2">Parse error</a>. Switch to the <a
+ href="#bogus1">bogus comment state</a>.
+ </dl>
- <dt><dfn id=tag-name0>Tag name state</dfn>
+ <h5 id=tag-name><span class=secno>8.2.4.5. </span><dfn id=tag-name1>Tag
+ name state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Switch to the <a href="#before">before attribute name state</a>.
+ <dd>Switch to the <a href="#before4">before attribute name state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the current tag token. Switch to the <a href="#data-state">data
- state</a>.
+ <dd>Emit the current tag token. Switch to the <a href="#data-state0">data
+ state</a>.
- <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER
- Z
+ <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z
- <dd>Append the lowercase version of the current input character (add
- 0x0020 to the character's code point) to the current tag token's tag
- name. Stay in the <a href="#tag-name0">tag name state</a>.
+ <dd>Append the lowercase version of the current input character (add
+ 0x0020 to the character's code point) to the current tag token's tag
+ name. Stay in the <a href="#tag-name1">tag name state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
+ Reconsume the EOF character in the <a href="#data-state0">data state</a>.
- <dt>U+002F SOLIDUS (/)
+ <dt>U+002F SOLIDUS (/)
- <dd>Switch to the <a href="#self-closing">self-closing start tag
- state</a>.
+ <dd>Switch to the <a href="#self-closing0">self-closing start tag
+ state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the current input character to the current tag token's tag
- name. Stay in the <a href="#tag-name0">tag name state</a>.
- </dl>
+ <dd>Append the current input character to the current tag token's tag
+ name. Stay in the <a href="#tag-name1">tag name state</a>.
+ </dl>
- <dt><dfn id=before>Before attribute name state</dfn>
+ <h5 id=before><span class=secno>8.2.4.6. </span><dfn id=before4>Before
+ attribute name state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Stay in the <a href="#before">before attribute name state</a>.
+ <dd>Stay in the <a href="#before4">before attribute name state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the current tag token. Switch to the <a href="#data-state">data
- state</a>.
+ <dd>Emit the current tag token. Switch to the <a href="#data-state0">data
+ state</a>.
- <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER
- Z
+ <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z
- <dd>Start a new attribute in the current tag token. Set that attribute's
- name to the lowercase version of the current input character (add
- 0x0020 to the character's code point), and its value to the empty
- string. Switch to the <a href="#attribute1">attribute name state</a>.
+ <dd>Start a new attribute in the current tag token. Set that attribute's
+ name to the lowercase version of the current input character (add 0x0020
+ to the character's code point), and its value to the empty string. Switch
+ to the <a href="#attribute5">attribute name state</a>.
- <dt>U+002F SOLIDUS (/)
+ <dt>U+002F SOLIDUS (/)
- <dd>Switch to the <a href="#self-closing">self-closing start tag
- state</a>.
+ <dd>Switch to the <a href="#self-closing0">self-closing start tag
+ state</a>.
- <dt>U+0022 QUOTATION MARK (")
+ <dt>U+0022 QUOTATION MARK (")
- <dt>U+0027 APOSTROPHE (')
+ <dt>U+0027 APOSTROPHE (')
- <dt>U+003D EQUALS SIGN (=)
+ <dt>U+003D EQUALS SIGN (=)
- <dd><a href="#parse2">Parse error</a>. Treat it as per the "anything
- else" entry below.
+ <dd><a href="#parse2">Parse error</a>. Treat it as per the "anything else"
+ entry below.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
+ Reconsume the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Start a new attribute in the current tag token. Set that attribute's
- name to the current input character, and its value to the empty string.
- Switch to the <a href="#attribute1">attribute name state</a>.
- </dl>
+ <dd>Start a new attribute in the current tag token. Set that attribute's
+ name to the current input character, and its value to the empty string.
+ Switch to the <a href="#attribute5">attribute name state</a>.
+ </dl>
- <dt><dfn id=attribute1>Attribute name state</dfn>
+ <h5 id=attribute><span class=secno>8.2.4.7. </span><dfn
+ id=attribute5>Attribute name state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Switch to the <a href="#after">after attribute name state</a>.
+ <dd>Switch to the <a href="#after4">after attribute name state</a>.
- <dt>U+003D EQUALS SIGN (=)
+ <dt>U+003D EQUALS SIGN (=)
- <dd>Switch to the <a href="#before0">before attribute value state</a>.
+ <dd>Switch to the <a href="#before5">before attribute value state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the current tag token. Switch to the <a href="#data-state">data
- state</a>.
+ <dd>Emit the current tag token. Switch to the <a href="#data-state0">data
+ state</a>.
- <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER
- Z
+ <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z
- <dd>Append the lowercase version of the current input character (add
- 0x0020 to the character's code point) to the current attribute's name.
- Stay in the <a href="#attribute1">attribute name state</a>.
+ <dd>Append the lowercase version of the current input character (add
+ 0x0020 to the character's code point) to the current attribute's name.
+ Stay in the <a href="#attribute5">attribute name state</a>.
- <dt>U+002F SOLIDUS (/)
+ <dt>U+002F SOLIDUS (/)
- <dd>Switch to the <a href="#self-closing">self-closing start tag
- state</a>.
+ <dd>Switch to the <a href="#self-closing0">self-closing start tag
+ state</a>.
- <dt>U+0022 QUOTATION MARK (")
+ <dt>U+0022 QUOTATION MARK (")
- <dt>U+0027 APOSTROPHE (')
+ <dt>U+0027 APOSTROPHE (')
- <dd><a href="#parse2">Parse error</a>. Treat it as per the "anything
- else" entry below.
+ <dd><a href="#parse2">Parse error</a>. Treat it as per the "anything else"
+ entry below.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
+ Reconsume the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the current input character to the current attribute's name.
- Stay in the <a href="#attribute1">attribute name state</a>.
- </dl>
+ <dd>Append the current input character to the current attribute's name.
+ Stay in the <a href="#attribute5">attribute name state</a>.
+ </dl>
- <p>When the user agent leaves the attribute name state (and before
- emitting the tag token, if appropriate), the complete attribute's name
- must be compared to the other attributes on the same token; if there is
- already an attribute on the token with the exact same name, then this is
- a <a href="#parse2">parse error</a> and the new attribute must be
- dropped, along with the value that gets associated with it (if any).</p>
+ <p>When the user agent leaves the attribute name state (and before emitting
+ the tag token, if appropriate), the complete attribute's name must be
+ compared to the other attributes on the same token; if there is already an
+ attribute on the token with the exact same name, then this is a <a
+ href="#parse2">parse error</a> and the new attribute must be dropped,
+ along with the value that gets associated with it (if any).
- <dt><dfn id=after>After attribute name state</dfn>
+ <h5 id=after><span class=secno>8.2.4.8. </span><dfn id=after4>After
+ attribute name state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Stay in the <a href="#after">after attribute name state</a>.
+ <dd>Stay in the <a href="#after4">after attribute name state</a>.
- <dt>U+003D EQUALS SIGN (=)
+ <dt>U+003D EQUALS SIGN (=)
- <dd>Switch to the <a href="#before0">before attribute value state</a>.
+ <dd>Switch to the <a href="#before5">before attribute value state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the current tag token. Switch to the <a href="#data-state">data
- state</a>.
+ <dd>Emit the current tag token. Switch to the <a href="#data-state0">data
+ state</a>.
- <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER
- Z
+ <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z
- <dd>Start a new attribute in the current tag token. Set that attribute's
- name to the lowercase version of the current input character (add
- 0x0020 to the character's code point), and its value to the empty
- string. Switch to the <a href="#attribute1">attribute name state</a>.
+ <dd>Start a new attribute in the current tag token. Set that attribute's
+ name to the lowercase version of the current input character (add 0x0020
+ to the character's code point), and its value to the empty string. Switch
+ to the <a href="#attribute5">attribute name state</a>.
- <dt>U+002F SOLIDUS (/)
+ <dt>U+002F SOLIDUS (/)
- <dd>Switch to the <a href="#self-closing">self-closing start tag
- state</a>.
+ <dd>Switch to the <a href="#self-closing0">self-closing start tag
+ state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
+ Reconsume the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Start a new attribute in the current tag token. Set that attribute's
- name to the current input character, and its value to the empty string.
- Switch to the <a href="#attribute1">attribute name state</a>.
- </dl>
+ <dd>Start a new attribute in the current tag token. Set that attribute's
+ name to the current input character, and its value to the empty string.
+ Switch to the <a href="#attribute5">attribute name state</a>.
+ </dl>
- <dt><dfn id=before0>Before attribute value state</dfn>
+ <h5 id=before0><span class=secno>8.2.4.9. </span><dfn id=before5>Before
+ attribute value state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Stay in the <a href="#before0">before attribute value state</a>.
+ <dd>Stay in the <a href="#before5">before attribute value state</a>.
- <dt>U+0022 QUOTATION MARK (")
+ <dt>U+0022 QUOTATION MARK (")
- <dd>Switch to the <a href="#attribute2">attribute value (double-quoted)
- state</a>.
+ <dd>Switch to the <a href="#attribute6">attribute value (double-quoted)
+ state</a>.
- <dt>U+0026 AMPERSAND (&)
+ <dt>U+0026 AMPERSAND (&)
- <dd>Switch to the <a href="#attribute4">attribute value (unquoted)
- state</a> and reconsume this input character.
+ <dd>Switch to the <a href="#attribute8">attribute value (unquoted)
+ state</a> and reconsume this input character.
- <dt>U+0027 APOSTROPHE (')
+ <dt>U+0027 APOSTROPHE (')
- <dd>Switch to the <a href="#attribute3">attribute value (single-quoted)
- state</a>.
+ <dd>Switch to the <a href="#attribute7">attribute value (single-quoted)
+ state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the current tag token. Switch to the <a href="#data-state">data
- state</a>.
+ <dd>Emit the current tag token. Switch to the <a href="#data-state0">data
+ state</a>.
- <dt>U+003D EQUALS SIGN (=)
+ <dt>U+003D EQUALS SIGN (=)
- <dd><a href="#parse2">Parse error</a>. Treat it as per the "anything
- else" entry below.
+ <dd><a href="#parse2">Parse error</a>. Treat it as per the "anything else"
+ entry below.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
- Reconsume the character in the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
+ Reconsume the character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the current input character to the current attribute's value.
- Switch to the <a href="#attribute4">attribute value (unquoted)
- state</a>.
- </dl>
+ <dd>Append the current input character to the current attribute's value.
+ Switch to the <a href="#attribute8">attribute value (unquoted) state</a>.
+ </dl>
- <dt><dfn id=attribute2>Attribute value (double-quoted) state</dfn>
+ <h5 id=attribute0><span class=secno>8.2.4.10. </span><dfn
+ id=attribute6>Attribute value (double-quoted) state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0022 QUOTATION MARK (")
+ <dl class=switch>
+ <dt>U+0022 QUOTATION MARK (")
- <dd>Switch to the <a href="#after0">after attribute value (quoted)
- state</a>.
+ <dd>Switch to the <a href="#after5">after attribute value (quoted)
+ state</a>.
- <dt>U+0026 AMPERSAND (&)
+ <dt>U+0026 AMPERSAND (&)
- <dd>Switch to the <a href="#character5">character reference in attribute
- value state</a>, with the <a href="#additional">additional allowed
- character</a> being U+0022 QUOTATION MARK (").
+ <dd>Switch to the <a href="#character7">character reference in attribute
+ value state</a>, with the <a href="#additional">additional allowed
+ character</a> being U+0022 QUOTATION MARK (").
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
- Reconsume the character in the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
+ Reconsume the character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the current input character to the current attribute's value.
- Stay in the <a href="#attribute2">attribute value (double-quoted)
- state</a>.
- </dl>
+ <dd>Append the current input character to the current attribute's value.
+ Stay in the <a href="#attribute6">attribute value (double-quoted)
+ state</a>.
+ </dl>
- <dt><dfn id=attribute3>Attribute value (single-quoted) state</dfn>
+ <h5 id=attribute1><span class=secno>8.2.4.11. </span><dfn
+ id=attribute7>Attribute value (single-quoted) state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0027 APOSTROPHE (')
+ <dl class=switch>
+ <dt>U+0027 APOSTROPHE (')
- <dd>Switch to the <a href="#after0">after attribute value (quoted)
- state</a>.
+ <dd>Switch to the <a href="#after5">after attribute value (quoted)
+ state</a>.
- <dt>U+0026 AMPERSAND (&)
+ <dt>U+0026 AMPERSAND (&)
- <dd>Switch to the <a href="#character5">character reference in attribute
- value state</a>, with the <a href="#additional">additional allowed
- character</a> being U+0027 APOSTROPHE (').
+ <dd>Switch to the <a href="#character7">character reference in attribute
+ value state</a>, with the <a href="#additional">additional allowed
+ character</a> being U+0027 APOSTROPHE (').
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
- Reconsume the character in the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
+ Reconsume the character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the current input character to the current attribute's value.
- Stay in the <a href="#attribute3">attribute value (single-quoted)
- state</a>.
- </dl>
+ <dd>Append the current input character to the current attribute's value.
+ Stay in the <a href="#attribute7">attribute value (single-quoted)
+ state</a>.
+ </dl>
- <dt><dfn id=attribute4>Attribute value (unquoted) state</dfn>
+ <h5 id=attribute2><span class=secno>8.2.4.12. </span><dfn
+ id=attribute8>Attribute value (unquoted) state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Switch to the <a href="#before">before attribute name state</a>.
+ <dd>Switch to the <a href="#before4">before attribute name state</a>.
- <dt>U+0026 AMPERSAND (&)
+ <dt>U+0026 AMPERSAND (&)
- <dd>Switch to the <a href="#character5">character reference in attribute
- value state</a>, with no <a href="#additional">additional allowed
- character</a>.
+ <dd>Switch to the <a href="#character7">character reference in attribute
+ value state</a>, with no <a href="#additional">additional allowed
+ character</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the current tag token. Switch to the <a href="#data-state">data
- state</a>.
+ <dd>Emit the current tag token. Switch to the <a href="#data-state0">data
+ state</a>.
- <dt>U+0022 QUOTATION MARK (")
+ <dt>U+0022 QUOTATION MARK (")
- <dt>U+0027 APOSTROPHE (')
+ <dt>U+0027 APOSTROPHE (')
- <dt>U+003D EQUALS SIGN (=)
+ <dt>U+003D EQUALS SIGN (=)
- <dd><a href="#parse2">Parse error</a>. Treat it as per the "anything
- else" entry below.
+ <dd><a href="#parse2">Parse error</a>. Treat it as per the "anything else"
+ entry below.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
- Reconsume the character in the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
+ Reconsume the character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the current input character to the current attribute's value.
- Stay in the <a href="#attribute4">attribute value (unquoted) state</a>.
- </dl>
+ <dd>Append the current input character to the current attribute's value.
+ Stay in the <a href="#attribute8">attribute value (unquoted) state</a>.
+ </dl>
- <dt><dfn id=character5>Character reference in attribute value state</dfn>
+ <h5 id=character2><span class=secno>8.2.4.13. </span><dfn
+ id=character7>Character reference in attribute value state</dfn></h5>
- <dd>
- <p>Attempt to <a href="#consume">consume a character reference</a>.</p>
+ <p>Attempt to <a href="#consume">consume a character reference</a>.
- <p>If nothing is returned, append a U+0026 AMPERSAND character to the
- current attribute's value.</p>
+ <p>If nothing is returned, append a U+0026 AMPERSAND character to the
+ current attribute's value.
- <p>Otherwise, append the returned character token to the current
- attribute's value.</p>
+ <p>Otherwise, append the returned character token to the current
+ attribute's value.
- <p>Finally, switch back to the attribute value state that you were in
- when were switched into this state.</p>
+ <p>Finally, switch back to the attribute value state that you were in when
+ were switched into this state.
- <dt><dfn id=after0>After attribute value (quoted) state</dfn>
+ <h5 id=after0><span class=secno>8.2.4.14. </span><dfn id=after5>After
+ attribute value (quoted) state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Switch to the <a href="#before">before attribute name state</a>.
+ <dd>Switch to the <a href="#before4">before attribute name state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the current tag token. Switch to the <a href="#data-state">data
- state</a>.
+ <dd>Emit the current tag token. Switch to the <a href="#data-state0">data
+ state</a>.
- <dt>U+002F SOLIDUS (/)
+ <dt>U+002F SOLIDUS (/)
- <dd>Switch to the <a href="#self-closing">self-closing start tag
- state</a>.
+ <dd>Switch to the <a href="#self-closing0">self-closing start tag
+ state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
+ Reconsume the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd><a href="#parse2">Parse error</a>. Reconsume the character in the <a
- href="#before">before attribute name state</a>.
- </dl>
+ <dd><a href="#parse2">Parse error</a>. Reconsume the character in the <a
+ href="#before4">before attribute name state</a>.
+ </dl>
- <dt><dfn id=self-closing>Self-closing start tag state</dfn>
+ <h5 id=self-closing><span class=secno>8.2.4.15. </span><dfn
+ id=self-closing0>Self-closing start tag state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dl class=switch>
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Set the <i>self-closing flag</i> of the current tag token. Emit the
- current tag token. Switch to the <a href="#data-state">data state</a>.
+ <dd>Set the <i>self-closing flag</i> of the current tag token. Emit the
+ current tag token. Switch to the <a href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the current tag token.
+ Reconsume the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd><a href="#parse2">Parse error</a>. Reconsume the character in the <a
- href="#before">before attribute name state</a>.
- </dl>
+ <dd><a href="#parse2">Parse error</a>. Reconsume the character in the <a
+ href="#before4">before attribute name state</a>.
+ </dl>
- <dt><dfn id=bogus>Bogus comment state</dfn>
+ <h5 id=bogus><span class=secno>8.2.4.16. </span><dfn id=bogus1>Bogus
+ comment state</dfn></h5>
- <dd>
- <p><em>(This can only happen if the <a href="#content3">content model
- flag</a> is set to the PCDATA state.)</em></p>
+ <p><em>(This can only happen if the <a href="#content3">content model
+ flag</a> is set to the PCDATA state.)</em>
- <p>Consume every character up to and including the first U+003E
- GREATER-THAN SIGN character (>) or the end of the file (EOF),
- whichever comes first. Emit a comment token whose data is the
- concatenation of all the characters starting from and including the
- character that caused the state machine to switch into the bogus comment
- state, up to and including the character immediately before the last
- consumed character (i.e. up to the character just before the U+003E or
- EOF character). (If the comment was started by the end of the file
- (EOF), the token is empty.)</p>
+ <p>Consume every character up to and including the first U+003E
+ GREATER-THAN SIGN character (>) or the end of the file (EOF), whichever
+ comes first. Emit a comment token whose data is the concatenation of all
+ the characters starting from and including the character that caused the
+ state machine to switch into the bogus comment state, up to and including
+ the character immediately before the last consumed character (i.e. up to
+ the character just before the U+003E or EOF character). (If the comment
+ was started by the end of the file (EOF), the token is empty.)
- <p>Switch to the <a href="#data-state">data state</a>.</p>
+ <p>Switch to the <a href="#data-state0">data state</a>.
- <p>If the end of the file was reached, reconsume the EOF character.</p>
+ <p>If the end of the file was reached, reconsume the EOF character.
- <dt><dfn id=markup>Markup declaration open state</dfn>
+ <h5 id=markup><span class=secno>8.2.4.17. </span><dfn id=markup0>Markup
+ declaration open state</dfn></h5>
- <dd>
- <p><em>(This can only happen if the <a href="#content3">content model
- flag</a> is set to the PCDATA state.)</em></p>
+ <p><em>(This can only happen if the <a href="#content3">content model
+ flag</a> is set to the PCDATA state.)</em>
- <p>If the next two characters are both U+002D HYPHEN-MINUS (-)
- characters, consume those two characters, create a comment token whose
- data is the empty string, and switch to the <a href="#comment0">comment
- start state</a>.</p>
+ <p>If the next two characters are both U+002D HYPHEN-MINUS (-) characters,
+ consume those two characters, create a comment token whose data is the
+ empty string, and switch to the <a href="#comment4">comment start
+ state</a>.
- <p>Otherwise, if the next seven characters are a
- <span>case-insensitive</span><!-- XXX xref, ascii only --> match for the
- word "DOCTYPE", then consume those characters and switch to the <a
- href="#doctype0">DOCTYPE state</a>.</p>
+ <p>Otherwise, if the next seven characters are a
+ <span>case-insensitive</span><!-- XXX xref, ascii only --> match for the
+ word "DOCTYPE", then consume those characters and switch to the <a
+ href="#doctype6">DOCTYPE state</a>.
- <p>Otherwise, if the <span>insertion mode</span> is "<a
- href="#in-foreign" title="insertion mode: in foreign content">in foreign
- content</a>" and the <a href="#current5">current node</a> is not an
- element in the <a href="#html-namespace0">HTML namespace</a> and the
- next seven characters are a
- <span>case-sensitive</span><!-- XXX xref, ascii
- only --> match for
- the string "[CDATA[" (the five uppercase letters "CDATA" with a U+005B
- LEFT SQUARE BRACKET character before and after), then consume those
- characters and switch to the <a href="#cdata1">CDATA section state</a>
- (which is unrelated to the <a href="#content3">content model flag</a>'s
- CDATA state).</p>
+ <p>Otherwise, if the <span>insertion mode</span> is "<a href="#in-foreign"
+ title="insertion mode: in foreign content">in foreign content</a>" and the
+ <a href="#current5">current node</a> is not an element in the <a
+ href="#html-namespace0">HTML namespace</a> and the next seven characters
+ are a <span>case-sensitive</span><!-- XXX xref, ascii
+ only --> match for
+ the string "[CDATA[" (the five uppercase letters "CDATA" with a U+005B
+ LEFT SQUARE BRACKET character before and after), then consume those
+ characters and switch to the <a href="#cdata2">CDATA section state</a>
+ (which is unrelated to the <a href="#content3">content model flag</a>'s
+ CDATA state).
- <p>Otherwise, this is a <a href="#parse2">parse error</a>. Switch to the
- <a href="#bogus">bogus comment state</a>. The next character that is
- consumed, if any, is the first character that will be in the comment.</p>
+ <p>Otherwise, this is a <a href="#parse2">parse error</a>. Switch to the <a
+ href="#bogus1">bogus comment state</a>. The next character that is
+ consumed, if any, is the first character that will be in the comment.
- <dt><dfn id=comment0>Comment start state</dfn>
+ <h5 id=comment0><span class=secno>8.2.4.18. </span><dfn id=comment4>Comment
+ start state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+002D HYPHEN-MINUS (-)
+ <dl class=switch>
+ <dt>U+002D HYPHEN-MINUS (-)
- <dd>Switch to the <a href="#comment1">comment start dash state</a>.
+ <dd>Switch to the <a href="#comment5">comment start dash state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd><a href="#parse2">Parse error</a>. Emit the comment token. Switch to
- the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the comment token. Switch to
+ the <a href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the comment token. Reconsume
- the EOF character in the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the comment token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the input character to the comment token's data. Switch to
- the <a href="#comment">comment state</a>.
- </dl>
+ <dd>Append the input character to the comment token's data. Switch to the
+ <a href="#comment">comment state</a>.
+ </dl>
- <dt><dfn id=comment1>Comment start dash state</dfn>
+ <h5 id=comment1><span class=secno>8.2.4.19. </span><dfn id=comment5>Comment
+ start dash state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+002D HYPHEN-MINUS (-)
+ <dl class=switch>
+ <dt>U+002D HYPHEN-MINUS (-)
- <dd>Switch to the <a href="#comment3">comment end state</a>
+ <dd>Switch to the <a href="#comment7">comment end state</a>
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd><a href="#parse2">Parse error</a>. Emit the comment token. Switch to
- the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the comment token. Switch to
+ the <a href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the comment token. Reconsume
- the EOF character in the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Emit the comment token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append a U+002D HYPHEN-MINUS (-) character and the input character
- to the comment token's data. Switch to the <a href="#comment">comment
- state</a>.
- </dl>
+ <dd>Append a U+002D HYPHEN-MINUS (-) character and the input character to
+ the comment token's data. Switch to the <a href="#comment">comment
+ state</a>.
+ </dl>
+ <dl>
<dt><dfn id=comment>Comment state</dfn>
+ </dl>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+002D HYPHEN-MINUS (-)
+ <dl class=switch>
+ <dt>U+002D HYPHEN-MINUS (-)
- <dd>Switch to the <a href="#comment2">comment end dash state</a>
+ <dd>Switch to the <a href="#comment6">comment end dash state</a>
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the comment token. Reconsume
- the EOF character in the <a href="#data-state">data state</a>.</dd>
- <!-- For
- security reasons: otherwise, hostile user could put a <script> in
- a comment e.g. in a blog comment and then DOS the server so that
- the end tag isn't read, and then the commented <script> tag would
- be treated as live code -->
+ <dd><a href="#parse2">Parse error</a>. Emit the comment token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.</dd>
+ <!-- For
+ security reasons: otherwise, hostile user could put a <script> in
+ a comment e.g. in a blog comment and then DOS the server so that
+ the end tag isn't read, and then the commented <script> tag would
+ be treated as live code -->
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the input character to the comment token's data. Stay in the
- <a href="#comment">comment state</a>.
- </dl>
+ <dd>Append the input character to the comment token's data. Stay in the <a
+ href="#comment">comment state</a>.
+ </dl>
- <dt><dfn id=comment2>Comment end dash state</dfn>
+ <h5 id=comment2><span class=secno>8.2.4.20. </span><dfn id=comment6>Comment
+ end dash state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+002D HYPHEN-MINUS (-)
+ <dl class=switch>
+ <dt>U+002D HYPHEN-MINUS (-)
- <dd>Switch to the <a href="#comment3">comment end state</a>
+ <dd>Switch to the <a href="#comment7">comment end state</a>
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the comment token. Reconsume
- the EOF character in the <a href="#data-state">data state</a>.</dd>
- <!-- For
- security reasons: otherwise, hostile user could put a <script> in
- a comment e.g. in a blog comment and then DOS the server so that
- the end tag isn't read, and then the commented <script> tag would
- be treated as live code -->
+ <dd><a href="#parse2">Parse error</a>. Emit the comment token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.</dd>
+ <!-- For
+ security reasons: otherwise, hostile user could put a <script> in
+ a comment e.g. in a blog comment and then DOS the server so that
+ the end tag isn't read, and then the commented <script> tag would
+ be treated as live code -->
- <dt>Anything else
+ <dt>Anything else
- <dd>Append a U+002D HYPHEN-MINUS (-) character and the input character
- to the comment token's data. Switch to the <a href="#comment">comment
- state</a>.
- </dl>
+ <dd>Append a U+002D HYPHEN-MINUS (-) character and the input character to
+ the comment token's data. Switch to the <a href="#comment">comment
+ state</a>.
+ </dl>
- <dt><dfn id=comment3>Comment end state</dfn>
+ <h5 id=comment3><span class=secno>8.2.4.21. </span><dfn id=comment7>Comment
+ end state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dl class=switch>
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the comment token. Switch to the <a href="#data-state">data
- state</a>.
+ <dd>Emit the comment token. Switch to the <a href="#data-state0">data
+ state</a>.
- <dt>U+002D HYPHEN-MINUS (-)
+ <dt>U+002D HYPHEN-MINUS (-)
- <dd><a href="#parse2">Parse error</a>. Append a U+002D HYPHEN-MINUS (-)
- character to the comment token's data. Stay in the <a
- href="#comment3">comment end state</a>.
+ <dd><a href="#parse2">Parse error</a>. Append a U+002D HYPHEN-MINUS (-)
+ character to the comment token's data. Stay in the <a
+ href="#comment7">comment end state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Emit the comment token. Reconsume
- the EOF character in the <a href="#data-state">data state</a>.</dd>
- <!-- For
- security reasons: otherwise, hostile user could put a <script> in
- a comment e.g. in a blog comment and then DOS the server so that
- the end tag isn't read, and then the commented <script> tag would
- be treated as live code -->
+ <dd><a href="#parse2">Parse error</a>. Emit the comment token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.</dd>
+ <!-- For
+ security reasons: otherwise, hostile user could put a <script> in
+ a comment e.g. in a blog comment and then DOS the server so that
+ the end tag isn't read, and then the commented <script> tag would
+ be treated as live code -->
- <dt>Anything else
+ <dt>Anything else
- <dd><a href="#parse2">Parse error</a>. Append two U+002D HYPHEN-MINUS
- (-) characters and the input character to the comment token's data.
- Switch to the <a href="#comment">comment state</a>.
- </dl>
+ <dd><a href="#parse2">Parse error</a>. Append two U+002D HYPHEN-MINUS (-)
+ characters and the input character to the comment token's data. Switch to
+ the <a href="#comment">comment state</a>.
+ </dl>
- <dt><dfn id=doctype0>DOCTYPE state</dfn>
+ <h5 id=doctype><span class=secno>8.2.4.22. </span><dfn id=doctype6>DOCTYPE
+ state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Switch to the <a href="#before1">before DOCTYPE name state</a>.
+ <dd>Switch to the <a href="#before6">before DOCTYPE name state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd><a href="#parse2">Parse error</a>. Reconsume the current character
- in the <a href="#before1">before DOCTYPE name state</a>.
- </dl>
+ <dd><a href="#parse2">Parse error</a>. Reconsume the current character in
+ the <a href="#before6">before DOCTYPE name state</a>.
+ </dl>
- <dt><dfn id=before1>Before DOCTYPE name state</dfn>
+ <h5 id=before1><span class=secno>8.2.4.23. </span><dfn id=before6>Before
+ DOCTYPE name state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Stay in the <a href="#before1">before DOCTYPE name state</a>.
+ <dd>Stay in the <a href="#before6">before DOCTYPE name state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd><a href="#parse2">Parse error</a>. Create a new DOCTYPE token. Set
- its <i>force-quirks flag</i> to <i>on</i>. Emit the token. Switch to
- the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Create a new DOCTYPE token. Set its
+ <i>force-quirks flag</i> to <i>on</i>. Emit the token. Switch to the <a
+ href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Create a new DOCTYPE token. Set
- its <i>force-quirks flag</i> to <i>on</i>. Emit the token. Reconsume
- the EOF character in the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Create a new DOCTYPE token. Set its
+ <i>force-quirks flag</i> to <i>on</i>. Emit the token. Reconsume the EOF
+ character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Create a new DOCTYPE token. Set the token's name to the current
- input character. Switch to the <a href="#doctype1">DOCTYPE name
- state</a>.
- </dl>
+ <dd>Create a new DOCTYPE token. Set the token's name to the current input
+ character. Switch to the <a href="#doctype7">DOCTYPE name state</a>.
+ </dl>
- <dt><dfn id=doctype1>DOCTYPE name state</dfn>
+ <h5 id=doctype0><span class=secno>8.2.4.24. </span><dfn id=doctype7>DOCTYPE
+ name state</dfn></h5>
- <dd>
- <p>First, consume the <a href="#next-input">next input character</a>:</p>
+ <p>First, consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Switch to the <a href="#after1">after DOCTYPE name state</a>.
+ <dd>Switch to the <a href="#after6">after DOCTYPE name state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the current DOCTYPE token. Switch to the <a
- href="#data-state">data state</a>.
+ <dd>Emit the current DOCTYPE token. Switch to the <a
+ href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the current input character to the current DOCTYPE token's
- name. Stay in the <a href="#doctype1">DOCTYPE name state</a>.
- </dl>
+ <dd>Append the current input character to the current DOCTYPE token's
+ name. Stay in the <a href="#doctype7">DOCTYPE name state</a>.
+ </dl>
- <dt><dfn id=after1>After DOCTYPE name state</dfn>
+ <h5 id=after1><span class=secno>8.2.4.25. </span><dfn id=after6>After
+ DOCTYPE name state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Stay in the <a href="#after1">after DOCTYPE name state</a>.
+ <dd>Stay in the <a href="#after6">after DOCTYPE name state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the current DOCTYPE token. Switch to the <a
- href="#data-state">data state</a>.
+ <dd>Emit the current DOCTYPE token. Switch to the <a
+ href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>
- <p>If the next six characters are a
- <span>case-insensitive</span><!-- XXX xref, ascii only --> match for
- the word "PUBLIC", then consume those characters and switch to the <a
- href="#before2">before DOCTYPE public identifier state</a>.</p>
+ <dd>
+ <p>If the next six characters are a
+ <span>case-insensitive</span><!-- XXX xref, ascii only --> match for the
+ word "PUBLIC", then consume those characters and switch to the <a
+ href="#before7">before DOCTYPE public identifier state</a>.</p>
- <p>Otherwise, if the next six characters are a
- <span>case-insensitive</span><!-- XXX xref, ascii only --> match for
- the word "SYSTEM", then consume those characters and switch to the <a
- href="#before3">before DOCTYPE system identifier state</a>.</p>
+ <p>Otherwise, if the next six characters are a
+ <span>case-insensitive</span><!-- XXX xref, ascii only --> match for the
+ word "SYSTEM", then consume those characters and switch to the <a
+ href="#before8">before DOCTYPE system identifier state</a>.</p>
- <p>Otherwise, this is the <a href="#parse2">parse error</a>. Set the
- DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to the
- <a href="#bogus0">bogus DOCTYPE state</a>.</p>
- </dl>
+ <p>Otherwise, this is the <a href="#parse2">parse error</a>. Set the
+ DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to the <a
+ href="#bogus2">bogus DOCTYPE state</a>.</p>
+ </dl>
- <dt><dfn id=before2>Before DOCTYPE public identifier state</dfn>
+ <h5 id=before2><span class=secno>8.2.4.26. </span><dfn id=before7>Before
+ DOCTYPE public identifier state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Stay in the <a href="#before2">before DOCTYPE public identifier
- state</a>.
+ <dd>Stay in the <a href="#before7">before DOCTYPE public identifier
+ state</a>.
- <dt>U+0022 QUOTATION MARK (")
+ <dt>U+0022 QUOTATION MARK (")
- <dd>Set the DOCTYPE token's public identifier to the empty string (not
- missing), then switch to the <a href="#doctype2">DOCTYPE public
- identifier (double-quoted) state</a>.
+ <dd>Set the DOCTYPE token's public identifier to the empty string (not
+ missing), then switch to the <a href="#doctype8">DOCTYPE public
+ identifier (double-quoted) state</a>.
- <dt>U+0027 APOSTROPHE (')
+ <dt>U+0027 APOSTROPHE (')
- <dd>Set the DOCTYPE token's public identifier to the empty string (not
- missing), then switch to the <a href="#doctype3">DOCTYPE public
- identifier (single-quoted) state</a>.
+ <dd>Set the DOCTYPE token's public identifier to the empty string (not
+ missing), then switch to the <a href="#doctype9">DOCTYPE public
+ identifier (single-quoted) state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch
- to the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch to
+ the <a href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Switch to the <a
- href="#bogus0">bogus DOCTYPE state</a>.
- </dl>
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Switch to the <a
+ href="#bogus2">bogus DOCTYPE state</a>.
+ </dl>
- <dt><dfn id=doctype2>DOCTYPE public identifier (double-quoted) state</dfn>
+ <h5 id=doctype1><span class=secno>8.2.4.27. </span><dfn id=doctype8>DOCTYPE
+ public identifier (double-quoted) state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0022 QUOTATION MARK (")
+ <dl class=switch>
+ <dt>U+0022 QUOTATION MARK (")
- <dd>Switch to the <a href="#after2">after DOCTYPE public identifier
- state</a>.
+ <dd>Switch to the <a href="#after7">after DOCTYPE public identifier
+ state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch
- to the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch to
+ the <a href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the current input character to the current DOCTYPE token's
- public identifier. Stay in the <a href="#doctype2">DOCTYPE public
- identifier (double-quoted) state</a>.
- </dl>
+ <dd>Append the current input character to the current DOCTYPE token's
+ public identifier. Stay in the <a href="#doctype8">DOCTYPE public
+ identifier (double-quoted) state</a>.
+ </dl>
- <dt><dfn id=doctype3>DOCTYPE public identifier (single-quoted) state</dfn>
+ <h5 id=doctype2><span class=secno>8.2.4.28. </span><dfn id=doctype9>DOCTYPE
+ public identifier (single-quoted) state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0027 APOSTROPHE (')
+ <dl class=switch>
+ <dt>U+0027 APOSTROPHE (')
- <dd>Switch to the <a href="#after2">after DOCTYPE public identifier
- state</a>.
+ <dd>Switch to the <a href="#after7">after DOCTYPE public identifier
+ state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch
- to the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch to
+ the <a href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the current input character to the current DOCTYPE token's
- public identifier. Stay in the <a href="#doctype3">DOCTYPE public
- identifier (single-quoted) state</a>.
- </dl>
+ <dd>Append the current input character to the current DOCTYPE token's
+ public identifier. Stay in the <a href="#doctype9">DOCTYPE public
+ identifier (single-quoted) state</a>.
+ </dl>
- <dt><dfn id=after2>After DOCTYPE public identifier state</dfn>
+ <h5 id=after2><span class=secno>8.2.4.29. </span><dfn id=after7>After
+ DOCTYPE public identifier state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Stay in the <a href="#after2">after DOCTYPE public identifier
- state</a>.
+ <dd>Stay in the <a href="#after7">after DOCTYPE public identifier
+ state</a>.
- <dt>U+0022 QUOTATION MARK (")
+ <dt>U+0022 QUOTATION MARK (")
- <dd>Set the DOCTYPE token's system identifier to the empty string (not
- missing), then switch to the <a href="#doctype4">DOCTYPE system
- identifier (double-quoted) state</a>.
+ <dd>Set the DOCTYPE token's system identifier to the empty string (not
+ missing), then switch to the <a href="#doctype10">DOCTYPE system
+ identifier (double-quoted) state</a>.
- <dt>U+0027 APOSTROPHE (')
+ <dt>U+0027 APOSTROPHE (')
- <dd>Set the DOCTYPE token's system identifier to the empty string (not
- missing), then switch to the <a href="#doctype5">DOCTYPE system
- identifier (single-quoted) state</a>.
+ <dd>Set the DOCTYPE token's system identifier to the empty string (not
+ missing), then switch to the <a href="#doctype11">DOCTYPE system
+ identifier (single-quoted) state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the current DOCTYPE token. Switch to the <a
- href="#data-state">data state</a>.
+ <dd>Emit the current DOCTYPE token. Switch to the <a
+ href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Switch to the <a
- href="#bogus0">bogus DOCTYPE state</a>.
- </dl>
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Switch to the <a
+ href="#bogus2">bogus DOCTYPE state</a>.
+ </dl>
- <dt><dfn id=before3>Before DOCTYPE system identifier state</dfn>
+ <h5 id=before3><span class=secno>8.2.4.30. </span><dfn id=before8>Before
+ DOCTYPE system identifier state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Stay in the <a href="#before3">before DOCTYPE system identifier
- state</a>.
+ <dd>Stay in the <a href="#before8">before DOCTYPE system identifier
+ state</a>.
- <dt>U+0022 QUOTATION MARK (")
+ <dt>U+0022 QUOTATION MARK (")
- <dd>Set the DOCTYPE token's system identifier to the empty string (not
- missing), then switch to the <a href="#doctype4">DOCTYPE system
- identifier (double-quoted) state</a>.
+ <dd>Set the DOCTYPE token's system identifier to the empty string (not
+ missing), then switch to the <a href="#doctype10">DOCTYPE system
+ identifier (double-quoted) state</a>.
- <dt>U+0027 APOSTROPHE (')
+ <dt>U+0027 APOSTROPHE (')
- <dd>Set the DOCTYPE token's system identifier to the empty string (not
- missing), then switch to the <a href="#doctype5">DOCTYPE system
- identifier (single-quoted) state</a>.
+ <dd>Set the DOCTYPE token's system identifier to the empty string (not
+ missing), then switch to the <a href="#doctype11">DOCTYPE system
+ identifier (single-quoted) state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch
- to the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch to
+ the <a href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Switch to the <a
- href="#bogus0">bogus DOCTYPE state</a>.
- </dl>
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Switch to the <a
+ href="#bogus2">bogus DOCTYPE state</a>.
+ </dl>
- <dt><dfn id=doctype4>DOCTYPE system identifier (double-quoted) state</dfn>
+ <h5 id=doctype3><span class=secno>8.2.4.31. </span><dfn
+ id=doctype10>DOCTYPE system identifier (double-quoted) state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0022 QUOTATION MARK (")
+ <dl class=switch>
+ <dt>U+0022 QUOTATION MARK (")
- <dd>Switch to the <a href="#after3">after DOCTYPE system identifier
- state</a>.
+ <dd>Switch to the <a href="#after8">after DOCTYPE system identifier
+ state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch
- to the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch to
+ the <a href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the current input character to the current DOCTYPE token's
- system identifier. Stay in the <a href="#doctype4">DOCTYPE system
- identifier (double-quoted) state</a>.
- </dl>
+ <dd>Append the current input character to the current DOCTYPE token's
+ system identifier. Stay in the <a href="#doctype10">DOCTYPE system
+ identifier (double-quoted) state</a>.
+ </dl>
- <dt><dfn id=doctype5>DOCTYPE system identifier (single-quoted) state</dfn>
+ <h5 id=doctype4><span class=secno>8.2.4.32. </span><dfn
+ id=doctype11>DOCTYPE system identifier (single-quoted) state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0027 APOSTROPHE (')
+ <dl class=switch>
+ <dt>U+0027 APOSTROPHE (')
- <dd>Switch to the <a href="#after3">after DOCTYPE system identifier
- state</a>.
+ <dd>Switch to the <a href="#after8">after DOCTYPE system identifier
+ state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch
- to the <a href="#data-state">data state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Switch to
+ the <a href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Append the current input character to the current DOCTYPE token's
- system identifier. Stay in the <a href="#doctype5">DOCTYPE system
- identifier (single-quoted) state</a>.
- </dl>
+ <dd>Append the current input character to the current DOCTYPE token's
+ system identifier. Stay in the <a href="#doctype11">DOCTYPE system
+ identifier (single-quoted) state</a>.
+ </dl>
- <dt><dfn id=after3>After DOCTYPE system identifier state</dfn>
+ <h5 id=after3><span class=secno>8.2.4.33. </span><dfn id=after8>After
+ DOCTYPE system identifier state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+0009 CHARACTER TABULATION
+ <dl class=switch>
+ <dt>U+0009 CHARACTER TABULATION
- <dt>U+000A LINE FEED (LF)
+ <dt>U+000A LINE FEED (LF)
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE
+ <dt>U+0020 SPACE
- <dd>Stay in the <a href="#after3">after DOCTYPE system identifier
- state</a>.
+ <dd>Stay in the <a href="#after8">after DOCTYPE system identifier
+ state</a>.
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the current DOCTYPE token. Switch to the <a
- href="#data-state">data state</a>.
+ <dd>Emit the current DOCTYPE token. Switch to the <a
+ href="#data-state0">data state</a>.
- <dt>EOF
+ <dt>EOF
- <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <a href="#data-state">data
- state</a>.
+ <dd><a href="#parse2">Parse error</a>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token. Reconsume
+ the EOF character in the <a href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd><a href="#parse2">Parse error</a>. Switch to the <a
- href="#bogus0">bogus DOCTYPE state</a>. (This does <em>not</em> set the
- DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>.)
- </dl>
+ <dd><a href="#parse2">Parse error</a>. Switch to the <a
+ href="#bogus2">bogus DOCTYPE state</a>. (This does <em>not</em> set the
+ DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>.)
+ </dl>
- <dt><dfn id=bogus0>Bogus DOCTYPE state</dfn>
+ <h5 id=bogus0><span class=secno>8.2.4.34. </span><dfn id=bogus2>Bogus
+ DOCTYPE state</dfn></h5>
- <dd>
- <p>Consume the <a href="#next-input">next input character</a>:</p>
+ <p>Consume the <a href="#next-input">next input character</a>:
- <dl class=switch>
- <dt>U+003E GREATER-THAN SIGN (>)
+ <dl class=switch>
+ <dt>U+003E GREATER-THAN SIGN (>)
- <dd>Emit the DOCTYPE token. Switch to the <a href="#data-state">data
- state</a>.
+ <dd>Emit the DOCTYPE token. Switch to the <a href="#data-state0">data
+ state</a>.
- <dt>EOF
+ <dt>EOF
- <dd>Emit the DOCTYPE token. Reconsume the EOF character in the <a
- href="#data-state">data state</a>.
+ <dd>Emit the DOCTYPE token. Reconsume the EOF character in the <a
+ href="#data-state0">data state</a>.
- <dt>Anything else
+ <dt>Anything else
- <dd>Stay in the <a href="#bogus0">bogus DOCTYPE state</a>.
- </dl>
+ <dd>Stay in the <a href="#bogus2">bogus DOCTYPE state</a>.
+ </dl>
- <dt><dfn id=cdata1>CDATA section state</dfn>
+ <h5 id=cdata0><span class=secno>8.2.4.35. </span><dfn id=cdata2>CDATA
+ section state</dfn></h5>
- <dd>
- <p><em>(This can only happen if the <a href="#content3">content model
- flag</a> is set to the PCDATA state, and is unrelated to the <a
- href="#content3">content model flag</a>'s CDATA state.)</em></p>
+ <p><em>(This can only happen if the <a href="#content3">content model
+ flag</a> is set to the PCDATA state, and is unrelated to the <a
+ href="#content3">content model flag</a>'s CDATA state.)</em>
- <p>Consume every character up to the next occurrence of the three
- character sequence U+005D RIGHT SQUARE BRACKET U+005D RIGHT SQUARE
- BRACKET U+003E GREATER-THAN SIGN (<code title="">]]></code>), or the end
- of the file (EOF), whichever comes first. Emit a series of text tokens
- consisting of all the characters consumed except the matching three
- character sequence at the end (if one was found before the end of the
- file).</p>
+ <p>Consume every character up to the next occurrence of the three character
+ sequence U+005D RIGHT SQUARE BRACKET U+005D RIGHT SQUARE BRACKET U+003E
+ GREATER-THAN SIGN (<code title="">]]></code>), or the end of the file
+ (EOF), whichever comes first. Emit a series of text tokens consisting of
+ all the characters consumed except the matching three character sequence
+ at the end (if one was found before the end of the file).
- <p>Switch to the <a href="#data-state">data state</a>.</p>
+ <p>Switch to the <a href="#data-state0">data state</a>.
- <p>If the end of the file was reached, reconsume the EOF character.</p>
- </dl>
+ <p>If the end of the file was reached, reconsume the EOF character.
- <h5 id=tokenizing><span class=secno>8.2.4.1. </span>Tokenizing character
+ <h5 id=tokenizing><span class=secno>8.2.4.36. </span>Tokenizing character
references</h5>
<p>This section defines how to <dfn id=consume>consume a character
reference</dfn>. This definition is used when parsing character references
- <a href="#character4" title="character reference data state">in text</a>
- and <a href="#character5" title="character reference in attribute value
+ <a href="#character6" title="character reference data state">in text</a>
+ and <a href="#character7" title="character reference in attribute value
state">in attributes</a>.
<p>The behavior depends on the identity of the next character (the one
@@ -47125,7 +47203,7 @@
<p>If the last character matched is not a U+003B SEMICOLON (<code
title="">;</code>), there is a <a href="#parse2">parse error</a>.</p>
- <p>If the character reference is being consumed <a href="#character5"
+ <p>If the character reference is being consumed <a href="#character7"
title="character reference in attribute value state">as part of an
attribute</a>, and the last character matched is not a U+003B SEMICOLON
(<code title="">;</code>), and the next character is in the range U+0030
@@ -47840,7 +47918,7 @@
is the empty string is not considered missing for the purposes of the
conditions above.</p>
- <p>Then, switch the <span>insertion mode</span> to "<a href="#before4"
+ <p>Then, switch the <span>insertion mode</span> to "<a href="#before9"
title="insertion mode: before html">before html</a>".</p>
<dt>Anything else
@@ -47850,15 +47928,15 @@
<p>Set the document to <a href="#quirks">quirks mode</a>.</p>
- <p>Switch the <span>insertion mode</span> to "<a href="#before4"
+ <p>Switch the <span>insertion mode</span> to "<a href="#before9"
title="insertion mode: before html">before html</a>", then reprocess the
current token.</p>
</dl>
- <h5 id=the-before><span class=secno>8.2.5.5. </span>The "<dfn id=before4
+ <h5 id=the-before><span class=secno>8.2.5.5. </span>The "<dfn id=before9
title="insertion mode: before html">before html</dfn>" insertion mode</h5>
- <p>When the <span>insertion mode</span> is "<a href="#before4"
+ <p>When the <span>insertion mode</span> is "<a href="#before9"
title="insertion mode: before html">before html</a>", tokens must be
handled as follows:
@@ -47901,7 +47979,7 @@
title=concept-appcache-init-no-attribute>application cache selection
algorithm</a> with no manifest.</p>
- <p>Switch the <span>insertion mode</span> to "<a href="#before5"
+ <p>Switch the <span>insertion mode</span> to "<a href="#before10"
title="insertion mode: before head">before head</a>".</p>
<dt>Anything else
@@ -47917,7 +47995,7 @@
title=concept-appcache-init-no-attribute>application cache selection
algorithm</a> with no manifest.</p>
- <p>Switch the <span>insertion mode</span> to "<a href="#before5"
+ <p>Switch the <span>insertion mode</span> to "<a href="#before10"
title="insertion mode: before head">before head</a>", then reprocess the
current token.</p>
@@ -47931,10 +48009,10 @@
content continues being appended to the nodes as described in the next
section.
- <h5 id=the-before0><span class=secno>8.2.5.6. </span>The "<dfn id=before5
+ <h5 id=the-before0><span class=secno>8.2.5.6. </span>The "<dfn id=before10
title="insertion mode: before head">before head</dfn>" insertion mode</h5>
- <p>When the <span>insertion mode</span> is "<a href="#before5"
+ <p>When the <span>insertion mode</span> is "<a href="#before10"
title="insertion mode: before head">before head</a>", tokens must be
handled as follows:
@@ -47997,7 +48075,7 @@
<p class=note>This will result in an empty <code><a
href="#head">head</a></code> element being generated, with the current
- token being reprocessed in the "<a href="#after4" title="insertion mode:
+ token being reprocessed in the "<a href="#after9" title="insertion mode:
after head">after head</a>" <span>insertion mode</span>.</p>
</dl>
@@ -48221,7 +48299,7 @@
<code><a href="#head">head</a></code> element) off the <a
href="#stack">stack of open elements</a>.</p>
- <p>Switch the <span>insertion mode</span> to "<a href="#after4"
+ <p>Switch the <span>insertion mode</span> to "<a href="#after9"
title="insertion mode: after head">after head</a>".</p>
<dt>An end tag whose tag name is "br"
@@ -48314,10 +48392,10 @@
name "noscript" had been seen and reprocess the current token.</p>
</dl>
- <h5 id=the-after><span class=secno>8.2.5.9. </span>The "<dfn id=after4
+ <h5 id=the-after><span class=secno>8.2.5.9. </span>The "<dfn id=after9
title="insertion mode: after head">after head</dfn>" insertion mode</h5>
- <p>When the <span>insertion mode</span> is "<a href="#after4"
+ <p>When the <span>insertion mode</span> is "<a href="#after9"
title="insertion mode: after head">after head</a>", tokens must be handled
as follows:
@@ -48502,7 +48580,7 @@
href="#parse2">parse error</a>.</p>
<!-- (some of those are fragment cases) -->
<!-- the insertion mode here is forcibly "in body". -->
- <p>Switch the <span>insertion mode</span> to "<a href="#after5"
+ <p>Switch the <span>insertion mode</span> to "<a href="#after10"
title="insertion mode: after body">after body</a>". Otherwise, ignore
the token.</p>
@@ -50110,7 +50188,7 @@
is a <a href="#parse2">parse error</a>; ignore the token. (<a
href="#fragment">fragment case</a>)</p>
- <p>Otherwise, <a href="#close4">close the cell</a> (see below) and
+ <p>Otherwise, <a href="#close5">close the cell</a> (see below) and
reprocess the current token.</p>
<dt>An end tag whose tag name is one of: "body", "caption", "col",
@@ -50130,7 +50208,7 @@
href="#fragment">fragment case</a>), then this is a <a
href="#parse2">parse error</a> and the token must be ignored.</p>
- <p>Otherwise, <a href="#close4">close the cell</a> (see below) and
+ <p>Otherwise, <a href="#close5">close the cell</a> (see below) and
reprocess the current token.</p>
<dt>Anything else
@@ -50141,7 +50219,7 @@
<span>insertion mode</span>.</p>
</dl>
- <p>Where the steps above say to <dfn id=close4>close the cell</dfn>, they
+ <p>Where the steps above say to <dfn id=close5>close the cell</dfn>, they
mean to run the following algorithm:
<ol>
@@ -50501,10 +50579,10 @@
</dl>
<h5 id=parsing-main-afterbody><span class=secno>8.2.5.20. </span>The "<dfn
- id=after5 title="insertion mode: after body">after body</dfn>" insertion
+ id=after10 title="insertion mode: after body">after body</dfn>" insertion
mode</h5>
- <p>When the <span>insertion mode</span> is "<a href="#after5"
+ <p>When the <span>insertion mode</span> is "<a href="#after10"
title="insertion mode: after body">after body</a>", tokens must be handled
as follows:
@@ -50548,7 +50626,7 @@
href="#fragment">fragment case</a>)</p>
<!-- can only happen for <html>'s own innerHTML -->
<p>Otherwise, switch the <span>insertion mode</span> to "<a
- href="#after7" title="insertion mode: after after body">after after
+ href="#after12" title="insertion mode: after after body">after after
body</a>".</p>
<dt>An end-of-file token
@@ -50620,8 +50698,8 @@
href="#html-fragment0">HTML fragment parsing algorithm</a> (<a
href="#fragment">fragment case</a>), and the <a href="#current5">current
node</a> is no longer a <code>frameset</code> element, then switch the
- <span>insertion mode</span> to "<a href="#after6" title="insertion mode:
- after frameset">after frameset</a>".</p>
+ <span>insertion mode</span> to "<a href="#after11" title="insertion
+ mode: after frameset">after frameset</a>".</p>
<dt>A start tag whose tag name is "frame"
@@ -50660,10 +50738,10 @@
</dl>
<h5 id=parsing-main-afterframeset><span class=secno>8.2.5.22. </span>The
- "<dfn id=after6 title="insertion mode: after frameset">after
+ "<dfn id=after11 title="insertion mode: after frameset">after
frameset</dfn>" insertion mode</h5>
- <p>When the <span>insertion mode</span> is "<a href="#after6"
+ <p>When the <span>insertion mode</span> is "<a href="#after11"
title="insertion mode: after frameset">after frameset</a>", tokens must be
handled as follows:</p>
<!-- due to rules in the "in frameset" mode, this can't be entered in the fragment case -->
@@ -50699,7 +50777,7 @@
<dt>An end tag whose tag name is "html"
<dd>
- <p>Switch the <span>insertion mode</span> to "<a href="#after8"
+ <p>Switch the <span>insertion mode</span> to "<a href="#after13"
title="insertion mode: after after frameset">after after frameset</a>".</p>
<dt>A start tag whose tag name is "noframes"
@@ -50724,11 +50802,11 @@
that do support frames but want to show the NOFRAMES content. Supporting
the former is easy; supporting the latter is harder.
- <h5 id=the-after0><span class=secno>8.2.5.23. </span>The "<dfn id=after7
+ <h5 id=the-after0><span class=secno>8.2.5.23. </span>The "<dfn id=after12
title="insertion mode: after after body">after after body</dfn>" insertion
mode</h5>
- <p>When the <span>insertion mode</span> is "<a href="#after7"
+ <p>When the <span>insertion mode</span> is "<a href="#after12"
title="insertion mode: after after body">after after body</a>", tokens
must be handled as follows:
@@ -50766,11 +50844,11 @@
body</a>" and reprocess the token.</p>
</dl>
- <h5 id=the-after1><span class=secno>8.2.5.24. </span>The "<dfn id=after8
+ <h5 id=the-after1><span class=secno>8.2.5.24. </span>The "<dfn id=after13
title="insertion mode: after after frameset">after after frameset</dfn>"
insertion mode</h5>
- <p>When the <span>insertion mode</span> is "<a href="#after8"
+ <p>When the <span>insertion mode</span> is "<a href="#after13"
title="insertion mode: after after frameset">after after frameset</a>",
tokens must be handled as follows:
Modified: source
===================================================================
--- source 2008-07-23 00:56:52 UTC (rev 1905)
+++ source 2008-07-23 01:04:16 UTC (rev 1906)
@@ -42984,1336 +42984,1226 @@
must be <span title="executing a script block">executed</span> and
removed from its list.</p>
- <p>The tokeniser state machine is as follows:</p>
+ <p>The tokeniser state machine consists of the states defined in the
+ following subsections.</p>
<!-- XXX should go through these reordering the entries so that
they're in some consistent order, like, by Unicode, errors last, or
something -->
- <dl>
+ <h5><dfn>Data state</dfn></h5>
- <dt><dfn>Data state</dfn></dt>
+ <p>Consume the <span>next input character</span>:</p>
- <dd>
+ <dl class="switch">
- <p>Consume the <span>next input character</span>:</p>
+ <dt>U+0026 AMPERSAND (&)</dt>
+ <dd>When the <span>content model flag</span> is set to one of the
+ PCDATA or RCDATA states and the <span>escape flag</span> is
+ false: switch to the <span>character reference data
+ state</span>.</dd> <dd>Otherwise: treat it as per the "anything
+ else" entry below.</dd>
- <dl class="switch">
+ <dt>U+002D HYPHEN-MINUS (-)</dt>
+ <dd>
- <dt>U+0026 AMPERSAND (&)</dt>
- <dd>When the <span>content model flag</span> is set to one of the
- PCDATA or RCDATA states and the <span>escape flag</span> is
- false: switch to the <span>character reference data
- state</span>.</dd> <dd>Otherwise: treat it as per the "anything
- else" entry below.</dd>
+ <p>If the <span>content model flag</span> is set to either the
+ RCDATA state or the CDATA state, and the <span>escape flag</span>
+ is false, and there are at least three characters before this
+ one in the input stream, and the last four characters in the
+ input stream, including this one, are U+003C LESS-THAN SIGN,
+ U+0021 EXCLAMATION MARK, U+002D HYPHEN-MINUS, and U+002D
+ HYPHEN-MINUS ("<!--"), then set the <span>escape flag</span>
+ to true.</p>
- <dt>U+002D HYPHEN-MINUS (-)</dt>
- <dd>
+ <p>In any case, emit the input character as a character
+ token. Stay in the <span>data state</span>.</p>
- <p>If the <span>content model flag</span> is set to either the
- RCDATA state or the CDATA state, and the <span>escape flag</span>
- is false, and there are at least three characters before this
- one in the input stream, and the last four characters in the
- input stream, including this one, are U+003C LESS-THAN SIGN,
- U+0021 EXCLAMATION MARK, U+002D HYPHEN-MINUS, and U+002D
- HYPHEN-MINUS ("<!--"), then set the <span>escape flag</span>
- to true.</p>
-
- <p>In any case, emit the input character as a character
- token. Stay in the <span>data state</span>.</p>
-
- </dd>
-
- <dt>U+003C LESS-THAN SIGN (<)</dt>
- <dd>When the <span>content model flag</span> is set to the PCDATA
- state: switch to the <span>tag open state</span>.</dd>
- <dd>When the <span>content model flag</span> is set to either the
- RCDATA state or the CDATA state and the <span>escape flag</span>
- is false: switch to the <span>tag open state</span>.</dd>
- <dd>Otherwise: treat it as per the "anything else" entry
- below.</dd>
-
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>
-
- <p>If the <span>content model flag</span> is set to either the
- RCDATA state or the CDATA state, and the <span>escape
- flag</span> is true, and the last three characters in the input
- stream including this one are U+002D HYPHEN-MINUS, U+002D
- HYPHEN-MINUS, U+003E GREATER-THAN SIGN ("-->"), set the
- <span>escape flag</span> to false.</p> <!-- no need to check
- that there are enough characters, since you can only run into
- this if the flag is true in the first place, which requires four
- characters. -->
-
- <p>In any case, emit the input character as a character
- token. Stay in the <span>data state</span>.</p>
-
- </dd>
-
- <dt>EOF</dt>
- <dd>Emit an end-of-file token.</dd>
-
- <dt>Anything else</dt>
- <dd>Emit the input character as a character token. Stay in the
- <span>data state</span>.</dd>
-
- </dl>
-
</dd>
- <dt><dfn>Character reference data state</dfn></dt>
+ <dt>U+003C LESS-THAN SIGN (<)</dt>
+ <dd>When the <span>content model flag</span> is set to the PCDATA
+ state: switch to the <span>tag open state</span>.</dd>
+ <dd>When the <span>content model flag</span> is set to either the
+ RCDATA state or the CDATA state and the <span>escape flag</span>
+ is false: switch to the <span>tag open state</span>.</dd>
+ <dd>Otherwise: treat it as per the "anything else" entry
+ below.</dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
<dd>
- <p><em>(This cannot happen if the <span>content model flag</span>
- is set to the CDATA state.)</em></p>
+ <p>If the <span>content model flag</span> is set to either the
+ RCDATA state or the CDATA state, and the <span>escape
+ flag</span> is true, and the last three characters in the input
+ stream including this one are U+002D HYPHEN-MINUS, U+002D
+ HYPHEN-MINUS, U+003E GREATER-THAN SIGN ("-->"), set the
+ <span>escape flag</span> to false.</p> <!-- no need to check
+ that there are enough characters, since you can only run into
+ this if the flag is true in the first place, which requires four
+ characters. -->
- <p>Attempt to <span>consume a character reference</span>, with no
- <span>additional allowed character</span>.</p>
+ <p>In any case, emit the input character as a character
+ token. Stay in the <span>data state</span>.</p>
- <p>If nothing is returned, emit a U+0026 AMPERSAND character
- token.</p>
-
- <p>Otherwise, emit the character token that was returned.</p>
-
- <p>Finally, switch to the <span>data state</span>.</p>
-
</dd>
- <dt><dfn>Tag open state</dfn></dt>
+ <dt>EOF</dt>
+ <dd>Emit an end-of-file token.</dd>
- <dd>
+ <dt>Anything else</dt>
+ <dd>Emit the input character as a character token. Stay in the
+ <span>data state</span>.</dd>
- <p>The behavior of this state depends on the <span>content model
- flag</span>.</p>
+ </dl>
- <dl>
- <dt>If the <span>content model flag</span> is set to the RCDATA
- or CDATA states</dt>
+ <h5><dfn>Character reference data state</dfn></h5>
- <dd>
+ <p><em>(This cannot happen if the <span>content model flag</span>
+ is set to the CDATA state.)</em></p>
- <p>Consume the <span>next input character</span>. If it is a
- U+002F SOLIDUS (/) character, switch to the <span>close tag open
- state</span>. Otherwise, emit a U+003C LESS-THAN SIGN character
- token and reconsume the current input character in the
- <span>data state</span>.</p>
+ <p>Attempt to <span>consume a character reference</span>, with no
+ <span>additional allowed character</span>.</p>
- </dd>
+ <p>If nothing is returned, emit a U+0026 AMPERSAND character
+ token.</p>
- <dt>If the <span>content model flag</span> is set to the PCDATA
- state</dt>
+ <p>Otherwise, emit the character token that was returned.</p>
- <dd>
+ <p>Finally, switch to the <span>data state</span>.</p>
- <p>Consume the <span>next input character</span>:</p>
- <dl class="switch">
+ <h5><dfn>Tag open state</dfn></h5>
- <dt>U+0021 EXCLAMATION MARK (!)</dt>
- <dd>Switch to the <span>markup declaration open state</span>.</dd>
+ <p>The behavior of this state depends on the <span>content model
+ flag</span>.</p>
- <dt>U+002F SOLIDUS (/)</dt>
- <dd>Switch to the <span>close tag open state</span>.</dd>
+ <dl>
- <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
- <dd>Create a new start tag token, set its tag name to the
- lowercase version of the input character (add 0x0020 to the
- character's code point), then switch to the <span>tag name
- state</span>. (Don't emit the token yet; further details will
- be filled in before it is emitted.)</dd>
+ <dt>If the <span>content model flag</span> is set to the RCDATA
+ or CDATA states</dt>
- <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
- <dd>Create a new start tag token, set its tag name to the input
- character, then switch to the <span>tag name
- state</span>. (Don't emit the token yet; further details will
- be filled in before it is emitted.)</dd>
+ <dd>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd><span>Parse error</span>. Emit a U+003C LESS-THAN SIGN
- character token and a U+003E GREATER-THAN SIGN character
- token. Switch to the <span>data state</span>.</dd>
+ <p>Consume the <span>next input character</span>. If it is a
+ U+002F SOLIDUS (/) character, switch to the <span>close tag open
+ state</span>. Otherwise, emit a U+003C LESS-THAN SIGN character
+ token and reconsume the current input character in the
+ <span>data state</span>.</p>
- <dt>U+003F QUESTION MARK (?)</dt>
- <dd><span>Parse error</span>. Switch to the <span>bogus
- comment state</span>.</dd>
-
- <dt>Anything else</dt>
- <dd><span>Parse error</span>. Emit a U+003C LESS-THAN SIGN
- character token and reconsume the current input character in the
- <span>data state</span>.</dd>
-
- </dl>
-
- </dd>
-
- </dl>
-
</dd>
- <dt><dfn>Close tag open state</dfn></dt>
+ <dt>If the <span>content model flag</span> is set to the PCDATA
+ state</dt>
<dd>
- <p>If the <span>content model flag</span> is set to the RCDATA or
- CDATA states but no start tag token has ever been emitted by this
- instance of the tokeniser (<span>fragment case</span>), or, if the
- <span>content model flag</span> is set to the RCDATA or CDATA
- states and the next few characters do not match the tag name of
- the last start tag token emitted (case insensitively), or if they
- do but they are not immediately followed by one of the following
- characters:</p>
+ <p>Consume the <span>next input character</span>:</p>
- <ul class="brief">
- <li>U+0009 CHARACTER TABULATION</li>
- <li>U+000A LINE FEED (LF)</li>
- <li>U+000C FORM FEED (FF)</li>
- <!--<li>U+000D CARRIAGE RETURN (CR)</li>-->
- <li>U+0020 SPACE</li>
- <li>U+003E GREATER-THAN SIGN (>)</li>
- <li>U+002F SOLIDUS (/)</li>
- <li>EOF</li>
- </ul>
+ <dl class="switch">
- <p>...then emit a U+003C LESS-THAN SIGN character token, a U+002F
- SOLIDUS character token, and switch to the <span>data state</span>
- to process the <span>next input character</span>.</p>
+ <dt>U+0021 EXCLAMATION MARK (!)</dt>
+ <dd>Switch to the <span>markup declaration open state</span>.</dd>
- <p>Otherwise, if the <span>content model flag</span> is set to the
- PCDATA state, or if the next few characters <em>do</em> match that tag
- name, consume the <span>next input character</span>:</p>
+ <dt>U+002F SOLIDUS (/)</dt>
+ <dd>Switch to the <span>close tag open state</span>.</dd>
- <dl class="switch">
-
<dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
- <dd>Create a new end tag token, set its tag name to the lowercase
- version of the input character (add 0x0020 to the character's
- code point), then switch to the <span>tag name
- state</span>. (Don't emit the token yet; further details will be
- filled in before it is emitted.)</dd>
+ <dd>Create a new start tag token, set its tag name to the
+ lowercase version of the input character (add 0x0020 to the
+ character's code point), then switch to the <span>tag name
+ state</span>. (Don't emit the token yet; further details will
+ be filled in before it is emitted.)</dd>
<dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
- <dd>Create a new end tag token, set its tag name to the input
- character, then switch to the <span>tag name state</span>. (Don't
- emit the token yet; further details will be filled in before it
- is emitted.)</dd>
+ <dd>Create a new start tag token, set its tag name to the input
+ character, then switch to the <span>tag name
+ state</span>. (Don't emit the token yet; further details will
+ be filled in before it is emitted.)</dd>
<dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd><span>Parse error</span>. Switch to the <span>data
- state</span>.</dd>
-
- <dt>EOF</dt>
<dd><span>Parse error</span>. Emit a U+003C LESS-THAN SIGN
- character token and a U+002F SOLIDUS character token. Reconsume
- the EOF character in the <span>data state</span>.</dd>
+ character token and a U+003E GREATER-THAN SIGN character
+ token. Switch to the <span>data state</span>.</dd>
- <dt>Anything else</dt>
+ <dt>U+003F QUESTION MARK (?)</dt>
<dd><span>Parse error</span>. Switch to the <span>bogus
comment state</span>.</dd>
- </dl>
-
- </dd>
-
- <dt><dfn>Tag name state</dfn></dt>
-
- <dd>
-
- <p>Consume the <span>next input character</span>:</p>
-
- <dl class="switch">
-
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Switch to the <span>before attribute name state</span>.</dd>
-
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the current tag token. Switch to the <span>data
- state</span>.</dd>
-
- <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
- <dd>Append the lowercase version of the current input character
- (add 0x0020 to the character's code point) to the current tag
- token's tag name. Stay in the <span>tag name state</span>.</dd>
-
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the current tag
- token. Reconsume the EOF character in the <span>data
- state</span>.</dd>
-
- <dt>U+002F SOLIDUS (/)</dt>
- <dd>Switch to the <span>self-closing start tag state</span>.</dd>
-
<dt>Anything else</dt>
- <dd>Append the current input character to the current tag token's
- tag name. Stay in the <span>tag name state</span>.</dd>
+ <dd><span>Parse error</span>. Emit a U+003C LESS-THAN SIGN
+ character token and reconsume the current input character in the
+ <span>data state</span>.</dd>
</dl>
</dd>
- <dt><dfn>Before attribute name state</dfn></dt>
+ </dl>
- <dd>
- <p>Consume the <span>next input character</span>:</p>
+ <h5><dfn>Close tag open state</dfn></h5>
- <dl class="switch">
+ <p>If the <span>content model flag</span> is set to the RCDATA or
+ CDATA states but no start tag token has ever been emitted by this
+ instance of the tokeniser (<span>fragment case</span>), or, if the
+ <span>content model flag</span> is set to the RCDATA or CDATA
+ states and the next few characters do not match the tag name of
+ the last start tag token emitted (case insensitively), or if they
+ do but they are not immediately followed by one of the following
+ characters:</p>
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Stay in the <span>before attribute name state</span>.</dd>
+ <ul class="brief">
+ <li>U+0009 CHARACTER TABULATION</li>
+ <li>U+000A LINE FEED (LF)</li>
+ <li>U+000C FORM FEED (FF)</li>
+ <!--<li>U+000D CARRIAGE RETURN (CR)</li>-->
+ <li>U+0020 SPACE</li>
+ <li>U+003E GREATER-THAN SIGN (>)</li>
+ <li>U+002F SOLIDUS (/)</li>
+ <li>EOF</li>
+ </ul>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the current tag token. Switch to the <span>data
- state</span>.</dd>
+ <p>...then emit a U+003C LESS-THAN SIGN character token, a U+002F
+ SOLIDUS character token, and switch to the <span>data state</span>
+ to process the <span>next input character</span>.</p>
- <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
- <dd>Start a new attribute in the current tag token. Set that
- attribute's name to the lowercase version of the current input
- character (add 0x0020 to the character's code point), and its
- value to the empty string. Switch to the <span>attribute name
- state</span>.</dd>
+ <p>Otherwise, if the <span>content model flag</span> is set to the
+ PCDATA state, or if the next few characters <em>do</em> match that tag
+ name, consume the <span>next input character</span>:</p>
- <dt>U+002F SOLIDUS (/)</dt>
- <dd>Switch to the <span>self-closing start tag state</span>.</dd>
+ <dl class="switch">
- <dt>U+0022 QUOTATION MARK (")</dt>
- <dt>U+0027 APOSTROPHE (')</dt>
- <dt>U+003D EQUALS SIGN (=)</dt>
- <dd><span>Parse error</span>. Treat it as per the "anything else"
- entry below.</dd>
+ <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+ <dd>Create a new end tag token, set its tag name to the lowercase
+ version of the input character (add 0x0020 to the character's
+ code point), then switch to the <span>tag name
+ state</span>. (Don't emit the token yet; further details will be
+ filled in before it is emitted.)</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the current tag
- token. Reconsume the EOF character in the <span>data
- state</span>.</dd>
-
- <dt>Anything else</dt>
- <dd>Start a new attribute in the current tag token. Set that
- attribute's name to the current input character, and its value to
- the empty string. Switch to the <span>attribute name
- state</span>.</dd>
+ <dt>U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z</dt>
+ <dd>Create a new end tag token, set its tag name to the input
+ character, then switch to the <span>tag name state</span>. (Don't
+ emit the token yet; further details will be filled in before it
+ is emitted.)</dd>
- </dl>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd><span>Parse error</span>. Switch to the <span>data
+ state</span>.</dd>
- </dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit a U+003C LESS-THAN SIGN
+ character token and a U+002F SOLIDUS character token. Reconsume
+ the EOF character in the <span>data state</span>.</dd>
- <dt><dfn>Attribute name state</dfn></dt>
+ <dt>Anything else</dt>
+ <dd><span>Parse error</span>. Switch to the <span>bogus
+ comment state</span>.</dd>
- <dd>
+ </dl>
- <p>Consume the <span>next input character</span>:</p>
- <dl class="switch">
+ <h5><dfn>Tag name state</dfn></h5>
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Switch to the <span>after attribute name state</span>.</dd>
+ <p>Consume the <span>next input character</span>:</p>
- <dt>U+003D EQUALS SIGN (=)</dt>
- <dd>Switch to the <span>before attribute value state</span>.</dd>
+ <dl class="switch">
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the current tag token. Switch to the <span>data
- state</span>.</dd>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Switch to the <span>before attribute name state</span>.</dd>
- <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
- <dd>Append the lowercase version of the current input character
- (add 0x0020 to the character's code point) to the current
- attribute's name. Stay in the <span>attribute name
- state</span>.</dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the current tag token. Switch to the <span>data
+ state</span>.</dd>
- <dt>U+002F SOLIDUS (/)</dt>
- <dd>Switch to the <span>self-closing start tag state</span>.</dd>
+ <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+ <dd>Append the lowercase version of the current input character
+ (add 0x0020 to the character's code point) to the current tag
+ token's tag name. Stay in the <span>tag name state</span>.</dd>
- <dt>U+0022 QUOTATION MARK (")</dt>
- <dt>U+0027 APOSTROPHE (')</dt>
- <dd><span>Parse error</span>. Treat it as per the "anything else"
- entry below.</dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the current tag
+ token. Reconsume the EOF character in the <span>data
+ state</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the current tag
- token. Reconsume the EOF character in the <span>data
- state</span>.</dd>
+ <dt>U+002F SOLIDUS (/)</dt>
+ <dd>Switch to the <span>self-closing start tag state</span>.</dd>
- <dt>Anything else</dt>
- <dd>Append the current input character to the current attribute's
- name. Stay in the <span>attribute name state</span>.</dd>
+ <dt>Anything else</dt>
+ <dd>Append the current input character to the current tag token's
+ tag name. Stay in the <span>tag name state</span>.</dd>
- </dl>
+ </dl>
- <p>When the user agent leaves the attribute name state (and before
- emitting the tag token, if appropriate), the complete attribute's
- name must be compared to the other attributes on the same token;
- if there is already an attribute on the token with the exact same
- name, then this is a <span>parse error</span> and the new
- attribute must be dropped, along with the value that gets
- associated with it (if any).</p>
- </dd>
+ <h5><dfn>Before attribute name state</dfn></h5>
- <dt><dfn>After attribute name state</dfn></dt>
+ <p>Consume the <span>next input character</span>:</p>
- <dd>
+ <dl class="switch">
- <p>Consume the <span>next input character</span>:</p>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Stay in the <span>before attribute name state</span>.</dd>
- <dl class="switch">
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the current tag token. Switch to the <span>data
+ state</span>.</dd>
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Stay in the <span>after attribute name state</span>.</dd>
+ <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+ <dd>Start a new attribute in the current tag token. Set that
+ attribute's name to the lowercase version of the current input
+ character (add 0x0020 to the character's code point), and its
+ value to the empty string. Switch to the <span>attribute name
+ state</span>.</dd>
- <dt>U+003D EQUALS SIGN (=)</dt>
- <dd>Switch to the <span>before attribute value state</span>.</dd>
+ <dt>U+002F SOLIDUS (/)</dt>
+ <dd>Switch to the <span>self-closing start tag state</span>.</dd>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the current tag token. Switch to the <span>data
- state</span>.</dd>
+ <dt>U+0022 QUOTATION MARK (")</dt>
+ <dt>U+0027 APOSTROPHE (')</dt>
+ <dt>U+003D EQUALS SIGN (=)</dt>
+ <dd><span>Parse error</span>. Treat it as per the "anything else"
+ entry below.</dd>
- <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
- <dd>Start a new attribute in the current tag token. Set that
- attribute's name to the lowercase version of the current input character
- (add 0x0020 to the character's code point), and its value to
- the empty string. Switch to the <span>attribute name
- state</span>.</dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the current tag
+ token. Reconsume the EOF character in the <span>data
+ state</span>.</dd>
- <dt>U+002F SOLIDUS (/)</dt>
- <dd>Switch to the <span>self-closing start tag state</span>.</dd>
+ <dt>Anything else</dt>
+ <dd>Start a new attribute in the current tag token. Set that
+ attribute's name to the current input character, and its value to
+ the empty string. Switch to the <span>attribute name
+ state</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the current tag
- token. Reconsume the EOF character in the <span>data
- state</span>.</dd>
+ </dl>
- <dt>Anything else</dt>
- <dd>Start a new attribute in the current tag token. Set that
- attribute's name to the current input character, and its value to
- the empty string. Switch to the <span>attribute name
- state</span>.</dd>
- </dl>
+ <h5><dfn>Attribute name state</dfn></h5>
- </dd>
+ <p>Consume the <span>next input character</span>:</p>
- <dt><dfn>Before attribute value state</dfn></dt>
+ <dl class="switch">
- <dd>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Switch to the <span>after attribute name state</span>.</dd>
- <p>Consume the <span>next input character</span>:</p>
+ <dt>U+003D EQUALS SIGN (=)</dt>
+ <dd>Switch to the <span>before attribute value state</span>.</dd>
- <dl class="switch">
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the current tag token. Switch to the <span>data
+ state</span>.</dd>
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Stay in the <span>before attribute value state</span>.</dd>
+ <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+ <dd>Append the lowercase version of the current input character
+ (add 0x0020 to the character's code point) to the current
+ attribute's name. Stay in the <span>attribute name
+ state</span>.</dd>
- <dt>U+0022 QUOTATION MARK (")</dt>
- <dd>Switch to the <span>attribute value (double-quoted) state</span>.</dd>
+ <dt>U+002F SOLIDUS (/)</dt>
+ <dd>Switch to the <span>self-closing start tag state</span>.</dd>
- <dt>U+0026 AMPERSAND (&)</dt>
- <dd>Switch to the <span>attribute value (unquoted) state</span>
- and reconsume this input character.</dd>
+ <dt>U+0022 QUOTATION MARK (")</dt>
+ <dt>U+0027 APOSTROPHE (')</dt>
+ <dd><span>Parse error</span>. Treat it as per the "anything else"
+ entry below.</dd>
- <dt>U+0027 APOSTROPHE (')</dt>
- <dd>Switch to the <span>attribute value (single-quoted) state</span>.</dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the current tag
+ token. Reconsume the EOF character in the <span>data
+ state</span>.</dd>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the current tag token. Switch to the <span>data
- state</span>.</dd>
+ <dt>Anything else</dt>
+ <dd>Append the current input character to the current attribute's
+ name. Stay in the <span>attribute name state</span>.</dd>
- <dt>U+003D EQUALS SIGN (=)</dt>
- <dd><span>Parse error</span>. Treat it as per the "anything else"
- entry below.</dd>
+ </dl>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the current tag
- token. Reconsume the character in the <span>data
- state</span>.</dd>
+ <p>When the user agent leaves the attribute name state (and before
+ emitting the tag token, if appropriate), the complete attribute's
+ name must be compared to the other attributes on the same token;
+ if there is already an attribute on the token with the exact same
+ name, then this is a <span>parse error</span> and the new
+ attribute must be dropped, along with the value that gets
+ associated with it (if any).</p>
- <dt>Anything else</dt>
- <dd>Append the current input character to the current attribute's
- value. Switch to the <span>attribute value (unquoted)
- state</span>.</dd>
- </dl>
+ <h5><dfn>After attribute name state</dfn></h5>
- </dd>
+ <p>Consume the <span>next input character</span>:</p>
- <dt><dfn>Attribute value (double-quoted) state</dfn></dt>
+ <dl class="switch">
- <dd>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Stay in the <span>after attribute name state</span>.</dd>
- <p>Consume the <span>next input character</span>:</p>
+ <dt>U+003D EQUALS SIGN (=)</dt>
+ <dd>Switch to the <span>before attribute value state</span>.</dd>
- <dl class="switch">
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the current tag token. Switch to the <span>data
+ state</span>.</dd>
- <dt>U+0022 QUOTATION MARK (")</dt>
- <dd>Switch to the <span>after attribute value (quoted)
- state</span>.</dd>
+ <dt>U+0041 LATIN CAPITAL LETTER A through to U+005A LATIN CAPITAL LETTER Z</dt>
+ <dd>Start a new attribute in the current tag token. Set that
+ attribute's name to the lowercase version of the current input character
+ (add 0x0020 to the character's code point), and its value to
+ the empty string. Switch to the <span>attribute name
+ state</span>.</dd>
- <dt>U+0026 AMPERSAND (&)</dt>
- <dd>Switch to the <span>character reference in attribute value
- state</span>, with the <span>additional allowed character</span>
- being U+0022 QUOTATION MARK (").</dd>
+ <dt>U+002F SOLIDUS (/)</dt>
+ <dd>Switch to the <span>self-closing start tag state</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the current tag
- token. Reconsume the character in the <span>data
- state</span>.</dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the current tag
+ token. Reconsume the EOF character in the <span>data
+ state</span>.</dd>
- <dt>Anything else</dt>
- <dd>Append the current input character to the current attribute's
- value. Stay in the <span>attribute value (double-quoted)
- state</span>.</dd>
+ <dt>Anything else</dt>
+ <dd>Start a new attribute in the current tag token. Set that
+ attribute's name to the current input character, and its value to
+ the empty string. Switch to the <span>attribute name
+ state</span>.</dd>
- </dl>
+ </dl>
- </dd>
- <dt><dfn>Attribute value (single-quoted) state</dfn></dt>
+ <h5><dfn>Before attribute value state</dfn></h5>
- <dd>
+ <p>Consume the <span>next input character</span>:</p>
- <p>Consume the <span>next input character</span>:</p>
+ <dl class="switch">
- <dl class="switch">
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Stay in the <span>before attribute value state</span>.</dd>
- <dt>U+0027 APOSTROPHE (')</dt>
- <dd>Switch to the <span>after attribute value (quoted)
- state</span>.</dd>
+ <dt>U+0022 QUOTATION MARK (")</dt>
+ <dd>Switch to the <span>attribute value (double-quoted) state</span>.</dd>
- <dt>U+0026 AMPERSAND (&)</dt>
- <dd>Switch to the <span>character reference in attribute value
- state</span>, with the <span>additional allowed character</span>
- being U+0027 APOSTROPHE (').</dd>
+ <dt>U+0026 AMPERSAND (&)</dt>
+ <dd>Switch to the <span>attribute value (unquoted) state</span>
+ and reconsume this input character.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the current tag
- token. Reconsume the character in the <span>data
- state</span>.</dd>
+ <dt>U+0027 APOSTROPHE (')</dt>
+ <dd>Switch to the <span>attribute value (single-quoted) state</span>.</dd>
- <dt>Anything else</dt>
- <dd>Append the current input character to the current attribute's
- value. Stay in the <span>attribute value (single-quoted)
- state</span>.</dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the current tag token. Switch to the <span>data
+ state</span>.</dd>
- </dl>
+ <dt>U+003D EQUALS SIGN (=)</dt>
+ <dd><span>Parse error</span>. Treat it as per the "anything else"
+ entry below.</dd>
- </dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the current tag
+ token. Reconsume the character in the <span>data
+ state</span>.</dd>
- <dt><dfn>Attribute value (unquoted) state</dfn></dt>
+ <dt>Anything else</dt>
+ <dd>Append the current input character to the current attribute's
+ value. Switch to the <span>attribute value (unquoted)
+ state</span>.</dd>
- <dd>
+ </dl>
- <p>Consume the <span>next input character</span>:</p>
- <dl class="switch">
+ <h5><dfn>Attribute value (double-quoted) state</dfn></h5>
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Switch to the <span>before attribute name state</span>.</dd>
+ <p>Consume the <span>next input character</span>:</p>
- <dt>U+0026 AMPERSAND (&)</dt>
- <dd>Switch to the <span>character reference in attribute value
- state</span>, with no <span>additional allowed
- character</span>.</dd>
+ <dl class="switch">
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the current tag token. Switch to the <span>data
- state</span>.</dd>
+ <dt>U+0022 QUOTATION MARK (")</dt>
+ <dd>Switch to the <span>after attribute value (quoted)
+ state</span>.</dd>
- <dt>U+0022 QUOTATION MARK (")</dt>
- <dt>U+0027 APOSTROPHE (')</dt>
- <dt>U+003D EQUALS SIGN (=)</dt>
- <dd><span>Parse error</span>. Treat it as per the "anything else"
- entry below.</dd>
+ <dt>U+0026 AMPERSAND (&)</dt>
+ <dd>Switch to the <span>character reference in attribute value
+ state</span>, with the <span>additional allowed character</span>
+ being U+0022 QUOTATION MARK (").</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the current tag
- token. Reconsume the character in the <span>data
- state</span>.</dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the current tag
+ token. Reconsume the character in the <span>data
+ state</span>.</dd>
- <dt>Anything else</dt>
- <dd>Append the current input character to the current attribute's
- value. Stay in the <span>attribute value (unquoted)
- state</span>.</dd>
+ <dt>Anything else</dt>
+ <dd>Append the current input character to the current attribute's
+ value. Stay in the <span>attribute value (double-quoted)
+ state</span>.</dd>
- </dl>
+ </dl>
- </dd>
- <dt><dfn>Character reference in attribute value state</dfn></dt>
+ <h5><dfn>Attribute value (single-quoted) state</dfn></h5>
- <dd>
+ <p>Consume the <span>next input character</span>:</p>
- <p>Attempt to <span>consume a character reference</span>.</p>
+ <dl class="switch">
- <p>If nothing is returned, append a U+0026 AMPERSAND character to
- the current attribute's value.</p>
+ <dt>U+0027 APOSTROPHE (')</dt>
+ <dd>Switch to the <span>after attribute value (quoted)
+ state</span>.</dd>
- <p>Otherwise, append the returned character token to the current
- attribute's value.</p>
+ <dt>U+0026 AMPERSAND (&)</dt>
+ <dd>Switch to the <span>character reference in attribute value
+ state</span>, with the <span>additional allowed character</span>
+ being U+0027 APOSTROPHE (').</dd>
- <p>Finally, switch back to the attribute value state that you were
- in when were switched into this state.</p>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the current tag
+ token. Reconsume the character in the <span>data
+ state</span>.</dd>
- </dd>
+ <dt>Anything else</dt>
+ <dd>Append the current input character to the current attribute's
+ value. Stay in the <span>attribute value (single-quoted)
+ state</span>.</dd>
- <dt><dfn>After attribute value (quoted) state</dfn></dt>
+ </dl>
- <dd>
- <p>Consume the <span>next input character</span>:</p>
+ <h5><dfn>Attribute value (unquoted) state</dfn></h5>
- <dl class="switch">
+ <p>Consume the <span>next input character</span>:</p>
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Switch to the <span>before attribute name state</span>.</dd>
+ <dl class="switch">
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the current tag token. Switch to the <span>data
- state</span>.</dd>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Switch to the <span>before attribute name state</span>.</dd>
- <dt>U+002F SOLIDUS (/)</dt>
- <dd>Switch to the <span>self-closing start tag state</span>.</dd>
+ <dt>U+0026 AMPERSAND (&)</dt>
+ <dd>Switch to the <span>character reference in attribute value
+ state</span>, with no <span>additional allowed
+ character</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the current tag
- token. Reconsume the EOF character in the <span>data
- state</span>.</dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the current tag token. Switch to the <span>data
+ state</span>.</dd>
- <dt>Anything else</dt>
- <dd><span>Parse error</span>. Reconsume the character in
- the <span>before attribute name state</span>.</dd>
+ <dt>U+0022 QUOTATION MARK (")</dt>
+ <dt>U+0027 APOSTROPHE (')</dt>
+ <dt>U+003D EQUALS SIGN (=)</dt>
+ <dd><span>Parse error</span>. Treat it as per the "anything else"
+ entry below.</dd>
- </dl>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the current tag
+ token. Reconsume the character in the <span>data
+ state</span>.</dd>
- </dd>
+ <dt>Anything else</dt>
+ <dd>Append the current input character to the current attribute's
+ value. Stay in the <span>attribute value (unquoted)
+ state</span>.</dd>
- <dt><dfn>Self-closing start tag state</dfn></dt>
+ </dl>
- <dd>
- <p>Consume the <span>next input character</span>:</p>
+ <h5><dfn>Character reference in attribute value state</dfn></h5>
- <dl class="switch">
+ <p>Attempt to <span>consume a character reference</span>.</p>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Set the <i>self-closing flag</i> of the current tag
- token. Emit the current tag token. Switch to the <span>data
- state</span>.</dd>
+ <p>If nothing is returned, append a U+0026 AMPERSAND character to
+ the current attribute's value.</p>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the current tag
- token. Reconsume the EOF character in the <span>data
- state</span>.</dd>
-
- <dt>Anything else</dt>
- <dd><span>Parse error</span>. Reconsume the character in
- the <span>before attribute name state</span>.</dd>
+ <p>Otherwise, append the returned character token to the current
+ attribute's value.</p>
- </dl>
+ <p>Finally, switch back to the attribute value state that you were
+ in when were switched into this state.</p>
- </dd>
- <dt><dfn>Bogus comment state</dfn></dt>
+ <h5><dfn>After attribute value (quoted) state</dfn></h5>
- <dd>
+ <p>Consume the <span>next input character</span>:</p>
- <p><em>(This can only happen if the <span>content model
- flag</span> is set to the PCDATA state.)</em></p>
+ <dl class="switch">
- <p>Consume every character up to and including the first U+003E
- GREATER-THAN SIGN character (>) or the end of the file (EOF),
- whichever comes first. Emit a comment token whose data is the
- concatenation of all the characters starting from and including
- the character that caused the state machine to switch into the
- bogus comment state, up to and including the character immediately
- before the last consumed character (i.e. up to the character just
- before the U+003E or EOF character). (If the comment was started
- by the end of the file (EOF), the token is empty.)</p>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Switch to the <span>before attribute name state</span>.</dd>
- <p>Switch to the <span>data state</span>.</p>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the current tag token. Switch to the <span>data
+ state</span>.</dd>
- <p>If the end of the file was reached, reconsume the EOF
- character.</p>
+ <dt>U+002F SOLIDUS (/)</dt>
+ <dd>Switch to the <span>self-closing start tag state</span>.</dd>
- </dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the current tag
+ token. Reconsume the EOF character in the <span>data
+ state</span>.</dd>
- <dt><dfn>Markup declaration open state</dfn></dt>
+ <dt>Anything else</dt>
+ <dd><span>Parse error</span>. Reconsume the character in
+ the <span>before attribute name state</span>.</dd>
- <dd>
+ </dl>
- <p><em>(This can only happen if the <span>content model
- flag</span> is set to the PCDATA state.)</em></p>
- <p>If the next two characters are both U+002D HYPHEN-MINUS (-)
- characters, consume those two characters, create a comment token
- whose data is the empty string, and switch to the <span>comment
- start state</span>.</p>
+ <h5><dfn>Self-closing start tag state</dfn></h5>
- <p>Otherwise, if the next seven characters are a
- <span>case-insensitive</span><!-- XXX xref, ascii only --> match
- for the word "DOCTYPE", then consume those characters and switch
- to the <span>DOCTYPE state</span>.</p>
+ <p>Consume the <span>next input character</span>:</p>
- <p>Otherwise, if the <span>insertion mode</span> is "<span
- title="insertion mode: in foreign content">in foreign
- content</span>" and the <span>current node</span> is not an
- element in the <span>HTML namespace</span> and the next seven
- characters are a <span>case-sensitive</span><!-- XXX xref, ascii
- only --> match for the string "[CDATA[" (the five uppercase
- letters "CDATA" with a U+005B LEFT SQUARE BRACKET character before
- and after), then consume those characters and switch to the
- <span>CDATA section state</span> (which is unrelated to the
- <span>content model flag</span>'s CDATA state).</p>
+ <dl class="switch">
- <p>Otherwise, this is a <span>parse error</span>. Switch to the
- <span>bogus comment state</span>. The next character that is
- consumed, if any, is the first character that will be in the
- comment.</p>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Set the <i>self-closing flag</i> of the current tag
+ token. Emit the current tag token. Switch to the <span>data
+ state</span>.</dd>
- </dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the current tag
+ token. Reconsume the EOF character in the <span>data
+ state</span>.</dd>
- <dt><dfn>Comment start state</dfn></dt>
+ <dt>Anything else</dt>
+ <dd><span>Parse error</span>. Reconsume the character in
+ the <span>before attribute name state</span>.</dd>
- <dd>
+ </dl>
- <p>Consume the <span>next input character</span>:</p>
- <dl class="switch">
+ <h5><dfn>Bogus comment state</dfn></h5>
- <dt>U+002D HYPHEN-MINUS (-)</dt>
- <dd>Switch to the <span>comment start dash state</span>.</dd>
+ <p><em>(This can only happen if the <span>content model
+ flag</span> is set to the PCDATA state.)</em></p>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd><span>Parse error</span>. Emit the comment token. Switch to
- the <span>data state</span>.</dd>
+ <p>Consume every character up to and including the first U+003E
+ GREATER-THAN SIGN character (>) or the end of the file (EOF),
+ whichever comes first. Emit a comment token whose data is the
+ concatenation of all the characters starting from and including
+ the character that caused the state machine to switch into the
+ bogus comment state, up to and including the character immediately
+ before the last consumed character (i.e. up to the character just
+ before the U+003E or EOF character). (If the comment was started
+ by the end of the file (EOF), the token is empty.)</p>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the comment token. Reconsume
- the EOF character in the <span>data state</span>.</dd>
+ <p>Switch to the <span>data state</span>.</p>
- <dt>Anything else</dt>
- <dd>Append the input character to the comment token's
- data. Switch to the <span>comment state</span>.</dd>
+ <p>If the end of the file was reached, reconsume the EOF
+ character.</p>
- </dl>
- </dd>
+ <h5><dfn>Markup declaration open state</dfn></h5>
- <dt><dfn>Comment start dash state</dfn></dt>
+ <p><em>(This can only happen if the <span>content model
+ flag</span> is set to the PCDATA state.)</em></p>
- <dd>
+ <p>If the next two characters are both U+002D HYPHEN-MINUS (-)
+ characters, consume those two characters, create a comment token
+ whose data is the empty string, and switch to the <span>comment
+ start state</span>.</p>
- <p>Consume the <span>next input character</span>:</p>
+ <p>Otherwise, if the next seven characters are a
+ <span>case-insensitive</span><!-- XXX xref, ascii only --> match
+ for the word "DOCTYPE", then consume those characters and switch
+ to the <span>DOCTYPE state</span>.</p>
- <dl class="switch">
+ <p>Otherwise, if the <span>insertion mode</span> is "<span
+ title="insertion mode: in foreign content">in foreign
+ content</span>" and the <span>current node</span> is not an
+ element in the <span>HTML namespace</span> and the next seven
+ characters are a <span>case-sensitive</span><!-- XXX xref, ascii
+ only --> match for the string "[CDATA[" (the five uppercase
+ letters "CDATA" with a U+005B LEFT SQUARE BRACKET character before
+ and after), then consume those characters and switch to the
+ <span>CDATA section state</span> (which is unrelated to the
+ <span>content model flag</span>'s CDATA state).</p>
- <dt>U+002D HYPHEN-MINUS (-)</dt>
- <dd>Switch to the <span>comment end state</span></dd>
+ <p>Otherwise, this is a <span>parse error</span>. Switch to the
+ <span>bogus comment state</span>. The next character that is
+ consumed, if any, is the first character that will be in the
+ comment.</p>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd><span>Parse error</span>. Emit the comment token. Switch to
- the <span>data state</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the comment token. Reconsume
- the EOF character in the <span>data state</span>.</dd>
+ <h5><dfn>Comment start state</dfn></h5>
- <dt>Anything else</dt>
- <dd>Append a U+002D HYPHEN-MINUS (-) character and the input
- character to the comment token's data. Switch to the
- <span>comment state</span>.</dd>
+ <p>Consume the <span>next input character</span>:</p>
- </dl>
+ <dl class="switch">
- </dd>
+ <dt>U+002D HYPHEN-MINUS (-)</dt>
+ <dd>Switch to the <span>comment start dash state</span>.</dd>
- <dt><dfn id="comment">Comment state</dfn></dt>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd><span>Parse error</span>. Emit the comment token. Switch to
+ the <span>data state</span>.</dd>
- <dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the comment token. Reconsume
+ the EOF character in the <span>data state</span>.</dd>
- <p>Consume the <span>next input character</span>:</p>
+ <dt>Anything else</dt>
+ <dd>Append the input character to the comment token's
+ data. Switch to the <span>comment state</span>.</dd>
- <dl class="switch">
+ </dl>
- <dt>U+002D HYPHEN-MINUS (-)</dt>
- <dd>Switch to the <span>comment end dash state</span></dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the comment token. Reconsume
- the EOF character in the <span>data state</span>.</dd> <!-- For
- security reasons: otherwise, hostile user could put a <script> in
- a comment e.g. in a blog comment and then DOS the server so that
- the end tag isn't read, and then the commented <script> tag would
- be treated as live code -->
+ <h5><dfn>Comment start dash state</dfn></h5>
- <dt>Anything else</dt>
- <dd>Append the input character to the comment token's data. Stay
- in the <span>comment state</span>.</dd>
+ <p>Consume the <span>next input character</span>:</p>
- </dl>
+ <dl class="switch">
- </dd>
+ <dt>U+002D HYPHEN-MINUS (-)</dt>
+ <dd>Switch to the <span>comment end state</span></dd>
- <dt><dfn>Comment end dash state</dfn></dt>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd><span>Parse error</span>. Emit the comment token. Switch to
+ the <span>data state</span>.</dd>
- <dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the comment token. Reconsume
+ the EOF character in the <span>data state</span>.</dd>
- <p>Consume the <span>next input character</span>:</p>
+ <dt>Anything else</dt>
+ <dd>Append a U+002D HYPHEN-MINUS (-) character and the input
+ character to the comment token's data. Switch to the
+ <span>comment state</span>.</dd>
- <dl class="switch">
+ </dl>
- <dt>U+002D HYPHEN-MINUS (-)</dt>
- <dd>Switch to the <span>comment end state</span></dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the comment token. Reconsume
- the EOF character in the <span>data state</span>.</dd> <!-- For
- security reasons: otherwise, hostile user could put a <script> in
- a comment e.g. in a blog comment and then DOS the server so that
- the end tag isn't read, and then the commented <script> tag would
- be treated as live code -->
+ <dt><dfn id="comment">Comment state</dfn></dt>
- <dt>Anything else</dt>
- <dd>Append a U+002D HYPHEN-MINUS (-) character and the input
- character to the comment token's data. Switch to the
- <span>comment state</span>.</dd>
- </dl>
+ <p>Consume the <span>next input character</span>:</p>
- </dd>
+ <dl class="switch">
- <dt><dfn>Comment end state</dfn></dt>
+ <dt>U+002D HYPHEN-MINUS (-)</dt>
+ <dd>Switch to the <span>comment end dash state</span></dd>
- <dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the comment token. Reconsume
+ the EOF character in the <span>data state</span>.</dd> <!-- For
+ security reasons: otherwise, hostile user could put a <script> in
+ a comment e.g. in a blog comment and then DOS the server so that
+ the end tag isn't read, and then the commented <script> tag would
+ be treated as live code -->
- <p>Consume the <span>next input character</span>:</p>
+ <dt>Anything else</dt>
+ <dd>Append the input character to the comment token's data. Stay
+ in the <span>comment state</span>.</dd>
- <dl class="switch">
+ </dl>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the comment token. Switch to the <span>data
- state</span>.</dd>
- <dt>U+002D HYPHEN-MINUS (-)</dt>
- <dd><span>Parse error</span>. Append a U+002D HYPHEN-MINUS
- (-) character to the comment token's data. Stay in the
- <span>comment end state</span>.</dd>
+ <h5><dfn>Comment end dash state</dfn></h5>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Emit the comment token. Reconsume
- the EOF character in the <span>data state</span>.</dd> <!-- For
- security reasons: otherwise, hostile user could put a <script> in
- a comment e.g. in a blog comment and then DOS the server so that
- the end tag isn't read, and then the commented <script> tag would
- be treated as live code -->
+ <p>Consume the <span>next input character</span>:</p>
- <dt>Anything else</dt>
- <dd><span>Parse error</span>. Append two U+002D HYPHEN-MINUS (-)
- characters and the input character to the comment token's
- data. Switch to the <span>comment state</span>.</dd>
+ <dl class="switch">
- </dl>
+ <dt>U+002D HYPHEN-MINUS (-)</dt>
+ <dd>Switch to the <span>comment end state</span></dd>
- </dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the comment token. Reconsume
+ the EOF character in the <span>data state</span>.</dd> <!-- For
+ security reasons: otherwise, hostile user could put a <script> in
+ a comment e.g. in a blog comment and then DOS the server so that
+ the end tag isn't read, and then the commented <script> tag would
+ be treated as live code -->
- <dt><dfn>DOCTYPE state</dfn></dt>
+ <dt>Anything else</dt>
+ <dd>Append a U+002D HYPHEN-MINUS (-) character and the input
+ character to the comment token's data. Switch to the
+ <span>comment state</span>.</dd>
- <dd>
+ </dl>
- <p>Consume the <span>next input character</span>:</p>
- <dl class="switch">
+ <h5><dfn>Comment end state</dfn></h5>
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Switch to the <span>before DOCTYPE name state</span>.</dd>
+ <p>Consume the <span>next input character</span>:</p>
- <dt>Anything else</dt>
- <dd><span>Parse error</span>. Reconsume the current
- character in the <span>before DOCTYPE name state</span>.</dd>
+ <dl class="switch">
- </dl>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the comment token. Switch to the <span>data
+ state</span>.</dd>
- </dd>
+ <dt>U+002D HYPHEN-MINUS (-)</dt>
+ <dd><span>Parse error</span>. Append a U+002D HYPHEN-MINUS
+ (-) character to the comment token's data. Stay in the
+ <span>comment end state</span>.</dd>
- <dt><dfn>Before DOCTYPE name state</dfn></dt>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Emit the comment token. Reconsume
+ the EOF character in the <span>data state</span>.</dd> <!-- For
+ security reasons: otherwise, hostile user could put a <script> in
+ a comment e.g. in a blog comment and then DOS the server so that
+ the end tag isn't read, and then the commented <script> tag would
+ be treated as live code -->
- <dd>
+ <dt>Anything else</dt>
+ <dd><span>Parse error</span>. Append two U+002D HYPHEN-MINUS (-)
+ characters and the input character to the comment token's
+ data. Switch to the <span>comment state</span>.</dd>
- <p>Consume the <span>next input character</span>:</p>
+ </dl>
- <dl class="switch">
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Stay in the <span>before DOCTYPE name state</span>.</dd>
+ <h5><dfn>DOCTYPE state</dfn></h5>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd><span>Parse error</span>. Create a new DOCTYPE token. Set its
- <i>force-quirks flag</i> to <i>on</i>. Emit the token. Switch to
- the <span>data state</span>.</dd>
+ <p>Consume the <span>next input character</span>:</p>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Create a new DOCTYPE token. Set its
- <i>force-quirks flag</i> to <i>on</i>. Emit the token. Reconsume
- the EOF character in the <span>data state</span>.</dd>
+ <dl class="switch">
- <dt>Anything else</dt>
- <dd>Create a new DOCTYPE token. Set the token's name to the
- current input character. Switch to the <span>DOCTYPE name
- state</span>.</dd>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Switch to the <span>before DOCTYPE name state</span>.</dd>
- </dl>
+ <dt>Anything else</dt>
+ <dd><span>Parse error</span>. Reconsume the current
+ character in the <span>before DOCTYPE name state</span>.</dd>
- </dd>
+ </dl>
- <dt><dfn>DOCTYPE name state</dfn></dt>
- <dd>
+ <h5><dfn>Before DOCTYPE name state</dfn></h5>
- <p>First, consume the <span>next input character</span>:</p>
+ <p>Consume the <span>next input character</span>:</p>
- <dl class="switch">
+ <dl class="switch">
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Switch to the <span>after DOCTYPE name state</span>.</dd>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Stay in the <span>before DOCTYPE name state</span>.</dd>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the current DOCTYPE token. Switch to the <span>data
- state</span>.</dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd><span>Parse error</span>. Create a new DOCTYPE token. Set its
+ <i>force-quirks flag</i> to <i>on</i>. Emit the token. Switch to
+ the <span>data state</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <span>data state</span>.</dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Create a new DOCTYPE token. Set its
+ <i>force-quirks flag</i> to <i>on</i>. Emit the token. Reconsume
+ the EOF character in the <span>data state</span>.</dd>
- <dt>Anything else</dt>
- <dd>Append the current input character to the current DOCTYPE
- token's name. Stay in the <span>DOCTYPE name state</span>.</dd>
+ <dt>Anything else</dt>
+ <dd>Create a new DOCTYPE token. Set the token's name to the
+ current input character. Switch to the <span>DOCTYPE name
+ state</span>.</dd>
- </dl>
+ </dl>
- </dd>
- <dt><dfn>After DOCTYPE name state</dfn></dt>
+ <h5><dfn>DOCTYPE name state</dfn></h5>
- <dd>
+ <p>First, consume the <span>next input character</span>:</p>
- <p>Consume the <span>next input character</span>:</p>
+ <dl class="switch">
- <dl class="switch">
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Switch to the <span>after DOCTYPE name state</span>.</dd>
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Stay in the <span>after DOCTYPE name state</span>.</dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the current DOCTYPE token. Switch to the <span>data
+ state</span>.</dd>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the current DOCTYPE token. Switch to the <span>data
- state</span>.</dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
+ Reconsume the EOF character in the <span>data state</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <span>data state</span>.</dd>
+ <dt>Anything else</dt>
+ <dd>Append the current input character to the current DOCTYPE
+ token's name. Stay in the <span>DOCTYPE name state</span>.</dd>
- <dt>Anything else</dt>
- <dd>
+ </dl>
- <p>If the next six characters are a
- <span>case-insensitive</span><!-- XXX xref, ascii only --> match
- for the word "PUBLIC", then consume those characters and switch
- to the <span>before DOCTYPE public identifier state</span>.</p>
- <p>Otherwise, if the next six characters are a
- <span>case-insensitive</span><!-- XXX xref, ascii only --> match
- for the word "SYSTEM", then consume those characters and switch
- to the <span>before DOCTYPE system identifier state</span>.</p>
+ <h5><dfn>After DOCTYPE name state</dfn></h5>
- <p>Otherwise, this is the <span>parse error</span>. Set the
- DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to
- the <span>bogus DOCTYPE state</span>.</p>
+ <p>Consume the <span>next input character</span>:</p>
- </dd>
+ <dl class="switch">
- </dl>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Stay in the <span>after DOCTYPE name state</span>.</dd>
- </dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the current DOCTYPE token. Switch to the <span>data
+ state</span>.</dd>
- <dt><dfn>Before DOCTYPE public identifier state</dfn></dt>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
+ Reconsume the EOF character in the <span>data state</span>.</dd>
+ <dt>Anything else</dt>
<dd>
- <p>Consume the <span>next input character</span>:</p>
+ <p>If the next six characters are a
+ <span>case-insensitive</span><!-- XXX xref, ascii only --> match
+ for the word "PUBLIC", then consume those characters and switch
+ to the <span>before DOCTYPE public identifier state</span>.</p>
- <dl class="switch">
+ <p>Otherwise, if the next six characters are a
+ <span>case-insensitive</span><!-- XXX xref, ascii only --> match
+ for the word "SYSTEM", then consume those characters and switch
+ to the <span>before DOCTYPE system identifier state</span>.</p>
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Stay in the <span>before DOCTYPE public identifier state</span>.</dd>
+ <p>Otherwise, this is the <span>parse error</span>. Set the
+ DOCTYPE token's <i>force-quirks flag</i> to <i>on</i>. Switch to
+ the <span>bogus DOCTYPE state</span>.</p>
- <dt>U+0022 QUOTATION MARK (")</dt>
- <dd>Set the DOCTYPE token's public identifier to the empty string
- (not missing), then switch to the <span>DOCTYPE public identifier
- (double-quoted) state</span>.</dd>
+ </dd>
- <dt>U+0027 APOSTROPHE (')</dt>
- <dd>Set the DOCTYPE token's public identifier to the empty string
- (not missing), then switch to the <span>DOCTYPE public identifier
- (single-quoted) state</span>.</dd>
+ </dl>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
- token. Switch to the <span>data state</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <span>data state</span>.</dd>
+ <h5><dfn>Before DOCTYPE public identifier state</dfn></h5>
- <dt>Anything else</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Switch to the <span>bogus
- DOCTYPE state</span>.</dd>
+ <p>Consume the <span>next input character</span>:</p>
- </dl>
+ <dl class="switch">
- </dd>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Stay in the <span>before DOCTYPE public identifier state</span>.</dd>
- <dt><dfn>DOCTYPE public identifier (double-quoted) state</dfn></dt>
+ <dt>U+0022 QUOTATION MARK (")</dt>
+ <dd>Set the DOCTYPE token's public identifier to the empty string
+ (not missing), then switch to the <span>DOCTYPE public identifier
+ (double-quoted) state</span>.</dd>
- <dd>
+ <dt>U+0027 APOSTROPHE (')</dt>
+ <dd>Set the DOCTYPE token's public identifier to the empty string
+ (not missing), then switch to the <span>DOCTYPE public identifier
+ (single-quoted) state</span>.</dd>
- <p>Consume the <span>next input character</span>:</p>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
+ token. Switch to the <span>data state</span>.</dd>
- <dl class="switch">
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
+ Reconsume the EOF character in the <span>data state</span>.</dd>
- <dt>U+0022 QUOTATION MARK (")</dt>
- <dd>Switch to the <span>after DOCTYPE public identifier state</span>.</dd>
+ <dt>Anything else</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Switch to the <span>bogus
+ DOCTYPE state</span>.</dd>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
- token. Switch to the <span>data state</span>.</dd>
+ </dl>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <span>data state</span>.</dd>
- <dt>Anything else</dt>
- <dd>Append the current input character to the current DOCTYPE
- token's public identifier. Stay in the <span>DOCTYPE public
- identifier (double-quoted) state</span>.</dd>
+ <h5><dfn>DOCTYPE public identifier (double-quoted) state</dfn></h5>
- </dl>
+ <p>Consume the <span>next input character</span>:</p>
- </dd>
+ <dl class="switch">
- <dt><dfn>DOCTYPE public identifier (single-quoted) state</dfn></dt>
+ <dt>U+0022 QUOTATION MARK (")</dt>
+ <dd>Switch to the <span>after DOCTYPE public identifier state</span>.</dd>
- <dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
+ token. Switch to the <span>data state</span>.</dd>
- <p>Consume the <span>next input character</span>:</p>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
+ Reconsume the EOF character in the <span>data state</span>.</dd>
- <dl class="switch">
+ <dt>Anything else</dt>
+ <dd>Append the current input character to the current DOCTYPE
+ token's public identifier. Stay in the <span>DOCTYPE public
+ identifier (double-quoted) state</span>.</dd>
- <dt>U+0027 APOSTROPHE (')</dt>
- <dd>Switch to the <span>after DOCTYPE public identifier state</span>.</dd>
+ </dl>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
- token. Switch to the <span>data state</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <span>data state</span>.</dd>
+ <h5><dfn>DOCTYPE public identifier (single-quoted) state</dfn></h5>
- <dt>Anything else</dt>
- <dd>Append the current input character to the current DOCTYPE
- token's public identifier. Stay in the <span>DOCTYPE public
- identifier (single-quoted) state</span>.</dd>
+ <p>Consume the <span>next input character</span>:</p>
- </dl>
+ <dl class="switch">
- </dd>
+ <dt>U+0027 APOSTROPHE (')</dt>
+ <dd>Switch to the <span>after DOCTYPE public identifier state</span>.</dd>
- <dt><dfn>After DOCTYPE public identifier state</dfn></dt>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
+ token. Switch to the <span>data state</span>.</dd>
- <dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
+ Reconsume the EOF character in the <span>data state</span>.</dd>
- <p>Consume the <span>next input character</span>:</p>
+ <dt>Anything else</dt>
+ <dd>Append the current input character to the current DOCTYPE
+ token's public identifier. Stay in the <span>DOCTYPE public
+ identifier (single-quoted) state</span>.</dd>
- <dl class="switch">
+ </dl>
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Stay in the <span>after DOCTYPE public identifier state</span>.</dd>
- <dt>U+0022 QUOTATION MARK (")</dt>
- <dd>Set the DOCTYPE token's system identifier to the empty string
- (not missing), then switch to the <span>DOCTYPE system identifier
- (double-quoted) state</span>.</dd>
+ <h5><dfn>After DOCTYPE public identifier state</dfn></h5>
- <dt>U+0027 APOSTROPHE (')</dt>
- <dd>Set the DOCTYPE token's system identifier to the empty string
- (not missing), then switch to the <span>DOCTYPE system identifier
- (single-quoted) state</span>.</dd>
+ <p>Consume the <span>next input character</span>:</p>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the current DOCTYPE token. Switch to the <span>data
- state</span>.</dd>
+ <dl class="switch">
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <span>data state</span>.</dd>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Stay in the <span>after DOCTYPE public identifier state</span>.</dd>
- <dt>Anything else</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Switch to the <span>bogus
- DOCTYPE state</span>.</dd>
+ <dt>U+0022 QUOTATION MARK (")</dt>
+ <dd>Set the DOCTYPE token's system identifier to the empty string
+ (not missing), then switch to the <span>DOCTYPE system identifier
+ (double-quoted) state</span>.</dd>
- </dl>
+ <dt>U+0027 APOSTROPHE (')</dt>
+ <dd>Set the DOCTYPE token's system identifier to the empty string
+ (not missing), then switch to the <span>DOCTYPE system identifier
+ (single-quoted) state</span>.</dd>
- </dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the current DOCTYPE token. Switch to the <span>data
+ state</span>.</dd>
- <dt><dfn>Before DOCTYPE system identifier state</dfn></dt>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
+ Reconsume the EOF character in the <span>data state</span>.</dd>
- <dd>
+ <dt>Anything else</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Switch to the <span>bogus
+ DOCTYPE state</span>.</dd>
- <p>Consume the <span>next input character</span>:</p>
+ </dl>
- <dl class="switch">
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Stay in the <span>before DOCTYPE system identifier state</span>.</dd>
+ <h5><dfn>Before DOCTYPE system identifier state</dfn></h5>
- <dt>U+0022 QUOTATION MARK (")</dt>
- <dd>Set the DOCTYPE token's system identifier to the empty string
- (not missing), then switch to the <span>DOCTYPE system identifier
- (double-quoted) state</span>.</dd>
+ <p>Consume the <span>next input character</span>:</p>
- <dt>U+0027 APOSTROPHE (')</dt>
- <dd>Set the DOCTYPE token's system identifier to the empty string
- (not missing), then switch to the <span>DOCTYPE system identifier
- (single-quoted) state</span>.</dd>
+ <dl class="switch">
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
- token. Switch to the <span>data state</span>.</dd>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Stay in the <span>before DOCTYPE system identifier state</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <span>data state</span>.</dd>
+ <dt>U+0022 QUOTATION MARK (")</dt>
+ <dd>Set the DOCTYPE token's system identifier to the empty string
+ (not missing), then switch to the <span>DOCTYPE system identifier
+ (double-quoted) state</span>.</dd>
- <dt>Anything else</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Switch to the <span>bogus
- DOCTYPE state</span>.</dd>
+ <dt>U+0027 APOSTROPHE (')</dt>
+ <dd>Set the DOCTYPE token's system identifier to the empty string
+ (not missing), then switch to the <span>DOCTYPE system identifier
+ (single-quoted) state</span>.</dd>
- </dl>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
+ token. Switch to the <span>data state</span>.</dd>
- </dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
+ Reconsume the EOF character in the <span>data state</span>.</dd>
- <dt><dfn>DOCTYPE system identifier (double-quoted) state</dfn></dt>
+ <dt>Anything else</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Switch to the <span>bogus
+ DOCTYPE state</span>.</dd>
- <dd>
+ </dl>
- <p>Consume the <span>next input character</span>:</p>
- <dl class="switch">
+ <h5><dfn>DOCTYPE system identifier (double-quoted) state</dfn></h5>
- <dt>U+0022 QUOTATION MARK (")</dt>
- <dd>Switch to the <span>after DOCTYPE system identifier state</span>.</dd>
+ <p>Consume the <span>next input character</span>:</p>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
- token. Switch to the <span>data state</span>.</dd>
+ <dl class="switch">
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <span>data state</span>.</dd>
+ <dt>U+0022 QUOTATION MARK (")</dt>
+ <dd>Switch to the <span>after DOCTYPE system identifier state</span>.</dd>
- <dt>Anything else</dt>
- <dd>Append the current input character to the current DOCTYPE
- token's system identifier. Stay in the <span>DOCTYPE system
- identifier (double-quoted) state</span>.</dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
+ token. Switch to the <span>data state</span>.</dd>
- </dl>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
+ Reconsume the EOF character in the <span>data state</span>.</dd>
- </dd>
+ <dt>Anything else</dt>
+ <dd>Append the current input character to the current DOCTYPE
+ token's system identifier. Stay in the <span>DOCTYPE system
+ identifier (double-quoted) state</span>.</dd>
- <dt><dfn>DOCTYPE system identifier (single-quoted) state</dfn></dt>
+ </dl>
- <dd>
- <p>Consume the <span>next input character</span>:</p>
+ <h5><dfn>DOCTYPE system identifier (single-quoted) state</dfn></h5>
- <dl class="switch">
+ <p>Consume the <span>next input character</span>:</p>
- <dt>U+0027 APOSTROPHE (')</dt>
- <dd>Switch to the <span>after DOCTYPE system identifier state</span>.</dd>
+ <dl class="switch">
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
- token. Switch to the <span>data state</span>.</dd>
+ <dt>U+0027 APOSTROPHE (')</dt>
+ <dd>Switch to the <span>after DOCTYPE system identifier state</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <span>data state</span>.</dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE
+ token. Switch to the <span>data state</span>.</dd>
- <dt>Anything else</dt>
- <dd>Append the current input character to the current DOCTYPE
- token's system identifier. Stay in the <span>DOCTYPE system
- identifier (single-quoted) state</span>.</dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
+ Reconsume the EOF character in the <span>data state</span>.</dd>
- </dl>
+ <dt>Anything else</dt>
+ <dd>Append the current input character to the current DOCTYPE
+ token's system identifier. Stay in the <span>DOCTYPE system
+ identifier (single-quoted) state</span>.</dd>
- </dd>
+ </dl>
- <dt><dfn>After DOCTYPE system identifier state</dfn></dt>
- <dd>
+ <h5><dfn>After DOCTYPE system identifier state</dfn></h5>
- <p>Consume the <span>next input character</span>:</p>
+ <p>Consume the <span>next input character</span>:</p>
- <dl class="switch">
+ <dl class="switch">
- <dt>U+0009 CHARACTER TABULATION</dt>
- <dt>U+000A LINE FEED (LF)</dt>
- <dt>U+000C FORM FEED (FF)</dt>
- <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
- <dt>U+0020 SPACE</dt>
- <dd>Stay in the <span>after DOCTYPE system identifier state</span>.</dd>
+ <dt>U+0009 CHARACTER TABULATION</dt>
+ <dt>U+000A LINE FEED (LF)</dt>
+ <dt>U+000C FORM FEED (FF)</dt>
+ <!--<dt>U+000D CARRIAGE RETURN (CR)</dt>-->
+ <dt>U+0020 SPACE</dt>
+ <dd>Stay in the <span>after DOCTYPE system identifier state</span>.</dd>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the current DOCTYPE token. Switch to the <span>data
- state</span>.</dd>
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the current DOCTYPE token. Switch to the <span>data
+ state</span>.</dd>
- <dt>EOF</dt>
- <dd><span>Parse error</span>. Set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
- Reconsume the EOF character in the <span>data state</span>.</dd>
+ <dt>EOF</dt>
+ <dd><span>Parse error</span>. Set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>. Emit that DOCTYPE token.
+ Reconsume the EOF character in the <span>data state</span>.</dd>
- <dt>Anything else</dt>
- <dd><span>Parse error</span>. Switch to the <span>bogus DOCTYPE
- state</span>. (This does <em>not</em> set the DOCTYPE token's
- <i>force-quirks flag</i> to <i>on</i>.)</dd>
+ <dt>Anything else</dt>
+ <dd><span>Parse error</span>. Switch to the <span>bogus DOCTYPE
+ state</span>. (This does <em>not</em> set the DOCTYPE token's
+ <i>force-quirks flag</i> to <i>on</i>.)</dd>
- </dl>
+ </dl>
- </dd>
- <dt><dfn>Bogus DOCTYPE state</dfn></dt>
+ <h5><dfn>Bogus DOCTYPE state</dfn></h5>
- <dd>
+ <p>Consume the <span>next input character</span>:</p>
- <p>Consume the <span>next input character</span>:</p>
+ <dl class="switch">
- <dl class="switch">
+ <dt>U+003E GREATER-THAN SIGN (>)</dt>
+ <dd>Emit the DOCTYPE token. Switch to the <span>data
+ state</span>.</dd>
- <dt>U+003E GREATER-THAN SIGN (>)</dt>
- <dd>Emit the DOCTYPE token. Switch to the <span>data
- state</span>.</dd>
+ <dt>EOF</dt>
+ <dd>Emit the DOCTYPE token. Reconsume the EOF character in the
+ <span>data state</span>.</dd>
- <dt>EOF</dt>
- <dd>Emit the DOCTYPE token. Reconsume the EOF character in the
- <span>data state</span>.</dd>
+ <dt>Anything else</dt>
+ <dd>Stay in the <span>bogus DOCTYPE state</span>.</dd>
- <dt>Anything else</dt>
- <dd>Stay in the <span>bogus DOCTYPE state</span>.</dd>
+ </dl>
- </dl>
- </dd>
+ <h5><dfn>CDATA section state</dfn></h5>
- <dt><dfn>CDATA section state</dfn></dt>
+ <p><em>(This can only happen if the <span>content model
+ flag</span> is set to the PCDATA state, and is unrelated to the
+ <span>content model flag</span>'s CDATA state.)</em></p>
- <dd>
+ <p>Consume every character up to the next occurrence of the three
+ character sequence U+005D RIGHT SQUARE BRACKET U+005D RIGHT SQUARE
+ BRACKET U+003E GREATER-THAN SIGN (<code title="">]]></code>), or
+ the end of the file (EOF), whichever comes first. Emit a series of
+ text tokens consisting of all the characters consumed except the
+ matching three character sequence at the end (if one was found
+ before the end of the file).</p>
- <p><em>(This can only happen if the <span>content model
- flag</span> is set to the PCDATA state, and is unrelated to the
- <span>content model flag</span>'s CDATA state.)</em></p>
+ <p>Switch to the <span>data state</span>.</p>
- <p>Consume every character up to the next occurrence of the three
- character sequence U+005D RIGHT SQUARE BRACKET U+005D RIGHT SQUARE
- BRACKET U+003E GREATER-THAN SIGN (<code title="">]]></code>), or
- the end of the file (EOF), whichever comes first. Emit a series of
- text tokens consisting of all the characters consumed except the
- matching three character sequence at the end (if one was found
- before the end of the file).</p>
+ <p>If the end of the file was reached, reconsume the EOF
+ character.</p>
- <p>Switch to the <span>data state</span>.</p>
- <p>If the end of the file was reached, reconsume the EOF
- character.</p>
- </dd>
-
- </dl>
-
-
<h5>Tokenizing character references</h5>
<p>This section defines how to <dfn>consume a character
More information about the Commit-Watchers
mailing list