[html5] r1312 - /
whatwg at whatwg.org
whatwg at whatwg.org
Sun Mar 2 18:26:33 PST 2008
Author: ianh
Date: 2008-03-02 18:26:32 -0800 (Sun, 02 Mar 2008)
New Revision: 1312
Modified:
index
source
Log:
[act] (2) Merged phases and insertion modes. Theoretically, this should make absolutely no difference. Please let me know the many ways in which I screwed up.
Modified: index
===================================================================
--- index 2008-03-03 01:11:51 UTC (rev 1311)
+++ index 2008-03-03 02:26:32 UTC (rev 1312)
@@ -1602,82 +1602,81 @@
<li><a href="#tree-construction"><span class=secno>8.2.4 </span>Tree
construction</a>
<ul class=toc>
- <li><a href="#the-main"><span class=secno>8.2.4.1. </span>The main
- phase</a>
- <ul class=toc>
- <li><a href="#the-stack"><span class=secno>8.2.4.1.1. </span>The
- stack of open elements</a>
+ <li><a href="#the-stack"><span class=secno>8.2.4.1. </span>The stack
+ of open elements</a>
- <li><a href="#the-list"><span class=secno>8.2.4.1.2. </span>The
- list of active formatting elements</a>
+ <li><a href="#the-list"><span class=secno>8.2.4.2. </span>The list
+ of active formatting elements</a>
- <li><a href="#creating"><span class=secno>8.2.4.1.3.
- </span>Creating and inserting HTML elements</a>
+ <li><a href="#creating"><span class=secno>8.2.4.3. </span>Creating
+ and inserting HTML elements</a>
- <li><a href="#closing"><span class=secno>8.2.4.1.4. </span>Closing
- elements that have implied end tags</a>
+ <li><a href="#closing"><span class=secno>8.2.4.4. </span>Closing
+ elements that have implied end tags</a>
- <li><a href="#the-element"><span class=secno>8.2.4.1.5. </span>The
- element pointers</a>
+ <li><a href="#the-element"><span class=secno>8.2.4.5. </span>The
+ element pointers</a>
- <li><a href="#the-insertion"><span class=secno>8.2.4.1.6.
- </span>The insertion mode</a>
- </ul>
+ <li><a href="#the-insertion"><span class=secno>8.2.4.6. </span>The
+ insertion mode</a>
- <li><a href="#the-initial"><span class=secno>8.2.4.2. </span>The
- initial phase</a>
+ <li><a href="#the-initial"><span class=secno>8.2.4.7. </span>The
+ initial insertion mode</a>
- <li><a href="#the-root0"><span class=secno>8.2.4.3. </span>The root
- element phase</a>
+ <li><a href="#the-root0"><span class=secno>8.2.4.8. </span>The root
+ element insertion mode</a>
- <li><a href="#the-before"><span class=secno>8.2.4.4. </span>The
+ <li><a href="#the-before"><span class=secno>8.2.4.9. </span>The
before head insertion mode</a>
- <li><a href="#parsing-main-inhead"><span class=secno>8.2.4.5.
+ <li><a href="#parsing-main-inhead"><span class=secno>8.2.4.10.
</span>The in head insertion mode</a>
<li><a href="#parsing-main-inheadnoscript"><span
- class=secno>8.2.4.6. </span>The in head noscript insertion mode</a>
-
+ class=secno>8.2.4.11. </span>The in head noscript insertion
+ mode</a>
- <li><a href="#the-after"><span class=secno>8.2.4.7. </span>The after
- head insertion mode</a>
+ <li><a href="#the-after"><span class=secno>8.2.4.12. </span>The
+ after head insertion mode</a>
- <li><a href="#parsing-main-inbody"><span class=secno>8.2.4.8.
+ <li><a href="#parsing-main-inbody"><span class=secno>8.2.4.13.
</span>The in body insertion mode</a>
- <li><a href="#parsing-main-intable"><span class=secno>8.2.4.9.
+ <li><a href="#parsing-main-intable"><span class=secno>8.2.4.14.
</span>The in table insertion mode</a>
- <li><a href="#parsing-main-incaption"><span class=secno>8.2.4.10.
+ <li><a href="#parsing-main-incaption"><span class=secno>8.2.4.15.
</span>The in caption insertion mode</a>
- <li><a href="#parsing-main-incolgroup"><span class=secno>8.2.4.11.
+ <li><a href="#parsing-main-incolgroup"><span class=secno>8.2.4.16.
</span>The in column group insertion mode</a>
- <li><a href="#parsing-main-intbody"><span class=secno>8.2.4.12.
+ <li><a href="#parsing-main-intbody"><span class=secno>8.2.4.17.
</span>The in table body insertion mode</a>
- <li><a href="#parsing-main-intr"><span class=secno>8.2.4.13.
+ <li><a href="#parsing-main-intr"><span class=secno>8.2.4.18.
</span>The in row insertion mode</a>
- <li><a href="#parsing-main-intd"><span class=secno>8.2.4.14.
+ <li><a href="#parsing-main-intd"><span class=secno>8.2.4.19.
</span>The in cell insertion mode</a>
- <li><a href="#parsing-main-inselect"><span class=secno>8.2.4.15.
+ <li><a href="#parsing-main-inselect"><span class=secno>8.2.4.20.
</span>The in select insertion mode</a>
- <li><a href="#parsing-main-afterbody"><span class=secno>8.2.4.16.
+ <li><a href="#parsing-main-afterbody"><span class=secno>8.2.4.21.
</span>The after body insertion mode</a>
- <li><a href="#parsing-main-inframeset"><span class=secno>8.2.4.17.
+ <li><a href="#parsing-main-inframeset"><span class=secno>8.2.4.22.
</span>The in frameset insertion mode</a>
<li><a href="#parsing-main-afterframeset"><span
- class=secno>8.2.4.18. </span>The after frameset insertion mode</a>
+ class=secno>8.2.4.23. </span>The after frameset insertion mode</a>
- <li><a href="#the-trailing"><span class=secno>8.2.4.19. </span>The
- trailing end phase</a>
+ <li><a href="#the-after0"><span class=secno>8.2.4.24. </span>The
+ after after body insertion mode</a>
+
+ <li><a href="#the-after1"><span class=secno>8.2.4.25. </span>The
+ after after frameset insertion mode</a>
</ul>
<li><a href="#the-unexpected"><span class=secno>8.2.5 </span>The
@@ -40436,13 +40435,9 @@
is created. The "output" of this stage consists of dynamically modifying
or extending that document's DOM tree.
- <p>Tree construction passes through several phases. Initially, UAs must act
- according to the steps described as being those of <a
- href="#the-initial0">the initial phase</a>.
-
<p>This specification does not define when an interactive user agent has to
- render the <code>Document</code> available to the user, or when it has to
- begin accepting user input.
+ render the <code>Document</code> so that it is available to the user, or
+ when it has to begin accepting user input.
<p>When the steps below require the UA to <dfn id=append>append a
character</dfn> to a node, the UA must collect it and all subsequent
@@ -40474,37 +40469,39 @@
href="#hardwareLimitations">practical concerns</a> will likely force user
agents to impose nesting depths.
- <h5 id=the-main><span class=secno>8.2.4.1. </span><dfn id=the-main0>The
- main phase</dfn></h5>
-
- <p>After <a href="#the-root1">the root element phase</a>, each token
- emitted from the <a href="#tokenisation0">tokenisation</a> stage must be
- processed as described in <em>this</em> section. This is by far the most
- involved part of parsing an HTML document.
-
- <p>The tree construction stage in this phase has several pieces of state: a
- <a href="#stack">stack of open elements</a>, a <a href="#list-of4">list of
+ <p>The tree construction stage has several pieces of state: a <a
+ href="#stack">stack of open elements</a>, a <a href="#list-of4">list of
active formatting elements</a>, a <a href="#head-element"><code
title="">head</code> element pointer</a>, a <a href="#form-element"><code
title="">form</code> element pointer</a>, and an <a
href="#insertion0">insertion mode</a>.
- <p class=big-issue>We could just fold insertion modes and phases into one
- concept (and duplicate the two rules common to all insertion modes into
- all of them).
+ <p>As each token is emitted from the tokeniser, the user agent must process
+ the token according to the rules given in the section corresponding to the
+ current <a href="#insertion0">insertion mode</a>.
- <h6 id=the-stack><span class=secno>8.2.4.1.1. </span>The stack of open
- elements</h6>
+ <h5 id=the-stack><span class=secno>8.2.4.1. </span>The stack of open
+ elements</h5>
- <p>Initially the <dfn id=stack>stack of open elements</dfn> contains just
- the <code><a href="#html">html</a></code> root element node created in the
- <a href="#the-root1" title="the root element phase">last phase</a> before
- switching to <em>this</em> phase (or, in the <a href="#fragment">fragment
- case</a>, the <code><a href="#html">html</a></code> element created as
- part of <a href="#html-fragment0" title="html fragment parsing
- algorithm">that algorithm</a>). That's the topmost node of the stack. It
- never gets popped off the stack. (This stack grows downwards.)
+ <p>Initially the <dfn id=stack>stack of open elements</dfn> is empty.
+ <p>The <a href="#root-element0" title="insertion mode: root element">root
+ element insertion mode</a> creates the <code><a
+ href="#html">html</a></code> root element node, which is then added to the
+ stack.
+
+ <p>In the <a href="#fragment">fragment case</a>, the <a href="#stack">stack
+ of open elements</a> is initialised to contain an <code><a
+ href="#html">html</a></code> element that is created as part of <a
+ href="#html-fragment0" title="html fragment parsing algorithm">that
+ algorithm</a>. (The <a href="#fragment">fragment case</a> skips the <a
+ href="#root-element0" title="insertion mode: root element">root element
+ insertion mode</a>.)
+
+ <p>The <code><a href="#html">html</a></code> node, however it is created,
+ is the topmost node of the stack. It never gets popped off the stack.
+ (This stack grows downwards.)
+
<p>The <dfn id=current4>current node</dfn> is the bottommost node in this
stack.
@@ -40653,8 +40650,8 @@
misnested formatting elements</a>), the stack is manipulated in a
random-access fashion.
- <h6 id=the-list><span class=secno>8.2.4.1.2. </span>The list of active
- formatting elements</h6>
+ <h5 id=the-list><span class=secno>8.2.4.2. </span>The list of active
+ formatting elements</h5>
<p>Initially the <dfn id=list-of4>list of active formatting elements</dfn>
is empty. It is used to handle mis-nested <a href="#formatting"
@@ -40742,8 +40739,8 @@
<li>Go to step 1.
</ol>
- <h6 id=creating><span class=secno>8.2.4.1.3. </span>Creating and inserting
- HTML elements</h6>
+ <h5 id=creating><span class=secno>8.2.4.3. </span>Creating and inserting
+ HTML elements</h5>
<p>When the steps below require the UA to <dfn id=create title="create an
element for the token">create an element for a token</dfn>, the UA must
@@ -40814,8 +40811,8 @@
error</a>.
</ol>
- <h6 id=closing><span class=secno>8.2.4.1.4. </span>Closing elements that
- have implied end tags</h6>
+ <h5 id=closing><span class=secno>8.2.4.4. </span>Closing elements that have
+ implied end tags</h5>
<p>When the steps below require the UA to <dfn id=generate>generate implied
end tags</dfn>, then, if the <a href="#current4">current node</a> is a
@@ -40835,7 +40832,7 @@
element to exclude from the process, then the UA must perform the above
steps as if that element was not in the above list.
- <h6 id=the-element><span class=secno>8.2.4.1.5. </span>The element pointers</h6>
+ <h5 id=the-element><span class=secno>8.2.4.5. </span>The element pointers</h5>
<p>Initially the <dfn id=head-element><code title="">head</code> element
pointer</dfn> and the <dfn id=form-element><code title="">form</code>
@@ -40851,12 +40848,13 @@
associate with forms in the face of dramatically bad markup, for
historical reasons.
- <h6 id=the-insertion><span class=secno>8.2.4.1.6. </span>The insertion mode</h6>
+ <h5 id=the-insertion><span class=secno>8.2.4.6. </span>The insertion mode</h5>
<p>Initially the <dfn id=insertion0>insertion mode</dfn> is "<a
- href="#before4" title="insertion mode: before head">before head</a>". It
- can change to "<a href="#in-head" title="insertion mode: in head">in
- head</a>", "<a href="#in-head0" title="insertion mode: in head
+ href="#initial" title="insertion mode: initial">initial</a>". It can
+ change to "<a href="#root-element0" title="insertion mode: root
+ element">root element</a>", "<a href="#in-head" title="insertion mode: in
+ head">in head</a>", "<a href="#in-head0" title="insertion mode: in head
noscript">in head noscript</a>", "<a href="#after4" title="insertion mode:
after head">after head</a>", "<a href="#in-body" title="insertion mode: in
body">in body</a>", "<a href="#in-table" title="insertion mode: in
@@ -40869,15 +40867,12 @@
title="insertion mode: in select">in select</a>", "<a href="#after5"
title="insertion mode: after body">after body</a>", "<a
href="#in-frameset" title="insertion mode: in frameset">in frameset</a>",
- and "<a href="#after6" title="insertion mode: after frameset">after
- frameset</a>" during the course of the parsing, as described below. It
- affects how certain tokens are processed.
+ "<a href="#after6" title="insertion mode: after frameset">after
+ frameset</a>", "<a href="#after7" title="insertion mode: after after
+ body">after after body</a>", and "<a href="#after8" title="insertion mode:
+ after after frameset">after after frameset</a>" during the course of the
+ parsing, as described below. It affects how certain tokens are processed.
- <p>If the tree construction stage is switched from <a href="#the-main0">the
- main phase</a> to <a href="#the-trailing0">the trailing end phase</a> and
- back again, the various pieces of state are not reset; the UA must act as
- if the state was maintained.
-
<p>When the steps below require the UA to <dfn id=reset>reset the insertion
mode appropriately</dfn>, it means the UA must follow these steps:
@@ -41027,14 +41022,11 @@
</ol>
-->
- <p>`
+ <h5 id=the-initial><span class=secno>8.2.4.7. </span>The <dfn id=initial
+ title="insertion mode: initial">initial</dfn> insertion mode</h5>
- <h5 id=the-initial><span class=secno>8.2.4.2. </span><dfn
- id=the-initial0>The initial phase</dfn></h5>
+ <p>Handle the token as follows:
- <p>Initially, the tree construction stage must handle each token emitted
- from the <a href="#tokenisation0">tokenisation</a> stage as follows:
-
<dl class=switch>
<dt>A character token that is one of one of U+0009 CHARACTER TABULATION,
U+000A LINE FEED (LF), U+000B LINE TABULATION, U+000C FORM FEED (FF),
@@ -41335,8 +41327,9 @@
compared to the values given in the lists above in a
case-insensitive<!-- ASCII --> manner.</p>
- <p>Then, switch to <a href="#the-root1">the root element phase</a> of the
- tree construction stage.</p>
+ <p>Then, change the <a href="#insertion0">insertion mode</a> to "<a
+ href="#root-element0" title="insertion mode: root element">root
+ element</a>".</p>
<dt>A start tag token
@@ -41353,16 +41346,16 @@
<p>Set the document to <a href="#quirks">quirks mode</a>.</p>
- <p>Then, switch to <a href="#the-root1">the root element phase</a> of the
- tree construction stage and reprocess the current token.</p>
+ <p>Then, change the <a href="#insertion0">insertion mode</a> to "<a
+ href="#root-element0" title="insertion mode: root element">root
+ element</a>".</p>
</dl>
- <h5 id=the-root0><span class=secno>8.2.4.3. </span><dfn id=the-root1>The
- root element phase</dfn></h5>
+ <h5 id=the-root0><span class=secno>8.2.4.8. </span>The <dfn
+ id=root-element0 title="insertion mode: root element">root element</dfn>
+ insertion mode</h5>
- <p>After <a href="#the-initial0">the initial phase</a>, as each token is
- emitted from the <a href="#tokenisation0">tokenisation</a> stage, it must
- be processed as described in this section.
+ <p>Handle the token as follows:
<dl class=switch>
<dt>A DOCTYPE token
@@ -41407,9 +41400,11 @@
<p>Create an <code><a href="#htmlelement">HTMLElement</a></code> node
with the tag name <code><a href="#html">html</a></code>, in the <a
href="#html-namespace0">HTML namespace</a>. Append it to the
- <code>Document</code> object. Switch to <a href="#the-main0">the main
- phase</a> and reprocess the current token.</p>
+ <code>Document</code> object.</p>
+ <p>Change the <a href="#insertion0">insertion mode</a> to "<a
+ href="#before4" title="insertion mode: before head">before head</a>".</p>
+
<p class=big-issue>Should probably make end tags be ignored, so that
"</head><!-- --><html>" puts the comment before the root node
(or should we?)</p>
@@ -41420,7 +41415,7 @@
content continues being appended to the nodes as described in the next
section.
- <h5 id=the-before><span class=secno>8.2.4.4. </span>The <dfn id=before4
+ <h5 id=the-before><span class=secno>8.2.4.9. </span>The <dfn id=before4
title="insertion mode: before head">before head</dfn> insertion mode</h5>
<p>Handle the token as follows:
@@ -41513,7 +41508,7 @@
after head">after head</a>" <a href="#insertion0">insertion mode</a>.</p>
</dl>
- <h5 id=parsing-main-inhead><span class=secno>8.2.4.5. </span>The <dfn
+ <h5 id=parsing-main-inhead><span class=secno>8.2.4.10. </span>The <dfn
id=in-head title="insertion mode: in head">in head</dfn> insertion mode</h5>
<p>Handle the token as follows.
@@ -41764,7 +41759,7 @@
get put into the head. Do we want to copy that?</p>
</dl>
- <h5 id=parsing-main-inheadnoscript><span class=secno>8.2.4.6. </span>The
+ <h5 id=parsing-main-inheadnoscript><span class=secno>8.2.4.11. </span>The
<dfn id=in-head0 title="insertion mode: in head noscript">in head
noscript</dfn> insertion mode</h5>
@@ -41832,7 +41827,7 @@
name "noscript" had been seen and reprocess the current token.</p>
</dl>
- <h5 id=the-after><span class=secno>8.2.4.7. </span>The <dfn id=after4
+ <h5 id=the-after><span class=secno>8.2.4.12. </span>The <dfn id=after4
title="insertion mode: after head">after head</dfn> insertion mode</h5>
<p>Handle the token as follows:
@@ -41915,7 +41910,7 @@
had been seen, and then reprocess the current token.</p>
</dl>
- <h5 id=parsing-main-inbody><span class=secno>8.2.4.8. </span>The <dfn
+ <h5 id=parsing-main-inbody><span class=secno>8.2.4.13. </span>The <dfn
id=in-body title="insertion mode: in body">in body</dfn> insertion mode</h5>
<p>Handle the token as follows:
@@ -42871,7 +42866,7 @@
</ol>
</dl>
- <h5 id=parsing-main-intable><span class=secno>8.2.4.9. </span>The <dfn
+ <h5 id=parsing-main-intable><span class=secno>8.2.4.14. </span>The <dfn
id=in-table title="insertion mode: in table">in table</dfn> insertion mode</h5>
<dl class=switch>
@@ -43048,7 +43043,7 @@
href="#html">html</a></code> element after this process is a <a
href="#fragment">fragment case</a>.
- <h5 id=parsing-main-incaption><span class=secno>8.2.4.10. </span>The <dfn
+ <h5 id=parsing-main-incaption><span class=secno>8.2.4.15. </span>The <dfn
id=in-caption title="insertion mode: in caption">in caption</dfn>
insertion mode</h5>
@@ -43108,7 +43103,7 @@
was "<a href="#in-body" title="insertion mode: in body">in body</a>".</p>
</dl>
- <h5 id=parsing-main-incolgroup><span class=secno>8.2.4.11. </span>The <dfn
+ <h5 id=parsing-main-incolgroup><span class=secno>8.2.4.16. </span>The <dfn
id=in-column title="insertion mode: in column group">in column group</dfn>
insertion mode</h5>
@@ -43184,7 +43179,7 @@
href="#fragment">fragment case</a>.</p>
</dl>
- <h5 id=parsing-main-intbody><span class=secno>8.2.4.12. </span>The <dfn
+ <h5 id=parsing-main-intbody><span class=secno>8.2.4.17. </span>The <dfn
id=in-table0 title="insertion mode: in table body">in table body</dfn>
insertion mode</h5>
@@ -43274,7 +43269,7 @@
href="#html">html</a></code> element after this process is a <a
href="#fragment">fragment case</a>.
- <h5 id=parsing-main-intr><span class=secno>8.2.4.13. </span>The <dfn
+ <h5 id=parsing-main-intr><span class=secno>8.2.4.18. </span>The <dfn
id=in-row title="insertion mode: in row">in row</dfn> insertion mode</h5>
<p>Handle the token as follows.
@@ -43363,7 +43358,7 @@
href="#html">html</a></code> element after this process is a <a
href="#fragment">fragment case</a>.
- <h5 id=parsing-main-intd><span class=secno>8.2.4.14. </span>The <dfn
+ <h5 id=parsing-main-intd><span class=secno>8.2.4.19. </span>The <dfn
id=in-cell title="insertion mode: in cell">in cell</dfn> insertion mode</h5>
<p>Handle the token as follows.
@@ -43462,7 +43457,7 @@
neither when the <a href="#insertion0">insertion mode</a> is "<a
href="#in-cell" title="insertion mode: in cell">in cell</a>".
- <h5 id=parsing-main-inselect><span class=secno>8.2.4.15. </span>The <dfn
+ <h5 id=parsing-main-inselect><span class=secno>8.2.4.20. </span>The <dfn
id=in-select title="insertion mode: in select">in select</dfn> insertion
mode</h5>
@@ -43581,7 +43576,7 @@
<p><a href="#parse0">Parse error</a>. Ignore the token.</p>
</dl>
- <h5 id=parsing-main-afterbody><span class=secno>8.2.4.16. </span>The <dfn
+ <h5 id=parsing-main-afterbody><span class=secno>8.2.4.21. </span>The <dfn
id=after5 title="insertion mode: after body">after body</dfn> insertion
mode</h5>
@@ -43632,8 +43627,9 @@
an <code><a href="#html">html</a></code> element in this case.) (<a
href="#fragment">fragment case</a>)</p>
- <p>Otherwise, switch to <a href="#the-trailing0">the trailing end
- phase</a>.</p>
+ <p>Then, change the <a href="#insertion0">insertion mode</a> to "<a
+ href="#after7" title="insertion mode: after after body">after after
+ body</a>".</p>
<dt>Anything else
@@ -43643,7 +43639,7 @@
title="insertion mode: in body">in body</a>" and reprocess the token.</p>
</dl>
- <h5 id=parsing-main-inframeset><span class=secno>8.2.4.17. </span>The <dfn
+ <h5 id=parsing-main-inframeset><span class=secno>8.2.4.22. </span>The <dfn
id=in-frameset title="insertion mode: in frameset">in frameset</dfn>
insertion mode</h5>
@@ -43726,7 +43722,7 @@
<p><a href="#parse0">Parse error</a>. Ignore the token.</p>
</dl>
- <h5 id=parsing-main-afterframeset><span class=secno>8.2.4.18. </span>The
+ <h5 id=parsing-main-afterframeset><span class=secno>8.2.4.23. </span>The
<dfn id=after6 title="insertion mode: after frameset">after frameset</dfn>
insertion mode</h5>
@@ -43768,7 +43764,9 @@
<dt>An end tag whose tag name is "html"
<dd>
- <p>Switch to <a href="#the-trailing0">the trailing end phase</a>.</p>
+ <p>Change the <a href="#insertion0">insertion mode</a> to "<a
+ href="#after8" title="insertion mode: after after frameset">after after
+ frameset</a>".</p>
<dt>A start tag whose tag name is "noframes"
@@ -43787,18 +43785,17 @@
that do support frames but want to show the NOFRAMES content. Supporting
the former is easy; supporting the latter is harder.
- <h5 id=the-trailing><span class=secno>8.2.4.19. </span><dfn
- id=the-trailing0>The trailing end phase</dfn></h5>
+ <h5 id=the-after0><span class=secno>8.2.4.24. </span>The <dfn id=after7
+ title="insertion mode: after after body">after after body</dfn> insertion
+ mode</h5>
- <p>After <a href="#the-main0">the main phase</a>, as each token is emitted
- from the <a href="#tokenisation0">tokenisation</a> stage, it must be
- processed as described in this section.
+ <p>Handle the token as follows:
<dl class=switch>
- <dt>A DOCTYPE token
+ <dt>An end-of-file token
<dd>
- <p><a href="#parse0">Parse error</a>. Ignore the token.</p>
+ <p><a href="#stops0">Stop parsing</a>.</p>
<dt>A comment token
@@ -43807,35 +43804,66 @@
with the <code title="">data</code> attribute set to the data given in
the comment token.</p>
+ <dt>A DOCTYPE token
+
<dt>A character token that is one of one of U+0009 CHARACTER TABULATION,
U+000A LINE FEED (LF), U+000B LINE TABULATION, U+000C FORM FEED (FF),
<!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE
+ <dt>A start tag whose tag name is "html"
+
<dd>
- <p>Process the token as it would be processed in <a href="#the-main0">the
- main phase</a>.</p>
- <!-- if there was a <body>, the space will go
- into it, otherwise (e.g. if there was a <frameset>) it'll go into
- the <html> node (this is important in case we have "foo</html>
- bar", as we don't want that to become one word) -->
-
+ <p>Process the token as if the <a href="#insertion0">insertion mode</a>
+ had been "<a href="#in-body" title="insertion mode: in body">in
+ body</a>".</p>
- <dt>A character token that is <em>not</em> one of U+0009 CHARACTER
- TABULATION, U+000A LINE FEED (LF), U+000B LINE TABULATION, U+000C FORM
- FEED (FF), <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE
+ <dt>Anything else
- <dt>A start tag token
+ <dd>
+ <p><a href="#parse0">Parse error</a>. Set the <a
+ href="#insertion0">insertion mode</a> to "<a href="#in-body"
+ title="insertion mode: in body">in body</a>" and reprocess the token.</p>
+ </dl>
- <dt>An end tag token
+ <h5 id=the-after1><span class=secno>8.2.4.25. </span>The <dfn id=after8
+ title="insertion mode: after after frameset">after after frameset</dfn>
+ insertion mode</h5>
- <dd>
- <p><a href="#parse0">Parse error</a>. Switch back to <a
- href="#the-main0">the main phase</a> and reprocess the token.</p>
+ <p>Handle the token as follows:
+ <dl class=switch>
<dt>An end-of-file token
<dd>
<p><a href="#stops0">Stop parsing</a>.</p>
+
+ <dt>A comment token
+
+ <dd>
+ <p>Append a <code>Comment</code> node to the <code>Document</code> object
+ with the <code title="">data</code> attribute set to the data given in
+ the comment token.</p>
+
+ <dt>A DOCTYPE token
+
+ <dt>A character token that is one of one of U+0009 CHARACTER TABULATION,
+ U+000A LINE FEED (LF), U+000B LINE TABULATION, U+000C FORM FEED (FF),
+ <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE
+
+ <dt>A start tag whose tag name is "html"
+
+ <dd>
+ <p>Process the token as if the <a href="#insertion0">insertion mode</a>
+ had been "<a href="#in-body" title="insertion mode: in body">in
+ body</a>".</p>
+
+ <dt>Anything else
+
+ <dd>
+ <p><a href="#parse0">Parse error</a>. Set the <a
+ href="#insertion0">insertion mode</a> to "<a href="#in-frameset"
+ title="insertion mode: in frameset">in frameset</a>" and reprocess the
+ token.</p>
</dl>
<h4 id=the-unexpected><span class=secno>8.2.5 </span>The unexpected end</h4>
@@ -44245,11 +44273,6 @@
</dl>
<li>
- <p>Switch the <a href="#html-0">HTML parser</a>'s <a
- href="#tree-construction0">tree construction</a> stage to <a
- href="#the-main0">the main phase</a>.
-
- <li>
<p>Let <var title="">root</var> be a new <code><a
href="#html">html</a></code> element with no attributes.</p>
Modified: source
===================================================================
--- source 2008-03-03 01:11:51 UTC (rev 1311)
+++ source 2008-03-03 02:26:32 UTC (rev 1312)
@@ -37722,13 +37722,9 @@
parser is created. The "output" of this stage consists of
dynamically modifying or extending that document's DOM tree.</p>
- <p>Tree construction passes through several phases. Initially, UAs
- must act according to the steps described as being those of
- <span>the initial phase</span>.</p>
-
<p>This specification does not define when an interactive user agent
- has to render the <code>Document</code> available to the user, or
- when it has to begin accepting user input.</p>
+ has to render the <code>Document</code> so that it is available to
+ the user, or when it has to begin accepting user input.</p>
<p>When the steps below require the UA to <dfn>append a
character</dfn> to a node, the UA must collect it and all subsequent
@@ -37758,36 +37754,36 @@
concerns</a> will likely force user agents to impose nesting
depths.</p>
-
- <h5><dfn>The main phase</dfn></h5>
-
- <p>After <span>the root element phase</span>, each token emitted
- from the <span>tokenisation</span> stage must be processed as
- described in <em>this</em> section. This is by far the most involved
- part of parsing an HTML document.</p>
-
- <p>The tree construction stage in this phase has several pieces of
- state: a <span>stack of open elements</span>, a <span>list of active
+ <p>The tree construction stage has several pieces of state: a
+ <span>stack of open elements</span>, a <span>list of active
formatting elements</span>, a <span><code title="">head</code>
element pointer</span>, a <span><code title="">form</code> element
pointer</span>, and an <span>insertion mode</span>.</p>
- <p class="big-issue">We could just fold insertion modes and phases
- into one concept (and duplicate the two rules common to all
- insertion modes into all of them).</p>
+ <p>As each token is emitted from the tokeniser, the user agent must
+ process the token according to the rules given in the section
+ corresponding to the current <span>insertion mode</span>.</p>
- <h6>The stack of open elements</h6>
+ <h5>The stack of open elements</h5>
- <p>Initially the <dfn>stack of open elements</dfn> contains just the
- <code>html</code> root element node created in the <span title="the
- root element phase">last phase</span> before switching to
- <em>this</em> phase (or, in the <span>fragment case</span>, the
- <code>html</code> element created as part of <span title="html
- fragment parsing algorithm">that algorithm</span>). That's the
- topmost node of the stack. It never gets popped off the stack. (This
- stack grows downwards.)</p>
+ <p>Initially the <dfn>stack of open elements</dfn> is empty.</p>
+ <p>The <span title="insertion mode: root element">root element
+ insertion mode</span> creates the <code>html</code> root element
+ node, which is then added to the stack.</p>
+
+ <p>In the <span>fragment case</span>, the <span>stack of open
+ elements</span> is initialised to contain an <code>html</code>
+ element that is created as part of <span title="html fragment
+ parsing algorithm">that algorithm</span>. (The <span>fragment
+ case</span> skips the <span title="insertion mode: root
+ element">root element insertion mode</span>.)</p>
+
+ <p>The <code>html</code> node, however it is created, is the topmost
+ node of the stack. It never gets popped off the stack. (This stack
+ grows downwards.)</p>
+
<p>The <dfn>current node</dfn> is the bottommost node in this
stack.</p>
@@ -37903,7 +37899,7 @@
the stack is manipulated in a random-access fashion.</p>
- <h6>The list of active formatting elements</h6>
+ <h5>The list of active formatting elements</h5>
<p>Initially the <dfn>list of active formatting elements</dfn> is
empty. It is used to handle mis-nested <span
@@ -38001,7 +37997,7 @@
</ol>
- <h6>Creating and inserting HTML elements</h6>
+ <h5>Creating and inserting HTML elements</h5>
<p>When the steps below require the UA to <dfn title="create an
element for the token">create an element for a token</dfn>, the UA
@@ -38070,7 +38066,7 @@
- <h6>Closing elements that have implied end tags</h6>
+ <h5>Closing elements that have implied end tags</h5>
<p>When the steps below require the UA to <dfn>generate implied end
tags</dfn>, then, if the <span>current node</span> is a
@@ -38088,7 +38084,7 @@
list.</p>
- <h6>The element pointers</h6>
+ <h5>The element pointers</h5>
<p>Initially the <dfn><code title="">head</code> element
pointer</dfn> and the <dfn><code title="">form</code> element
@@ -38105,32 +38101,31 @@
markup, for historical reasons.</p>
- <h6>The insertion mode</h6>
+ <h5>The insertion mode</h5>
<p>Initially the <dfn>insertion mode</dfn> is "<span
- title="insertion mode: before head">before head</span>". It can
- change to "<span title="insertion mode: in head">in head</span>",
- "<span title="insertion mode: in head noscript">in head
- noscript</span>", "<span title="insertion mode: after head">after
- head</span>", "<span title="insertion mode: in body">in
- body</span>", "<span title="insertion mode: in table">in
- table</span>", "<span title="insertion mode: in caption">in
- caption</span>", "<span title="insertion mode: in column group">in
- column group</span>", "<span title="insertion mode: in table
- body">in table body</span>", "<span title="insertion mode: in
- row">in row</span>", "<span title="insertion mode: in cell">in
- cell</span>", "<span title="insertion mode: in select">in
- select</span>", "<span title="insertion mode: after body">after
- body</span>", "<span title="insertion mode: in frameset">in
- frameset</span>", and "<span title="insertion mode: after
- frameset">after frameset</span>" during the course of the parsing,
- as described below. It affects how certain tokens are processed.</p>
+ title="insertion mode: initial">initial</span>". It can change to
+ "<span title="insertion mode: root element">root element</span>",
+ "<span title="insertion mode: in head">in head</span>", "<span
+ title="insertion mode: in head noscript">in head noscript</span>",
+ "<span title="insertion mode: after head">after head</span>", "<span
+ title="insertion mode: in body">in body</span>", "<span
+ title="insertion mode: in table">in table</span>", "<span
+ title="insertion mode: in caption">in caption</span>", "<span
+ title="insertion mode: in column group">in column group</span>",
+ "<span title="insertion mode: in table body">in table body</span>",
+ "<span title="insertion mode: in row">in row</span>", "<span
+ title="insertion mode: in cell">in cell</span>", "<span
+ title="insertion mode: in select">in select</span>", "<span
+ title="insertion mode: after body">after body</span>", "<span
+ title="insertion mode: in frameset">in frameset</span>", "<span
+ title="insertion mode: after frameset">after frameset</span>",
+ "<span title="insertion mode: after after body">after after
+ body</span>", and "<span title="insertion mode: after after
+ frameset">after after frameset</span>" during the course of the
+ parsing, as described below. It affects how certain tokens are
+ processed.</p>
- <p>If the tree construction stage is switched from <span>the main
- phase</span> to <span>the trailing end phase</span> and back again,
- the various pieces of state are not reset; the UA must act as if the
- state was maintained.</p>
-
<p>When the steps below require the UA to <dfn>reset the insertion
mode appropriately</dfn>, it means the UA must follow these
steps:</p>
@@ -38277,13 +38272,12 @@
</ol>
-->
-`
- <h5><dfn>The initial phase</dfn></h5>
- <p>Initially, the tree construction stage must handle each token
- emitted from the <span>tokenisation</span> stage as follows:</p>
+ <h5>The <dfn title="insertion mode: initial">initial</dfn> insertion mode</h5>
+ <p>Handle the token as follows:</p>
+
<dl class="switch">
<dt>A character token that is one of one of U+0009 CHARACTER
@@ -38428,8 +38422,8 @@
be compared to the values given in the lists above in a
case-insensitive<!-- ASCII --> manner.</p>
- <p>Then, switch to <span>the root element phase</span> of the tree
- construction stage.</p>
+ <p>Then, change the <span>insertion mode</span> to "<span
+ title="insertion mode: root element">root element</span>".</p>
</dd>
@@ -38445,19 +38439,17 @@
<p>Set the document to <span>quirks mode</span>.</p>
- <p>Then, switch to <span>the root element phase</span> of the tree
- construction stage and reprocess the current token.</p>
+ <p>Then, change the <span>insertion mode</span> to "<span
+ title="insertion mode: root element">root element</span>".</p>
</dd>
</dl>
- <h5><dfn>The root element phase</dfn></h5>
+ <h5>The <dfn title="insertion mode: root element">root element</dfn> insertion mode</h5>
- <p>After <span>the initial phase</span>, as each token is emitted
- from the <span>tokenisation</span> stage, it must be processed as
- described in this section.</p>
+ <p>Handle the token as follows:</p>
<dl class="switch">
@@ -38499,9 +38491,11 @@
<p>Create an <code>HTMLElement</code> node with the tag name
<code>html</code>, in the <span>HTML namespace</span>. Append it
- to the <code>Document</code> object. Switch to <span>the main
- phase</span> and reprocess the current token.</p>
+ to the <code>Document</code> object.</p>
+ <p>Change the <span>insertion mode</span> to "<span
+ title="insertion mode: before head">before head</span>".</p>
+
<p class="big-issue">Should probably make end tags be ignored, so
that "</head><!-- --><html>" puts the comment before the
root node (or should we?)</p>
@@ -40911,8 +40905,9 @@
be an <code>html</code> element in this case.)
(<span>fragment case</span>)</p>
- <p>Otherwise, switch to <span>the trailing end
- phase</span>.</p>
+ <p>Then, change the <span>insertion mode</span> to "<span
+ title="insertion mode: after after body">after after
+ body</span>".</p>
</dd>
@@ -41056,7 +41051,9 @@
<dt>An end tag whose tag name is "html"</dt>
<dd>
- <p>Switch to <span>the trailing end phase</span>.</p>
+ <p>Change the <span>insertion mode</span> to "<span
+ title="insertion mode: after after frameset">after after
+ frameset</span>".</p>
</dd>
<dt>A start tag whose tag name is "noframes"</dt>
@@ -41078,17 +41075,15 @@
harder.</p>
- <h5><dfn>The trailing end phase</dfn></h5>
+ <h5>The <dfn title="insertion mode: after after body">after after body</dfn> insertion mode</h5>
- <p>After <span>the main phase</span>, as each token is emitted from
- the <span>tokenisation</span> stage, it must be processed as
- described in this section.</p>
+ <p>Handle the token as follows:</p>
<dl class="switch">
- <dt>A DOCTYPE token</dt>
+ <dt>An end-of-file token</dt>
<dd>
- <p><span>Parse error</span>. Ignore the token.</p>
+ <p><span>Stop parsing</span>.</p>
</dd>
<dt>A comment token</dt>
@@ -41098,33 +41093,63 @@
data given in the comment token.</p>
</dd>
+ <dt>A DOCTYPE token</dt>
<dt>A character token that is one of one of U+0009 CHARACTER
TABULATION, U+000A LINE FEED (LF), U+000B LINE TABULATION, U+000C
FORM FEED (FF), <!--U+000D CARRIAGE RETURN (CR),--> or U+0020
SPACE</dt>
+ <dt>A start tag whose tag name is "html"</dt>
<dd>
- <p>Process the token as it would be processed in <span>the main
- phase</span>.</p> <!-- if there was a <body>, the space will go
- into it, otherwise (e.g. if there was a <frameset>) it'll go into
- the <html> node (this is important in case we have "foo</html>
- bar", as we don't want that to become one word) -->
+ <p>Process the token as if the <span>insertion mode</span> had
+ been "<span title="insertion mode: in body">in body</span>".</p>
</dd>
- <dt>A character token that is <em>not</em> one of U+0009 CHARACTER
- TABULATION, U+000A LINE FEED (LF), U+000B LINE TABULATION, U+000C
- FORM FEED (FF), <!--U+000D CARRIAGE RETURN (CR),--> or U+0020 SPACE</dt>
- <dt>A start tag token</dt>
- <dt>An end tag token</dt>
+ <dt>Anything else</dt>
<dd>
- <p><span>Parse error</span>. Switch back to <span>the main
- phase</span> and reprocess the token.</p>
+ <p><span>Parse error</span>. Set the <span>insertion mode</span>
+ to "<span title="insertion mode: in body">in body</span>" and
+ reprocess the token.</p>
</dd>
+ </dl>
+
+
+ <h5>The <dfn title="insertion mode: after after frameset">after after frameset</dfn> insertion mode</h5>
+
+ <p>Handle the token as follows:</p>
+
+ <dl class="switch">
+
<dt>An end-of-file token</dt>
<dd>
<p><span>Stop parsing</span>.</p>
</dd>
+
+ <dt>A comment token</dt>
+ <dd>
+ <p>Append a <code>Comment</code> node to the <code>Document</code>
+ object with the <code title="">data</code> attribute set to the
+ data given in the comment token.</p>
+ </dd>
+ <dt>A DOCTYPE token</dt>
+ <dt>A character token that is one of one of U+0009 CHARACTER
+ TABULATION, U+000A LINE FEED (LF), U+000B LINE TABULATION, U+000C
+ FORM FEED (FF), <!--U+000D CARRIAGE RETURN (CR),--> or U+0020
+ SPACE</dt>
+ <dt>A start tag whose tag name is "html"</dt>
+ <dd>
+ <p>Process the token as if the <span>insertion mode</span> had
+ been "<span title="insertion mode: in body">in body</span>".</p>
+ </dd>
+
+ <dt>Anything else</dt>
+ <dd>
+ <p><span>Parse error</span>. Set the <span>insertion mode</span>
+ to "<span title="insertion mode: in frameset">in frameset</span>" and
+ reprocess the token.</p>
+ </dd>
+
</dl>
@@ -41590,13 +41615,6 @@
<li>
- <p>Switch the <span>HTML parser</span>'s <span>tree
- construction</span> stage to <span>the main phase</span>.
-
- </li>
-
- <li>
-
<p>Let <var title="">root</var> be a new <code>html</code> element
with no attributes.</p>
More information about the Commit-Watchers
mailing list