[html5] r2139 - [ct] (0) Rearchitect how RCDATA/CDATA blocks work so that they don't involve inv [...]
whatwg at whatwg.org
whatwg at whatwg.org
Tue Sep 2 02:43:08 PDT 2008
Author: ianh
Date: 2008-09-02 02:42:45 -0700 (Tue, 02 Sep 2008)
New Revision: 2139
Modified:
index
source
Log:
[ct] (0) Rearchitect how RCDATA/CDATA blocks work so that they don't involve invoking the tokeniser in a weird way. (credit: w)
Modified: index
===================================================================
--- index 2008-09-02 07:25:09 UTC (rev 2138)
+++ index 2008-09-02 09:42:45 UTC (rev 2139)
@@ -2071,48 +2071,51 @@
<li><a href="#parsing-main-inbody"><span class=secno>8.2.5.10.
</span>The "in body" insertion mode</a>
- <li><a href="#parsing-main-intable"><span class=secno>8.2.5.11.
+ <li><a href="#parsing-main-incdata"><span class=secno>8.2.5.11.
+ </span>The "in CDATA/RCDATA" insertion mode</a>
+
+ <li><a href="#parsing-main-intable"><span class=secno>8.2.5.12.
</span>The "in table" insertion mode</a>
- <li><a href="#parsing-main-incaption"><span class=secno>8.2.5.12.
+ <li><a href="#parsing-main-incaption"><span class=secno>8.2.5.13.
</span>The "in caption" insertion mode</a>
- <li><a href="#parsing-main-incolgroup"><span class=secno>8.2.5.13.
+ <li><a href="#parsing-main-incolgroup"><span class=secno>8.2.5.14.
</span>The "in column group" insertion mode</a>
- <li><a href="#parsing-main-intbody"><span class=secno>8.2.5.14.
+ <li><a href="#parsing-main-intbody"><span class=secno>8.2.5.15.
</span>The "in table body" insertion mode</a>
- <li><a href="#parsing-main-intr"><span class=secno>8.2.5.15.
+ <li><a href="#parsing-main-intr"><span class=secno>8.2.5.16.
</span>The "in row" insertion mode</a>
- <li><a href="#parsing-main-intd"><span class=secno>8.2.5.16.
+ <li><a href="#parsing-main-intd"><span class=secno>8.2.5.17.
</span>The "in cell" insertion mode</a>
- <li><a href="#parsing-main-inselect"><span class=secno>8.2.5.17.
+ <li><a href="#parsing-main-inselect"><span class=secno>8.2.5.18.
</span>The "in select" insertion mode</a>
<li><a href="#parsing-main-inselectintable"><span
- class=secno>8.2.5.18. </span>The "in select in table" insertion
+ class=secno>8.2.5.19. </span>The "in select in table" insertion
mode</a>
- <li><a href="#parsing-main-inforeign"><span class=secno>8.2.5.19.
+ <li><a href="#parsing-main-inforeign"><span class=secno>8.2.5.20.
</span>The "in foreign content" insertion mode</a>
- <li><a href="#parsing-main-afterbody"><span class=secno>8.2.5.20.
+ <li><a href="#parsing-main-afterbody"><span class=secno>8.2.5.21.
</span>The "after body" insertion mode</a>
- <li><a href="#parsing-main-inframeset"><span class=secno>8.2.5.21.
+ <li><a href="#parsing-main-inframeset"><span class=secno>8.2.5.22.
</span>The "in frameset" insertion mode</a>
<li><a href="#parsing-main-afterframeset"><span
- class=secno>8.2.5.22. </span>The "after frameset" insertion
+ class=secno>8.2.5.23. </span>The "after frameset" insertion
mode</a>
- <li><a href="#the-after0"><span class=secno>8.2.5.23. </span>The
+ <li><a href="#the-after0"><span class=secno>8.2.5.24. </span>The
"after after body" insertion mode</a>
- <li><a href="#the-after1"><span class=secno>8.2.5.24. </span>The
+ <li><a href="#the-after1"><span class=secno>8.2.5.25. </span>The
"after after frameset" insertion mode</a>
</ul>
@@ -26746,9 +26749,25 @@
encoding</var></dfn>. They are determined when the script is run, based on
the attributes on the element at that time.
+ <p>When an <span>XML parser</span> creates a <code><a
+ href="#script1">script</a></code> element, it must be marked as being <a
+ href="#parser-inserted">"parser-inserted"</a>. When the element's end tag
+ is parsed, the user agent must <a href="#running" title="running a
+ script">run</a> the <code><a href="#script1">script</a></code> element.
+
+ <p class=note>Equivalent requirements exist for the <a href="#html-0">HTML
+ parser</a>, but they are detailed in that section instead.
+
+ <p>When a <code><a href="#script1">script</a></code> element that is marked
+ as neither having <a href="#already">"already executed"</a> nor being <a
+ href="#parser-inserted">"parser-inserted"</a> is <span>inserted into a
+ document</span><!-- XXX xref -->, the user agent must <a href="#running"
+ title="running a script">run</a> the <code><a
+ href="#script1">script</a></code> element.
+
<p><dfn id=running title="running a script">Running a script</dfn>: When a
- script block is <span>inserted into a document</span>, the user agent must
- act as follows:
+ <code><a href="#script1">script</a></code> element is to be run, the user
+ agent must act as follows:
<ol>
<li>
@@ -26815,10 +26834,8 @@
or if the user agent does not <a href="#support">support the scripting
language</a> given by <var><a href="#the-scripts">the script's
type</a></var> for this <code><a href="#script1">script</a></code>
- element, or if the <code><a href="#script1">script</a></code> element
- has its <a href="#already">"already executed"</a> flag set, then the
- user agent must abort these steps at this point. The script is not
- executed.</p>
+ element, then the user agent must abort these steps at this point. The
+ script is not executed.</p>
<li>
<p>The user agent must set the element's <a href="#already">"already
@@ -46921,43 +46938,52 @@
href="#in-head0" title="insertion mode: in head noscript">in head
noscript</a>", "<a href="#after9" title="insertion mode: after head">after
head</a>", "<a href="#in-body" title="insertion mode: in body">in
- body</a>", "<a href="#in-table" title="insertion mode: in table">in
- table</a>", "<a href="#in-caption" title="insertion mode: in caption">in
- caption</a>", "<a href="#in-column" title="insertion mode: in column
- group">in column group</a>", "<a href="#in-table0" title="insertion mode:
- in table body">in table body</a>", "<a href="#in-row" title="insertion
- mode: in row">in row</a>", "<a href="#in-cell" title="insertion mode: in
- cell">in cell</a>", "<a href="#in-select" title="insertion mode: in
- select">in select</a>", "<a href="#in-select0" title="insertion mode: in
- select in table">in select in table</a>", "<a href="#in-foreign"
- title="insertion mode: in foreign content">in foreign content</a>", "<a
- href="#after10" title="insertion mode: after body">after body</a>", "<a
- href="#in-frameset" title="insertion mode: in frameset">in frameset</a>",
- "<a href="#after11" title="insertion mode: after frameset">after
- frameset</a>", "<a href="#after12" title="insertion mode: after after
- body">after after body</a>", and "<a href="#after13" title="insertion
- mode: after after frameset">after after frameset</a>" during the course of
- the parsing, as described in the <a href="#tree-construction0">tree
- construction</a> stage. The insertion mode affects how tokens are
- processed and whether CDATA sections are supported.
+ body</a>", "<a href="#in-cdatarcdata" title="insertion mode: in
+ CDATA/RCDATA">in CDATA/RCDATA</a>", "<a href="#in-table" title="insertion
+ mode: in table">in table</a>", "<a href="#in-caption" title="insertion
+ mode: in caption">in caption</a>", "<a href="#in-column" title="insertion
+ mode: in column group">in column group</a>", "<a href="#in-table0"
+ title="insertion mode: in table body">in table body</a>", "<a
+ href="#in-row" title="insertion mode: in row">in row</a>", "<a
+ href="#in-cell" title="insertion mode: in cell">in cell</a>", "<a
+ href="#in-select" title="insertion mode: in select">in select</a>", "<a
+ href="#in-select0" title="insertion mode: in select in table">in select in
+ table</a>", "<a href="#in-foreign" title="insertion mode: in foreign
+ content">in foreign content</a>", "<a href="#after10" title="insertion
+ mode: after body">after body</a>", "<a href="#in-frameset"
+ title="insertion mode: in frameset">in frameset</a>", "<a href="#after11"
+ title="insertion mode: after frameset">after frameset</a>", "<a
+ href="#after12" title="insertion mode: after after body">after after
+ body</a>", and "<a href="#after13" title="insertion mode: after after
+ frameset">after after frameset</a>" during the course of the parsing, as
+ described in the <a href="#tree-construction0">tree construction</a>
+ stage. The insertion mode affects how tokens are processed and whether
+ CDATA sections are supported.
<p>Seven of these modes, namely "<a href="#in-head" title="insertion mode:
in head">in head</a>", "<a href="#in-body" title="insertion mode: in
- body">in body</a>", "<a href="#in-table" title="insertion mode: in
- table">in table</a>", "<a href="#in-table0" title="insertion mode: in
- table body">in table body</a>", "<a href="#in-row" title="insertion mode:
- in row">in row</a>", "<a href="#in-cell" title="insertion mode: in
- cell">in cell</a>", and "<a href="#in-select" title="insertion mode: in
- select">in select</a>", are special, in that the other modes defer to them
- at various times. When the algorithm below says that the user agent is to
- do something "<dfn id=using10>using the rules for</dfn> the <var
- title="">m</var> insertion mode", where <var title="">m</var> is one of
- these modes, the user agent must use the rules described under the <var
- title="">m</var> <span>insertion mode</span>'s section, but must leave the
- <span>insertion mode</span> unchanged unless the rules in <var
- title="">m</var> themselves switch the <span>insertion mode</span> to a
- new value.
+ body">in body</a>", "<a href="#in-cdatarcdata" title="insertion mode: in
+ CDATA/RCDATA">in CDATA/RCDATA</a>", "<a href="#in-table" title="insertion
+ mode: in table">in table</a>", "<a href="#in-table0" title="insertion
+ mode: in table body">in table body</a>", "<a href="#in-row"
+ title="insertion mode: in row">in row</a>", "<a href="#in-cell"
+ title="insertion mode: in cell">in cell</a>", and "<a href="#in-select"
+ title="insertion mode: in select">in select</a>", are special, in that the
+ other modes defer to them at various times. When the algorithm below says
+ that the user agent is to do something "<dfn id=using10>using the rules
+ for</dfn> the <var title="">m</var> insertion mode", where <var
+ title="">m</var> is one of these modes, the user agent must use the rules
+ described under the <var title="">m</var> <span>insertion mode</span>'s
+ section, but must leave the <span>insertion mode</span> unchanged unless
+ the rules in <var title="">m</var> themselves switch the <span>insertion
+ mode</span> to a new value.
+ <p>When the insertion mode is switched to "<a href="#in-cdatarcdata"
+ title="insertion mode: in CDATA/RCDATA">in CDATA/RCDATA</a>", the <dfn
+ id=original>original insertion mode</dfn> is also set. This is the
+ insertion mode to which the tree construction stage will return when the
+ corresponding end tag is parsed.
+
<p>When the insertion mode is switched to "<a href="#in-foreign"
title="insertion mode: in foreign content">in foreign content</a>", the
<dfn id=secondary1>secondary insertion mode</dfn> is also set. This
@@ -46965,6 +46991,8 @@
title="insertion mode: in foreign content">in foreign content</a>" mode to
handle HTML (i.e. not foreign) content.
+ <hr>
+
<p>When the steps below require the UA to <dfn id=reset>reset the insertion
mode appropriately</dfn>, it means the UA must follow these steps:
@@ -49510,13 +49538,9 @@
<ol>
<li>
- <p><a href="#create0">Create an element for the token</a> in the <a
- href="#html-namespace0">HTML namespace</a>.
+ <p><a href="#insert0">Insert an HTML element</a> for the token.
<li>
- <p>Append the new element to the <a href="#current5">current node</a>.
-
- <li>
<p>If the algorithm that was invoked is the <a href="#generic">generic
CDATA element parsing algorithm</a>, switch the tokeniser's <a
href="#content4">content model flag</a> to the CDATA state; otherwise
@@ -49525,23 +49549,13 @@
href="#content4">content model flag</a> to the RCDATA state.
<li>
- <p>Then, collect all the character tokens that the tokeniser returns
- until it returns a token that is not a character token, or until it
- stops tokenizing.
+ <p>Let the <a href="#original">original insertion mode</a> be the current
+ <span>insertion mode</span>.</p>
<li>
- <p>If this process resulted in a collection of character tokens, append a
- single <code>Text</code> node, whose contents is the concatenation of
- all those tokens' characters, to the new element node.
-
- <li>
- <p>The tokeniser's <a href="#content4">content model flag</a> will have
- switched back to the PCDATA state.
-
- <li>
- <p>If the next token is an end tag token with the same tag name as the
- start tag token, ignore it. Otherwise, it's an end-of-file token, and
- this is a <a href="#parse2">parse error</a>.
+ <p>Then, switch the <span>insertion mode</span> to "<a
+ href="#in-cdatarcdata" title="insertion mode: in CDATA/RCDATA">in
+ CDATA/RCDATA</a>".
</ol>
<h5 id=closing1><span class=secno>8.2.5.2. </span>Closing elements that
@@ -50157,120 +50171,45 @@
<dt id=scriptTag>A start tag whose tag name is "script"
<dd>
- <p><a href="#create0">Create an element for the token</a> in the <a
- href="#html-namespace0">HTML namespace</a>.</p>
+ <ol>
+ <li>
+ <p><a href="#create0">Create an element for the token</a> in the <a
+ href="#html-namespace0">HTML namespace</a>.
- <p>Mark the element as being <a
- href="#parser-inserted">"parser-inserted"</a>. This ensures that, if the
- script is external, any <code title=dom-document-write-HTML><a
- href="#document.write...">document.write()</a></code> calls in the
- script will execute in-line, instead of blowing the document away, as
- would happen in most other cases.</p>
+ <li>
+ <p>Mark the element as being <a
+ href="#parser-inserted">"parser-inserted"</a>.</p>
- <p>Switch the tokeniser's <a href="#content4">content model flag</a> to
- the CDATA state.</p>
+ <p class=note>This ensures that, if the script is external, any <code
+ title=dom-document-write-HTML><a
+ href="#document.write...">document.write()</a></code> calls in the
+ script will execute in-line, instead of blowing the document away, as
+ would happen in most other cases. It also prevents the script from
+ executing until the end tag is seen.</p>
- <p>Then, collect all the character tokens that the tokeniser returns
- until it returns a token that is not a character token, or until it
- stops tokenizing.</p>
+ <li>
+ <p>If the parser was originally created for the <a
+ href="#html-fragment0">HTML fragment parsing algorithm</a>, then mark
+ the <code><a href="#script1">script</a></code> element as <a
+ href="#already">"already executed"</a>. (<a href="#fragment">fragment
+ case</a>)
- <p>If this process resulted in a collection of character tokens, append a
- single <code>Text</code> node to the <code><a
- href="#script1">script</a></code> element node whose contents is the
- concatenation of all those tokens' characters.</p>
+ <li>
+ <p>Append the new element to the <a href="#current5">current node</a>.</p>
- <p>The tokeniser's <a href="#content4">content model flag</a> will have
- switched back to the PCDATA state.</p>
+ <li>
+ <p>Switch the tokeniser's <a href="#content4">content model flag</a> to
+ the CDATA state.
- <p>If the next token is not an end tag token with the tag name "script",
- then this is a <a href="#parse2">parse error</a>; mark the <code><a
- href="#script1">script</a></code> element as <a href="#already">"already
- executed"</a>. Otherwise, the token is the <code><a
- href="#script1">script</a></code> element's end tag, so ignore it.</p>
+ <li>
+ <p>Let the <a href="#original">original insertion mode</a> be the
+ current <span>insertion mode</span>.</p>
- <p>If the parser was originally created for the <a
- href="#html-fragment0">HTML fragment parsing algorithm</a>, then mark
- the <code><a href="#script1">script</a></code> element as <a
- href="#already">"already executed"</a>, and skip the rest of the
- processing described for this token (including the part below where
- "<span title="pending external script">pending external scripts</span>"
- are executed). (<a href="#fragment">fragment case</a>)</p>
+ <li>
+ <p>Switch the <span>insertion mode</span> to "<a href="#in-cdatarcdata"
+ title="insertion mode: in CDATA/RCDATA">in CDATA/RCDATA</a>".
+ </ol>
- <p class=note>Marking the <code><a href="#script1">script</a></code>
- element as "already executed" prevents it from executing when it is
- inserted into the document a few paragraphs below. Thus, scripts missing
- their end tags and scripts that were inserted using <code
- title=dom-innerHTML-HTML><a href="#innerhtml0">innerHTML</a></code>,
- <code title=dom-outerHTML-HTML><a
- href="#outerhtml0">outerHTML</a></code>, or <code
- title=dom-insertAdjacentHTML-HTML><a
- href="#insertadjacenthtml0">insertAdjacentHTML()</a></code> aren't
- executed.</p>
-
- <p>Let the <var title="">old insertion point</var> have the same value as
- the current <a href="#insertion">insertion point</a>. Let the <a
- href="#insertion">insertion point</a> be just before the <a
- href="#next-input">next input character</a>.</p>
-
- <p>Append the new element to the <a href="#current5">current node</a>. <a
- href="#running" title="running a script">Special processing occurs when
- a <code>script</code> element is inserted into a document</a> that might
- cause some script to execute, which might cause <a
- href="#document.write..." title=dom-document-write-HTML>new characters
- to be inserted into the tokeniser</a>.</p>
-
- <p>Let the <a href="#insertion">insertion point</a> have the value of the
- <var title="">old insertion point</var>. (In other words, restore the <a
- href="#insertion">insertion point</a> to the value it had before the
- previous paragraph. This value might be the "undefined" value.)</p>
-
- <p id=scriptTagParserResumes>At this stage, if there is a <span>pending
- external script</span>, then:</p>
-
- <dl class=switch>
- <dt>If the tree construction stage is <a href="#nestedParsing">being
- called reentrantly</a>, say from a call to <code
- title=dom-document-write-HTML><a
- href="#document.write...">document.write()</a></code>:
-
- <dd>
- <p>Abort the processing of any nested invocations of the tokeniser,
- yielding control back to the caller. (Tokenization will resume when
- the caller returns to the "outer" tree construction stage.)
-
- <dt>Otherwise:
-
- <dd>
- <p>Follow these steps:</p>
-
- <ol>
- <li>
- <p>Let <var title="">the script</var> be the <span>pending external
- script</span>. There is no longer a <span>pending external
- script</span>.
-
- <li>
- <p><a href="#pause">Pause</a> until the script has <a
- href="#completed">completed loading</a>.
-
- <li>
- <p>Let the <a href="#insertion">insertion point</a> be just before
- the <a href="#next-input">next input character</a>.
-
- <li>
- <p><a href="#executing0" title="executing a script block">Execute the
- script</a>.
-
- <li>
- <p>Let the <a href="#insertion">insertion point</a> be undefined
- again.
-
- <li>
- <p>If there is once again a <span>pending external script</span>,
- then repeat these steps from step 1.
- </ol>
- </dl>
-
<dt>An end tag whose tag name is "head"
<dd>
@@ -51625,7 +51564,127 @@
</ol>
</dl>
- <h5 id=parsing-main-intable><span class=secno>8.2.5.11. </span>The "<dfn
+ <h5 id=parsing-main-incdata><span class=secno>8.2.5.11. </span>The "<dfn
+ id=in-cdatarcdata title="insertion mode: in CDATA/RCDATA">in
+ CDATA/RCDATA</dfn>" insertion mode</h5>
+
+ <p>When the <span>insertion mode</span> is "<a href="#in-cdatarcdata"
+ title="insertion mode: in CDATA/RCDATA">in CDATA/RCDATA</a>", tokens must
+ be handled as follows:
+
+ <dl class=switch>
+ <dt>A character token
+
+ <dd>
+ <p><a href="#insert" title="insert a character">Insert the token's
+ character</a> into the <a href="#current5">current node</a>.</p>
+
+ <dt>An end-of-file token
+
+ <dd> <!-- can't be the fragment case -->
+ <p><a href="#parse2">Parse error</a>.</p>
+
+ <p>If the <a href="#current5">current node</a> is a <code><a
+ href="#script1">script</a></code> element, mark the <code><a
+ href="#script1">script</a></code> element as <a href="#already">"already
+ executed"</a>.</p>
+
+ <p>Pop the <a href="#current5">current node</a> off the <a
+ href="#stack">stack of open elements</a>.</p>
+
+ <p>Switch the <span>insertion mode</span> to the <a
+ href="#original">original insertion mode</a> and reprocess the current
+ token.</p>
+
+ <dt>An end tag whose tag name is "script"
+
+ <dd>
+ <p>Let <var title="">script</var> be the <a href="#current5">current
+ node</a> (which will be a <code><a href="#script1">script</a></code>
+ element).</p>
+
+ <p>Pop the <a href="#current5">current node</a> off the <a
+ href="#stack">stack of open elements</a>.</p>
+
+ <p>Switch the <span>insertion mode</span> to the <a
+ href="#original">original insertion mode</a>.</p>
+
+ <p>Let the <var title="">old insertion point</var> have the same value as
+ the current <a href="#insertion">insertion point</a>. Let the <a
+ href="#insertion">insertion point</a> be just before the <a
+ href="#next-input">next input character</a>.</p>
+
+ <p><a href="#running" title="running a script">Run</a> the <var
+ title="">script</var>. This might cause some script to execute, which
+ might cause <a href="#document.write..."
+ title=dom-document-write-HTML>new characters to be inserted into the
+ tokeniser</a>, and might cause the tokeniser to output more tokens,
+ resulting in a <a href="#nestedParsing">reentrant invocation of the
+ parser</a>.</p>
+
+ <p>Let the <a href="#insertion">insertion point</a> have the value of the
+ <var title="">old insertion point</var>. (In other words, restore the <a
+ href="#insertion">insertion point</a> to the value it had before the
+ previous paragraph. This value might be the "undefined" value.)</p>
+
+ <p id=scriptTagParserResumes>At this stage, if there is a <span>pending
+ external script</span>, then:</p>
+
+ <dl class=switch>
+ <dt>If the tree construction stage is <a href="#nestedParsing">being
+ called reentrantly</a>, say from a call to <code
+ title=dom-document-write-HTML><a
+ href="#document.write...">document.write()</a></code>:
+
+ <dd>
+ <p>Abort the processing of any nested invocations of the tokeniser,
+ yielding control back to the caller. (Tokenization will resume when
+ the caller returns to the "outer" tree construction stage.)
+
+ <dt>Otherwise:
+
+ <dd>
+ <p>Follow these steps:</p>
+
+ <ol>
+ <li>
+ <p>Let <var title="">the script</var> be the <span>pending external
+ script</span>. There is no longer a <span>pending external
+ script</span>.
+
+ <li>
+ <p><a href="#pause">Pause</a> until the script has <a
+ href="#completed">completed loading</a>.
+
+ <li>
+ <p>Let the <a href="#insertion">insertion point</a> be just before
+ the <a href="#next-input">next input character</a>.
+
+ <li>
+ <p><a href="#executing0" title="executing a script block">Execute the
+ script</a>.
+
+ <li>
+ <p>Let the <a href="#insertion">insertion point</a> be undefined
+ again.
+
+ <li>
+ <p>If there is once again a <span>pending external script</span>,
+ then repeat these steps from step 1.
+ </ol>
+ </dl>
+
+ <dt>Any other end tag
+
+ <dd>
+ <p>Pop the <a href="#current5">current node</a> off the <a
+ href="#stack">stack of open elements</a>.</p>
+
+ <p>Switch the <span>insertion mode</span> to the <a
+ href="#original">original insertion mode</a>.</p>
+ </dl>
+
+ <h5 id=parsing-main-intable><span class=secno>8.2.5.12. </span>The "<dfn
id=in-table title="insertion mode: in table">in table</dfn>" insertion
mode</h5>
@@ -51810,7 +51869,7 @@
href="#html">html</a></code> element after this process is a <a
href="#fragment">fragment case</a>.
- <h5 id=parsing-main-incaption><span class=secno>8.2.5.12. </span>The "<dfn
+ <h5 id=parsing-main-incaption><span class=secno>8.2.5.13. </span>The "<dfn
id=in-caption title="insertion mode: in caption">in caption</dfn>"
insertion mode</h5>
@@ -51873,7 +51932,7 @@
<span>insertion mode</span>.</p>
</dl>
- <h5 id=parsing-main-incolgroup><span class=secno>8.2.5.13. </span>The "<dfn
+ <h5 id=parsing-main-incolgroup><span class=secno>8.2.5.14. </span>The "<dfn
id=in-column title="insertion mode: in column group">in column
group</dfn>" insertion mode</h5>
@@ -51958,7 +52017,7 @@
href="#fragment">fragment case</a>.</p>
</dl>
- <h5 id=parsing-main-intbody><span class=secno>8.2.5.14. </span>The "<dfn
+ <h5 id=parsing-main-intbody><span class=secno>8.2.5.15. </span>The "<dfn
id=in-table0 title="insertion mode: in table body">in table body</dfn>"
insertion mode</h5>
@@ -52048,7 +52107,7 @@
href="#html">html</a></code> element after this process is a <a
href="#fragment">fragment case</a>.
- <h5 id=parsing-main-intr><span class=secno>8.2.5.15. </span>The "<dfn
+ <h5 id=parsing-main-intr><span class=secno>8.2.5.16. </span>The "<dfn
id=in-row title="insertion mode: in row">in row</dfn>" insertion mode</h5>
<p>When the <span>insertion mode</span> is "<a href="#in-row"
@@ -52137,7 +52196,7 @@
href="#html">html</a></code> element after this process is a <a
href="#fragment">fragment case</a>.
- <h5 id=parsing-main-intd><span class=secno>8.2.5.16. </span>The "<dfn
+ <h5 id=parsing-main-intd><span class=secno>8.2.5.17. </span>The "<dfn
id=in-cell title="insertion mode: in cell">in cell</dfn>" insertion mode</h5>
<p>When the <span>insertion mode</span> is "<a href="#in-cell"
@@ -52238,7 +52297,7 @@
neither when the <span>insertion mode</span> is "<a href="#in-cell"
title="insertion mode: in cell">in cell</a>".
- <h5 id=parsing-main-inselect><span class=secno>8.2.5.17. </span>The "<dfn
+ <h5 id=parsing-main-inselect><span class=secno>8.2.5.18. </span>The "<dfn
id=in-select title="insertion mode: in select">in select</dfn>" insertion
mode</h5>
@@ -52360,7 +52419,7 @@
<p><a href="#parse2">Parse error</a>. Ignore the token.</p>
</dl>
- <h5 id=parsing-main-inselectintable><span class=secno>8.2.5.18. </span>The
+ <h5 id=parsing-main-inselectintable><span class=secno>8.2.5.19. </span>The
"<dfn id=in-select0 title="insertion mode: in select in table">in select
in table</dfn>" insertion mode</h5>
@@ -52396,7 +52455,7 @@
<span>insertion mode</span>.</p>
</dl>
- <h5 id=parsing-main-inforeign><span class=secno>8.2.5.19. </span>The "<dfn
+ <h5 id=parsing-main-inforeign><span class=secno>8.2.5.20. </span>The "<dfn
id=in-foreign title="insertion mode: in foreign content">in foreign
content</dfn>" insertion mode</h5>
@@ -52578,7 +52637,7 @@
flag">acknowledge the token's <i>self-closing flag</i></a>.</p>
</dl>
- <h5 id=parsing-main-afterbody><span class=secno>8.2.5.20. </span>The "<dfn
+ <h5 id=parsing-main-afterbody><span class=secno>8.2.5.21. </span>The "<dfn
id=after10 title="insertion mode: after body">after body</dfn>" insertion
mode</h5>
@@ -52642,7 +52701,7 @@
body</a>" and reprocess the token.</p>
</dl>
- <h5 id=parsing-main-inframeset><span class=secno>8.2.5.21. </span>The "<dfn
+ <h5 id=parsing-main-inframeset><span class=secno>8.2.5.22. </span>The "<dfn
id=in-frameset title="insertion mode: in frameset">in frameset</dfn>"
insertion mode</h5>
@@ -52737,7 +52796,7 @@
<p><a href="#parse2">Parse error</a>. Ignore the token.</p>
</dl>
- <h5 id=parsing-main-afterframeset><span class=secno>8.2.5.22. </span>The
+ <h5 id=parsing-main-afterframeset><span class=secno>8.2.5.23. </span>The
"<dfn id=after11 title="insertion mode: after frameset">after
frameset</dfn>" insertion mode</h5>
@@ -52802,7 +52861,7 @@
that do support frames but want to show the NOFRAMES content. Supporting
the former is easy; supporting the latter is harder.
- <h5 id=the-after0><span class=secno>8.2.5.23. </span>The "<dfn id=after12
+ <h5 id=the-after0><span class=secno>8.2.5.24. </span>The "<dfn id=after12
title="insertion mode: after after body">after after body</dfn>" insertion
mode</h5>
@@ -52844,7 +52903,7 @@
body</a>" and reprocess the token.</p>
</dl>
- <h5 id=the-after1><span class=secno>8.2.5.24. </span>The "<dfn id=after13
+ <h5 id=the-after1><span class=secno>8.2.5.25. </span>The "<dfn id=after13
title="insertion mode: after after frameset">after after frameset</dfn>"
insertion mode</h5>
Modified: source
===================================================================
--- source 2008-09-02 07:25:09 UTC (rev 2138)
+++ source 2008-09-02 09:42:45 UTC (rev 2139)
@@ -24107,9 +24107,25 @@
encoding</var></dfn>. They are determined when the script is run,
based on the attributes on the element at that time.</p>
+ <p>When an <span>XML parser</span> creates a <code>script</code>
+ element, it must be marked as being
+ <span>"parser-inserted"</span>. When the element's end tag is
+ parsed, the user agent must <span title="running a
+ script">run</span> the <code>script</code> element.</p>
+
+ <p class="note">Equivalent requirements exist for the <span>HTML
+ parser</span>, but they are detailed in that section instead.</p>
+
+ <p>When a <code>script</code> element that is marked as neither
+ having <span>"already executed"</span> nor being
+ <span>"parser-inserted"</span> is <span>inserted into a
+ document</span><!-- XXX xref -->, the user agent must <span
+ title="running a script">run</span> the <code>script</code>
+ element.</p>
+
<p><dfn title="running a script">Running a script</dfn>: When a
- script block is <span>inserted into a document</span>, the user
- agent must act as follows:</p>
+ <code>script</code> element is to be run, the user agent must act as
+ follows:</p>
<ol>
@@ -24179,10 +24195,8 @@
no need to worry about the HTML case, as the HTML parser handles
that for us -->, or if the user agent does not <span>support the
scripting language</span> given by <var>the script's type</var>
- for this <code>script</code> element, or if the
- <code>script</code> element has its <span>"already
- executed"</span> flag set, then the user agent must abort these
- steps at this point. The script is not executed.</p>
+ for this <code>script</code> element, then the user agent must
+ abort these steps at this point. The script is not executed.</p>
</li>
@@ -44313,7 +44327,8 @@
title="insertion mode: in head noscript">in head noscript</span>",
"<span title="insertion mode: after head">after head</span>", "<span
title="insertion mode: in body">in body</span>", "<span
- title="insertion mode: in table">in table</span>", "<span
+ title="insertion mode: in CDATA/RCDATA">in CDATA/RCDATA</span>",
+ "<span title="insertion mode: in table">in table</span>", "<span
title="insertion mode: in caption">in caption</span>", "<span
title="insertion mode: in column group">in column group</span>",
"<span title="insertion mode: in table body">in table body</span>",
@@ -44335,7 +44350,8 @@
<p>Seven of these modes, namely "<span title="insertion mode: in
head">in head</span>", "<span title="insertion mode: in body">in
- body</span>", "<span title="insertion mode: in table">in
+ body</span>", "<span title="insertion mode: in CDATA/RCDATA">in
+ CDATA/RCDATA</span>", "<span title="insertion mode: in table">in
table</span>", "<span title="insertion mode: in table body">in table
body</span>", "<span title="insertion mode: in row">in row</span>",
"<span title="insertion mode: in cell">in cell</span>", and "<span
@@ -44351,12 +44367,19 @@
to a new value.</p>
<p>When the insertion mode is switched to "<span title="insertion
+ mode: in CDATA/RCDATA">in CDATA/RCDATA</span>", the <dfn>original
+ insertion mode</dfn> is also set. This is the insertion mode to
+ which the tree construction stage will return when the corresponding
+ end tag is parsed.</p>
+
+ <p>When the insertion mode is switched to "<span title="insertion
mode: in foreign content">in foreign content</span>", the
<dfn>secondary insertion mode</dfn> is also set. This secondary mode
is used within the rules for the "<span title="insertion mode: in
foreign content">in foreign content</span>" mode to handle HTML
(i.e. not foreign) content.</p>
+ <hr>
<p>When the steps below require the UA to <dfn>reset the insertion
mode appropriately</dfn>, it means the UA must follow these
@@ -46466,12 +46489,8 @@
<ol>
- <li><p><span>Create an element for the token</span> in the
- <span>HTML namespace</span>.</p></li>
+ <li><p><span>Insert an HTML element</span> for the token.</p></li>
- <li><p>Append the new element to the <span>current
- node</span>.</p></li>
-
<li><p>If the algorithm that was invoked is the <span>generic CDATA
element parsing algorithm</span>, switch the tokeniser's
<span>content model flag</span> to the CDATA state; otherwise the
@@ -46479,22 +46498,13 @@
algorithm</span>, switch the tokeniser's <span>content model
flag</span> to the RCDATA state.</p></li>
- <li><p>Then, collect all the character tokens that the tokeniser
- returns until it returns a token that is not a character token, or
- until it stops tokenizing.</p></li>
+ <li><p>Let the <span>original insertion mode</span> be the current
+ <span>insertion mode</span>.</p>
- <li><p>If this process resulted in a collection of character
- tokens, append a single <code>Text</code> node, whose contents is
- the concatenation of all those tokens' characters, to the new
- element node.</p></li>
+ <li><p>Then, switch the <span>insertion mode</span> to "<span
+ title="insertion mode: in CDATA/RCDATA">in
+ CDATA/RCDATA</span>".</p></li>
- <li><p>The tokeniser's <span>content model flag</span> will have
- switched back to the PCDATA state.</p></li>
-
- <li><p>If the next token is an end tag token with the same tag name
- as the start tag token, ignore it. Otherwise, it's an end-of-file
- token, and this is a <span>parse error</span>.</p></li>
-
</ol>
@@ -46985,120 +46995,42 @@
<dt id="scriptTag">A start tag whose tag name is "script"</dt>
<dd>
- <p><span>Create an element for the token</span> in the <span>HTML
- namespace</span>.</p>
+ <ol>
- <p>Mark the element as being
- <span>"parser-inserted"</span>. This ensures that, if the
- script is external, any <code
- title="dom-document-write-HTML">document.write()</code> calls
- in the script will execute in-line, instead of blowing the
- document away, as would happen in most other cases.</p>
+ <li><p><span>Create an element for the token</span> in the
+ <span>HTML namespace</span>.</p></li>
- <p>Switch the tokeniser's <span>content model flag</span> to
- the CDATA state.</p>
+ <li>
- <p>Then, collect all the character tokens that the tokeniser
- returns until it returns a token that is not a character
- token, or until it stops tokenizing.</p>
+ <p>Mark the element as being <span>"parser-inserted"</span>.</p>
- <p>If this process resulted in a collection of character
- tokens, append a single <code>Text</code> node to the
- <code>script</code> element node whose contents is the
- concatenation of all those tokens' characters.</p>
+ <p class="note">This ensures that, if the script is external, any
+ <code title="dom-document-write-HTML">document.write()</code>
+ calls in the script will execute in-line, instead of blowing the
+ document away, as would happen in most other cases. It also
+ prevents the script from executing until the end tag is seen.</p>
- <p>The tokeniser's <span>content model flag</span> will have
- switched back to the PCDATA state.</p>
+ </li>
- <p>If the next token is not an end tag token with the tag name
- "script", then this is a <span>parse error</span>; mark the
- <code>script</code> element as <span>"already
- executed"</span>. Otherwise, the token is the
- <code>script</code> element's end tag, so ignore it.</p>
+ <li><p>If the parser was originally created for the <span>HTML
+ fragment parsing algorithm</span>, then mark the
+ <code>script</code> element as <span>"already
+ executed"</span>. (<span>fragment case</span>)</p></li>
- <p>If the parser was originally created for the <span>HTML
- fragment parsing algorithm</span>, then mark the
- <code>script</code> element as <span>"already executed"</span>,
- and skip the rest of the processing described for this token
- (including the part below where "<span title="pending external
- script">pending external scripts</span>" are
- executed). (<span>fragment case</span>)</p>
+ <li><p>Append the new element to the <span>current node</span>.</p>
- <p class="note">Marking the <code>script</code> element as
- "already executed" prevents it from executing when it is inserted
- into the document a few paragraphs below. Thus, scripts missing
- their end tags and scripts that were inserted using <code
- title="dom-innerHTML-HTML">innerHTML</code>, <code
- title="dom-outerHTML-HTML">outerHTML</code>, or <code
- title="dom-insertAdjacentHTML-HTML">insertAdjacentHTML()</code>
- aren't executed.</p>
+ <li><p>Switch the tokeniser's <span>content model flag</span> to
+ the CDATA state.</p></li>
- <p>Let the <var title="">old insertion point</var> have the
- same value as the current <span>insertion point</span>. Let
- the <span>insertion point</span> be just before the <span>next
- input character</span>.</p>
+ <li><p>Let the <span>original insertion mode</span> be the current
+ <span>insertion mode</span>.</p>
- <p>Append the new element to the <span>current node</span>.
- <span title="running a script">Special processing occurs when
- a <code>script</code> element is inserted into a
- document</span> that might cause some script to execute, which
- might cause <span title="dom-document-write-HTML">new
- characters to be inserted into the tokeniser</span>.</p>
+ <li><p>Switch the <span>insertion mode</span> to "<span
+ title="insertion mode: in CDATA/RCDATA">in
+ CDATA/RCDATA</span>".</p></li>
- <p>Let the <span>insertion point</span> have the value of the
- <var title="">old insertion point</var>. (In other words,
- restore the <span>insertion point</span> to the value it had
- before the previous paragraph. This value might be the
- "undefined" value.)</p>
+ </ol>
- <p id="scriptTagParserResumes">At this stage, if there is a
- <span>pending external script</span>, then:</p>
-
- <dl class="switch">
-
- <dt>If the tree construction stage is <a
- href="#nestedParsing">being called reentrantly</a>, say from
- a call to <code
- title="dom-document-write-HTML">document.write()</code>:</dt>
-
- <dd><p>Abort the processing of any nested invocations of the
- tokeniser, yielding control back to the caller. (Tokenization
- will resume when the caller returns to the "outer" tree
- construction stage.)</p></dd>
-
- <dt>Otherwise:</dt>
-
- <dd>
-
- <p>Follow these steps:</p>
-
- <ol>
-
- <li><p>Let <var title="">the script</var> be the <span>pending
- external script</span>. There is no longer a <span>pending
- external script</span>.</p></li>
-
- <li><p><span>Pause</span> until the script has <span>completed
- loading</span>.</p></li>
-
- <li><p>Let the <span>insertion point</span> be just before the
- <span>next input character</span>.</p></li>
-
- <li><p><span title="executing a script block">Execute the
- script</span>.</p></li>
-
- <li><p>Let the <span>insertion point</span> be undefined
- again.</p></li>
-
- <li><p>If there is once again a <span>pending external
- script</span>, then repeat these steps from step 1.</p></li>
-
- </ol>
-
- </dd>
-
- </dl>
-
</dd>
<dt>An end tag whose tag name is "head"</dt>
@@ -48536,6 +48468,136 @@
</dl>
+
+ <h5 id="parsing-main-incdata">The "<dfn title="insertion mode: in CDATA/RCDATA">in CDATA/RCDATA</dfn>" insertion mode</h5>
+
+ <p>When the <span>insertion mode</span> is "<span title="insertion
+ mode: in CDATA/RCDATA">in CDATA/RCDATA</span>", tokens must be
+ handled as follows:</p>
+
+ <dl class="switch">
+
+ <dt>A character token</dt>
+ <dd>
+
+ <p><span title="insert a character">Insert the token's
+ character</span> into the <span>current node</span>.</p>
+
+ </dd>
+
+ <dt>An end-of-file token</dt>
+ <dd>
+
+ <!-- can't be the fragment case -->
+ <p><span>Parse error</span>.</p>
+
+ <p>If the <span>current node</span> is a <code>script</code>
+ element, mark the <code>script</code> element as <span>"already
+ executed"</span>.</p>
+
+ <p>Pop the <span>current node</span> off the <span>stack of open
+ elements</span>.</p>
+
+ <p>Switch the <span>insertion mode</span> to the <span>original
+ insertion mode</span> and reprocess the current token.</p>
+
+ </dd>
+
+ <dt>An end tag whose tag name is "script"</dt>
+ <dd>
+
+ <p>Let <var title="">script</var> be the <span>current node</span>
+ (which will be a <code>script</code> element).</p>
+
+ <p>Pop the <span>current node</span> off the <span>stack of open
+ elements</span>.</p>
+
+ <p>Switch the <span>insertion mode</span> to the <span>original
+ insertion mode</span>.</p>
+
+ <p>Let the <var title="">old insertion point</var> have the
+ same value as the current <span>insertion point</span>. Let
+ the <span>insertion point</span> be just before the <span>next
+ input character</span>.</p>
+
+ <p><span title="running a script">Run</span> the <var
+ title="">script</var>. This might cause some script to execute,
+ which might cause <span title="dom-document-write-HTML">new
+ characters to be inserted into the tokeniser</span>, and might
+ cause the tokeniser to output more tokens, resulting in a <a
+ href="#nestedParsing">reentrant invocation of the parser</a>.</p>
+
+ <p>Let the <span>insertion point</span> have the value of the
+ <var title="">old insertion point</var>. (In other words,
+ restore the <span>insertion point</span> to the value it had
+ before the previous paragraph. This value might be the
+ "undefined" value.)</p>
+
+ <p id="scriptTagParserResumes">At this stage, if there is a
+ <span>pending external script</span>, then:</p>
+
+ <dl class="switch">
+
+ <dt>If the tree construction stage is <a
+ href="#nestedParsing">being called reentrantly</a>, say from a
+ call to <code
+ title="dom-document-write-HTML">document.write()</code>:</dt>
+
+ <dd><p>Abort the processing of any nested invocations of the
+ tokeniser, yielding control back to the caller. (Tokenization
+ will resume when the caller returns to the "outer" tree
+ construction stage.)</p></dd>
+
+
+ <dt>Otherwise:</dt>
+
+ <dd>
+
+ <p>Follow these steps:</p>
+
+ <ol>
+
+ <li><p>Let <var title="">the script</var> be the <span>pending
+ external script</span>. There is no longer a <span>pending
+ external script</span>.</p></li>
+
+ <li><p><span>Pause</span> until the script has <span>completed
+ loading</span>.</p></li>
+
+ <li><p>Let the <span>insertion point</span> be just before the
+ <span>next input character</span>.</p></li>
+
+ <li><p><span title="executing a script block">Execute the
+ script</span>.</p></li>
+
+ <li><p>Let the <span>insertion point</span> be undefined
+ again.</p></li>
+
+ <li><p>If there is once again a <span>pending external
+ script</span>, then repeat these steps from step 1.</p></li>
+
+ </ol>
+
+ </dd>
+
+ </dl>
+
+ </dd>
+
+ <dt>Any other end tag</dt>
+ <dd>
+
+ <p>Pop the <span>current node</span> off the <span>stack of open
+ elements</span>.</p>
+
+ <p>Switch the <span>insertion mode</span> to the <span>original
+ insertion mode</span>.</p>
+
+ </dd>
+
+ </dl>
+
+
<h5 id="parsing-main-intable">The "<dfn title="insertion mode: in table">in table</dfn>" insertion mode</h5>
<p>When the <span>insertion mode</span> is "<span title="insertion
More information about the Commit-Watchers
mailing list