[html5] r931 - /
whatwg at whatwg.org
whatwg at whatwg.org
Tue Jun 19 16:49:50 PDT 2007
Author: ianh
Date: 2007-06-19 16:49:49 -0700 (Tue, 19 Jun 2007)
New Revision: 931
Modified:
index
source
Log:
[e] (0) Abstract out the innerHTML serialisation algorithm.
Modified: index
===================================================================
--- index 2007-06-19 23:27:35 UTC (rev 930)
+++ index 2007-06-19 23:49:49 UTC (rev 931)
@@ -1518,7 +1518,10 @@
<li><a href="#namespaces"><span class=secno>8.3. </span>Namespaces</a>
- <li><a href="#entities"><span class=secno>8.4. </span>Entities</a>
+ <li><a href="#serialising"><span class=secno>8.4. </span>Serialising
+ HTML fragments</a>
+
+ <li><a href="#entities"><span class=secno>8.5. </span>Entities</a>
</ul>
<li><a href="#wysiwyg"><span class=secno>9. </span>WYSIWYG editors</a>
@@ -3710,166 +3713,9 @@
<p>On getting, the <code title=dom-innerHTML-HTML><a
href="#innerhtml0">innerHTML</a></code> DOM attribute must return the
- result of running the following algorithm:
+ result of running the <a href="#html-fragment">HTML fragment serialisation
+ algorithm</a> on the node.
- <ol>
- <li>
- <p>Let <var title="">s</var> be a string, and initialise it to the empty
- string.
-
- <li>
- <p>For each child node <var title="">child</var>, in <a
- href="#tree-order">tree order</a>, append the appropriate string from
- the following list to <var title="">s</var>:</p>
-
- <dl class=switch>
- <dt>If the child node is an <code title="">Element</code>
-
- <dd>
- <p>Append a U+003C LESS-THAN SIGN (<code title=""><</code>)
- character, followed by the element's tag name. (For nodes created by
- the <a href="#html-0">HTML parser</a>, <code
- title="">Document.createElement()</code>, or <code
- title="">Document.renameNode()</code>, the tag name will be
- lowercase.)</p>
-
- <p>For each attribute that the element has, append a U+0020 SPACE
- character, the attribute's name (which, for attributes set by the <a
- href="#html-0">HTML parser</a> or by <code
- title="">Element.setAttributeNode()</code> or <code
- title="">Element.setAttribute()</code>, will be lowercase), a U+003D
- EQUALS SIGN (<code title="">=</code>) character, a U+0022 QUOTATION
- MARK (<code title="">"</code>) character, the attribute's value,
- <a href="#escapingString" title="escaping a string">escaped as
- described below</a>, and a second U+0022 QUOTATION MARK (<code
- title="">"</code>) character.</p>
-
- <p>While the exact order of attributes is UA-defined, and may depend on
- factors such as the order that the attributes were given in the
- original markup, the sort order must be stable, such that consecutive
- calls to <code title=dom-innerHTML-HTML><a
- href="#innerhtml0">innerHTML</a></code> serialise an element's
- attributes in the same order.</p>
-
- <p>Append a U+003E GREATER-THAN SIGN (<code title="">></code>)
- character.</p>
-
- <p>If the child node is an <code><a href="#area">area</a></code>,
- <code><a href="#base">base</a></code>, <code>basefont</code>,
- <code>bgsound</code>, <code><a href="#br">br</a></code>, <code><a
- href="#col">col</a></code>, <code><a href="#embed">embed</a></code>,
- <code>frame</code>, <code><a href="#hr">hr</a></code>, <code><a
- href="#img">img</a></code>, <code>input</code>, <code><a
- href="#link">link</a></code>, <code><a href="#meta0">meta</a></code>,
- <code><a href="#param">param</a></code>, <code>spacer</code>, or
- <code>wbr</code> element, then continue on to the next child node at
- this point.</p>
- <!-- also, i guess:
- image, isindex, and keygen, but we don't list those because we
- don't consider those "elements", more "macros", and thus we
- should never serialise them -->
- <!-- XXX when we get around to
- it, add event-source -->
- <p>If the child node is a <code><a href="#pre">pre</a></code> or
- <code>textarea</code> element, append a U+000A LINE FEED (LF)
- character.</p>
-
- <p>Append the value of the <var title="">child</var> element's <code
- title=dom-innerHTML-HTML><a href="#innerhtml0">innerHTML</a></code>
- DOM attribute (thus recursing into this algorithm for that element),
- followed by a U+003C LESS-THAN SIGN (<code title=""><</code>)
- character, a U+002F SOLIDUS (<code title="">/</code>) character, the
- element's tag name again, and finally a U+003E GREATER-THAN SIGN
- (<code title="">></code>) character.</p>
-
- <dt>If the child node is a <code title="">Text</code> or <code
- title="">CDATASection</code> node
-
- <dd>
- <p>If one of the ancestors of the child node is a <code><a
- href="#style">style</a></code>, <code><a
- href="#script0">script</a></code>, <code>xmp</code>, <code><a
- href="#iframe">iframe</a></code>, <code>noembed</code>,
- <code>noframes</code>, or <code><a
- href="#noscript">noscript</a></code> element, then append the value of
- the <var title="">child</var> node's <code title="">data</code> DOM
- attribute literally.</p>
- <!-- note about noscript: because this is defining an API, it
- can assume that scripting is enabled, and that thus the
- <noscript> element in the DOM will have been parsed in the
- scripting-enabled mode, and that thus the text node is raw
- markup -->
-
- <p>Otherwise, append the value of the <var title="">child</var> node's
- <code title="">data</code> DOM attribute, <a href="#escapingString"
- title="escaping a string">escaped as described below</a>.</p>
-
- <dt>If the child node is a <code title="">Comment</code>
-
- <dd>
- <p>Append the literal string <code><!--</code> (U+003C LESS-THAN
- SIGN, U+0021 EXCLAMATION MARK, U+002D HYPHEN-MINUS, U+002D
- HYPHEN-MINUS), followed by the value of the <var title="">child</var>
- node's <code title="">data</code> DOM attribute, followed by the
- literal string <code>--></code> (U+002D HYPHEN-MINUS, U+002D
- HYPHEN-MINUS, U+003E GREATER-THAN SIGN).</p>
-
- <dt>If the child node is a <code title="">DocumentType</code>
-
- <dd>
- <p>Append the literal string <code><!DOCTYPE</code> (U+003C
- LESS-THAN SIGN, U+0021 EXCLAMATION MARK, U+0044 LATIN CAPITAL LETTER
- D, U+004F LATIN CAPITAL LETTER O, U+0043 LATIN CAPITAL LETTER C,
- U+0054 LATIN CAPITAL LETTER T, U+0059 LATIN CAPITAL LETTER Y, U+0050
- LATIN CAPITAL LETTER P, U+0045 LATIN CAPITAL LETTER E), followed by a
- space (U+0020 SPACE), followed by the value of the <var
- title="">child</var> node's <code title="">name</code> DOM attribute,
- followed by the literal string <code>></code> (U+003E GREATER-THAN
- SIGN).</p>
- </dl>
-
- <p>Other nodes types (e.g. <code title="">Attr</code>) cannot occur as
- children of elements. If they do, the <code title=dom-innerHTML-HTML><a
- href="#innerhtml0">innerHTML</a></code> attribute must raise an
- <code>INVALID_STATE_ERR</code> exception.</p>
-
- <li>
- <p>The result of the algorithm is the string <var title="">s</var>.
- </ol>
-
- <p><dfn id=escapingString>Escaping a string</dfn> (for the purposes of the
- algorithm above) consists of replacing any occurances of the "<code
- title="">&</code>" character by the string "<code
- title="">&</code>", any occurances of the "<code
- title=""><</code>" character by the string "<code
- title=""><</code>", any occurances of the "<code
- title="">></code>" character by the string "<code
- title="">></code>", and any occurances of the "<code
- title="">"</code>" character by the string "<code
- title="">"</code>".
-
- <p class=note>Entity reference nodes are <a
- href="#entity-references">assumed to be expanded</a> by the user agent,
- and are therefore not covered in the algorithm above.
-
- <p class=note>It is possible that the roundtripping through <code
- title=dom-innerHTML-HTML><a href="#innerhtml0">innerHTML</a></code> will
- not work. For instance, if the element is a <code>textarea</code> element
- to which a <code title="">Comment</code> node has been appended, then
- assigning <code title=dom-innerHTML-HTML><a
- href="#innerhtml0">innerHTML</a></code> to itself will result in the
- comment being displayed in the text field. Similarly, if, as a result of
- DOM manipulation, the element contains a comment that contains the literal
- string "<code title="">--></code>", then when the result of serialising
- the element is parsed, the comment will be truncated at that point and the
- rest of the comment will be interpreted as markup. More examples would be
- making a <code><a href="#script0">script</a></code> element contain a text
- node with the text string "<code></script></code>", or having a
- <code><a href="#p">p</a></code> element that contains a <code><a
- href="#ul">ul</a></code> element (as the <code><a href="#ul">ul</a></code>
- element's <span title=syntax-start-tag>start tag</span> would imply the
- end tag for the <code><a href="#p">p</a></code>).
-
<p>On setting, if the node is a document, the <code
title=dom-innerHTML-HTML><a href="#innerhtml0">innerHTML</a></code> DOM
attribute must run the following algorithm:
@@ -38631,7 +38477,170 @@
<p>The <dfn id=html-namespace0>HTML namespace</dfn> is:
<code>http://www.w3.org/1999/xhtml</code>
- <h3 id=entities><span class=secno>8.4. </span><dfn
+ <h3 id=serialising><span class=secno>8.4. </span>Serialising HTML fragments</h3>
+
+ <p>The following steps form the <dfn id=html-fragment>HTML fragment
+ serialisation algorithm</dfn>. The algorithm takes as input a DOM
+ <code>Element</code> or <code>Document</code>, referred to as <var
+ title="">the node</var>, and either returns a string or raises an
+ exception.
+
+ <ol>
+ <li>
+ <p>Let <var title="">s</var> be a string, and initialise it to the empty
+ string.
+
+ <li>
+ <p>For each child node <var title="">child</var> of <var title="">the
+ node</var>, in <a href="#tree-order">tree order</a>, append the
+ appropriate string from the following list to <var title="">s</var>:</p>
+
+ <dl class=switch>
+ <dt>If the child node is an <code title="">Element</code>
+
+ <dd>
+ <p>Append a U+003C LESS-THAN SIGN (<code title=""><</code>)
+ character, followed by the element's tag name. (For nodes created by
+ the <a href="#html-0">HTML parser</a>, <code
+ title="">Document.createElement()</code>, or <code
+ title="">Document.renameNode()</code>, the tag name will be
+ lowercase.)</p>
+
+ <p>For each attribute that the element has, append a U+0020 SPACE
+ character, the attribute's name (which, for attributes set by the <a
+ href="#html-0">HTML parser</a> or by <code
+ title="">Element.setAttributeNode()</code> or <code
+ title="">Element.setAttribute()</code>, will be lowercase), a U+003D
+ EQUALS SIGN (<code title="">=</code>) character, a U+0022 QUOTATION
+ MARK (<code title="">"</code>) character, the attribute's value,
+ <a href="#escapingString" title="escaping a string">escaped as
+ described below</a>, and a second U+0022 QUOTATION MARK (<code
+ title="">"</code>) character.</p>
+
+ <p>While the exact order of attributes is UA-defined, and may depend on
+ factors such as the order that the attributes were given in the
+ original markup, the sort order must be stable, such that consecutive
+ invocations of this algorithm serialise an element's attributes in the
+ same order.</p>
+
+ <p>Append a U+003E GREATER-THAN SIGN (<code title="">></code>)
+ character.</p>
+
+ <p>If the child node is an <code><a href="#area">area</a></code>,
+ <code><a href="#base">base</a></code>, <code>basefont</code>,
+ <code>bgsound</code>, <code><a href="#br">br</a></code>, <code><a
+ href="#col">col</a></code>, <code><a href="#embed">embed</a></code>,
+ <code>frame</code>, <code><a href="#hr">hr</a></code>, <code><a
+ href="#img">img</a></code>, <code>input</code>, <code><a
+ href="#link">link</a></code>, <code><a href="#meta0">meta</a></code>,
+ <code><a href="#param">param</a></code>, <code>spacer</code>, or
+ <code>wbr</code> element, then continue on to the next child node at
+ this point.</p>
+ <!-- also, i guess:
+ image, isindex, and keygen, but we don't list those because we
+ don't consider those "elements", more "macros", and thus we
+ should never serialise them -->
+ <!-- XXX when we get around to
+ it, add event-source -->
+ <p>If the child node is a <code><a href="#pre">pre</a></code> or
+ <code>textarea</code> element, append a U+000A LINE FEED (LF)
+ character.</p>
+
+ <p>Append the value of running the <a href="#html-fragment">HTML
+ fragment serialisation algorithm</a> on the <var title="">child</var>
+ element (thus recursing into this algorithm for that element),
+ followed by a U+003C LESS-THAN SIGN (<code title=""><</code>)
+ character, a U+002F SOLIDUS (<code title="">/</code>) character, the
+ element's tag name again, and finally a U+003E GREATER-THAN SIGN
+ (<code title="">></code>) character.</p>
+
+ <dt>If the child node is a <code title="">Text</code> or <code
+ title="">CDATASection</code> node
+
+ <dd>
+ <p>If one of the ancestors of the child node is a <code><a
+ href="#style">style</a></code>, <code><a
+ href="#script0">script</a></code>, <code>xmp</code>, <code><a
+ href="#iframe">iframe</a></code>, <code>noembed</code>,
+ <code>noframes</code>, or <code><a
+ href="#noscript">noscript</a></code> element, then append the value of
+ the <var title="">child</var> node's <code title="">data</code> DOM
+ attribute literally.</p>
+ <!-- note about noscript: because this is defining an API, it
+ can assume that scripting is enabled, and that thus the
+ <noscript> element in the DOM will have been parsed in the
+ scripting-enabled mode, and that thus the text node is raw
+ markup -->
+
+ <p>Otherwise, append the value of the <var title="">child</var> node's
+ <code title="">data</code> DOM attribute, <a href="#escapingString"
+ title="escaping a string">escaped as described below</a>.</p>
+
+ <dt>If the child node is a <code title="">Comment</code>
+
+ <dd>
+ <p>Append the literal string <code><!--</code> (U+003C LESS-THAN
+ SIGN, U+0021 EXCLAMATION MARK, U+002D HYPHEN-MINUS, U+002D
+ HYPHEN-MINUS), followed by the value of the <var title="">child</var>
+ node's <code title="">data</code> DOM attribute, followed by the
+ literal string <code>--></code> (U+002D HYPHEN-MINUS, U+002D
+ HYPHEN-MINUS, U+003E GREATER-THAN SIGN).</p>
+
+ <dt>If the child node is a <code title="">DocumentType</code>
+
+ <dd>
+ <p>Append the literal string <code><!DOCTYPE</code> (U+003C
+ LESS-THAN SIGN, U+0021 EXCLAMATION MARK, U+0044 LATIN CAPITAL LETTER
+ D, U+004F LATIN CAPITAL LETTER O, U+0043 LATIN CAPITAL LETTER C,
+ U+0054 LATIN CAPITAL LETTER T, U+0059 LATIN CAPITAL LETTER Y, U+0050
+ LATIN CAPITAL LETTER P, U+0045 LATIN CAPITAL LETTER E), followed by a
+ space (U+0020 SPACE), followed by the value of the <var
+ title="">child</var> node's <code title="">name</code> DOM attribute,
+ followed by the literal string <code>></code> (U+003E GREATER-THAN
+ SIGN).</p>
+ </dl>
+
+ <p>Other nodes types (e.g. <code title="">Attr</code>) cannot occur as
+ children of elements. If they do, this algorithm must raise an
+ <code>INVALID_STATE_ERR</code> exception.</p>
+
+ <li>
+ <p>The result of the algorithm is the string <var title="">s</var>.
+ </ol>
+
+ <p><dfn id=escapingString>Escaping a string</dfn> (for the purposes of the
+ algorithm above) consists of replacing any occurances of the "<code
+ title="">&</code>" character by the string "<code
+ title="">&</code>", any occurances of the "<code
+ title=""><</code>" character by the string "<code
+ title=""><</code>", any occurances of the "<code
+ title="">></code>" character by the string "<code
+ title="">></code>", and any occurances of the "<code
+ title="">"</code>" character by the string "<code
+ title="">"</code>".
+
+ <p class=note>Entity reference nodes are <a
+ href="#entity-references">assumed to be expanded</a> by the user agent,
+ and are therefore not covered in the algorithm above.
+
+ <p class=note>It is possible that the output of this algorithm, if parsed
+ with an <a href="#html-0">HTML parser</a>, will not return the original
+ tree structure. For instance, if a <code>textarea</code> element to which
+ a <code title="">Comment</code> node has been appended is serialised and
+ the output is then reparsed, the comment will end up being displayed in
+ the text field. Similarly, if, as a result of DOM manipulation, an element
+ contains a comment that contains the literal string "<code
+ title="">--></code>", then when the result of serialising the element
+ is parsed, the comment will be truncated at that point and the rest of the
+ comment will be interpreted as markup. More examples would be making a
+ <code><a href="#script0">script</a></code> element contain a text node
+ with the text string "<code></script></code>", or having a <code><a
+ href="#p">p</a></code> element that contains a <code><a
+ href="#ul">ul</a></code> element (as the <code><a href="#ul">ul</a></code>
+ element's <span title=syntax-start-tag>start tag</span> would imply the
+ end tag for the <code><a href="#p">p</a></code>).
+
+ <h3 id=entities><span class=secno>8.5. </span><dfn
id=entities0>Entities</dfn></h3>
<p>This table lists the entity names that are supported by HTML, and the
Modified: source
===================================================================
--- source 2007-06-19 23:27:35 UTC (rev 930)
+++ source 2007-06-19 23:49:49 UTC (rev 931)
@@ -2280,183 +2280,9 @@
follow.</p>
<p>On getting, the <code title="dom-innerHTML-HTML">innerHTML</code>
- DOM attribute must return the result of running the following
- algorithm:</p>
+ DOM attribute must return the result of running the <span>HTML
+ fragment serialisation algorithm</span> on the node.</p>
- <ol>
-
- <li><p>Let <var title="">s</var> be a string, and initialise it to
- the empty string.</p></li>
-
- <li>
-
- <p>For each child node <var title="">child</var>, in <span>tree order</span>,
- append the appropriate string from the following list to <var
- title="">s</var>:</p>
-
- <dl class="switch">
-
- <dt>If the child node is an <code title="">Element</code></dt>
-
- <dd>
-
- <p>Append a U+003C LESS-THAN SIGN (<code title=""><</code>)
- character, followed by the element's tag name. (For nodes
- created by the <span>HTML parser</span>, <code
- title="">Document.createElement()</code>, or <code
- title="">Document.renameNode()</code>, the tag name will be
- lowercase.)</p>
-
- <p>For each attribute that the element has, append a U+0020
- SPACE character, the attribute's name (which, for attributes set
- by the <span>HTML parser</span> or by <code
- title="">Element.setAttributeNode()</code> or <code
- title="">Element.setAttribute()</code>, will be lowercase), a
- U+003D EQUALS SIGN (<code title="">=</code>) character, a U+0022
- QUOTATION MARK (<code title="">"</code>) character, the
- attribute's value, <span title="escaping a string">escaped as
- described below</span>, and a second U+0022 QUOTATION MARK
- (<code title="">"</code>) character.</p>
-
- <p>While the exact order of attributes is UA-defined, and may
- depend on factors such as the order that the attributes were
- given in the original markup, the sort order must be stable,
- such that consecutive calls to <code
- title="dom-innerHTML-HTML">innerHTML</code> serialise an
- element's attributes in the same order.</p>
-
- <p>Append a U+003E GREATER-THAN SIGN (<code title="">></code>)
- character.</p>
-
- <p>If the child node is an <code>area</code>, <code>base</code>,
- <code>basefont</code>, <code>bgsound</code>, <code>br</code>,
- <code>col</code>, <code>embed</code>, <code>frame</code>,
- <code>hr</code>, <code>img</code>, <code>input</code>,
- <code>link</code>, <code>meta</code>, <code>param</code>,
- <code>spacer</code>, or <code>wbr</code> element, then continue
- on to the next child node at this point.</p> <!-- also, i guess:
- image, isindex, and keygen, but we don't list those because we
- don't consider those "elements", more "macros", and thus we
- should never serialise them --> <!-- XXX when we get around to
- it, add event-source -->
-
- <p>If the child node is a <code>pre</code> or
- <code>textarea</code> element, append a U+000A LINE FEED (LF)
- character.</p>
-
- <p>Append the value of the <var title="">child</var> element's
- <code title="dom-innerHTML-HTML">innerHTML</code> DOM attribute
- (thus recursing into this algorithm for that element), followed
- by a U+003C LESS-THAN SIGN (<code title=""><</code>)
- character, a U+002F SOLIDUS (<code title="">/</code>) character,
- the element's tag name again, and finally a U+003E GREATER-THAN
- SIGN (<code title="">></code>) character.</p>
-
- </dd>
-
-
- <dt>If the child node is a <code title="">Text</code> or <code
- title="">CDATASection</code> node</dt>
-
- <dd>
-
- <p>If one of the ancestors of the child node is a
- <code>style</code>, <code>script</code>, <code>xmp</code>,
- <code>iframe</code>, <code>noembed</code>,
- <code>noframes</code>, or <code>noscript</code> element, then
- append the value of the <var title="">child</var> node's <code
- title="">data</code> DOM attribute literally.</p>
- <!-- note about noscript: because this is defining an API, it
- can assume that scripting is enabled, and that thus the
- <noscript> element in the DOM will have been parsed in the
- scripting-enabled mode, and that thus the text node is raw
- markup -->
-
- <p>Otherwise, append the value of the <var title="">child</var>
- node's <code title="">data</code> DOM attribute, <span
- title="escaping a string">escaped as described below</span>.</p>
-
- </dd>
-
-
- <dt>If the child node is a <code title="">Comment</code></dt>
-
- <dd>
-
- <p>Append the literal string <code><!--</code> (U+003C
- LESS-THAN SIGN, U+0021 EXCLAMATION MARK, U+002D HYPHEN-MINUS,
- U+002D HYPHEN-MINUS), followed by the value of the <var
- title="">child</var> node's <code title="">data</code> DOM
- attribute, followed by the literal string <code>--></code>
- (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN
- SIGN).</p>
-
- </dd>
-
-
- <dt>If the child node is a <code title="">DocumentType</code></dt>
-
- <dd>
-
- <p>Append the literal string <code><!DOCTYPE</code> (U+003C
- LESS-THAN SIGN, U+0021 EXCLAMATION MARK, U+0044 LATIN CAPITAL
- LETTER D, U+004F LATIN CAPITAL LETTER O, U+0043 LATIN CAPITAL
- LETTER C, U+0054 LATIN CAPITAL LETTER T, U+0059 LATIN CAPITAL
- LETTER Y, U+0050 LATIN CAPITAL LETTER P, U+0045 LATIN CAPITAL
- LETTER E), followed by a space (U+0020 SPACE), followed by the
- value of the <var title="">child</var> node's <code
- title="">name</code> DOM attribute, followed by the literal
- string <code>></code> (U+003E GREATER-THAN SIGN).</p>
-
- </dd>
-
-
- </dl>
-
- <p>Other nodes types (e.g. <code title="">Attr</code>) cannot
- occur as children of elements. If they do, the <code
- title="dom-innerHTML-HTML">innerHTML</code> attribute must raise
- an <code>INVALID_STATE_ERR</code> exception.</p>
-
- </li>
-
- <li><p>The result of the algorithm is the string <var
- title="">s</var>.</p></li>
-
- </ol>
-
- <p><dfn id="escapingString">Escaping a string</dfn> (for the
- purposes of the algorithm above) consists of replacing any
- occurances of the "<code title="">&</code>" character by the
- string "<code title="">&</code>", any occurances of the
- "<code title=""><</code>" character by the string "<code
- title=""><</code>", any occurances of the "<code
- title="">></code>" character by the string "<code
- title="">></code>", and any occurances of the "<code
- title="">"</code>" character by the string "<code
- title="">"</code>".</p>
-
- <p class="note">Entity reference nodes are <a
- href="#entity-references">assumed to be expanded</a> by the user
- agent, and are therefore not covered in the algorithm above.</p>
-
- <p class="note">It is possible that the roundtripping through <code
- title="dom-innerHTML-HTML">innerHTML</code> will not work. For
- instance, if the element is a <code>textarea</code> element to which
- a <code title="">Comment</code> node has been appended, then
- assigning <code title="dom-innerHTML-HTML">innerHTML</code> to
- itself will result in the comment being displayed in the text
- field. Similarly, if, as a result of DOM manipulation, the element
- contains a comment that contains the literal string "<code
- title="">--></code>", then when the result of serialising the
- element is parsed, the comment will be truncated at that point and
- the rest of the comment will be interpreted as markup. More examples
- would be making a <code>script</code> element contain a text node
- with the text string "<code></script></code>", or having a
- <code>p</code> element that contains a <code>ul</code> element (as
- the <code>ul</code> element's <span title="syntax-start-tag">start
- tag</span> would imply the end tag for the <code>p</code>).</p>
-
<p>On setting, if the node is a document, the <code
title="dom-innerHTML-HTML">innerHTML</code> DOM attribute must run
the following algorithm:</p>
@@ -36017,6 +35843,190 @@
<p>The <dfn>HTML namespace</dfn> is: <code>http://www.w3.org/1999/xhtml</code></p>
+
+
+ <h3>Serialising HTML fragments</h3>
+
+ <p>The following steps form the <dfn>HTML fragment serialisation
+ algorithm</dfn>. The algorithm takes as input a DOM
+ <code>Element</code> or <code>Document</code>, referred to as <var
+ title="">the node</var>, and either returns a string or raises an
+ exception.</p>
+
+ <ol>
+
+ <li><p>Let <var title="">s</var> be a string, and initialise it to
+ the empty string.</p></li>
+
+ <li>
+
+ <p>For each child node <var title="">child</var> of <var
+ title="">the node</var>, in <span>tree order</span>, append the
+ appropriate string from the following list to <var
+ title="">s</var>:</p>
+
+ <dl class="switch">
+
+ <dt>If the child node is an <code title="">Element</code></dt>
+
+ <dd>
+
+ <p>Append a U+003C LESS-THAN SIGN (<code title=""><</code>)
+ character, followed by the element's tag name. (For nodes
+ created by the <span>HTML parser</span>, <code
+ title="">Document.createElement()</code>, or <code
+ title="">Document.renameNode()</code>, the tag name will be
+ lowercase.)</p>
+
+ <p>For each attribute that the element has, append a U+0020
+ SPACE character, the attribute's name (which, for attributes set
+ by the <span>HTML parser</span> or by <code
+ title="">Element.setAttributeNode()</code> or <code
+ title="">Element.setAttribute()</code>, will be lowercase), a
+ U+003D EQUALS SIGN (<code title="">=</code>) character, a U+0022
+ QUOTATION MARK (<code title="">"</code>) character, the
+ attribute's value, <span title="escaping a string">escaped as
+ described below</span>, and a second U+0022 QUOTATION MARK
+ (<code title="">"</code>) character.</p>
+
+ <p>While the exact order of attributes is UA-defined, and may
+ depend on factors such as the order that the attributes were
+ given in the original markup, the sort order must be stable,
+ such that consecutive invocations of this algorithm serialise an
+ element's attributes in the same order.</p>
+
+ <p>Append a U+003E GREATER-THAN SIGN (<code title="">></code>)
+ character.</p>
+
+ <p>If the child node is an <code>area</code>, <code>base</code>,
+ <code>basefont</code>, <code>bgsound</code>, <code>br</code>,
+ <code>col</code>, <code>embed</code>, <code>frame</code>,
+ <code>hr</code>, <code>img</code>, <code>input</code>,
+ <code>link</code>, <code>meta</code>, <code>param</code>,
+ <code>spacer</code>, or <code>wbr</code> element, then continue
+ on to the next child node at this point.</p> <!-- also, i guess:
+ image, isindex, and keygen, but we don't list those because we
+ don't consider those "elements", more "macros", and thus we
+ should never serialise them --> <!-- XXX when we get around to
+ it, add event-source -->
+
+ <p>If the child node is a <code>pre</code> or
+ <code>textarea</code> element, append a U+000A LINE FEED (LF)
+ character.</p>
+
+ <p>Append the value of running the <span>HTML fragment
+ serialisation algorithm</span> on the <var title="">child</var>
+ element (thus recursing into this algorithm for that element),
+ followed by a U+003C LESS-THAN SIGN (<code title=""><</code>)
+ character, a U+002F SOLIDUS (<code title="">/</code>) character,
+ the element's tag name again, and finally a U+003E GREATER-THAN
+ SIGN (<code title="">></code>) character.</p>
+
+ </dd>
+
+
+ <dt>If the child node is a <code title="">Text</code> or <code
+ title="">CDATASection</code> node</dt>
+
+ <dd>
+
+ <p>If one of the ancestors of the child node is a
+ <code>style</code>, <code>script</code>, <code>xmp</code>,
+ <code>iframe</code>, <code>noembed</code>,
+ <code>noframes</code>, or <code>noscript</code> element, then
+ append the value of the <var title="">child</var> node's <code
+ title="">data</code> DOM attribute literally.</p>
+ <!-- note about noscript: because this is defining an API, it
+ can assume that scripting is enabled, and that thus the
+ <noscript> element in the DOM will have been parsed in the
+ scripting-enabled mode, and that thus the text node is raw
+ markup -->
+
+ <p>Otherwise, append the value of the <var title="">child</var>
+ node's <code title="">data</code> DOM attribute, <span
+ title="escaping a string">escaped as described below</span>.</p>
+
+ </dd>
+
+
+ <dt>If the child node is a <code title="">Comment</code></dt>
+
+ <dd>
+
+ <p>Append the literal string <code><!--</code> (U+003C
+ LESS-THAN SIGN, U+0021 EXCLAMATION MARK, U+002D HYPHEN-MINUS,
+ U+002D HYPHEN-MINUS), followed by the value of the <var
+ title="">child</var> node's <code title="">data</code> DOM
+ attribute, followed by the literal string <code>--></code>
+ (U+002D HYPHEN-MINUS, U+002D HYPHEN-MINUS, U+003E GREATER-THAN
+ SIGN).</p>
+
+ </dd>
+
+
+ <dt>If the child node is a <code title="">DocumentType</code></dt>
+
+ <dd>
+
+ <p>Append the literal string <code><!DOCTYPE</code> (U+003C
+ LESS-THAN SIGN, U+0021 EXCLAMATION MARK, U+0044 LATIN CAPITAL
+ LETTER D, U+004F LATIN CAPITAL LETTER O, U+0043 LATIN CAPITAL
+ LETTER C, U+0054 LATIN CAPITAL LETTER T, U+0059 LATIN CAPITAL
+ LETTER Y, U+0050 LATIN CAPITAL LETTER P, U+0045 LATIN CAPITAL
+ LETTER E), followed by a space (U+0020 SPACE), followed by the
+ value of the <var title="">child</var> node's <code
+ title="">name</code> DOM attribute, followed by the literal
+ string <code>></code> (U+003E GREATER-THAN SIGN).</p>
+
+ </dd>
+
+
+ </dl>
+
+ <p>Other nodes types (e.g. <code title="">Attr</code>) cannot
+ occur as children of elements. If they do, this algorithm must
+ raise an <code>INVALID_STATE_ERR</code> exception.</p>
+
+ </li>
+
+ <li><p>The result of the algorithm is the string <var
+ title="">s</var>.</p></li>
+
+ </ol>
+
+ <p><dfn id="escapingString">Escaping a string</dfn> (for the
+ purposes of the algorithm above) consists of replacing any
+ occurances of the "<code title="">&</code>" character by the
+ string "<code title="">&</code>", any occurances of the
+ "<code title=""><</code>" character by the string "<code
+ title=""><</code>", any occurances of the "<code
+ title="">></code>" character by the string "<code
+ title="">></code>", and any occurances of the "<code
+ title="">"</code>" character by the string "<code
+ title="">"</code>".</p>
+
+ <p class="note">Entity reference nodes are <a
+ href="#entity-references">assumed to be expanded</a> by the user
+ agent, and are therefore not covered in the algorithm above.</p>
+
+ <p class="note">It is possible that the output of this algorithm, if
+ parsed with an <span>HTML parser</span>, will not return the
+ original tree structure. For instance, if a <code>textarea</code>
+ element to which a <code title="">Comment</code> node has been
+ appended is serialised and the output is then reparsed, the comment
+ will end up being displayed in the text field. Similarly, if, as a
+ result of DOM manipulation, an element contains a comment that
+ contains the literal string "<code title="">--></code>", then
+ when the result of serialising the element is parsed, the comment
+ will be truncated at that point and the rest of the comment will be
+ interpreted as markup. More examples would be making a
+ <code>script</code> element contain a text node with the text string
+ "<code></script></code>", or having a <code>p</code> element that
+ contains a <code>ul</code> element (as the <code>ul</code> element's
+ <span title="syntax-start-tag">start tag</span> would imply the end
+ tag for the <code>p</code>).</p>
+
+
<h3><dfn>Entities</dfn></h3>
<p>This table lists the entity names that are supported by HTML, and
More information about the Commit-Watchers
mailing list