[whatwg] Empty elements
Ian Hickson
ian at hixie.ch
Thu Feb 16 12:02:16 PST 2006
On Thu, 16 Feb 2006, Tim Altman wrote:
>
> OK. Assuming the HTML5 document is served with a text/html doctype, how would
> the following markup be parsed?
>
> <table>
> <tr>
> <td>
> <canvas/>
> <p>Foo</p>
> </td>
> </tr>
> </table>
You omitted the DOCTYPE, which makes it a "difficult parse error" and thus
isn't currently defined (i.e. it triggers Quirks mode). Assuming the
document started with "<!DOCTYPE HTML>", though, and ignoring all
whitespace (nothing interesting happens with whitespace):
* Tree Construction starts in the Initial Phase.
* A DOCTYPE token marked as being correct
-- Append a DocumentType node to the Document node
-- Switch to the Root Element Phase.
* A start tag token (<table>)
-- Append an <html> element to the Document node.
-- Switch to the Main Phase
* Main phase state:
-- Insertion mode is in the "before head" mode.
-- Stack of open elements has just <html>.
-- Reprocess the token:
* "Anything else" (<table>) in "before head"
-- Act as if <head> had been seen:
* A start tag token with the tag name "head"
-- Append a <head> element to the <html> element.
-- Stack of open elements has <html><head>.
-- Switch to the "in head" insertion mode.
-- Reprocess the token:
* "Anything else" (<table>) in "in head"
-- Act as if </head> had been seen:
* An end tag token with the tag name "head"
-- Stack of open elements again just has <html>.
-- Change the insertion mode to "after head".
-- Reprocess the token:
* "Anything else" (<table>) in "after head"
-- Act as if <body> had been seen:
* A start tag token with the tag name "body"
-- Append a <body> element to the <html> element.
-- Stack of open elements has <html><body>.
-- Switch to the "in body" insertion mode.
-- Reprocess the token:
* A start tag whose tag name is "table"
-- Append a <table> element to the <body> element.
-- Stack of open elements has <html><body><table>.
-- Switch to the "in table" insertion mode.
* A start tag whose tag name is one of: "td", "th", "tr"
-- Act as if <tbody> had been seen:
* A start tag whose tag name is one of: "tbody", "tfoot", "thead"
-- Append a <tbody> to the <table> element.
-- Stack of open elements has <html><body><table><tbody>.
-- Switch to the "in table body" insertion mode.
-- Reprocess the token:
* A start tag whose tag name is "tr"
-- Append a <tr> element to the <tbody> element.
-- Stack of open elements has <html><body><table><tbody><tr>.
-- Switch to the "in row" insertion mode.
* A start tag whose tag name is one of: "th", "td"
-- Append a <td> element to the <tr> element.
-- Stack of open elements has <html><body><table><tbody><tr><td>.
-- Switch to the "in cell" insertion mode.
* Anything else (<canvas>) in "in cell"
-- Process as if it was "in body":
* A start tag token not covered by the previous entries (<canvas>)
-- Append a <canvas> element to the <td> element.
-- Stack of open elements has:
<html><body><table><tbody><tr><td><canvas>
* Anything else (<p>) in "in cell"
-- Process as if it was "in body":
* A start tag whose tag name is one of: "address", "blockquote",
"center", "dir", "div", "dl", "fieldset", "h1", "h2", "h3",
"h4", "h5", "h6", "listing", "menu", "ol", "p", "pre", "ul"
-- Append a <p> element to the <canvas> element.
-- Stack of open elements has:
<html><body><table><tbody><tr><td><canvas><p>
* Anything else (character "F", then later "o" and "o") in "in cell"
-- Process as if it was "in body":
* Append a text node Foo to the <p> element.
* Anything else (</p>) in "in cell"
-- Process as if it was "in body":
* An end tag whose tag name is "p"
-- Stack of open elements one again has just:
<html><body><table><tbody><tr><td><canvas>
-- Insertion mode is still "in cell".
* An end tag whose tag name is one of: "td", "th"
-- Current node is not a <td> (it's <canvas>): EASY PARSE ERROR.
-- Pop elements until a <td> is popped. Stack of open elements one
again has just <html><body><table><tbody><tr>.
-- Switch insertion mode to "in row".
* An end tag whose tag name is "tr"
-- Stack of open elements is now: <html><body><table><tbody>.
-- Switch insertion mode to "in table body".
* An end tag whose tag name is "table"
-- Act as if </tbody> had been seen:
* An end tag whose tag name is one of: "tbody", "tfoot", "thead"
-- Stack of open elements is <html><body><table>.
-- Change insertion mode to "in table".
-- Reprocess the token.
* An end tag whose tag name is "table"
-- Stack of open element is <html><body>.
-- Change insertion mode to "in body".
* An end-of-file token
-- Act as if </body> had been seen:
* An end tag with the tag name "body"
-- Switch insertion mode to "after body".
-- Reprocess the token.
* An end-of-file token
-- Act as if </html> had been seen:
* An end tag with the tag name "html"
-- Switch to the Trailing End Phase.
-- Reprocess the token.
* An end-of-file token
-- Ignore the token.
The result is a DOM that looks like:
#document
HTML
HEAD
BODY
TABLE
TBODY
TR
TD
CANVAS
P
#text ("Foo")
Hopefully everyone was able to follow along at home and get the same
result.
> I skimmed the parsing section of the current HTML5 draft (mainly
> 8.2.2.3.7) and noticed that the canvas element is being treated as a
> "phrasing" element. Is this by mistake? I would think it would be
> treated similar to the object element, since they have similar handling
> of fallback content.
New elements will all be either treated like <div>, <input>, or <span>,
depending on whether they are structure-like, empty, or something else.
<object> has _complicated_ parsing semantics. We don't want to make any
new elements have complicated parsing semantics (especially because that
wouldn't be backwards-compatible).
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list