[whatwg] Parsing the string <html>

Ian Hickson ian at hixie.ch
Fri Aug 2 19:20:41 PDT 2013

Ian wrote:
> On Fri, 2 Aug 2013, Mohammad Al Houssami (Alumni) wrote:
> > 
> > When parsing the string <html> the document should supposedly have an 
> > html root with head and body children. ( This is what live dom viewer 
> > shows at least) but according to the specs( if im not wrong) we only 
> > get the document with the html element and the stack of open elements 
> > will have html head and body elements in it.
> The "<html>" start tag token causes you to jump from the "initial" 
> insertion mode to the "before html" insertion mode, and then the <html> 
> element is created and you jump to "before head".
> You then hit the "end of file" token, and that causes the <head> element 
> to be generated, and switches you to "in head", where <head> is popped 
> and you switch to "after head", where you insert a <body> element and 
> switch to "in body", at which point you stop parsing.

On Sat, 3 Aug 2013, Mohammad Al Houssami (Alumni) wrote:
> That is totally correct. But are the head and body elements added to the 
> document? So basically when we stop parsing the document should only 
> have the html element is that correct?

On Fri, 2 Aug 2013, Tab Atkins Jr. wrote:
> No, the spec clearly says "Insert an HTML element..." for those as you 
> trace through the parsing.

As Tab says, when the elements are generated they are also immediately 
inserted into the document. For example, where it says:

# Insert an HTML element for a "body" start tag token with no attributes.

...in the "after head" mode, "Insert an HTML element" is a hyperlink to 
the definition of that algorithm earlier in the spec, which says:

# 1. Let the adjusted insertion location be the appropriate place for 
#    inserting a node.

...which itself basically just boils down to "inside current node, after 
its last child (if any)", followed by:

# 2. Create an element for the token in the HTML namespace, with the 
#    intended parent being the element in which the adjusted insertion 
#    location finds itself.

...followed by (skipping bits irrelevant to this case):

# 4. If it is possible to insert an element at the adjusted insertion 
#    location, then insert the newly created element at the adjusted 
#    insertion location.

...which appends the <body> element to the <html> element (after the 
<head> element, which goes through the same process earlier). When you 
append a node to another, they end up in the same Document.

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list