[whatwg] Reading a start tag in "text" insertion mode

Mohammad Al Houssami (Alumni) mha53 at mail.aub.edu
Fri Aug 16 02:43:43 PDT 2013

So this is what I am missing. My implementation does not follow the specs 100%. I have built the tokenizer completely first and now started the tree construction. I pass all the tokens so they are kind of separate until now. This is because of complexity reasons. The plan was to work on finding a way to go back to the Tokenizer after some progress is made. So basically I cant handle this situation at the moment. 
Thanks for the clear up Ian. :)

-----Original Message-----
From: Ian Hickson [mailto:ian at hixie.ch] 
Sent: Friday, August 16, 2013 4:24 AM
To: Mohammad Al Houssami (Alumni)
Cc: whatwg at whatwg.org
Subject: Re: [whatwg] Reading a start tag in "text" insertion mode

On Thu, 15 Aug 2013, Mohammad Al Houssami (Alumni) wrote:
> I am building a parser incrementally by sets of elements (and not all 
> at the same time ) so while debugging I noticed that the text 
> insertion mode does not have a "anything else" branch. Lets assume my 
> input is the
> following: <title><head> The title start tag will lead us to the text 
> insertion mode. And then what should happen ?  The specifications 
> don't deal with this case as there is nothing that says what should 
> happen in this case... I think I am missing something here ?

The generic RCDATA element parsing algorithm puts the tokenizer into the RCDATA state, from which the only possible tokens are text tokens, end tag tokens, and end-of-file tokens. These are the same tokens that the "text" 
mode handles.

So you parse a <title> start tag token, you go into "text" mode, then you get six character tokens, which get inserted into the <title> element, then you get an EOF token, and you unwind the parser and end.

What token are you getting that isn't handled?

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list