[whatwg] Reading a start tag in "text" insertion mode
Mohammad Al Houssami (Alumni)
mha53 at mail.aub.edu
Fri Aug 16 02:43:43 PDT 2013
So this is what I am missing. My implementation does not follow the specs 100%. I have built the tokenizer completely first and now started the tree construction. I pass all the tokens so they are kind of separate until now. This is because of complexity reasons. The plan was to work on finding a way to go back to the Tokenizer after some progress is made. So basically I cant handle this situation at the moment.
Thanks for the clear up Ian. :)
From: Ian Hickson [mailto:ian at hixie.ch]
Sent: Friday, August 16, 2013 4:24 AM
To: Mohammad Al Houssami (Alumni)
Cc: whatwg at whatwg.org
Subject: Re: [whatwg] Reading a start tag in "text" insertion mode
On Thu, 15 Aug 2013, Mohammad Al Houssami (Alumni) wrote:
> I am building a parser incrementally by sets of elements (and not all
> at the same time ) so while debugging I noticed that the text
> insertion mode does not have a "anything else" branch. Lets assume my
> input is the
> following: <title><head> The title start tag will lead us to the text
> insertion mode. And then what should happen ? The specifications
> don't deal with this case as there is nothing that says what should
> happen in this case... I think I am missing something here ?
The generic RCDATA element parsing algorithm puts the tokenizer into the RCDATA state, from which the only possible tokens are text tokens, end tag tokens, and end-of-file tokens. These are the same tokens that the "text"
So you parse a <title> start tag token, you go into "text" mode, then you get six character tokens, which get inserted into the <title> element, then you get an EOF token, and you unwind the parser and end.
What token are you getting that isn't handled?
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg