[whatwg] "Script Data" tokenizer mode

Mon Nov 2 02:29:22 PST 2009

Matt Hall wrote:
> When the "script data" state was added to the tokenizer, the tree construction
> algorithm was updated to switch the tokenizer into this state upon finding a
> start tag named "script" while in the "in head" insertion mode (9.2.5.7). I see
> that a corresponding change was not made to 9.5 about "Parsing HTML Fragments"
> as it still says to switch into the RAWTEXT state upon finding a "script" tag.
> Does anyone know if this difference is intentional, or did someone just forget
> to update the fragment parsing case?

I think, due to the fact that no start tag has ever been emitted by the 
tokenizer, that RAWTEXT and the script data states should behave 
identically for the script element fragment case. (Once you take into 
account that there are no appropriate end tag token, all the careful 
casing for the comments effectively becomes nothing, and regardless of 
input everything will become character tokens. This is true of both the 
script data state and the RAWTEXT state: the latter is probably 
preferably due to its far lower complexity.)

-- 
Geoffrey Sneddon — Opera Software
<http://gsnedders.com/>
<http://www.opera.com/>