[whatwg] Parsing: Tokenisation - DOCTYPE State

Ian Hickson ian at hixie.ch
Tue Jan 31 12:24:38 PST 2006

On Sun, 29 Jan 2006, Lachlan Hunt wrote:
>   I believe there are some mistakes in the DOCTYPE state section.
> As far as I can tell both of these DOCTYPEs are considered conformant, but
> shouldn't the first be an easy parse error?
>   <!DOCTYPEhtml>
>   <!DOCTYPE html>

Yeah. Fixed. They both still generate the same DOM but the first causes an 
error to be flagged.

> * That should read "[subtract] 0x0020 to the character's codepoint"
>   (This error is repeated in the DOCTYPE name state too.)

Fixed. Though I'm not sure we want to be doing this really. I'm torn.

> * Why is it marked as being error at that stage?  It doesn't seem to
>   be necessary because of the last step in the DOCTYPE name state that
>   says:
>   "If the name of the DOCTYPE token is exactly the four letters "HTML",
>    then mark the token as being correct. Otherwise, mark it as being in
>    error."

It's mostly just for the case of an EOF during the DOCTYPE name state.

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list