[whatwg] Parsing: Tokenisation - DOCTYPE State
Lachlan Hunt
lachlan.hunt at lachy.id.au
Sat Jan 28 22:59:33 PST 2006
Hi,
I believe there are some mistakes in the DOCTYPE state section.
As far as I can tell both of these DOCTYPEs are considered conformant,
but shouldn't the first be an easy parse error?
<!DOCTYPEhtml>
<!DOCTYPE html>
In the DOCTYPE state, it says:
U+0061 LATIN SMALL LETTER A through to U+007A LATIN SMALL LETTER Z
Create a new DOCTYPE token. Set the token's name name to the
uppercase version of the current input character (*add 0x0020
to the character's codepoint*), and mark it as being in error.
Switch to the DOCTYPE name state.
* That should read "[subtract] 0x0020 to the character's codepoint"
(This error is repeated in the DOCTYPE name state too.)
* Why is it marked as being error at that stage? It doesn't seem to
be necessary because of the last step in the DOCTYPE name state that
says:
"If the name of the DOCTYPE token is exactly the four letters "HTML",
then mark the token as being correct. Otherwise, mark it as being in
error."
--
Lachlan Hunt
http://lachy.id.au/
More information about the whatwg
mailing list