[whatwg] Tokenizor PseudoCode

Mohammad Al Houssami (Alumni) mha53 at mail.aub.edu
Fri Mar 15 12:12:43 PDT 2013

Hello Everyone,

I just want to make sure that in places where no state change is called it means we stay in the same state right?
Take the RCDATA state below. In the anything else branch we emit character token and then go consume another character and check all the cases in this state.
This is the only thing that makes sense but I just want to make sure :)

Thanks RCDATA state
Consume the next input character<http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#next-input-character>:
U+0026 AMPERSAND (&)
Switch to the character reference in RCDATA state<http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#character-reference-in-rcdata-state>.
Switch to the RCDATA less-than sign state<http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#rcdata-less-than-sign-state>.
U+0000 NULL
Parse error<http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#parse-error>. Emit a U+FFFD REPLACEMENT CHARACTER character token.
Emit an end-of-file token.
Anything else
Emit the current input character<http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#current-input-character> as a character token.

More information about the whatwg mailing list