[whatwg] [WebApps] Parsing: bogus DOCTYPE state
jking at dark-phantasy.com
Mon Jul 17 09:16:15 PDT 2006
The bogus DOCTYPE state consumes all characters until it gets to EOF or a
'>' character. I presume this means that the following DOCTYPE:
<!DOCTYPE html blah "http://some<invalid>URI">
...would finish at the first > and emit character tokens for 'URI">'.
Similarly, I imagine this sequence:
<!DOCTYPE html blah <html lang="en"><head>
...would not produce a start-tag token for 'html'.
Is this what browsers do, or is this an oversight? Even if it -is- what
browsers do, this behaviour would lead conformance checkers to report the
wrong kinds of errors; I would suggest a more complex parsing of DOCTYPEs
More information about the whatwg