[whatwg] comment parsing
Ian Hickson
ian at hixie.ch
Sun Jan 22 18:33:05 PST 2006
On Sat, 21 Jan 2006, Anne van Kesteren wrote:
>
> Quoting Anne van Kesteren <fora at annevankesteren.nl>:
> > However, from the specification it is not entirely clear what should happen
> > with <!--></p>.
>
> The specification also does not match what is widely implemented for cases
> like:
>
> # <p><!-- --FAIL></p>
>
> Here is how they are parsed more or less (without EOF and error handling):
>
> zcorpan says:
> ok, so it is parsed like this...
> <! marked section open state
> -- comment open state
> anything except --: stay in comment open state
> -- comment end state
> anything except >: stay in comment end state
> > close comment
In my testing, I found that browsers were less than consistent about this.
For example, this:
<!-- a > -- b > c --> EOF
...in Mozilla in quirks mode, is treated as one long comment, but this:
<!-- a > -- b > c EOF
...is treated as if the comment ended after the "a". Given the security
concerns raised by reparsing (see my last e-mail), we don't want to do
this. Safari quirks mode looked like it might be implementing your
described behaviour. I couldn't test Opera, it raises exceptions on my
test script when I use it to test unexpected EOF situations.
IE6 (in both standards mode and quirks mode) has this interesting
behaviour:
SOURCE DOM
<!-- a > EOF Empty comment.
<!-- a > - EOF Text node "<!-- a > -".
<!-- a > -- EOF Text node "<!-- a > --".
<!-- a > --> EOF Comment " a > ".
<!-- a > -- > EOF Empty comment, text node " -- >".
<!-- a > -- b > EOF Empty comment, text node " -- b >".
<!-- a > -- b > c - EOF Text node " a > -- b > c -".
<!-- a > -- b > c -- EOF Text node " a > -- b > c --".
<!-- a > -- b > c --> EOF Comment " a > -- b > c".
Per the HTML5 spec now, it should be:
SOURCE DOM
<!-- a > EOF Comment " a >".
<!-- a > - EOF Comment " a > -".
<!-- a > -- EOF Comment " a > --".
<!-- a > --> EOF Comment " a > ".
<!-- a > -- > EOF Comment " a > -- >".
<!-- a > -- b > EOF Comment " a > -- b >".
<!-- a > -- b > c - EOF Comment " a > -- b > c -".
<!-- a > -- b > c -- EOF Comment " a > -- b > c --".
<!-- a > -- b > c --> EOF Comment " a > -- b > c ".
This seems like the most logical lowest-common-denominator way of
describing this.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list