[whatwg] comment parsing
Ian Hickson
ian at hixie.ch
Sun Jan 22 18:15:19 PST 2006
On Sat, 21 Jan 2006, Anne van Kesteren wrote:
>
> Given the new parsing rules for comments (all those internal discussions...) I
> was trying to write some testcases for how they are defined now.
>
> # <p><!-- -- -->PASS<!--></p>
>
> However, from the specification it is not entirely clear what should
> happen with <!--></p>. Well, perhaps it is, but then I'd like that to be
> changed. If we take the problematic snippet:
>
> # <!--></p>
>
> It seems that per
> <http://whatwg.org/specs/web-apps/current-work/#marked> "<!--" starts
> the comment. It seems that per
> <http://whatwg.org/specs/web-apps/current-work/#comment> all characters
> that follow and are not a dash have to become part of the comment. Is
> that correct?
Yes. The </p> is part of the comment.
> So if I would modify the testcase to say:
>
> # <p><!-- -- -->PASS<!--></p>FAIL
>
> And directly after "FAIL" it is EOF (or a few end tags later) it would never
> show up, right?
Correct.
> Given that most browsers show "FAIL" or "<!-->FAIL" for:
>
> # <p><!-->FAIL</p>
>
> A change might be in order. Or perhaps someone explaining to me what I
> did wrong when reading the specification.
Your reading is correct.
The reason the spec doesn't say that you re-parse if you hit EOF with an
open comment is that it is a security risk.
Imagine that the page contains the following:
...
<!--
<script> hostileScript(): </script>
-->
...
...where "hostileScript()" is some script that does something bad.
A DOS attack on the server could cause the transmitted text to be:
...
<!--
<script> hostileScript(): </script>
...which, if we re-parse the content upon hitting EOF with an open
comment, would cause the script to be executed.
This scenario could show itself any time that a blog entry system allows
users to enter comments, for instance.
(Thanks to Jesse Ruderman for pointing this out.)
(I could be convinced that <!--> should be a full comment -- allowing the
<!-- and --> parts to overlap -- if it could be shown that UAs implement
this behaviour separately from their implementing <!--EOF as reparsing.)
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list