[whatwg] Another issue in parsing tokens in foreign content

Ian Hickson ian at hixie.ch
Wed Jul 31 14:05:57 PDT 2013

On Thu, 4 Jul 2013, Michael Day wrote:
> > 
> > The problem is that we can't do (2) in _all_ cases, e.g. innerHTML on 
> > an <svg> can't possibly break out of the <svg> if it sees one of these 
> > tags, since that's the "root" of what is being parsed.
> Yes, HTML has already lost the composability of parsing that XML and 
> other languages have, that's long gone. But that doesn't mean we should 
> try to make it even more irregular :)
> Currently Firefox, Chrome, and Prince all treat the fragment case the 
> same as the whole document case, so we already have interoperable 
> behaviour on this issue.

If you treated them the same, you would either crash or have an infinite 
loop, because you'd either pop the root element off the stack and then try 
to append something to null, or you'd try to reprocess the token without 
having popped anything first.

There has to be _some_ special casing of <svg>.innerHTML.

What should the special casing be? Consider this case:

   <svg>.innerHTML = '<g><p>'

I can see two possible options:

    +-- g
        +-- P


    +-- g
    +-- P

Neither are what happens in the non-fragment case (in that case the <p> is 
a sibling of the <svg>).

Consider this case:

   <svg>.innerHTML = '<g><svg><g><p>'

Here, the <P> node could be a child of the innermost <g>, the innermost 
<svg>, the outermost <g>, or the outermost <svg>. I could see arguments 
for all those cases. It seems unlikely that the author meant any of them.

> Since the HTML spec is supposed to reflect reality, it seems pointless 
> to deliberately introduce an inconsistency in the parsing model that 
> requires changes in all user agents to implement.

All the user agents (or at least, all the browsers I could test) have to 
change anyway. Blink-based browsers and WebKit-based browsers don't 
support innerHTML on <svg> at all. Firefox supports innerHTML on <svg> but 
puts all the nodes in the HTML namespace.

In conclusion, the reason I simply removed the quirk from fragment parsing 
rather than trying to make it work is that:

 - all browsers will have to change anyway,

 - the quirk needs special handling in the fragment case anyway,

 - it's not clear what the behaviour should be,

 - in many cases, we're not error-correcting in a useful way anyway.

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list