[imps] Tree construction question

Rob Jellinghaus rjelling at microsoft.com
Fri Feb 5 14:03:02 PST 2010


This list seems pretty quiet now; if there is a better list / forum to ask this, please advise.  (Also, to avoid misinterpretation, I am not on the IE team and do not speak for the IE team's plans.)

Section 9.2.5.10 of the latest version of the spec contains this text under the rule for the "a" start tag:

In the non-conforming stream <a href="a">a<table><a href="b">b</table>x, the first a<http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-a-element> element would be closed upon seeing the second one, and the "x" character would be inside a link to "b", not to "a". This is despite the fact that the outer a<http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-a-element> element is not in table scope (meaning that a regular </a> end tag at the start of the table wouldn't close the outer a<http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-a-element> element).

This is consistent with the behavior when feeding this example to http://james.html5.org/parsetree.html which outputs:

|html
  |head
  |body
    |a href="a"
      |#text: a
      |a href="b"
        |#text: b
      |table
    |a href="b"
      |#text: x

My personal implementation delivers identical results, and my reading of the spec is that this is compliant output given the definition of tree construction -- the second <a> tag gets foster parented under the first <a> tag, because the first <a> tag is selected as the foster parent element when the second <a> tag is encountered.

Of course, it is not actually HTML-compliant output, because of nested <a> tags, which are invalid.

So it appears that the HTML 5  tree construction algorithm produces invalid HTML in the case of this non-compliant input.  IE gets it even worse, but Firefox seems to get it more correct:

|html
  |head
  |body
    |a href="a"
      |#text: a
    |a href="b"
      |#text: b
    |table
    |#text: x

Is this known?  Already planned to be fixed?  Or is this a known bug that will ship?

My apologies if this example has already been discussed; pointers on finding that earlier discussion will help me learn to fish.

Thanks,
Rob Jellinghaus
Microsoft Technical Strategy Incubation
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/implementors-whatwg.org/attachments/20100205/781ce2d3/attachment-0002.htm>


More information about the Implementors mailing list