[Imps] liberal XML and implied end tags

James Graham jg307 at cam.ac.uk
Mon Mar 12 05:20:13 PDT 2007


Sam Ruby wrote:
> Inside src/liberalxmlparser.py, I see:
> 
>>              if node.name == name:
>>                  #XXX Something is wrong here... The next (commented) line is
>>                  #html-only
>>                  #self.tree.generateImpliedEndTags()

(Insofar as this is a html5lib-specific issue it should probably be on our 
mailing list. I'm setting followup-to html5lib-discuss at googlegroups.com)

> Problem #1: if I uncomment out that line, no tests fail.  What's up with 
> that?  If I need to make a fix that involves restoring that line, how 
> will I know what that breaks?

You're right I should have added a test. I will do that asap (this evening GMT).

> Problem #2: the functionality that was supposed to be enabled by that 
> logic can be expressed by the following addition to BasicXmlTest:
> 
>>     def test_mismatch(self):
>>       self.assertXmlEquals("<x><y>foo</x>bar","<x><y>foo</y></x>bar")
> 
> Unfortunately, that test doesn't pass, with or without the line in 
> question; my first inclination is to restore the commented out line and 
> then debug why it doesn't work, but I'm reluctant to do so without an 
> understanding of what problem was solved by commenting out the line in 
> the first place.

The "problem" was that parsing the WHATWG Blog feed crashed the parser because 
generateImpliedEndTags was removing a <p> element from the stack of open 
elements so that the later loop:

while self.tree.openElements.pop() != node:
     pass

didn't find a match. In general, however, the problem is that I don't see how 
generateImpiledEndTags, which looks for HTML-specific elements to close can 
possibly be right here.

-- 
"Eternity's a terrible thought. I mean, where's it all going to end?"
  -- Tom Stoppard, Rosencrantz and Guildenstern are Dead



More information about the Implementors mailing list