[Imps] Test cases for parsing spec

James Graham jg307 at cam.ac.uk
Wed Dec 6 10:17:34 PST 2006


Sam Ruby wrote:
> Ian Hickson wrote:
>> On Wed, 6 Dec 2006, Sam Ruby wrote:
>>> That being said, if Ian (or somebody) can come up with a small seed of 
>>> test cases, I will try to convert them into a usable form and see if I 
>>> can get html5lib working with it.
>> I have a bunch of tests here, I just need a format to output the tests 
>> into. It would take me a few minutes at most, once someone has defined 
>> the exact format for the tests. They're not currently in a usable form 
>> (outside Google, anyway).
> 
> Negotiable.  What I work with now is:
> 
> Line 1: "<!--"
> Line 2: "Description: [1]"
> Line 3: "Expect:      [2]"
> Line 4: "-->"
> Line 5+: HTML
> 
> where [1] is human readable, and [2] is computer readable.  [2] will 
> likely need to be adapted anyway, so don't worry too much.  Something 
> language neutral or xpath-ish would be ideal.
> 
> Example:
> 
> <!--
> Description: extraneous quotes
> Expect:      html/body/a[@title="foo"]
> -->
> <html><body><a href="#"" title="foo"></body></html>

Something like that looks ideal for the parser/treebuilder. If we want to test 
the tokeniser separately, it might be good to just have a list of expected token 
types and properties:

<!--
Description:
Expect:
StartTag html
StartTag body
StartTag a {'href':'#', 'title':'foo'}
EndTag body
EndTag html
-->
<html><body><a href="#"" title="foo"></body></html>

It means multi-line expect and might be overcomplex (ideally one could define a 
single test and check the output from both phases)... what do you think?

-- 
"Eternity's a terrible thought. I mean, where's it all going to end?"
  -- Tom Stoppard, Rosencrantz and Guildenstern are Dead



More information about the Implementors mailing list