[whatwg] Test cases for parsing spec (Was: Re: Provding Better Tools)

Anne van Kesteren annevk at opera.com
Wed Dec 6 06:19:48 PST 2006


On Wed, 06 Dec 2006 15:13:26 +0100, Sam Ruby <rubys at intertwingly.net>  
wrote:
> Count me in.  This is actually closer to the original reason why I  
> originally subscribed to this list.  If given a few tests, I could  
> convert them into a useful form,and this form could serve as a model for  
> future tests.
>
> My original interest was to write a replacement for Python's SGMLLIB,  
> i.e., one that was not based on the theoretical ideal of how SGML  
> vocabularies work, but one based on the practical notion of how HTML  
> actually is parsed.

The HTMLTokenizer for such a project is mostly finished already:

   http://code.google.com/p/html5lib/

(As in, it actually emits the tokens it has to. I'm quite happy about it!)

James Graham has been working on the Tree Construction part of the process  
(called HTMLParser in parser.py) and Lachlan Hunt is working on an  
HTMLInputStream class which handles some of the specifics needed for the  
input stream.


-- 
Anne van Kesteren
<http://annevankesteren.nl/>
<http://www.opera.com/>



More information about the whatwg mailing list