[Imps] True DOM TreeBuilder
rubys at intertwingly.net
Wed Jan 10 02:36:40 PST 2007
I just committed a minidom.getDOMImplementation() based TreeBuilder to
1) I had to monkey patch minidom in order to get text nodes that are
immediate children of the document node to work.
2) Based on how html5 is spec'ed, the doctypes become "HTML" instead
of "html", which is what you would expect in an XML DOM
3) This implementation is not namespace aware, nor are the elements
placed in the XHTML namespace.
http://code.google.com/p/html5lib/ is purportedly XHTML 1.0 Strict, but
is served as text/html and contains such dubious constructs as
"<div id=gaia>". You can obtained a cleaned up version of this page
after a side trip through the DOM via:
$ python parse.py -b dom -x http://code.google.com/p/html5lib/
In particular, note what the DOM's default "toxml()" method does to the
script near the end of this page.
- Sam Ruby
More information about the Implementors