[Imps] Liberal XML parsing
James Graham
jg307 at cam.ac.uk
Mon Jan 8 09:41:41 PST 2007
Sam Ruby wrote:
> I've provided one way: by refactoring it so that all the lowercasing of
> element names is done in exactly one place, and that the lowercasing of
> attribute names is also done in exactly one place. That class can be
> subclassed to provide a different behavior.
That sounds fine to me. We need to add some unicode tests though to be sure
we're not lowercasing where we shouldn't be.
> I'm in no particular rush, but if after a few days it turns out that
> people are OK with something *like* this going into the html5lib
> repository, I'd love to put it in there -- at which point it would be
> free to evolve, be renamed, refactored, and enhanced. One thing I would
> love to work on is a true DOM builder (at which point, I could throw
> away my XMLDocument, XMLElement, and XMLComment classes), but I would
> need changes to TreeBuilder so that I could provide my own Text class
> (for example).
FWIW I consider supporting one of the python DOM implementations a priority for
the 0.3 release of html5lib (of course we need to release 0.2 first -- at this
point that is basically a case of uploading the source archive). Using the
current treebuilder interface it should be possible to support DOM-like text
nodes without any changes but it's non-trivial so maybe the current interface is
in need of improvement (the problem is that we aslo need to support ElementTree
which regards text as attributes).
--
"Eternity's a terrible thought. I mean, where's it all going to end?"
-- Tom Stoppard, Rosencrantz and Guildenstern are Dead
More information about the Implementors
mailing list