[whatwg] Drop-in parsers (was: Re: Provding Better Tools)
hsivonen at iki.fi
Tue Dec 5 16:25:22 PST 2006
On Dec 5, 2006, at 16:07, Thomas Broyer wrote:
> 2006/12/5, Mike Schinkel:
>> >> I've just started (today) a .NET implementation (in C#):
>> >> a parser as an XmlReader subclass and writers as XmlWriter
>> >> and HtmlTextWriter subclasses.
Cool! I think making the HTML5 implementations drop-in replacements
for the normal XmlReader and XmlWriter implementations is an
excellent approach. (However, some parts of the prescribed error
correction may not be possible with truly streaming XmlReader, so for
full flexibility and correctness it would be necessary to provide a
true streaming mode with Draconian fatal errors on streaming-
incompatible errors and a tree-buffering fake streaming mode with the
streaming-incompatible errors handled in the buffered tree.)
Hopefully, my conformance checker efforts will, as a side effect,
produce a parser written in Java that can be extended to cover
general Java needs as a drop-in SAX/DOM/XOM-compatible parser. (The
conformance checker only needs a true streaming SAX parser with the
streaming-incompatible errors treated as fatal. I have a design
beyond the conformance checking needs in my head, but I have many
other competing action items to attend to, so please consider this
vaporware. I can't promise anything.)
In general, I think HTML5 parser implementations should target the
most important XML APIs for a given language. For Python, this would
likely mean the Python flavor of SAX (again with partly-Draconian
true streaming or buffering fake streaming), DOM and ElementTree. For
Ruby, this would mean a REXML-compatible implementation. For C, it
would make sense for an HTML5 parser to integrate into libxml2. I
believe such a C implementation would eventually benefit PHP, too.
Of course, in all these cases, the element names should be reported
in lower case unlike in browsers.
>> What license will you release under?
> Probably the MIT licence, I'm not sure yet...
The known Python and Java projects also use the MIT license*.
If the goal is to drive adoption, the MIT license is great, because
it is a Free Software license according to the FSF, an Open Source
license according to OSI, Debian-approved (relevant even to C#
because of Mono), GPL-compatible and suitable for embedding in
proprietary products as well.
hsivonen at iki.fi
More information about the whatwg