[whatwg] XSLT: HTML 5 --> HTML
Henri Sivonen
hsivonen at iki.fi
Tue Feb 6 03:59:26 PST 2007
On Feb 6, 2007, at 13:23, Elliotte Harold wrote:
> It would probably have to be done in two parts. First make the
> document well-formed (possibly with a TagSoup fork). Then run the
> stylesheet. The problem with TagSoup is that it treats bogons
> (unknown elements as empty). It also doesn't quite follow Web Apps
> 1.0's error recovery algorithm. Possibly I could base the initial
> step on html5lib instead.
My parser[1] doesn't follow the WA10 parsing algorithm, either,
*yet*. However, as a tentative Pythonless Java solution, you could
use it together with a RELAX NG validator in the pipeline (using the
whattf.org schemas[2]) to implement Draconian failure in cases where
the error recovery would kick in as per the WA10 parsing algorithm.
Basically, the parser would report to a ContentHandler splitter. The
splitter would show each SAX event to Jing/oNVDL first. The validator
would use DraconianErrorHandler (Jing/oNVDL is fail-fast). Second,
each SAX event would be shown to a TrAX TransformerHandler.
[1] http://hsivonen.iki.fi/validator-about/htmlparser.jar
[2] http://syntax.whattf.org/
--
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/
More information about the whatwg
mailing list