[whatwg] XSLT: HTML 5 --> HTML

Henri Sivonen hsivonen at iki.fi
Tue Feb 6 03:59:26 PST 2007


On Feb 6, 2007, at 13:23, Elliotte Harold wrote:

> It would probably have to be done in two parts. First make the  
> document well-formed (possibly with a TagSoup fork). Then run the  
> stylesheet. The problem with TagSoup is that it treats bogons  
> (unknown elements as empty). It also doesn't quite follow Web Apps  
> 1.0's error recovery algorithm. Possibly I could base the initial  
> step on html5lib instead.

My parser[1] doesn't follow the WA10 parsing algorithm, either,  
*yet*. However, as a tentative Pythonless Java solution, you could  
use it together with a RELAX NG validator in the pipeline (using the  
whattf.org schemas[2]) to implement Draconian failure in cases where  
the error recovery would kick in as per the WA10 parsing algorithm.

Basically, the parser would report to a ContentHandler splitter. The  
splitter would show each SAX event to Jing/oNVDL first. The validator  
would use DraconianErrorHandler (Jing/oNVDL is fail-fast). Second,  
each SAX event would be shown to a TrAX TransformerHandler.

[1] http://hsivonen.iki.fi/validator-about/htmlparser.jar
[2] http://syntax.whattf.org/

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/





More information about the whatwg mailing list