[html5] Incremental rendering of XHTML5

Tue Dec 13 13:53:47 PST 2011

On Fri, 12 Aug 2011, Jesper Tverskov wrote:
> 
> I have not understood why an XML parser for webpages doesn't have the 
> potential of being faster than a HTML parser. Most HTML parsers are able 
> to repair the markup and to show something useful no matter how much 
> code is missing.

The basic answer is that it is no more expensive to correct such mistakes 
than it is to abort when you hit such mistakes. In fact in many cases it 
can be quicker to not look to see if there is a mistake, which can make 
HTML parsing actually quicker in theory in some cases.

An analogy:

An XML parser is like a car driven by someone who carefully checks that 
he's not going to drive into anything. If there's anything on the road, he 
stops and doesn't go any further.

An HTML parser, on the other hand, is like a car driven by someone who 
doesn't check to see if there's anything in the road, he just plows 
straight through.

When there's nothing on the road, the reckless HTML parser car can go 
faster than the careful XML parser car can. On the other hand, when 
there's something in the way (i.e. a markup error), the results are 
predictable with the XML car -- it stops. But with the HTML car, the 
results can be negligible (e.g. driving straight over a newspaper on the 
road, analogous to a minor syntax error), or they can be spectacular (e.g. 
driving into a big rock, analogous to a syntax error with noticeable error 
handling, like unexpected text in a <table>).

HTH,
-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'