[whatwg] Distinguishing XML and HTML by content sniffing

Michael Day mikeday at yeslogic.com
Sat Mar 3 22:33:51 PST 2007


Hi all,

For user agents like Prince that support XML and HTML content it is 
sometimes necessary to distinguish whether a .html file is actually XML 
or HTML in order for it to be processed correctly.

I've written an article for XML.com explaining exactly how Prince 
performs content sniffing to distinguish XML and HTML documents:

     What Does XML Smell Like?
     http://www.xml.com/pub/a/2007/02/28/what-does-xml-smell-like.html

Any feedback would be greatly appreciated. No doubt at some point it 
will be necessary to revise our heuristics for HTML5 :)

Best regards,

Michael

-- 
Print XML with Prince!
http://www.princexml.com



More information about the whatwg mailing list