[whatwg] Stability of tokenizing/dom algorithms

James Graham jgraham at opera.com
Mon Dec 15 02:07:23 PST 2008

Edward Z. Yang wrote:
> The reason I'd like to know this is because I am the author of a tool
> named HTML Purifier, which takes user-input HTML and cleans it for
> standards-compliance as well as XSS. We insist on output being standards
> compliant, because the result is unambiguous.

Nothing in section 8 is going to ensure that you get output that passes 
a conformance check. If you do transform the output into something that 
is conforming then you have to make up the rules yourself so you have 
just shifted the ambiguity from the client (where it will hopefully 
disappear in a few years once the HTML5 algorithm has large-scale 
adoption) to the sanitizer implementation.

