2008/12/17 Ian Hickson <span dir="ltr">&lt;ian@hixie.ch&gt;</span>&nbsp;<div class="Ih2E3d"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>

XML is neither more performant nor stricter than XML. The main differences<br></div>

are that XML has less user-friendly error recovery and supports arbitrary<br>

namespaces. Authors have clearly indicated that this is not compelling.<br>

Deprecating HTML thus seems like vain effort. (We already tried over the<br>

past few years with XHTML 1.x, and it didn&#39;t work.)<br>

<div><div><br>

</div></div></blockquote></div><p>I don&#39;t write browser code, honestly, but I think that XML parser don&#39;t need to check for attribute types (they&#39;re all quoted strings), don&#39;t check for element type (whether there can or must be closing tag), don&#39;t check for insertion modes, just parses the input completely any semantic or particular behaviour associated with any tag. Then, when the DOMElement or DOMAttr or DOM-whatever are built, they get the appropriate interface (eg. HTMLElement) depending on the namespace.<br>


</p><p>I think that the latter algorithm can be faster, but I actually haven&#39;t got any benchmark (I cannot have, since no browser implements completely HTML5 parse algorithm yet).</p><p>Secondly, stricter to me means: every warning is an error. Look in the following code:<br>


&lt;div&gt;&lt;p&gt;some text&lt;/div&gt;<br>When the HTML parser find char &#39;d&#39;, i can imagine it expects char &#39;p&#39; (as in &lt;/p&gt;) and fallback to &quot;quirk mode&quot; otherwise, although no assertion are made in the official HTML spec.<br>


When parsing as XML, though, the parser can simply get the char: is it a &#39;p&#39;? then go forward, else stop parsing<br>no quirks, no trying to guess author intentions</p><p>what about &lt;div&gt;&lt;p&gt;some text&lt;p&gt;some more text&lt;/div&gt;?<br>


is it this: &lt;div&gt;&lt;p&gt;some text&lt;/p&gt;&lt;p&gt;some more text&lt;/p&gt;&lt;/div&gt;<br>or either this: &lt;div&gt;&lt;p&gt;some text&lt;p&gt;some more text&lt;/p&gt;&lt;/p&gt;&lt;/div&gt;<br></p><p>And most of time strict checking means less errors caused by distraction (misspelling of an end tag, missing &#39;/&#39; when self-closing elements not always selfclosing, etc.)</p>


<p>Lastly, you said that deprecating HTML is vain. Well, IMO, if you deprecate it now, maybe this year, or next year, or even the year after, nothing will move. But (according to WHATWG Wiki) HTML spec will be ready in 2020.</p>


<p>Do you think that in 12 years everybody will just ignore the string &quot;HTML is deprecated and should no longer be used&quot;?</p><p>By the way, XHTML1.0 / 1.1 said nothing about HTML4, they were independent specifications.</p>


<p>Giovanni Campagna</p>

<br>