<br><br><div class="gmail_quote">2008/12/17 Ian Hickson <span dir="ltr">&lt;ian@hixie.ch&gt;</span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="Ih2E3d">

<br>This doesn&#39;t cost any time in HTML either, since the tokeniser doesn&#39;t<br>

need to worry about what tags have end tags, the tree construction side<br>

just drops unexpected end tags on the floor.</div></blockquote><div></div><div>I don&#39;t think authors expect tags to disappear.</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="Ih2E3d"><br>

&gt; don&#39;t check for insertion modes<br>

<br>

Having an insertion mode isn&#39;t particularly a performance cost. (It<br>

affects code footprint, but that&#39;s about it.)</div></blockquote><div></div><div>1) it needs more code (one x insertion mode): more code is always less performance, even if it is just to load a bigger executable</div>

<div>2) it needs code to &nbsp;select the insertion mode for the next element (when the spec says &nbsp;to reset the insertion mode): in the worst case it has to compare nodeName 18 times</div><div></div><div>&gt; That&#39;s the same as HTML.<br>


<div class="Ih2E3d">No it is not. HTML defines special beaviour for the following elements: &nbsp;address, area, article, aside, base, basefont, bgsound, blockquote, body, br, center, col, colgroup, command, datagrid, dd, details, dialog, dir, div, dl, dt, embed, eventsource fieldset, figure, footer, form, frame, frameset, h1, h2, h3, h4, h5, h6, head, header, hr, iframe, img, input, isindex, li, link, listing, menu, meta, nav, noembed, noframes, noscript, ol, p, param, plaintext, pre, script, section, select, spacer, style, tbody, textarea, tfoot, thead, title, tr, ul, and wbr.</div>

<div class="Ih2E3d">I think they&#39;re quite too many to say that it is like XML</div><div class="Ih2E3d"><br>&gt; There are a number of HTML5 parser implementations, and data suggests that<br></div>

&gt; there is no particular performance gain.<br>

<div class="Ih2E3d">There are no actual HTML5 parser implementation, only HTML4 compatible with new syntax. (are you sure that closed source browsers really do what is written in the specification?)<br>

<br>&gt; There&#39;s no guessing in HTML either; all input streams have very specific<br></div>

&gt; and required results.<br>

<div class="Ih2E3d">Actually, there&#39;s nothing that really says that&nbsp;&lt;div&gt;&lt;p&gt;some text&lt;/p&gt;&lt;p&gt;some more text&lt;/p&gt;&lt;/div&gt; is more correct than&nbsp;&lt;div&gt;&lt;p&gt;some text&lt;p&gt;some more text&lt;/p&gt;&lt;/p&gt;&lt;/div&gt;<br>

<br></div>

<div class="Ih2E3d">Just when writing the specification you guess that the first possibility is what auctor thought. You are guessing, not the browser.<br>

<br>&gt; Validating code is certainly an important QA point, but once you&#39;ve<br></div>

&gt; shipped code, the presence of an error is not helpful to the end user.<br>

&gt; Often errors in XML files weren&#39;t present when the file was created, but<br>

&gt; appear later when new text is merged in automatically.<br>

<div class="Ih2E3d"><br>

</div><div class="Ih2E3d">As Nils pointed, it is an error itself to have any content to be automatically merged inside a stream. It is like opening a random file, executing it and expecting no errors. Every input, even from the most trustworthy source, must be parsed for errors and then checked after publishing.</div>

<div class="Ih2E3d">And if an end user finds an error, he probably will report it to the owner of the web site, who in turn will report it (quite angrily) to web designer. Something like: &quot;What on earth are you doing in front of the coffe machine? I don&#39;t pay you to rest! Fix that website immediately!</div>

<div class="Ih2E3d"></div><div class="Ih2E3d">&gt; Well, they&#39;ve ignored it for the past 7 years, so why would they change?<br></div>

<div class="Ih2E3d">Nobody said to user that he was browsing a deprecate web site. If something like IE7 information bar (ie. a non modal bar, disactivable and not annoying the user, but immediately visible) could appear in a &nbsp;web site sent with &nbsp;text/html, &nbsp;I think companies won&#39;t like their site tagged as &quot;deprecate&quot; and port them to application/xhtml+xml in no time (do you imagine what &quot;deprecate&quot; can mean on news web site?)</div>

<div class="Ih2E3d">And don&#39;t forget that the most common browser was IE6, not very standard oriented...</div><div class="Ih2E3d"></div><div class="Ih2E3d">&gt; Anyway, it isn&#39;t clear that we would _want_ to deprecate HTML, even if we<br>

</div>

&gt; had any real choice in the matter.<br>

<div><div class="Wj3C7c"></div></div></div></div><p>I&#39;m not sure if I understood your sentence (sorry, English is not my mother language). Anyway, you just have to put an &quot;authoring requirement&quot; for text/html<br>

</p><p>1) user agent will just ignore it and implement the HTML algorithm (we don&#39;t want to &quot;break the web&quot;, using Microsoft terms)<br>2) standard-oriented authors will convert their sites to application/xhtml+xml (if they didn&#39;t before)<br>

3) other authors will keep their tag soup (and get their sites yellow-barred)<br>4) company owners will make their decision between 2 and 3<br></p><p>Gradually, n° 3 will disappear, because there&#39;s no actual needing for HTML.</p>

<p>@ Garret:<br>originally I wrote XBL2, then I deleted it since it was not pertinent (and went in opposite direction as my opinion), but i forgot to edit list number.</p><p>Secondly, what do you mean with bubbling? Ok, I can put an event handler on what I need on, say, &lt;HTML&gt;, but then how can I execute the proper handler? I must retrieve it, either attaching it to the DOM node (but I don&#39;t advice it, it is not interoperable) or maintaing an hash map with class names / function pointers and solve it.</p>

<p>Actually, there is no performant and clean solution, just few hacks.</p>