[whatwg] Thoughts on HTML 5

Thu Dec 18 06:53:58 PST 2008

2008/12/17 Ian Hickson <ian at hixie.ch>

>
> This doesn't cost any time in HTML either, since the tokeniser doesn't
> need to worry about what tags have end tags, the tree construction side
> just drops unexpected end tags on the floor.
>
I don't think authors expect tags to disappear.

>
> > don't check for insertion modes
>
> Having an insertion mode isn't particularly a performance cost. (It
> affects code footprint, but that's about it.)
>
1) it needs more code (one x insertion mode): more code is always less
performance, even if it is just to load a bigger executable
2) it needs code to  select the insertion mode for the next element (when
the spec says  to reset the insertion mode): in the worst case it has to
compare nodeName 18 times
> That's the same as HTML.
No it is not. HTML defines special beaviour for the following elements:
 address, area, article, aside, base, basefont, bgsound, blockquote, body,
br, center, col, colgroup, command, datagrid, dd, details, dialog, dir, div,
dl, dt, embed, eventsource fieldset, figure, footer, form, frame, frameset,
h1, h2, h3, h4, h5, h6, head, header, hr, iframe, img, input, isindex, li,
link, listing, menu, meta, nav, noembed, noframes, noscript, ol, p, param,
plaintext, pre, script, section, select, spacer, style, tbody, textarea,
tfoot, thead, title, tr, ul, and wbr.
I think they're quite too many to say that it is like XML

> There are a number of HTML5 parser implementations, and data suggests that
> there is no particular performance gain.
There are no actual HTML5 parser implementation, only HTML4 compatible with
new syntax. (are you sure that closed source browsers really do what is
written in the specification?)

> There's no guessing in HTML either; all input streams have very specific
> and required results.
Actually, there's nothing that really says that <div><p>some text</p><p>some
more text</p></div> is more correct than <div><p>some text<p>some more
text</p></p></div>

Just when writing the specification you guess that the first possibility is
what auctor thought. You are guessing, not the browser.

> Validating code is certainly an important QA point, but once you've
> shipped code, the presence of an error is not helpful to the end user.
> Often errors in XML files weren't present when the file was created, but
> appear later when new text is merged in automatically.

As Nils pointed, it is an error itself to have any content to be
automatically merged inside a stream. It is like opening a random file,
executing it and expecting no errors. Every input, even from the most
trustworthy source, must be parsed for errors and then checked after
publishing.
And if an end user finds an error, he probably will report it to the owner
of the web site, who in turn will report it (quite angrily) to web designer.
Something like: "What on earth are you doing in front of the coffe machine?
I don't pay you to rest! Fix that website immediately!
> Well, they've ignored it for the past 7 years, so why would they change?
Nobody said to user that he was browsing a deprecate web site. If something
like IE7 information bar (ie. a non modal bar, disactivable and not annoying
the user, but immediately visible) could appear in a  web site sent with
 text/html,  I think companies won't like their site tagged as "deprecate"
and port them to application/xhtml+xml in no time (do you imagine what
"deprecate" can mean on news web site?)
And don't forget that the most common browser was IE6, not very standard
oriented...
> Anyway, it isn't clear that we would _want_ to deprecate HTML, even if we
> had any real choice in the matter.

I'm not sure if I understood your sentence (sorry, English is not my mother
language). Anyway, you just have to put an "authoring requirement" for
text/html

1) user agent will just ignore it and implement the HTML algorithm (we don't
want to "break the web", using Microsoft terms)
2) standard-oriented authors will convert their sites to
application/xhtml+xml (if they didn't before)
3) other authors will keep their tag soup (and get their sites
yellow-barred)
4) company owners will make their decision between 2 and 3

Gradually, n° 3 will disappear, because there's no actual needing for HTML.

@ Garret:
originally I wrote XBL2, then I deleted it since it was not pertinent (and
went in opposite direction as my opinion), but i forgot to edit list number.

Secondly, what do you mean with bubbling? Ok, I can put an event handler on
what I need on, say, <HTML>, but then how can I execute the proper handler?
I must retrieve it, either attaching it to the DOM node (but I don't advice
it, it is not interoperable) or maintaing an hash map with class names /
function pointers and solve it.

Actually, there is no performant and clean solution, just few hacks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20081218/a4b90310/attachment-0001.htm>