[whatwg] Thoughts on HTML 5

Giovanni Campagna scampa.giovanni at gmail.com
Wed Dec 17 13:16:46 PST 2008

2008/12/17 Ian Hickson <ian at hixie.ch>
> XML is neither more performant nor stricter than XML. The main differences
> are that XML has less user-friendly error recovery and supports arbitrary
> namespaces. Authors have clearly indicated that this is not compelling.
> Deprecating HTML thus seems like vain effort. (We already tried over the
> past few years with XHTML 1.x, and it didn't work.)
> I don't write browser code, honestly, but I think that XML parser don't
need to check for attribute types (they're all quoted strings), don't check
for element type (whether there can or must be closing tag), don't check for
insertion modes, just parses the input completely any semantic or particular
behaviour associated with any tag. Then, when the DOMElement or DOMAttr or
DOM-whatever are built, they get the appropriate interface (eg. HTMLElement)
depending on the namespace.

I think that the latter algorithm can be faster, but I actually haven't got
any benchmark (I cannot have, since no browser implements completely HTML5
parse algorithm yet).

Secondly, stricter to me means: every warning is an error. Look in the
following code:
<div><p>some text</div>
When the HTML parser find char 'd', i can imagine it expects char 'p' (as in
</p>) and fallback to "quirk mode" otherwise, although no assertion are made
in the official HTML spec.
When parsing as XML, though, the parser can simply get the char: is it a
'p'? then go forward, else stop parsing
no quirks, no trying to guess author intentions

what about <div><p>some text<p>some more text</div>?
is it this: <div><p>some text</p><p>some more text</p></div>
or either this: <div><p>some text<p>some more text</p></p></div>

And most of time strict checking means less errors caused by distraction
(misspelling of an end tag, missing '/' when self-closing elements not
always selfclosing, etc.)

Lastly, you said that deprecating HTML is vain. Well, IMO, if you deprecate
it now, maybe this year, or next year, or even the year after, nothing will
move. But (according to WHATWG Wiki) HTML spec will be ready in 2020.

Do you think that in 12 years everybody will just ignore the string "HTML is
deprecated and should no longer be used"?

By the way, XHTML1.0 / 1.1 said nothing about HTML4, they were independent

Giovanni Campagna
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20081217/b55961a3/attachment-0001.htm>

More information about the whatwg mailing list