[whatwg] Configure Apache to send the right MIME type for XHTML
Elliotte Harold
elharo at metalab.unc.edu
Wed Mar 7 11:04:08 PST 2007
Henri Sivonen wrote:
> TagSoup exists today.
Yes, and I use it. However it constantly surprises people in the markup
it generates, as hanging out for a day or two on the tagsoup-friends
mailing list will show. That's not it's fault. There's just no one
obvious way to fix all the broken markup that's out there. TagSoup picks
one approach. HTML 5 picks another. Both will surprise people a lot of
the time. At the parser level that can't be helped.
However at the document level it can be helped. When the document author
takes the care to generate a well-formed document, they are rarely
surprised by the resulting tree the parser builds. The tree is explicit
in the markup. Explicit markup is more obvious and less surprising than
the implicit fill-in both TagSoup and HTML 5 do.
Hmm, that brings up another question. Does the HTML 5 fixup algorithm
ever change the *tree* for a well-formed (but invalid) document? For
instance, if it finds an li element that is a child of a p, what would
it do? Either ignoring the <li></li> tags, skipping the li element
completely, or filling in a ul element would all change the tree.
I suspect it does one of these three things (or something similar like
filling in an ol element) but without opening the spec or writing a
sample program, I can't tell you which.
By contrast with a real XML parser, I can tell you what's going to
happen without cracking open the spec. HTML5, TagSoup, and XML parse
trees are all deterministic and thus predictable; but only the XML tree
is *obvious*.
--
Elliotte Rusty Harold elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
More information about the whatwg
mailing list