[whatwg] Allow trailing slash in always-empty HTML5 elements?

Elliotte Harold elharo at metalab.unc.edu
Sat Dec 2 04:02:04 PST 2006


Lachlan Hunt wrote:

> HTML and XML have significantly different parsing requirements and they 
> absolutely must be treated as significantly different file formats.  Any 
> attempt to treat them as the same format is an extremely bad idea.

That's only true to the extent that some people seem to insist on making 
them needlessly different. HTML is tantalizingly close to well-formed 
XML. They both derive from SGML. They both use angle bracketed tags. 
They both define a tree structure. Indeed in many cases an HTML document 
is an XML document.

This enables the use of the very powerful XML toolchain for processing 
HTML. In fact, prior to the widespread adoption of XML there were, near 
as I could tell, no reliable open means of parsing HTML documents. There 
were a few proprietary, incompatible, buggy engines locked up in various 
browsers; and that was about it.

What I don't understand is why some members of this working group is so 
dead set on actively preventing HTML from being XML. The non-draconian 
error handling I understand. But why are you disappointed that <!DOCTYPE 
html> is well-formed XML? Why the active hostility to well-formedness?

-- 
Elliotte Rusty Harold  elharo at metalab.unc.edu
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/



More information about the whatwg mailing list