On 11/29/06, <b class="gmail_sendername">Ian Hickson</b> <<a href="mailto:ian@hixie.ch">ian@hixie.ch</a>> wrote:<div><span class="gmail_quote"></span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

On Wed, 29 Nov 2006, Leons Petrazickis wrote:<br>><br>> This rigmarole is going to repeat on every site that has converted to<br>> XHTML sent as text/html. People are emotionally invested in the idea of<br>> trailing slashes. Websites have complex codebases, and going through

<br>> them removing trailing slashes on singleton elements would be very hard.<br><br>Various things are worse noting here:<br><br>XHTML is a minority on the Web. Looking at just which elements specify the<br>XHTML namespace on their <html> element, XHTML has at most 15%

<br>penetration, for example.</blockquote><div><br>I am of the belief that that particular statistic is meaningless.  Even if it were 15%, most aren't well formed.  Of those that are well formed, most don't have the cojones to serve such documents with the appropriate MIME type as they know that to do so would cause compliant UA to be rather unforgiving.  And of the few insane enough to do so, it is rare that the page in question is actually valid.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Nothing is going to stop people from continuing to use XHTML1, HTML4,<br>HTML3.2

, HTML2, or whatever their existing content uses. HTML5 is a new<br>language, that happens to be backwards-compatible with all of those. There<br>are probably near zero documents on the Web today that are<br>HTML5-compliant, simply because the DOCTYPE is new. That's fine. Just

<br>getting new documents to be compliant would be fine. WordPress, for<br>example, will eventually create new templates, and those could be based on<br>HTML5 (though of course WordPress would have a harder job there due to its

<br>hardcoding of markup, but that's another story).</blockquote><div><br>... on the other hand, I am not of the belief that version numbers mean what they are supposed to.  You will see HTTP 1.1 headers in HTTP 1.0 requests, RSS 

2.0 elements in RSS 0.91 feeds, and HTML4 elements in XHTML documents.<br><br>We live in a cut and paste world.  The fact that I could find an XHTMLism in the front page of <a href="http://Microsoft.com">Microsoft.com</a>

 will likely surprise few.  Lachlan is free to call the authors of WordPress bozos if he likes, but frankly the bozos out number you.  What should be the most damning of all is that I found an example on the most prominent page on the 

<a href="http://mozilla.org">mozilla.org</a> site.  No one can say that the authors of that page didn't make a conscious choice in the DOCTYPE for that page.  No one can say that the authors of that page are ignorant.  No one can say that mozilla has a(n entirely) cavalier attitude towards standards.

<br><br>My theory is that we live in a cut and paste world, one based on partial understanding.  Few understand DOCTYPEs and xmlns attributes, mostly people crib from something that works.<br><br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

If people want to make HTML5 syntactically compatible with XHTML1, such<br>that XHTML1 documents don't cause syntax errors in HTML5, we'll have to do<br>a whole lot more than just allowing trailing /s. I don't really see why

<br>that would be a goal, though. Going further, if we want to make documents<br>in general compliant with HTML5, then we've got our work cut out for us --<br>at least 78% of documents are syntactically incorrect today (not counting

<br>things like trailing /s in attributes, or missing DOCTYPEs -- if you<br>include those, the number is more like 93%).</blockquote><div><br>At the present time being valid is an ideal that is virtually unattainable.  For most people, if your web page is broken, a validator is probably the last place you want to go as it will require you to fix a number of things that frankly nobody cares about before you can see the real errors.

<br><br>The situation is not perfect, but perhaps a bit better for feeds.  For the overwhelming majority of errors that the feed validator reports, there is somebody that cares.  Example: try viewing a feed that isn't well formed using IE7.

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">In general, people don't migrate to new versions of HTML. They only use<br>new versions for new documents. Which is fine, since HTML5 UAs are going

<br>to be backwards-compatible (by design).</blockquote><div><br>Now we are getting to the real question:  backwards compatible with what?  Only with compliant  documents (i.e., at most 22% of the web) or with pages like the one at 

<a href="http://mozilla.org">mozilla.org</a>?<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> They've already reaped all the benefits of XHTML -- cleaner, more

<br>> readable, more maintainable code.<br><br>It's a myth than XHTML gives you those benefits, by the way, especially if<br>you don't actually use an XML pipeline (which WordPress doesn't).</blockquote><div><br>I have no interest in that discussion.   

<br></div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">> The very idea of HTML5 is to not demand that the Web be scrapped and<br>> rewritten. We need the people who have rewritten all their pages so that

<br>> they validate on the W3C validator -- they have the fire and the zeal<br>> and the will to spread our format. We need to make the migration from<br>> invalid XHTML to valid HTML5 very, very easy for them. We can't require

<br>> them to dig through PHP spaghetti. And that means that, no matter how<br>> it's achieved, <br/> needs to be valid HTML5.<br><br>I don't really understand this argument. Those who use XHTML1 because it's<br>

"the latest thing", are as likely to use HTML5 because it's "the latest<br>thing", regardless of how complex that is. After all, they made the<br>transition to XHTML, why wouldn't they make the transition to HTML5?

</blockquote><div><br>More likely, those that chose XHTML1 because it was the latest thing are now jaded by the promises made - and largely unkept - and will take a pass on HTML5.<br><br>Unless, of course, HTML5 compliance is simultaneously both more meaningful and easier to achieve than XHTML1 compliance.

<br><br>Drawing lines in the sand and maintaining that "<br />" is invalid is only going to make more busy work for a lot of people.  If you try to explain why this decision was made, most won't understand, and eventually most will decide that compliance isn't worth the bother.

<br><br>However, drawing lines in the sand that "<p /> doesn't mean what you think it means" will affect few, and the reason for that particular line is both sound and educational.<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

I'm being devil's advocate here. As I noted earlier, I don't have an<br>opinion on this yet; I'm interested in what people are saying.</blockquote><div><br>I'm impressed that you are keeping an open mind.<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

What would be most helpful is if it could be clearly stated what the<br>proposal is exactly (trailing /s are already handled by the parser, is<br>the proposal just to make them not raise an error in some cases? What<br>cases, exactly? How would this change the parser spec?), what the reason

<br>for this proposal is, and what the pros and cons are.</blockquote><div><br>Just FYI, I'm in no rush here.   What I said about living in a world where mostly what exists out there consists of partial understanding applies to me too.  Without running code and test cases, I don't yet fully 'grok' what the parser described in the document is supposed to do.  But I will get there.

<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">--<br>Ian Hickson               U+1047E                )\._.,--....,'``.    fL<br>

<a href="http://ln.hixie.ch/">http://ln.hixie.ch/</a>       U+263A                /,   _.. \   _\  ;`._ ,.<br>Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'<br></blockquote></div><br>- Sam Ruby<br>