[whatwg] several messages about XML syntax and HTML5

Mike Schinkel mikeschinkel at gmail.com
Tue Dec 5 03:30:15 PST 2006

Ian Hickson wrote:
>> > Of course the fastest to display would be XHTML, giving site 
>> > owners a reason to go with that. OR the user agent could be 
>> > given the authority to try the different ones whenever it sees 
>> > text/html. (
>> It isn't clear to me why you think XHTML would be fastest. 
>> In practice, HTML is considerably better optimised in browsers 
>> than XHTML. 

Only because, in what I proposed, the browser would first try XHTML before
trying the others. If they tried HTML5 first, HTML5 would display faster.
But since it doesn't appear trying multiple is an option, my comment is

>> XHTML5 is not really intended to be used, it's only defined for 
>> the purposes of making sure XML users don't try to each invent 
>> their own version, resulting in dozens of incompatible versions.
>> HTML5 as text/html is the main serialisation format for HTML5.

So am I to understand that, moving forward, the W3C will recommend HTML5 for
web pages and XHTML only for special cases?  If so, and there are freely
available conformant parsers on all platforms that the spec explicit
recognizes, I'd be happy with that.

>> Well, this is out of scope for the WHATWG...


>> but I encourage you to speak to browser vendors and search 
>> engines and see what they say.

It's nothing a browser vendor would need to support. OTOH, from a search
engine perspective, you work for Google...? :)

>> > How about *real* XML Data Islands then?
>> What would those be?

For example:


	Data in XML format goes here.


The HTML5 parser would pass anything within <XMLDATA> elements to an XML
parser and insert whatever it returns into the response stream.  This could
allow SVG and MathML to work, no?

>> > > Move to HTML5 with an XML pipeline. (This is basically the same as 
>> > > number 6, except that there's no code to drop first.)
>> > 
>> > You do realize that this will happen over a period of many years if it 
>> > happens at all?  And in the mean time...?
>> In the mean time... what? HTML5 won't be "complete" for decades, 
>> I don't see what the problem is here. Everything we're doing here is 
>> on a large timescale.

I mean it will take a long time for people to start using HTML5 until the
time at which HTML5 pipeline tools become ubiquitious if left up to the free
market to develop them and they are not specified as part of the spec
meaning lots of string contationations apps will be built in the meantime
and create another slate of legacy apps.  I know the W3C hasn't done this
before, but can't we learn from past mistakes and take a fresh approach?  I
think it's very important to do this, but I won't keep beating the dead
horse if it's not going to happen.

>> XHTML, which introduced a new format, provides a single direction? 
>> I'm confused. I thought it was the introduction of XHTML that 
>> introduced multiple formats!

Actually, text/plain came even before text/html. :)  Anyway, XHTML was
presented by the W3C as an eventual replacement on text/html.  I'm ideally
hoping that we can have one target with the rest marked as legacy, not
multiple incompatible ones.

>> Anyway, with HTML5, you have a single direction: HTML5-as-text/html.

What's the position of XHTML?  It *seems* like it will still be presented as
a viable option by the W3C.

>> I'm confused as to how XHTML, which introduced a new way of 
>> doing things, reduced the number of ways of doing things.

See above answer.

>> Anyway. Just consider HTML5-as-text/html to be your only 
>> language, and you'll be set. 

No man is an island, especially on the Internet.  I can't consider HTML5 as
the only one to target for future if others head down the XHTML path.  For
one of 1000 considerations, how do I know it the website I'm posting a
comment to used HTML5 or XHTML (as a non-technical user)?

>> > By designing in extensibility [...]
>> HTML has a well-defined extensionability model, as 
>> used by the Microformats community. It's even got 
>> a good accessibility story.

One of the limitations on Microformat design is the lack of available tags.
This is a this issue I'm bringing up is new (from me) but what about
allowing several more attributes to be added to the standard attribute list
for all elements?  For example, if would be really nice if attributes like
abbr, href, name, rel, rev, scope, size, src, type, and value were available
on ALL elements. (Please, pretty please... :)

If it's already in the spec, forgive me but I'm still trying to wade through
all 200+ pages of it.

>> > Until then, the preferred technique for extracting things 
>> > like trackback metadata will continue to be screen scraping 
>> > with regular expressions.
>> I believe pingback shows quite clearly that extension mechanisms 
>> for such things already exist and that the fact that trackback 
>> doesn't use them is not a fault of HTML.

Mind if I ask for clarification on this?  I am not advocating anything here,
you just peaked my interest in learning what you meant.

>> Ok. HTML supports this today, in both the HTML and XML 
>> serialisations, using class values and rel types. Microformats.org 
>> is the community that is most actively working with these 
>> mechanisms, but the mechanisms are open to anyone to use. 
>> As I mentioned above, this even has a pretty decent accessibility 
>> story (which is unusual for extension mechanisms).

Let me applify the need to have more attributes available for extension.
Without more attributes, the story is really not that decent. ;-)

Thanks for listenin'

-Mike Schinkel

More information about the whatwg mailing list