[whatwg] several messages about XML syntax and HTML5

Wed Dec 6 21:44:05 PST 2006

On Wed, 6 Dec 2006, Mike Schinkel wrote:
> > >
> > > The HTML5 parser would pass anything within <XMLDATA> elements to an 
> > > XML parser and insert whatever it returns into the response stream.  
> > > This could allow SVG and MathML to work, no?
> >
> > What's the use case?
> 
> The use-case is to allow abitrary XML to be embedded into HTML

That's not a use case, that's a feature proposal. A use case is something 
like "I want to share my calendar information with my children", or "I 
wish I could include spreadsheets in my financial documents written in 
HTML", or "I need to be able to include flowcharts in my documents".

> > What's the processing model?
> 
> I don't understand what you are asking.

How would your proposal work. What would the rules be for how Web browsers 
are to handle the content.

> > It's not clear to me what the problem is we're trying to solve here, 
> > not what the proposal for solving it is.
> 
> Ability to insert XML-based solutions into HTML and have then processed 
> as XML.

That's not a problem. That's a solution, looking for a problem.

What is the problem that this solves?

> This would allow almost ultimate flexibility moving forward and not 
> require an HTML6 for many things.

Why would requiring HTML6 be a bad thing?

> What's more, it would help to see what the world at large has created 
> for extensions to understand what interests people.

We already have such a mechanism, namely, plugins.

> And because it would be required to be valid (or at least well formed) 
> XML you'd give HTML publishers a chance to learn the rules of XML.

Why is that a good thing?

> It would also allow embedding of what I'll call "Microdirectives", i.e. 
> basically metadata, but that visible like Microformats.  It would let 
> someone publish both a human readable document and a machine readable 
> document.

HTML already lets you do this. Microformats are an example of this.

> Oh, and it would be a great place to store RDF. ;-)

Again, that seems like a solution in search of a problem.

> > What are the parsing requirements?
> 
> Again, not exactly sure what you are asking.  Have I answered already?

I mean, what exactly would the browser have to do to parse the content 
you are proposing? Something like the parsing rules in the spec today, but 
specifically for your proposed feature.

> > Note that any feature that, when misused, will "work better" in 
> > browsers that _don't_ support the feature than in browsers that _do_ 
> > support the feature, are doomed to failure, because browsers will be 
> > forced to emulate the browsers that don't support the feature instead. 
> > This basically implies that any syntax checking in a text/html 
> > document that results in fatal error (even for a subpart) but that 
> > renders ok in legacy browsers is a non-starter.
> 
> I don't follow this. Did you state it correctly?

Yes.

> Does it apply to what I'm talking about?  And if so, why?

Here's an example. If this:

   ...text...
   <new-feature><erroneous content></new-feature>
   ...text...

...displays like this:

   ...text... ...text...

...in existing browsers, but like this:

   ...text... ERROR ...text...

...in new browsers, then it looks worse in new browsers than old ones. 
Thus, new browsers will want to go back to the way that old browsers 
handled it, so that they don't handle it worse than the (old) competition.

> I legacy browser would have to ignore an <xmldata> element.  Why would 
> it be bad if older browsers worked better?  I can't even conceive of an 
> example.

If legacy browsers ignore it, but new browsers show an error, then the 
legacy browsers, to the user, are doing a better job, and the browser 
vendors will be discouraged from implementing the feature correctly.

> > They're using HTML5. Anything using text/html is HTML5, and everyone 
> > basically uses text/html. There are exceptions (Sam, e.g.), but there 
> > are _always_ exceptions.
> 
> Clarification then; what happens if IE8 supports application/xhtml+xml. 
> Sounds like that would actually be a bad thing.  Otherwise we might end 
> up with two camps: the HTML5 camp and the XHTML camp, and all the 
> associated chaos.

I have no opinion on this.

> > > This is a this issue I'm bringing up is new (from me) but what about 
> > > allowing several more attributes to be added to the standard 
> > > attribute list for all elements?  For example, if would be really 
> > > nice if attributes like abbr, href, name, rel, rev, scope, size, 
> > > src, type, and value were available on ALL elements. (Please, pretty 
> > > please... :)
> >
> > Could you elaborate on what each one of these attributes would mean?
> 
> I don't have specifics

Then it is not clear that they are required.

> but I know from participating on the Microformat list that one of the 
> biggest problems if lack of available attributes.

That certainly isn't what I've heard from the Microformats community.

> If there were more attributes, the Microformat community could develop 
> much less verbose markup; i.e. instead of having three sets of <DIV> 
> tags around an element each with one attribute that can be used, they 
> could define Microformats that only required one <DIV> tag.  But I can't 
> give you exactly what they would be used for, just they the Microformat 
> community could learn to apply them effectively. By analogy, it would be 
> like someone requiring TimBL to specify upfront all the kinds of content 
> people would put on a web page before giving him the green light to work 
> on the web.  I'm asking for building blocks; the Microformat community 
> would define how they apply.

Meaningless building blocks -- or building blocks whose meaning varies 
from page to page depending on who wrote them -- are not good for the Web.

> > Taking "abbr", though, if you want to extend HTML with some custom 
> > features and one of the thing you have to add is an abbreviation, then 
> > use the <abbr> element. It already supports abbreviations.
> 
> For one example, if I already have <td> tags enclosing a value, why do I 
> need to add almost 50% more characters when I could instead do it more 
> cleanly?
> 
> 	<td><abbr title="United States">USA</abbr></td>
> vs.
> 	<td abbr="United States">USA</td>

14 characters is not enough to warrant an entirely new attribute with its 
own processing model, conformance requirements, semantics, etc. Every 
element and attribute is very expensive to add. We have to keep it to a 
bare minimum.

> 	<abbr class="currency" title="USD">
> 		<span class="amount">54.97</span>
> 	</abbr>
> vs.
> 	<span class="currency" type="USD" description="amount">54.97</span>

What's wrong with:

   $54.97 (USD)

...? It doesn't seem like you actually need _any_ markup here. If you 
really want it:

   <span class="amount"><abbr title="USD">$</abbr>54.97</span>

I don't see why this is a problem.

> Another thing that would be nice would be to add a <uf> tag (for u=Micro 
> & f=Format) and give it lots of attributes with short names for semantic 
> application:
> 
>        <span class="money">
>                <span class="symbol" title="dollar">$</span>
>                <abbr class="currency" title="USD">
>                        <span class="amount">54.97</span>
>                </abbr>
>        </span>
> vs.
>        <uf c="money">
>                <uf c="symbol" t="dollar">$</uf>
>                <uf a="currency" t="USD" n="amount">54.97</uf>
>        </uf>
> 
> But I know you'll probably consider <uf> too strange...

It's not that it's strange, it's that it's meaningless. Those values are 
all opaque, they mean nothing. That's terrible for accessibility.

> > The pingback specification does exactly what the trackback 
> > specification does, but without relying on RDF blocks in comments or 
> > anything silly like that. It just uses the Microformats approach, and 
> > is far easier to use, and doesn't require any additional bits to add 
> > to HTML.
> 
> [offtopic]
> I'd never heard of pingback. I googled for it and found your website 
> first, but couldn't find the RFC number.  You have a copyright of 2002, 
> and it appears that Trackback was also developed in 2002. So are you 
> implying they should have used Pingback instead?  It appears they were 
> developed in parallel?

They were made around the same time (Trackback was invented first). My 
point was just that Trackback is not a good example of why you need more 
attributes in HTML, since there are equivalent technologies that do it 
with existing markup and no loss of detail.

> BTW, why did you use XMLRPC with an simple RESTful POST would have 
> sufficed (and been easier to implement?)

Inexperience. Pingback is a terrible design. You really only need a single 
HTTP header ("Referer") to do it. The entire spec is in fact redundant.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'