[whatwg] Joe Clark's Criticisms of the WHATWG and HTML 5

Ian Hickson ian at hixie.ch
Mon Oct 30 12:33:55 PST 2006

On Sun, 29 Oct 2006, Henri Sivonen wrote:
> > 
> > Due in no small part to WHAT WG’s leadership by a strict standardista
> Well, the leadership applies different kind of strictness to the 
> tokenizer/DOM level and to semantics. Personally, I'd like the 
> tokenizer/DOM part to be a tad stricter and the semantics part to be 
> more lax.

FWIW, I'm pretty sure that when I get to going through your comments in 
detail that this is pretty much the direction the spec will go in.

> FWIW, I think <samp> and <kbd> don't deserve to be in HTML and I am not 
> convinced that the use cases for <var> could not be satisfied by <i>.

I'm lukewarm on all three, but the cost to keeping these is probably 
slightly less than the cost to removing them, so I'm tending towards 
keeping them... FWIW, <var> is used the most of those three, and <samp> 
the least; they are all three used more often than <bdo> or <ruby>, at 
least in the sample of several billion files I last made. (We're talking 
in the 0.01% to 0.05% range here.)

> I can't remember seeing any use case-based rationale for the <t> 
> element. I'm not convinced that having it is a good idea.

<t> (or an equivalent) has been widely requested, especially in the 
microformats and CSS communities. Several microformats have need for 
encoding specific times and/or dates, and are currently (ab?)using <abbr> 
for this purpose. The CSS community has requested a <date> or <time> 
element because they want to restyle dates and times according to locale. 
The blogging and content publishing communities have also raised the need 
for a way to unambiguously mark up what part of their document is a date 
and/or time, though in their case (as with microformats) they need a way 
to then mark each date/time element as being a particular semantic 
(publishing date, birth date, calendar event time etc).

Also, the aforementioned research indicated that there are substantial 
amounts of content on the Web that uses invented elements, IDs, and class 
attributes to mark up dates and times. For example, I found about the same 
number of pages with the obscure ID "updatedtime" as I did pages with a 
<button> element; "date" was the 14th most frequently seen class name.

> I'm inclined to think that the <cite> element is useless. <i> could be 
> used for marking up titles of works and <b> could be used for magazine 
> and newspaper-style marking up of first instance of personal names. I 
> have yet to see a markup consumption use case that would work on the 
> public Web and would use <cite>.

<cite> is used more than <button>. It's used almost as often as <h6>.

One of the reasons for keeping <var>, <cite>, <em>, etc, separate, instead 
of saying that authors should just use <i> for all of them, is that it 
makes styling them differently much easier. (Why is <i class="var"> better 
than <var>?)

> Also, I was unable to explain to my mother why she should use <cite> 
> instead of whatever command-i does in Dreamweaver. (Apparently, 
> command-i applied <i> Dreamweaver 4 but applies <em> in Dreamweaver MX, 
> which should indicate to semanticist that <em> and <strong> are a lost 
> cause and really are only aliases for <i> and <b>.)

WYSIWYG editors really should use <i> and <b>, I think. I'll probably be 
including a section specifically targetted at WYSIWYG editors that don't 
have semantic support.

> >     * note and reference for footnotes, endnotes, and sidenotes (not 
> > aside in “HTML5”)
> Yes, this is an area where document and converter authors currently need 
> to come up with their own class-based hacks. Ideally a continuous media 
> user agent could show footnotes in context so that they don't become de 
> facto endnotes.

If anyone has any ideas on this, please post them to the list. (The CSS 
group is also looking at footnotes closely.) One thing to consider when 
looking at footnotes is "would the title="" attribute handle this use case 
as well as what I'm proposing?". If the answer is "yes", or "almost", then 
it's probably not a good idea to introduce the new feature.

> > * bibliographies, tables of contents, and indices (some in “HTML5”)
> One of the issues here is the tension of HTML as an authoring format and 
> HTML as a delivery format. That is, do we really want the browser to do 
> the stuff BibTeX does? OTOH, if the browser just displays output from a 
> bibliography generator, what level of semantic encoding is actually 
> useful for the consumers of the document? PDF doesn't attempt to go 
> further than identifying what blocks are bibliography entries. Is that 
> useful enough to bother? If the markup is very detailed so that Google 
> Scholar (or whatever) could analyze cross-references in scientific 
> papers, wouldn't that veer back into focusing on computer science 
> papers?
> I, for one, am writing about HTML5 in LaTeX. One of the reasons was 
> BibTeX even though I have to hack a .bst of my own.

I have to be honest that personally I've not really found a need for a 
bibliography tool. In fact, the preprocessor I use to generate the WHATWG 
specs (the one that does all the cross-references) supports automatic 
bibliography generation, but I go out of my way to not use it and to do 
the bibliography manually. (And the WF2 spec doesn't have a small 
references section!)

But if you could eloborate on exactly what it is you need in terms of 
bibliography in HTML5, it's certainly an area open for discussion.

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list