[whatwg] Joe Clark's Criticisms of the WHATWG and HTML 5

Tue Dec 11 02:53:36 PST 2007

(Despite the subject line, this thread quickly veered way from Joe's blog 
post and instead covered a variety of subjects. I have attempts to address 
the points that had substance and may affect the spec in my replies below. 
Please let me know if I missed something in this thread that you believe 
should have resulted in a change to the spec.)

On Thu, 22 Mar 2007, Henri Sivonen wrote:
> > >
> > > FWIW, I think <samp> and <kbd> don't deserve to be in HTML and I am 
> > > not convinced that the use cases for <var> could not be satisfied by 
> > > <i>.
> > 
> > I'm lukewarm on all three, but the cost to keeping these is probably 
> > slightly less than the cost to removing them, so I'm tending towards 
> > keeping them...
> 
> I tend to agree. But then they should not be used as a basis for arguing 
> anything about the design of HTML5 or as bases for analogies for 
> including new "semantic" elements of similar kind.

Agreed.

> > The CSS community has requested a <date> or <time> element because 
> > they want to restyle dates and times according to locale.
> 
> I tend to think that this has huge potential for people getting confused 
> and missing appointments. Time zones are impractical and confusing 
> enough without DWIM changing them.

Sorry, by locale I just meant the syntax, not the time zone.

> > Also, the aforementioned research indicated that there are substantial 
> > amounts of content on the Web that uses invented elements, IDs, and 
> > class attributes to mark up dates and times. For example, I found 
> > about the same number of pages with the obscure ID "updatedtime" as I 
> > did pages with a <button> element; "date" was the 14th most frequently 
> > seen class name.
> 
> However, merely marking up something as *a* date without knowing *what* 
> date is not particularly interesting. (Compare with the fluffiness of 
> Dublin Core.)

Well, it helps a little -- microformats for example add the "what" using 
their own semantics, and CSS doesn't need the "what" to decide on the 
"how" (to present).

> > > I'm inclined to think that the <cite> element is useless. <i> could 
> > > be used for marking up titles of works and <b> could be used for 
> > > magazine and newspaper-style marking up of first instance of 
> > > personal names. I have yet to see a markup consumption use case that 
> > > would work on the public Web and would use <cite>.
> > 
> > <cite> is used more than <button>. It's used almost as often as <h6>.
> > 
> > One of the reasons for keeping <var>, <cite>, <em>, etc, separate, 
> > instead of saying that authors should just use <i> for all of them, is 
> > that it makes styling them differently much easier.
> 
> Assuming, of course, that you want to style them differently instead of 
> just italicizing all of them.

This seems common enough to keep them, given the arguments listed at the 
top of this e-mail.

> I am still on the fence about using <cite> in my thesis. Currently I am 
> using it to mark up titles of works.

Any advice as to what the specshould say on the matter is welcome; in fact 
I have a whole folder of such advice that I'll be addressing in due 
course.

> > (Why is <i class="var"> better than <var>?)
> 
> It isn't. But <i> is better than <var> for editor UIs if all you want to 
> do is to italicize (the common case).

Granted, and <i> is allowed. It is semantically less precise, but that 
doesn't necessarily matter.

> > > |     * note and reference for footnotes, endnotes, and sidenotes 
> > > |       (not aside in “HTML5”)
> > > 
> > > Yes, this is an area where document and converter authors currently 
> > > need to come up with their own class-based hacks. Ideally a 
> > > continuous media user agent could show footnotes in context so that 
> > > they don't become de facto endnotes.
> > 
> > If anyone has any ideas on this, please post them to the list. (The 
> > CSS group is also looking at footnotes closely.) One thing to consider 
> > when looking at footnotes is "would the title="" attribute handle this 
> > use case as well as what I'm proposing?". If the answer is "yes", or 
> > "almost", then it's probably not a good idea to introduce the new 
> > feature.
> 
> I am not happy with title='' for footnotes.
> 
> First, there are all the usual objection against putting 
> natural-language text in attributes.

Agreed.

> Second, tooltips (the typical screen media presentation of title='') 
> have significantly different properties compared to print footnotes when 
> it comes to reader attention. Tooltips aren't very discoverable and are 
> inconvenient to read. Footnotes, on the other hand, are rather easy to 
> read. Moreover, footnotes containing prose (as opposed to just URIs or 
> other identifier data) actually work as a device for emphasizing stuff 
> that the author pretends to de-emphasize while knowing that (s)he is 
> really emphasizing. Tooltips don't work like this. I remember reading 
> somewhere (I forget where) that many people read the footnotes first 
> when they turn a new page in a book.
> 
> This is why I'd be interested in being able to turn <aside>s into 
> footnotes in print.

CSS is working to improve support for footnotes. I think we may have to 
introduce footnote-specific markup in a future version of HTML.

> > > | * bibliographies, tables of contents, and indices (some in 
> > > |   “HTML5”)
> > > 
> > > One of the issues here is the tension of HTML as an authoring format 
> > > and HTML as a delivery format. That is, do we really want the 
> > > browser to do the stuff BibTeX does? OTOH, if the browser just 
> > > displays output from a bibliography generator, what level of 
> > > semantic encoding is actually useful for the consumers of the 
> > > document? PDF doesn't attempt to go further than identifying what 
> > > blocks are bibliography entries. Is that useful enough to bother? If 
> > > the markup is very detailed so that Google Scholar (or whatever) 
> > > could analyze cross-references in scientific papers, wouldn't that 
> > > veer back into focusing on computer science papers?
> > > 
> > > I, for one, am writing about HTML5 in LaTeX. One of the reasons was 
> > > BibTeX even though I have to hack a .bst of my own.
> > 
> > I have to be honest that personally I've not really found a need for a 
> > bibliography tool. In fact, the preprocessor I use to generate the 
> > WHATWG specs (the one that does all the cross-references) supports 
> > automatic bibliography generation, but I go out of my way to not use 
> > it and to do the bibliography manually. (And the WF2 spec doesn't have 
> > a small references section!)
> > 
> > But if you could eloborate on exactly what it is you need in terms of 
> > bibliography in HTML5, it's certainly an area open for discussion.
> 
> Actually, I don't think HTML5 should leave bibliography formatting to 
> the browser. There are just too many ways to want to micromanage the 
> bibliography presentation. I guess that leaves HTML as both an editing 
> format and a delivery format but the editing and delivery instances 
> cannot always be the same. Instead, intermediate processing on the 
> authoring side is needed.

That's how I've used HTML for writing specs for those time.

> BTW, after writing what is quoted above, I moved from .tex, .bib, 
> TeXlipse and LaTeX to .xhtml, .bib, oXygen, TeXlipse and Prince. So I 
> got rid of LaTeX but kept .bib. To make it work, with the help of the 
> TeXlipse lead developer, I wrote a special-purpose bibliography 
> generator for XHTML that uses the .bib parser from TeXlipse.
> 
> It appears that the microformat folks are working on hCite. I can't 
> figure out what hCite is exactly (see the thread starting at 
> http://microformats.org/discuss/mail/microformats-discuss/2007-March/008996.html 
> ), but it seems to me that isn't workable as a hand-authoring format. 
> Instead, you'd have to generate it from something like .bib.

So in conclusion... we should leave HTML5 as is for now?

On Fri, 23 Mar 2007, Nicholas Shanks wrote:
> 
> I was thinking more along the lines of:
> 
> 1) We start with a set containing all potential authors, human and 
> robotic, past present and future.
> 2) We remove from that set the people and programs who don't care about 
> or are not willing to learn correct methods of authorship, these people 
> should have no say.
> 3) We then take a poll of every possible string value for new elements, 
> and sort the result as a priority list, amalgamating words that mean the 
> same thing.
> 4) We decide how many elements HTML should have (i.e. how complicated it 
> should be/how hard for new people to learn), and cut the list at this 
> number.
> 5) We then use this as the new HTML.
> 
> That way I'm sure there would be 100 million votes for <copyright> and 
> perhaps 250,000 votes for <var>, <dfn>, <kbd> etc.

I a way, I did that, with the study of what authors are using today.

In practice, though, since browser vendors aren't going to drop support 
for existing elements, existing elements end up being much cheaper to 
"add" than new elements, and end up being not especially worth removing.

> But how can you justify the presence of <kbd> when so few people write 
> content where keyboard input has to be represented? I've never met 
> anyone who's hobby is writing computer software manuals in HTML. By 
> contrast there are millions upon millions of people who watch television 
> and discuss it on the internet. Why isn't <tv-show> an element?

History, basically. <kbd> exists, it costs us basically nothing to keep it 
(and actually would cost us more, probably, to remove it). <tv-show>, on 
the other hand, has little demand and would cost a lot to add.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'