<section> and headings [was: Re: [whatwg] LABEL and radio/checkbox onclick]

Sun Aug 29 07:03:27 PDT 2004

Ian Hickson wrote:
> On Thu, 26 Aug 2004, James Graham wrote:

>>The new scheme makes it very easy to create illogical page structures. 
>>For example, it's not clear how the following should work:
>>
>><section>
>><h1>Title</h1>
>><section>
>><h1>Subheading</h1>
>><section>
>><h2>Second subheading</h2>
>></section>
>></section>
>></section>
> 
> 
> What is unclear about it?

Right, I had missed the fact that h1 differs only in the style; I had 
the mistaken idea that <h1> was the only element that worked with <section>.

> 
> 
> 
>>On the other hand, there is some merit to a situation in which <section> 
>>creates structure and the choice of n in <h{n}> denotes the 'importance' 
>>of the heading relative to the content of the page (so, for example, 
>>search bots give lower weight to <h6> elements than <h1> elements 
>>regardless of the nesting).
> 
> 
> I considered this, but making <h1>-<h6> have _different_ semantics than 
> each other in <section> elements basically makes it impossible to do the 
> whole backwards-compatibility trick.
> 

In principle, I think I see the point. In practice the way that many 
authors seem to use the existing <h1> to <h6> elements is broadly 
compatible with this principle (iirc the HTML 4 spec states that the 
different elements denote  the "importance" of different heading levels. 
Authors tend to ignore the following text stating that headings can be 
used to create a document outline and mark e.g. headings in their 
sidebar as being lower importance than headings in the main document 
even if, structurally, they're at the same depth. This makes creating a 
useful outline impossible but does mean that some search bots give 
higher weight to more important content.).

 From my limited observation of author practice, the most 
backward/author-compatible heading model I can imagine is:

The semantics of h1...h6 elements that are the first h1...h6 child of a 
<section> element is the heading for that section. Subsequent h1...h6 
elements in the same <section> are subheadings (in the sense of sub 
titles e.g. "A New Hope"  in "Star Wars - A New Hope") of that section. 
h1....h6 elements have decreasing levels of importance with h1 being the 
most important and h6 the least. Higher level headings tags should be 
used to mark the headings most important to the page content as a whole 
and low level headings should be used to mark less significant divisions.

When a h1...h6 element is the child of  a  <section> element, UAs which 
contruct a document outline must do so from the depth of "section" 
nesting alone and ignore which of h1...h6 is used. Similarly, when 
headings are contained in <section> elements visual UAs should use 
default heading styles based on the depth of nesting and not on the 
heading element used. When the heading is not contained in a <section> 
element, it should be treated as a html4 heading.

Now, I'm not saying that's a practical model to use. In particular, I'm 
not sure CSS can represent the constraints on when to ignore the type of 
heading and when to take account of it. However I believe it is entirely 
backward compatible and consistent with author demands both those I've 
head explicitly (I want to mark some headings as low importance so they 
don't flood search results) and implicit (I've noticed sites that use 
the "a second heading in a <div> is a subheading" paradigm- Off the top 
of my head, both Eric Meyer (meyerweb.com) and Dave Shea (mezzoblue.com) 
do this in their weblogs). So I think we should make some effort to 
address these needs.

> 
> 
>>In general, I think that explicit markup for document sections is good 
>>(although I would like to see more single-purpose elements such as 
>><header> or <footer> to provide addiational semantics for UAs - the 
>>ability to seperate out sitewide elements from page-specific content is, 
>>in my opinion, particularly important) but I think we need to carefully 
>>consider the way the old and new heading styles will interact, 
>>particularly since backward compatibility is important.
> 
> 
> Yeah, <header> and <footer> or similar elements are almost certainly going 
> to be defined at some point, along with <content> (for the main body of 
> the page), <entry> or <post> or <article> to refer to a unit of text 
> bigger than a section but smaller than a page, <sidebar> to mean a, well,
> side bar, <note> to mean a note... and so forth. Suggestions welcome. 

The most obvious  use case I have in mind would be a UA hiding certian 
sections of the page so that the content was easilly accessible. It 
might therefore be goood to have a general purpose <chrome> element to 
denote a section of the page other than the main content. One could then 
subdivide using an attribute (<chrome type="header"> <chrome 
type="footer"> and so on). This is, at least, easilly extendable and 
allows a browser to use CSS like "chrome {display:none;}". For 
document-type pages, it might be good to have <pullout> or something to 
define a box of content that related to a part of the article but was 
not in the main flow. This would be superior to just using <section> or 
<div> with float because it would work in non-visual browsers. I think a 
well-defined way of adding footnotes would be popular although it's not 
quite in the same class of functionality.

> We'll probably keep it to a minimum though. The idea is just to relieve
> the most common pseudo-semantic uses of <div>.

Ideally we could get a large sample of actual sites to find out what the 
  most common uses acttually are. Is there an existing bot avaliable 
that would  allow one to spider (part of!) the web and extract the 
classnames given to <div> elements?

-- 
"If anybody ever tells you that you’re using the language incorrectly, 
just yell 'prescriptive grammarian!' at the top of your voice and all 
the linguists in the building will run over and surround the guy... and 
then they’ll rough him up"