[whatwg] Re: <section> and headings

James Graham jg307 at cam.ac.uk
Sun Nov 14 05:49:55 PST 2004

Laurens Holst wrote:

> Ian Hickson wrote:
>> Well, <h> wouldn't be backwards compatible at all. At least <h1> would
>> look like a heading of sorts.
> I give you one abbreviation: CSS. 

Sure one can make anything look like a heading. But no HTML4 UA would 
recognise <h> as a heading whereas <h1> would, at least be considered to 
be a heading element.

Put another way, postulating CSS as a solution to a problem of sematics 
is about as useful as reintoducing <font> to HTML.

>>>  <h1>Foo</h1>
>>>  <section>
>>>   <h3>Bar</h3>
>>>   <h6>Quuz</h6>
>>>  </section>
>>> Would be the same as H1, H2, H2, right?
>> Yes.
> Arbitrary heading elements (1 out of 6) are incredibly verbose to 
> express in CSS, and you'd have to place h1|h2|h3|h4|h5|h6 in any XPath 
> expression as well. So in practice, I don't think this is a good option. 

Backwards compatibilty means that these elements have to stay whatever 
we do. The fact that they are a pain to work with programatically is 
true but, unfortunatley, unavoidable.

>> And if we don't redefine <h1> (and <h2> to <h6>), then you end up with
>> the weird situation of having six elements which could easily be used
>> but end up with meaningless semantics. (And they would be inline
>> elements in legacy UAs, which is even worse.)
> XHTML 2.0 does this. Probably for well-discussed reasons, amongst 
> others a number of concerns you raised (like the search engine thing). 
> I don't see why it shouldn't. 

XHTML 2 has entirely different design goals to "HTML5". Specifically 
backward compatibility is not one of these design goals. Given the 
lengths to which many successful software products go to maintain 
backward compatibility, there is some evidence that the XHTML2 path is a 

>> e.g. at the moment, this:
>>    <body>
>>     <h1>A</h1>
>>     <section>
>>      <h2>A.1</h2>
>>      <section>
>>       <h3>A.1.1</h3>
>>      </section>
>>     </section>
>>    </body>
>> ...makes sense, but if we say you have to use a new element for
>> headers, then the above is now meaningless and trying to make an
>> outline from it would not do anything useful.
> That's just not true, or I'm missing your point.
> Try making a tree view of a document based on h1...h6 headings.
> CSS: euh...
> XSLT: euh... 

That can be done without too much trouble (n.b. I'm not sure what CSS 
has to do with making a tree view). In fact tools already exist to do 
exactly that.

> Now try making a tree view based on h headings.

Well it's impossible unless you explicitly support HTML5 i.e. not 
backwards compatible.

> CSS: section { padding: 2em; border-left: 2px solid red; } 

That would work with the markup above, no?

> <template match="/">
>   <html><body><apply-templates select="//section|h" /></body></html>
> </template>
> <template match="section">
>   <section><apply-templates select="//section|h" /></section>
> </template>
> <template match="h">
>   <h><apply-templates select="//section|h" /></h>
> </template>
> I don't think that can become more straightforward.
>> Basically I want three things:
>>  1. It has to be possible to take existing markup (which correctly
>>     uses <h1>-<h6>) and wrap the sections up with <section> (and the
>>     other new section elements) and have it be correct markup.
>>     Basically, allowing authors to replace <div class="section"> with
>>     <section>, <div class="post"> with <article>, etc.
> Aside from that I don't see why when you're changing the markup anyway 
> you would still want to retain the old headings, the XHTML 2.0 
> solution allows for this just fine.

Beacuse you accept that you still have to deal with UAs that don't 
support the new markup. In this case the transformation <div> -> 
<section> is unlikely to be problematic (a non-sematic element replaced 
with an unsupported element) whereas <hn>-<h> is a problem (a semantic 
element replaced by a non-semantic one).

>>  2. It has to be possible to write new documents that use the section
>>     elements and have the headers be automatically styled to the right
>>     depth (and maybe automatically numbered, with appropriate CSS),
>>     and yet still be readable in legacy UAs, without having to think
>>     about old UAs. Basically, the header element has to be header-like
>>     in old browsers.
> Let me just refer to my first (two) paragraph(s) here. 

"Basically the header element has to be header-like in old browsers". If 
'header-like' means anything other than 'has a heading-like appearence 
(in which case <font size="24"> might be heading-like) you've totally 
ignored this point.

>>  3. It shouldn't be too easy to end up with meaningless markup when
>>     doing either of the above. So a random <h4> in the middle of an
>>     <h2> and an <h3> has to be defined as meaning _something_.
> This is no different than the existing spec. This would mean a 4th 
> level heading between a second- and a third-level heading. 

HTML 4 doensn't really specify how this should work.

> Inside sections one could let the section level determine the heading 
> level and treat all headings the same

Now that I agree with.

> , or use the highest level of either the section or the heading. I 
> don't see a need to define this more specificly, as h1...h6 just don't 
> go very well with sections.

In a backwards-compatible world, they have to interact somehow (if the 
XHTML2 people haven't defined this yet they will have to or their 
heading model will be totally broken).

> That's the way it is, and it won't really harm anyone. 

Except anyone trying, say, to create a tree view of a document. Other 
document formats allow tree views to be constructed. Saying that this 
should be impossible in HTML seems rather shortsighted. There are other 
types of UAs that want to know about headings too. Searchbots are an 
obvious example.

>> At the moment what I'm thinking of doing is this (most of these ideas
>> are in the draft at the moment, but mostly in contradictory ways):
> [...]
> I think this is all needlessly complicated.
> Note that for navigation XHTML 2.0 has <nl> Navigation Lists, which 
> would correspond to your <navigation> tag. 

> A sidebar (which side?

Can you say 'not mixing presentation with content'

> how is it different from navigation and why is navigation not a sidebar?)

Because a sidebar, typically, isn't something that contains navigation. 
It is a piece of content that is related to the main text but not in the 
flow of the main text. The spec needs to make this clear.

>  usually consists out of links, and on places where it doesn't it is 
> conte. And <article> (content) is implied (everything not navigation). 
> All in all this set of tags you proposed sounds too specific to me.
>> To simplify the CSS rules for <h1>, we could limit the ways in which
>> sections can be nested, and say that other nesting combinations do not
>> cause the <h1>'s presentation to change by default in CSS-based UAs.
>>     Element       Meaningful descendents
>>     <body>        <section> <article> <sidebar> <navigation>
>>     <section>     <section> <article> <sidebar>
>>     <article>     <section> <sidebar>
>>     <sidebar>     <section>
>>     <navigation>
>> Unfortunately the rules still become unmanageable after 3 levels (that
>> is to say, the <h5> and <h6> levels have an insane number of rules).
>> An alternative would be to ask the CSS working group for an :or()
>> selector of sorts, and then have:
>>    :or(section, article, sidebar, navigation) h1 { /* h2 */ }
>>    :or(section, article, sidebar, navigation) h1
>>    :or(section, article, sidebar, navigation) h1 { /* h3 */ }
>>    :or(section, article, sidebar, navigation) h1
>>    :or(section, article, sidebar, navigation) h1
>>    :or(section, article, sidebar, navigation) h1 { /* h4 */ }
>> That might work.
> Woohoo.
> Note the amount of sarcasm in my voice, which can unfortunately not be 
> transferred through this medium (well, I guess I could include some 
> XVoice markup :)). Just use <section> with <h> headings and <nl> with 
> <label> headings. 

But then, what does

mean? It varies according to which UA you ask - a HTML4 UA would report 
a single heading, a "HTML5" UA would not.

>>>> I don't disagree. But it is backwards compatible.
>>> Not really. If search engines don't get upgraded to support this new
>>> kind of H1 semantic all kinds of documents can be indexed wrong or
>>> they can be marked inappropriate because they mis-use the H1 element
>>> in the eyes of the search engine. (The same as with creating a page
>>> full of links, but now you are mis-using a heading element.)
>> You are assuming that search engines trust authors to use <h1>
>> elements correctly in the first place, and, more importantly, that
>> they treat them differently to <h2> elements in a way that would be
>> noticeable if this became widespread.
>> I highly doubt this.
>> Also, using <h> would have the same problem in reverse -- content
>> would no longer be indexed as a header at all.
> That is up to the site author to decide, isn't it. Not all content 
> needs a high search rank, and not all content is used on the web. I 
> also think it is a slight adjustment for e.g. Google to make to their 
> engine, so who knows they will.

Who knows indeed. The point of being backwards compatible is that people 
don't have to run the risk that product X will not be updated to the new 
requirements. Seriously, how many sitesm will use the new markup if they 
believe that it might decrease their search ranking (bearing in mind 
that Google is quite secretive about such things).

> At least if you don't try, you can be sure they never will. In any 
> case h1...h6 would not be deprecated so there is no reason not to use 
> them if you want to. 

But how would they interact with <section>? That's the question, no? I 
feel I'm missing something here...

>> The other advantage of using the existing <hX> elements is that
>> Assistive Technologies will continue working, reporting the section
>> headers, instead of saying there are no headers on the page.
> Assistive Technologies don't work on pages using headers created with 
> font tags or styled divs either. Assistive technologies can be updated.

can != will be
In fact, a faliure to work with existing technologies might be enough of 
a barrier to adoption that people avoid "HTML5" at-all so products are 
never updated to work with it.

More information about the whatwg mailing list