[whatwg] Re: <section> and headings and other threads

Ian Hickson ian at hixie.ch
Tue Apr 5 07:53:46 PDT 2005


Ok, the spec has been updated to define headings and sections:

   http://whatwg.org/specs/web-apps/current-work/#sectioning

(Only section 2.4 and its subsections; ignore sections from 2.5 onwards.)

Comments?


Below is my response to all the header-related feedback received so far.

On Sat, 13 Nov 2004, Laurens Holst wrote:
> >
> > Well, <h> wouldn't be backwards compatible at all. At least <h1> would 
> > look like a heading of sorts.
> 
> I give you one abbreviation: CSS.

Lynx doesn't support CSS.


> > >  <h1>Foo</h1>
> > >  <section>
> > >   <h3>Bar</h3>
> > >   <h6>Quuz</h6>
> > >  </section>
> > > 
> > > Would be the same as H1, H2, H2, right?
> > 
> > Yes.

Actually in the model now in the spec that would be equivalent to three 
nested sections with H1, H2, H3 headers.


> Arbitrary heading elements (1 out of 6) are incredibly verbose to express in
> CSS, and you'd have to place h1|h2|h3|h4|h5|h6 in any XPath expression as
> well. So in practice, I don't think this is a good option.
> 
> section h1, section h2, section h3, section h4, section h5, section h6 {
> 	font-size: xx-large;
> }
> section section h1, section section h2, section section h3, section section
> h4, section section h5, section section h6 {
> 	font-size: xx-large;
> }
> 
> etc.
> 
> This should visually be the same as h1, h3, h6 and semantically the same as
> sections with weird headings inside.

I haven't looked into the styling solution yet but it is definitely by 
opinion that CSS should adapt to fit the markup and not the other way 
around. Since I'm on the CSSWG and an editor of the Selectors spec, I can 
work directly with the CSSWG on this. :-)


> > And if we don't redefine <h1> (and <h2> to <h6>), then you end up with 
> > the weird situation of having six elements which could easily be used 
> > but end up with meaningless semantics. (And they would be inline 
> > elements in legacy UAs, which is even worse.)
> 
> XHTML 2.0 does this.

I disagree with many of XHTML2's design decisions.


> > e.g. at the moment, this:
> > 
> >    <body>
> >     <h1>A</h1>
> >     <section>
> >      <h2>A.1</h2>
> >      <section>
> >       <h3>A.1.1</h3>
> >      </section>
> >     </section>
> >    </body>
> > 
> > ...makes sense, but if we say you have to use a new element for
> > headers, then the above is now meaningless and trying to make an
> > outline from it would not do anything useful.
> 
> That's just not true, or I'm missing your point.
> 
> Try making a tree view of a document based on h1...h6 headings.
> CSS: euh...
> XSLT: euh...

I can't speak for XSLT, but for CSS it would be something like:

   section { padding: 2em; border-left: 2px solid red; }

...which is in fact exactly what you wrote for the <h> version:

> Now try making a tree view based on h headings.
> CSS: section { padding: 2em; border-left: 2px solid red; }

I'm not sure what the problem is.


> >  3. It shouldn't be too easy to end up with meaningless markup when
> >     doing either of the above. So a random <h4> in the middle of an
> >     <h2> and an <h3> has to be defined as meaning _something_.

Note that the current definitions do indeed define every possible use of 
<hx> headers, I think.

> This is no different than the existing spec.

The existing spec is silent on this.

> This would mean a 4th level heading between a second- and a third-level 
> heading. Inside sections one could let the section level determine the 
> heading level and treat all headings the same, or use the highest level 
> of either the section or the heading. I don't see a need to define this 
> more specificly, as h1...h6 just don't go very well with sections. 
> That's the way it is, and it won't really harm anyone.

Except those trying to create outliners that interoperate. For example for 
documents contents pages, for jumping between sections (like PDF viewers), 
etc etc.


> I think [the spec proposals from back then are] needlessly complicated.
> 
> Note that for navigation XHTML 2.0 has <nl> Navigation Lists, which would
> correspond to your <navigation> tag. A sidebar (which side? how is it
> different from navigation and why is navigation not a sidebar?) usually
> consists out of links, and on places where it doesn't it is conte. And
> <article> (content) is implied (everything not navigation). All in all this
> set of tags you proposed sounds too specific to me.

Please look at the current spec draft and let me know if you still have 
those concerns.

Note that the tags in the spec come directly from research I did into what 
markup people use for typical sites (especially looking into <div> abuse).


> > The other advantage of using the existing <hX> elements is that 
> > Assistive Technologies will continue working, reporting the section 
> > headers, instead of saying there are no headers on the page.
> 
> Assistive Technologies don't work on pages using headers created with 
> font tags or styled divs either.

Those pages are broken.


> Assistive technologies can be updated. For technologies such as those, 
> section tags actually make much more sense than headings as they're 
> currently used.

I think the current definition (as of a few days ago) should take core of 
this.


On Thu, 18 Nov 2004, Henri Sivonen wrote:
> 
> <h> and <section> allow naïve inclusions, so that the content author or 
> the CMS does not need to deal with heading level shifting.
> 
> IMO <h> and <section> are better than <h1> through <h6>, but I'm not 
> convince that they are better enough to warrant the incompatibility. 
> Besides, when you have both, the required CMS code gets even uglier than 
> what is needed with only <h1> through <h6>.

<h1> through <h6> in <section> are equivalent to <h> in XHTML2 (mostly -- 
they are better defined than in XHTML2). They only imply relative header 
relations, not fixed ones. <h3> doesn't always mean "third level header", 
what it means depends on context, in the way described in the spec, just 
like <h> does.


On Wed, 17 Nov 2004, James Graham wrote:
> >
> > This is also why I feel that <section> should define headings such 
> > that there is no way to end up with a "missing level". Not by making 
> > such constructs non-conforming, but by simply defining them so that it 
> > isn't a problem and the headings are automatically nested 
> > appropriately.

Note that this is now done.


> > I do like this idea, but it isn't really workable. We need authors to 
> > be able to use HTML5 markup and yet still have it render correctly in 
> > HTML4 UAs, which basically means that we need <h2>-<h6> to mean 
> > exactly what they do in HTML4, or at least mean that as much as 
> > anything else. So we couldn't say that <h3> meant a minor heading, 
> > since otherwise the following:
> > 
> >   <h1>...</h1>
> >   <section>
> >    <h2>...</h2>
> >    <section>
> >     <h3>...</h3>
> > 
> > ...would not be exactly equivalent to:
> > 
> >   <h1>...</h1>
> >   <h2>...</h2>
> >   <h3>...</h3>
> > 
> > ...which we want.
> 
> Why are those two inequivalent under my definition?

Under the current proposed spec, as under your definition, they are in 
fact equivalent now.


> As far as I can tell, the differences come when one looks at fragments 
> like:
>
>  <section>
>  <h1>..</h1>
>    <section>
>    <h1>..</h1>
>      <section>
>      <h1>..</h1>
> 
> Unless I have missed something, in the current webapps spec, this is (in 
> HTML5) exactly equivalent to the two examples that you gave,

Correct.

> and indeed authors are encouraged to use this form.

The spec now says "Sections may contain headers of any rank, but authors 
are strongly encouraged to either use only h1 elements, or to use 
elements of the appropriate rank for the section's nesting level".


> Clearly this is not equivalent to the HTML4ised version:
>
>  <h1>..</h1>
>  <h1>..</h1>
>  <h1>..</h1>
>
> My proposal would make this example semantically different to your 
> examples in both HTML4 and HTML5, and would retain the letter of the 
> HTML4 spec (and, indeed, the sense in which many people have interpreted 
> it). It therefore makes authors more likely to use markup that will 
> behave as expected in HTML4 UAs.

I don't see any way we can have nested <section> elements _not_ mean 
nested sections. That strikes at the very core of what nesting a <section> 
element would mean, IMHO.


On Sat, 20 Nov 2004, fantasai wrote:
> 
> I would define things as follows:
> 
>  - The first header in a <section> is that section's top-level header
>  - Depth of section increases:
>      - when heading number increases
>      - when <section> nesting increases--but this increments from
>        the last top-level <section> header rather than the last header
>  - Depth of section does not decrease with a header number that is higher
>    than the section's top-level header's number. (This means all
>    subsequent header number increments increment based on this header's
>    number instead of the top-level header's number.)

That's roughly what the spec says now (albeit in more detail and coping 
with nested sections better).

>  - Section header immediately following a section header of the same level
>    is considered a subtitle.

That's what <header> is for. I see no reason to disallow empty sections, 
and I have problems defining anything of this nature because of the 
differences between:

   <h1/><h1/>
   <h1/> <h1/>
   <div><h1/></div><div><h1/></div>
   <h1/>x<h1/>
   <h1/><p/><h1/>

Those should IMHO all be exactly identical as far as the outline goes.


> Example of double header:
>   http://www.alistapart.com/
> (ISSN bit is <h6>, but is semantically a top-level header for the whole
> section)

Perfect example of the use case for <header>.


On Mon, 15 Nov 2004, Matthew Raymond wrote:
>
>    The following steps COMBINED should solve all problems related to 
> section headers:
> 
> 1) The <h#> elements should be [deprecated].
> 4) The <h> element will be the only way to create a semantically valid header
> for a section.

I strongly disagree with this. I don't see the point. Introducing a new 
element just to replace an old one with near-identical semantics seems 
silly, especially in light of the fact that we want an easy migration 
path with a good backwards compatibility story.


> 2) The <h#> elements will have no SEMANTIC meaning when inside a <section>
> header. Their presentation, however, will remain the same.

The spec defines their semantics now.


> 3) Within an <h> element, <h#> elements (but not their contents) will be
> ignored entirely.
> 
> 5) There should only be one <h> element for each section. Any <h> element
> after the first <h> element will have no semantic meaning, but can still have
> the same presentation as the first <h> element.
> 
> 6) The only way to create semantically valid subsections within a <section>
> element is to create child <section> elements within the <section> element.

That seems overly strict given that people will be doing these bad things 
anyway. I don't see the point in making it illegal (or saying an element 
"has no semantics", which I guess is the same thing) when we can just 
define what it means and make it ok.


> I still feel that, structurally speaking, there should be a <section> 
> element for every section and subsection, even for sections that are 
> both leaves and immediate siblings.

I agree, but I don't see that we can enforce it.


> Therefore, I'm amending my previous 
> position with the following:
> 
> 1) Nested headers are ignored. Therefore, this markup...
> 
> <h1><h2>Header</h2></h1>
> 
> ...Is the same as...
> 
> <h1>Header</h1>

No, it's the same as

   <h1></h1><h2>Header</h2>


> 2) <h1>-<h6> have the same semantic value as in HTML 4.01, but are 
> additionally defined as not having any semantic meaning related to 
> document _structure_.

Not sure what this means. HTML4 doesn't define them really, and I don't 
see what the second half of your sentence means.


> 3) The <h> element is defined as being the same as <h1>-<h6>, except that the
> importance level is obtained by the parent <section>, and <h> can only be used
> within a <section>. Therefore, the following to example...

Counter proposal: Just define <h1>-<h6> that way. This is what the current 
spec does.


On Sun, 21 Nov 2004, fantasai wrote:
> James Graham wrote:
> > Backwards compatibility must be maintained. <h1> to <h6> must represent
> > headings. Given the abuse of headings-as-structure on the existing web there
> > may be some leeway in (re)defining the way that the headings interact to
> > give e.g. an outline/toc.
> > ...
> > Multiple headings per section will probably happen anyway. So we may as well
> > allow them.
> > ...
> > Many documents on the web do not have a formal structure of the sort that
> > would be edxpected in a legal report. The heading model should be able to
> > cope with that.
> > ...
> > It has to be possible to get an unambigous structure from the headings of a
> > document. This means having an algorithm in the spec that UAs can implement
> > that will give a 'tree view' of the document structure.

Agreed. The spec has been updated to match this. Comments welcome!


On Thu, 25 Nov 2004, dolphinling wrote:
>
> With respect to <section>, <h>, and <hn>, I would suggest the following:
> 
> For any <hn>, if n is less than or equal to the number of sections it is
> nested inside, it is semantically equivelant to <h>;
> 
> <section>
>   <h1>1st level header</h1>
>   <p>content</p>
> </section>
> 
> <section>
>   <h>1st level header</h>
>   <p>content</p>
>   <section>
>     <h2>2nd level header</h2>
>     <p>content</p>
>   </section>
>   <section>
>     <h1>_Also_ 2nd level header</h1>
>     <p>content</p>
>   </section>
> </section>
> 
> Around any hn with n greater than the number of sections, there are implied
> semantic sections. These implied sections end at the end of the containing
> explicit section (or other containing block) or at the start of the next hn
> with an equal or lower value of n;
> 
> <section>
>   <h1>1st level header</h1>
>   <p>content</p>
>   <!-- section -->
>     <h2>2nd level header</h2>
>     <p>content</p>
>   <!-- /section -->
> </section>

Agreed.


> <section>
>   <h1>1st level header</h1>
>   <p>content</p>
>   <!-- section -->
>     <!-- section -->
>       <h3>3rd level header</h3>
>       <p>content</p>
>     <!-- /section -->
>   <!-- /section -->
> </section>

Disagreed; the <h3> simply gets treated as an <h2> in this case, IMHO. I 
don't see the advantage of having deeper sections here.


> For a legacy document:
> 
> <!-- section -->
>   <h1>1st level header</h1>
>   <p>content</p>
>   <!-- section -->
>     <h2>2nd level header</h2>
>     <p>content</p>
>     <!-- section -->
>       <h3>3rd level header</h3>
>       <p>content</p>
>     <!-- end section -->
>   <!-- end section -->
> <!-- /section -->

Agreed.


> A more complex example, with h and hn chosen off the top of my head:
> 
> <section>
>   <h>1st level header</h>
>   <p>content</p>
>   <section>
>     <h1>2nd level header</h1>
>     <p>content</p>
>   <!-- /section -->
>   <!-- section -->     <!-- This implied split I'm not sure about, but
>                             it seems to be best [1] [2] -->
>     <h2>2nd level header</h2>
>     <p>content</p>
>     <section>
>       <h>3rd level header</h>
>       <p>content</p>
>       <section>
>        ...
> 
> [1] This also answers the question of what happens if you have two headers in
> a section. The possibilities are assume the second one is a subsection, assume
> they're both subsections, or assume they're both normal sections, with an
> implied split. I think the implied split is best...
> 
> [2] ...Or it could just be declared invalid, and there could be a limit of one
> header per section. Can you have content before the header, though? How about
> subsections before the header? And what about implied subsections? Hmm... have
> to think about it, but it might work. (Too lazy to revise my big long example,
> though)

The current spec takes care of these cases too.



On Thu, 25 Nov 2004, Matthew Raymond wrote:
> 
> If there is any difference in presentation or the level of importance, 
> then this contradicts the HTML 4.01 specification when the header 
> element is a child of a <section>. If you assume your <h> elements are 
> set up the way mine are, then this is the case, since in my model, <h> 
> elements on level "n" are semantically and presentationally identical to 
> <hn>. It looks to me like you're using <section> to enforce a minimum 
> importance level, and possibly to alter presentation. If so, I oppose 
> this.

Why?


> > Around any hn with n greater than the number of sections, there are implied
> > semantic sections. These implied sections end at the end of the containing
> > explicit section (or other containing block) or at the start of the next hn
> > with an equal or lower value of n;
> > 
> > <section>
> >   <h1>1st level header</h1>
> >   <p>content</p>
> >   <!-- section -->
> >     <h2>2nd level header</h2>
> >     <p>content</p>
> >   <!-- /section -->
> > </section>
> 
>    Let's add a <section>, then:
> 
> | <section>
> |  <section>
> |   <h1>2nd level header</h1>
> |   <p>content</p>
> |  <!-- /section -->
> |  <!-- section -->
> |   <h2>2nd level header</h2>
> |   <p>content</p>
> |  </section>
> | </section>

Indeed. The spec takes care of the above (the two examples above are 
semantically equivalent to the first, not the second).


> The more complicated we make the rules with regards to implied sections, 
> the more likely we'll have the following problems:
> 
> a) Webmasters will get confused and create markup that doesn't have the 
> structure or presentation they desire.

Hopefully the rules are based on something simple enough to avoid that. 
(Someone should probably write a five line summary for the intro to the 
spec which can actually be understood by authors. My current summary is 
terse and to the point but probably hard to understand.)


> b) The UA programmers will overlook certain cases, resulting in the 
> creation of outlines that violate the specification.

Hopefully the way it is defined, in terms of an algorithm, should sidestep 
that issue. It is easy to test each point in the algorithm.


> c) There will be specific cases where it may be impossible for 
> webmasters and vendors to determine how an outline should be generated, 
> resulting in intentional differences in the way markup is written for 
> these cases and how UAs handle it.

In theory the algorithm covers every possible case. (Including odd cases 
like the element not being in the DOM.)


On Thu, 31 Mar 2005, fantasai wrote:
> James Graham wrote:
> > Ian Hickson wrote:
> > 
> > > There are two big issues here:
> > > 
> > > 1. What do <h1> to <h6> mean in a <body>?
> > > 
> > > 2. What do <h1> to <h6> mean in a <section>?
> > 
> > Incidentially, unless I was convinced otherwise in some way that I've 
> > now forgotten, I believe that question 1 and 2 should be the same i.e.
> 
> I second this. As Anne noted once on IRC, it should be possible to 
> copy-paste the article-text of an existing HTML document into a 
> <section> element and have all the elements inside retain the same 
> relative semantics.

The spec has taken that into account and (with a few exceptions that are 
the entire point of <section>) the two are defined equivalently. In fact 
the spec only has one definition, which is shared by the two.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


More information about the whatwg mailing list