[whatwg] A plea to Hixie to adopt <main>, and main element parsing behaviour

Thu Nov 8 01:51:19 PST 2012

Hi all,

responses in line

On 7 November 2012 19:38, Ian Hickson <ian at hixie.ch> wrote:

> On Wed, 7 Nov 2012, Simon Pieters wrote:
> >
> > My impression from TPAC is that implementors are on board with the idea
> > of adding <main> to HTML, and we're left with Hixie objecting to it.
>
> If implementors wish to implement something, my objecting is irrelevant.
> :-)
>
> Just implement it.
>

It appears that some implementers would like Hixie to spec <main>, but he
is unwilling as he disagrees with the feature,

In a hallway discussion with a microsoft rep at TPAC he indicated that IE
would have no objections to implementing the feature was introduced via
the  HTML WG, which is what is happening at the moment. I also asked other
browser implementers who indicated that agreement from Hixie was not a
prerequisite for implementation. I suggest what is a prerequisite is clear
use cases and data to back them up and a well defined spec of the feature.
These are provided via the main spec and linked documents [1]

I have also spoken to various browser accessibility engineers who have
agreed that it would be a useful addition to complete the HTML element -
ARIA landmark mapping, and that the accessibility part would be trivial to
implement [4]:

I am generally in favor of a <main> element, and FWIW, implementation of
the semantics should be trivial in WebKit, or any UA that supports the ARIA
'main' landmark role already.

another data point: when i discussed the main element with one of the
mozilla accessibility engineers they suggested that it would be useful for
providing a built in skip to content link, which is one of the use cases.

> > Hixie's argument is, I think, that the use case that <main> is intended
> > to address is already possible by applying the Scooby-Doo algorithm, as
> > James put it -- remove all elements that are not main content, <header>,
> > <aside>, etc., and you're left with the main content.
>
> The reason there is no element <main> in the HTML spec currently is that
> there are no use cases for it that aren't already handled, right.
>

The use cases data and rationale have been provided [1]. If you have
objections it would be useful to respond to them rather than restating your
position.

> > I think the Scooby-Doo algorithm is a heuristic that is not reliable
> > enough in practice, since authors are likely to put stuff outside the
> > main content that do not get filtered out by the algorithm, and vice
> > versa.
>
> That people will get markup wrong is a given. This will not obviously be
> any less the case with an element named <main> than an element named
> <article> or elements named <nav> or <aside> or <header>.
>

Agreed that people get markup wrong, I don't agree with your supposition
that <main> would be just as prone to mistakes as the other elements you
cited.

<main> is a  simple concept, and its use is clearly defined, its limitation
of use once per page makes it less prone to mistakes in its use, it is
based on concepts that are evident in authoring practises and the evidence
of the strong correlation between elements (typically divs) identifying the
main content area and their use for role=main and the target of skip links
indicates the concept is already understood and in use.

>
> In fact, when we have looked at actual data for this (see e.g. the recent
> thread where I went through Steve's data, or the threads years ago when
> this first came up), it turns out authors are significantly more reliably
> using class names that relate to marking up navigation blocks and headers,
> than they are about marking up "main". Authors seem to put class="main"
> and equivalents around every possible combination of content in a page,
> purely based on their styling needs.
>

Problem is Ian,  you haven't responded to the data and use cases, you have
have misdirected the discussion by continuing to talk about class names,
when the data and use cases and rationale are based on the use of id values.

Did the year's old previous discussion take into account id value data or
skip link data or role=main placement data?

What the relevant new data clearly indicates is that in approx 80% of cases
when authors identify the main area of content it is the part of the
content that does not include header, footer or navigation content.

It also indicates that where skip links are present or role=main is used
their position correlates highly with the use of id values designating the
main content area of a page.

furthermore when ARIA role=main is used in 95% [3] of the cases in the data
sampled it is used once only which is a clear indicator that authors get
how to identify  the main content area of a page.

*  use of a descriptive id to value to identify the main content area of a
web page is common.
(id="main"|id="content"|id="
maincontent"|id="content-main"|id="main-content"
used on 39% of the pages in the sample [2])

 * There is a strong correlation between use of role='main' on an element
with id values of 'content' or 'main' or permutations. (when used = 101
pages)  77% were on an element with id values of 'content' or 'main' or
permutations.
* There is a strong correlation between use of id values of 'content' or
'main' or permutations as targets for 'skip to content'/'skip to main
content' links (when used = 67 pages) 78% of skip link targets # were
elements with id values of 'content' or 'main' or permutations.
* There appears to be a strong correlation in the identification of content
areas (with id values of 'content' or 'main' or permutations.) as what is
described in the spec as appropriate content to be contained with a
<main> element [1]

>
> Thus if the use case is "determine where the boilerplate ends", i.e.
> skipping navigation blocks, headers, footers, and sidebars, the evidence
> I've examined suggests that it would be more reliable to have authors mark
> up those blocks than mark up "the main content".
>

as stated above you have been looking at class names not id values.

> > Implementations that want to support a "go to main content" or
> > "highlight the main content", like Safari's Reader Mode, or whatever
> > it's called, need to have various heuristics for detecting the main
> > content, and is expected to work even for pages that don't use any of
> > the new elements. However, I think using <main> as a way to opt out of
> > the heuristic works better than using <aside> to opt out of the
> > heuristic.
>
> On what basis do you draw that conclusion?
>
>
> > For instance, it seems reasonable to use <aside> for a pull-quote as
> > part of the main content, and you don't want that to be excluded, but
> > the Scooby-Doo algorithm does that.
>
> If it's a pull quote, why would you _not_ want it excluded?
>
>
> On Wed, 7 Nov 2012, Ojan Vafai wrote:
> >
> > This idea doesn't seem to address any pressing use-cases. I don't expect
> > authors to use it as intended consistently enough for it to be useful in
> > practice for things like Safari's Reader mode. You're stuck needing to
> > use something like the Scooby-Doo algorithm most of the time anyways.
>
> Exactly.
>

Exactly what? that the commenter agrees with your position, nothing else in
the statement is obvious.

>
>
> On Thu, 8 Nov 2012, Kang-Hao (Kenny) Lu wrote:
> >
> > [...] another argument, if I understand correctly, is to use <article>
> > in place of this role. I think the Web is probably full of mis-used
> > <article> already such that using the first <article> in document order
> > has no chance to work out, but it would nice if this can be verified,
> > even though I can already imagine that an author is unlikely to mark up
> > the main content with <article> when the main content isn't an article
> > in English sense.
>
> For the "jump to the start of the body" use case, <article> and <main>
> seem like they'd be misused exactly as much as each other.
>
>
A statement of supposition only, article is not as simply defined or simple
to know when and when not to use as the main element.

>
> > James Graham wrote:
> > > The observation that having one element on a page marked — via class
> > > or id — "main" is already a clear cowpath enhances the credibility
> > > of the suggested solution. On the other hand, I agree that now
> > > everyone heading down the cowpath was aiming for the same place; a
> > > <div class=main> wrapping the whole page, headers, footers, and all is
> > > clearly not the same as one that identifies the extent of the primary
> > > content.
> >
> > Right.
>
> Studying the data, as I have done in previous threads, has always
> indicated that there is actually no cowpath here for "main". As James says
> above, these classes and IDs are used for all kinds of combinations of
> content and headers, content and navigation, just content, etc. If this is
> any indication, <main> wouldn't be useful for its stated purpose.
>
>
> > So, assuming "skip to main" is the only use case for <main>, which I am
> > not sure if Steve agrees, I think the proposal should use strong wording
> > to prevent such misuse and the proposal should include one example of
> > such misuse and explains it.
>
> The strength of the wording will have basically no effect, let's be
> realistic here. Few authors read the spec. It doesn't matter most of the
> time, because the failure mode if an author uses <em> instead of <var> or
> vice versa is just that the styling will be slightly off or maintenance
> will be slightly harder. The failure mode for <main> would be that its
> entire reason for existing (making a heuristic simpler) is lost.
>
>
> On Wed, 7 Nov 2012, Simon Pieters wrote:
> >
> > I'm not convinced that we should freeze the parser now just because we
> > have reached interop.
>
> For the record, I personally do not consider the parser frozen.
>
> If an implementor wants to implement <main>, they should IMHO do so by
> supporting it the same way as <article> is supported in the parser, with
> the DOM interface HTMLElement, with the same styling and (for conformance
> checkers and authoring tools) content model as <div>. However, I would
> recommend against implementing <main>, for the reasons given above.
>
>
> > I think not changing the parser here makes <main> (and other future
> > elements; whatever we do here sets a precedent for future elements)
> > inconsistent with the rest of HTML. In the long term, having <main> and
> > <aside> parse differently just because we didn't want to change the
> > behavior from 2012-era browsers will seem silly.
>
> Indeed. Given how relatively painless transitioning from no parser spec at
> all to having one at all actually ended up being, at least relative to
> what I was expecting, I think adding new elements isn't a big deal at all.
> (Even in the <head>.) We shouldn't add elements in general, but that's
> more about not expanding the language, not about the parser.
>
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

[1]
https://dvcs.w3.org/hg/html-extensions/raw-file/tip/maincontent/index.html
http://www.w3.org/html/wg/wiki/User:Sfaulkne/main-usecases#Introduction
http://lists.w3.org/Archives/Public/public-html/2012Oct/0109.html

[2]
http://www.paciellogroup.com/blog/2012/04/html5-accessibility-chops-data-for-the-masses/

[3]
http://www.paciellogroup.com/blog/2012/04/html5-accessibility-chops-real-world-aria-landmark-use/

[4] http://lists.w3.org/Archives/Public/public-html/2012Nov/0085.html
-- 
with regards

Steve Faulkner
Technical Director - TPG

www.paciellogroup.com | www.HTML5accessibility.com |
www.twitter.com/stevefaulkner
HTML5: Techniques for providing useful text alternatives -
dev.w3.org/html5/alt-techniques/
Web Accessibility Toolbar - www.paciellogroup.com/resources/wat-ie-about.html