[whatwg] Nested list

Aryeh Gregor Simetrical+w3c at gmail.com
Wed Jul 27 12:24:15 PDT 2011


(Ian pointed out this old thread to me that he hadn't yet responded
to, so I'll respond now.)

On Thu, Jul 2, 2009 at 6:05 PM, Ryosuke Niwa <rniwa at google.com> wrote:
> Hi, I just realized that in HTML4.01 spec, DTD doesn't seem to allow
> nested OL or UL without LI.  See
> http://www.w3.org/TR/REC-html40/struct/lists.html#h-10.2  In fact, the
> nested list example is marked deprecated.  But in practice, all major
> user agents produce nested list when execCommand("Indent"...) is
> executed.  Is there any chance we can standardize nested lists, and in
> particular, what UA produce?

I opened a bug about this a while ago:

http://www.w3.org/Bugs/Public/show_bug.cgi?id=12609

> For example, all major browsers (Firefox, IE, & WebKit) produce
> slightly different versions of HTML when indenting "item 2" in the
> following HTML (assume it's content-editable):
> <ol>
> <ol id="u1"><li id="i1">item 1</li></ol>
> <li id="i2">item 2</li>
> <ol id="u3"><li id="i3">item 3</li></ol>
> </ol>
>
> In particular, many UA remove arbitrary id attributes.

If you remove the whitespace nodes (which currently confuse my
algorithms in lots of places -- it's a known issue), it produces

<ol><ol id="u1"><li id="i1">item 1</li><li id="i2">ite[m] 2</li><li
id="i3">item 3</li></ol></ol>

according to my implementation.  This is because of the following
steps in the indent algorithm at the time of this writing:

* Let tag be the local name of the parent of first node.
* Wrap node list, with sibling criteria matching only HTML elements
with local name tag and new parent instructions returning the result
of calling createElement(tag) on the ownerDocument of first node.
http://aryeh.name/gitweb.cgi?p=editing;a=blob_plain;f=editing.html;hb=e4c523de#indent

Since the sibling criteria only require the sibling to have local name
tag (in this case "ol"), without regard to attributes, it's considered
valid for merging.  Since both the previous and next siblings match
the sibling criteria, the next sibling is merged into the previous
sibling, so the resulting element has the id of the original first
element.

Chrome 14 dev merges them but keeps the id of the second list, not the
first.  Firefox 7.0a2 merges into the second list, and ignores the
first list even if it has no id -- it seems to ignore previous
siblings if there's a legitimate next sibling.  Opera 11.50 gives the
same output as the spec.  IE10PP2 merges the three items into one list
that has no id at all.  I think that the spec (= Opera) is reasonable
here -- anyone disagree?

I added a few tests to my test suite based on this feedback:

http://aryeh.name/gitweb.cgi?p=editing;a=commitdiff;h=d4233f8f

On Mon, Jul 13, 2009 at 7:01 AM, Simon Pieters <simonp at opera.com> wrote:
> I think this is a bug in execCommand('indent') and should be fixed in
> browsers.

I disagree.  I found when writing my spec that having <ol>/<ul> nested
inside <li> complicates a lot of things needlessly, because the
<ol>/<ul> is visually part of the parent <ol>/<ul> and not the <li>.
I wound up aggressively normalizing list items so that if they contain
<ol> or <ul>, those are broken out to become siblings in any case
where it might cause problems:

http://aryeh.name/gitweb.cgi?p=editing;a=blob_plain;f=editing.html;hb=e4c523de#normalize-sublists

I don't see that this has any semantic problem.  We can simply define
the semantics of <ol><li>foo</li><ol><li>bar</ol></ol> as being the
same as those of <ol><li>foo<ol><li>bar</ol></ol>.

On Thu, Sep 3, 2009 at 8:29 PM, Ian Hickson <ian at hixie.ch> wrote:
> There are lots of bugs that need fixing with execCommand(); I don't see
> why this wouldn't be one of them.

I've considered the issue carefully and determined that it's HTML5's
authoring requirements that should change here, not browsers'
execCommand() behavior.  Basically, to avoid extra branching, we want
to ensure that all nested lists we deal with are consistent, either
all nested directly or all nested inside <li>'s.  Having them nested
directly makes things simpler.

For instance, suppose you have markup like
<ol><li>foo<ol><li>bar</ol>baz</ol> and the user outdents the middle
item.  This needs to be a special case: you have to notice that you're
a descendant of an <li>, break that into two, and stick the <ol> in
between.  If you first normalize and make the markup look like
<ol><li>foo</li><ol><li>bar</ol><li>baz</ol>, outdenting the middle
item can completely ignore the outer list with no need for a special
case.

Or take another case: consider the markup,
<ol><li>foo<li>bar<ol><li>baz</ol></ol>, and have the user select
"bar" and try to indent it.  It looks like "baz" is a separate item,
so the user will expect it not to get indented along with "bar".  This
means you need to do a bunch of complicated stuff: first you need to
look for a previous list item to move things into ("foo" in this
case), then check if there's already an <ol> at the end of it, merge
into that <ol> if so, otherwise create a new <ol>, move the whole <li>
containing "barbaz" into the new <ol>, move the "baz" <li> out of its
<ol>, remove that.  Or something like that.  If I first normalize to
<ol><li>foo<li>bar</li><ol><li>baz</ol></ol>, I can wrap the bar <li>
with no extra work at all, reusing my "wrap" algorithm that checks for
siblings that match given criteria and creates a new wrapper if none
are found.

There are lots of cases like this.  Often if a list is directly nested
inside another list without an intervening <li>, you can treat it
either like a non-nested list or like an <li>.  I found that not
normalizing lists to be directly nested just created pointless
headaches for no gain.  There's no obstacle to us defining new
semantics in this case and making it valid.


More information about the whatwg mailing list