[whatwg] Sortable Tables

Pierre Dubois duboisp2 at gmail.com
Fri Nov 9 20:00:52 PST 2012


On Tue Nov 6 11:25:21 PST 2012, Ian Hickson wrote:
>
> [snip]
> This is a very interesting idea.
>
> Is this something browser vendors would be interested in implementing? I'm
> hesitant to add a feature for this (which could be somewhat involved)
> before having the definite interest of some browser implementors.

I am not representating a browser vendors either a browser developper
but I will be welcome to create a javascript polyfill
for the sortable table as defined by the WHATWG spec on the current
browser supported by the WxT Toolbox.


On Tue Nov 6 11:39:35 PST 2012, Ojan Vafai wrote:
>
> [snip]
> A couple thoughts off the top of my head:
> 1. Would sorting actually reorder the DOM nodes or just change their visual
> order? It's not clear to me which one is better. I think the former is what
> you'd want most of the time.

>From my point of view, the DOM and the associated API should reflect
the re-order. That would allow to draw chart on the fly based on user
order preference.

> 2. What values should the sort property allow. One idea is that it takes a
> JS function similar to what JavaScript's sort function takes. If you leave
> it out then it just does alphanumeric sort.

I agree to takes JS function "callback" similar to what JavaScript's
sort function takes. The callback function can receive the DOM cell
element as parameter.
Instead of alphanumeric order by default, it would more relevant to
have numeric sort then alpha sort. That sort can be base on the first
word in the
cell then if required goes to next words to complete the comparaison.

> 3. What elements does it go on? I don't see what it would do on a td. I
> could see putting it on a th though.

Me also, I do not see the value added to apply sort on a td element.
Applying sort on th seem an interoperable solution with data table
that use the proper markup and are accessible.

> Also, it's not clear to me what would
> get sorted. For example, in some tables, you would group trs inside tbodys
> and want to sort those.

My opinion is that depends of the real scope of the "th" element.

If the "th" is an empty cell or used for "layout", the sorting
functionality would not be available.
If the "th" is an "group header", the sorting functionality would be
applied to the header cell along with their data fixed. Where the
header cell is a
subgroup header or/and an header that represent one or more row or column.
If the "th" is an "header", the sorting functionality could be applied
to the data cell associated and by default the sorting action would be
extended to the other axis [row|col].

I think it is very important to always keep the data relationships
with their associated cell headers. This could be required, in the
accessibility manner,
for a screen reader.

I think the vector (col|tr) get sorted in their respective group where
the sort is occuring. No group (tbody & colgroup) re-ordering should
happen during a
sorting applied on a cell header related to a vector (tr|col). A group
can get ordered only when the sorting action is applied to the group
header cell that
represent those group.

As the concept was already defined in a proposal I made
(http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-October/037679.html).
I would recommand that
the sort would not occur in vector (col|tr) into a summary group. But
the summary group would always follow their associated data group
during the grouping sorting.

Also currently there is no spec and/or official best practice on how
to get the full potential of the column grouping and the row grouping.
Before to enable any kind of official table sorting, the specification
should be able to handle properly the complex table relationships,
in an accessibility matter, without having the needs to take in
consideration the "scope" attribute and the "headers" attribute.
See the proposal to remove the headers and scope attribute:
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-September/037475.html



On Tue Nov 6 11:55:06 PST 2012, Boris Zbarsky wrote:
>
> [snip]
> Another obvious question: how does (or should) sorting interact with
> rowspans?

May be the data cells that are rowspans and/or colspans could force
the column and/or row related to it to be fixed.
That would not break the cell relationships but could make the table
imposible to be sorted. Also when a sorting would affect
a rowspans/colspans cell, may be only the first cell value in the
sorting order can be considerated, the others cell would be ignored.

Use case: A data table that have row headers and column headers.
Row and column that is in the scope of an rowspans and colspans data
cell (td) would be fixed.

Use case: A data table that only have row headers.
Row that is in the scope of an rowspans data cell (td) would be fixed.

Use case: A data table that only have column headers.
Column that is in the scope of a colspans data cell (td) would be fixed.



On Tue Nov 6 16:17:09 PST 2012, Christoph Päper wrote:
>
> [snip]
> >> Note that ‘col’ and ‘colgroup’ elements are hardly supported.
> But they’re essential for assigning sort properties.
>
>   <col key=…>
>   <colgroup key=…>
>
> A ‘col’ inherits the ‘key’ from a parent ‘colgroup’, but may override it.

I like the concept but that concept needs to be replicated to 'tbody',
'tr'. The same would be applicable to the group header cell defined
by the 'th' element.

> [snip]
> The default ‘key’ is ‘auto’ for explicit columns and ‘none’ for implicit columns.

I think the default 'key' should be also applied on implicit columns.

> [snip]
> >> Therefore columns should bear (…) what kind of content their cells have.
>
> Authors will mess this up of course, but then it’s their fault. Let’s not overload ‘title’ or ‘abbr’.

By default, the sorting can be based on the inner text that the cell
have without considering any CSS effect and other Content Flow
elements.
eg.
<table sortable>
<tr><th>Column
<tr><td><a href="#">DDD</a>
<tr><td><span style="display:none">AAA</span> HHH
<tr><td><table><tr><td>BBB</table>
</table>
Could result like the following after ascending sorting applied to the
cell header "Column"
<table sortable>
<tr><th>Column
<tr><td><span style="display:none">AAA</span> HHH
<tr><td><table><tr><td>BBB</table>
<tr><td><a href="#">DDD</a>
</table>

If the data cell 'td' has a specialized structured content then the
callback function would be useful to do the sorting.


> >> Several columns may be used for sorting by some kind of priority.
>
> This is a UI question, though.

Is there a use case for that ? If so, I think that should be
explicitly specified in the same place as the proposed 'key'
attribute.

> [snip]
> >> Cell content may not consist of the string that should be used verbatim (…).
> >> Cells should have an optional attribute indicating their sort key.
>
>   <th value="Rolling Stones, The">The Rolling Stones
>   <td value="0.454">1 lb

I do not catch the value-add to have a 'value' attribute on th/td
element. That would just create duplication of the information already
contained in the cell. The use of an callback function can be a good
alternative to that issue. I do not see overloading the HTML code
as a best practice.


> [snip]
> To support this, cells must be splittable!
>
>   td {color: green;}
>   #split {color: red;}
>
>   <tr><td>3 <td id=split colspan=2> red
>   <tr><td>1
>   <tr><td>2 <td> green
>
> after sorting by the first column should look like
>
>   <tr><td>1 <td id=split> red
>   <tr><td>2 <td> green
>   <tr><td>3 <td id=split> red
>
> would if duplicate IDs were legal. The DOM tree, however, would not change! The value of the cell at position (1,1), i.e. second row and column since we count from zero, is always undefined, but the value of the slot at (1,1) changes from “red” to “green”.

That would break the tabular data integrety and the cells (data cell
and header cell) can loose their relationships.

> Once we have splittable cells, one could imagine additions to the CSS table model and Selectors that allowed arbitrary partial repositioning of slots, think 15-puzzle. Let’s not go there yet, and not here.

I do not have actual numbers, but my common sense and my experience
told me that the probability of an authors to create a puzzle from a
data table is
very very very LOW instead of having an author that would create a
complex table.

> User agents should not be required to sort tables that contain malformed slots,

I would say: User agents should not be required to sort tables that
contain malformed tabulars cells where the table width is inconsistant
between rows
and the height is inconsistant between columns.

A data table like this would be ignored because the second row are
missing a data cell in the second column

<table>
<tr><th>Column 1 <th>Column 2
<tr><td>Data 1
<tr><td>Data 1 <td>Data 2
</table>

> There may be rows that should always go to the top or the bottom of their group. This should be handled by table headers and footers in most cases, otherwise it is not covered yet.

As you see my proposal
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-September/037475.html
and the associated working example available from here:
http://wet-boew.github.com/wet-boew/demos/tableparser/index-eng.html



On Wed Nov 7 01:54:19 PST 2012, Jirka Kosek wrote:
>
> [snip]
> It would be very difficult to support sorting on dates and numbers as in
> HTML they are usually present formatted using specific locale. So there
> should be additional attribute added to td/th which can hold sort key
> which will override cell contents, something like
>
> <td sortas="2012-11-07">11. listopadu 2012</td>

Not if is possible to use a callback function to sort the cells.



On Wed Nov 7 02:34:54 PST 2012, Stuart Langridge wrote:

>
> [snip]
> Therefore, it attempts to guess the type of a
> table column: if a column looks like it contains numbers, sorttable will
> use numeric sort (1 before 2 before 100) rather than alphanumeric sort (1
> before 100 before 2); if a column looks like it contains date information,
> then sorttable will sort by date (for formats DD/MM/YYYY and MM/DD/YYYY).
> The algorithm used for this guessing is pretty naive (check the first cell
> in a column; if it's blank, check the next one; etc). I think that this, by
> itself, has accounted for sorttable's popularity, because in most cases, it
> Just Works; you add a <script> element pointing to the script, and
> class="sortable" to the <table>, and do *nothing else*, and your table is
> sortable without any configuration.
>
> Everything else below here is configuration-based: something you'd have to
> do explicitly as an author. The above point is the critical one; guessing
> column types to make table sorting be zero-config. Some alternative scripts
> require you to explicitly tag date or numeric columns, and I think that
> authors see that as annoying. Anecdata, of course.
>
> Sorttable also allows authors to specify "alternate content" for a cell.
> That is (ignore the invalid HTML attribute here; I didn't know any better,
> and we didn't have data-* attributes when I wrote this stuff)
>
> <td sorttable_customkey="11">eleven</td>
>
> This is basically useful for when you have table data which has a definite
> order but it can't be autoguessed, or (more usefully still) when it could
> be autoguessed but that would be hard. The canonical example of this is
> dates: it would be exceedingly annoying, given
> <td>Wed 7th November, 10.00am GMT</td>
> to have to parse that cell content in JavaScript to turn it back into a
> Date() so it can be placed in sort order with other dates. The sorttable.js
> solution is to specify a "custom key", which sorttable pretends was the
> cell content for the purposes of sorting, so
> <td sorttable_customkey="20121107-100000">Wed 7th November, 10.00am GMT</td>
> and then the script can sort it. This feature is basically the get-out
> clause, an author hook for saying "I know what I want, but your fancy
> sorting thing can't handle it; how do I override that?" They can specify
> custom keys for all their TDs and then sorting will work fine. (Obviously,
> dates are less of a problem in theory today with <date> elements, but...
> how does the script know to use the datetime attribute of the <date> in
> <td><date>...</date></td>?)

That seems a good use case to add a javascript callback to sort
complex data cell.

> [snip]
> 3. Multiple header rows. Many authors have two or more <tr>s in the
> <thead>, one of which contains rowspanned <th>s, to group columns together.
> If this happens, which <th>s are clickable to sort the table? Which are
> not? This is hard to autodiagnose (and indeed sorttable punts on it and
> picks the first one, which is almost certainly wrong; even naively picking
> the last <tr> inside <thead> would be better, but still imperfect).

That is one thing the Table Usability Algorithm resolve.

> 4. Handling colspans and rowspans in the table. Sorttable.js basically
> punts on this, because what's expected to happen when you sort a column
> which contains only half a cell (because the other half's in another
> column, with rowspan=2) is wildly author-specific. But a properly specced
> solution doesn't get to punt and say "unsupported". This will need some
> thought.

As already proposed before, the data cell that have colspans and
rowspans would fix, freeze, the row and column that is related to it.
That would ensure to keep
the integrity of the tabular data. So a table can have a cell colspans
and rowspans and still sortable.

> 5. Numeric sort handling exponented numbers such as 1.5e6 (which do not
> match a naive "is this a number" regexp such as /^[0-9]+$/ )

Another use case to add a javascript callback.

> 6. Specifying how to display that a column is sorted. This would likely be
> done in this specification by leaving it to CSS and th::sorted-forward {
> after: content("v"); } or some such thing (I have no policy suggestions
> here), but authors want to be able to specify this, along with different
> styles for a sorted column. This is mildly more awkward because there's no
> real concept of a column in the DOM of an HTML table, but perhaps all the
> TDs could grow a pseudo ::sorted-forward or something (handwaving here like
> mad, obviously).

May instead of styling the data cell 'td' it would more relevant to
style the cell header 'th' where the last action of sorting happened.
Knowing where the last
sort happened can be an useful information to be provided to users
that use a screen reader.

> 7. Case sensitivity in alphannumeric sorting. Some people like it, some
> people don't; it's good to have some sort of author-controllable switch.
> (Obviously solveable with <td
> sorttable_customkey="INSENSITIVE">Insensitive</td> in the limit case, and
> this, like many other things on this list, suggests that some sort of "here
> is the JavaScript function I want you to use to produce sort keys for table
> cells in this column" function is a useful idea. Sorttable allows this, and
> people use it a lot.)

What is more usable, having default sort by being no case sensitivity
or with case sensitivity ? If the authors do not want the default case
sensitivity sort
he can use a javascript callback.

> [snip]
> but there's no <ol> concept for <tr>s).

'ol' is an ordered list. 'ul' is an unordered list. Because this
discussion is about to add the sortable functionality to table, 'tr'
and 'col' needs to be
considerated unordered.

> 9. A commonly requested type of things to know how to automatically sort is
> IP addresses. (I solve this by forwarding people the email explaining how
> to add a new sort type function to sorttable, because I've never got around
> to adding it to the script.)

Another use case to add a javascript callback.

> [snip]
> 13. What happens if a table has multiple <tbody> elements? Do they sort as
> independent units, or mingle together? Sorttable just sorts the first one
> and ignores the rest, because multiple tbodies are uncommon, but that's not
> really acceptable ;-)

The spec should take in consideration the grouping relationships as
proposed here: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-September/037475.html

> 14. Fixed-position rows. Many authors have a "totals" row at the bottom of
> their table which should remain at the bottom of the table even after
> sorting, which is easily handled (that's what <tfoot> is for), but some
> authors also have rows midway through the table which are "headers": this
> especially shows up in long tables, where the column headers from <thead>
> are repeated midway down the table and should remain in position even when
> the table is sorted. In general this means that they should remain the same
> number of rows away from <thead>. This case is odd, and sorttable.js
> doesn't handle it, but lots of people ask for it.

For me having midway "headers" used for columns could be an
accessibility/usability issue. Instead I see more relevant to have an
another option set on
the 'table' element to enable the posibility to keep the relevant
cells heading (columns and row group cell headers) in the available
view port.

On Thu Nov 8 08:09:17 PST 2012, Christoph Päper wrote:
>
> [snip]
> The sorting algorithm should not work on cells, but on slots (or slot values rather).
>
> Cells spanning multiple rows or columns may have to be split into one cell per slot and should be rejoined afterwards if possible. Note that ‘rowspan’ itself is safe for vertical sorting, unless it spans a ‘fixed’ column. Also, ‘colspan‘ is safe when it appears in the column to be sorted by.

It is important that cells would always keep their relationships
during the sort. So even the cell is split in slots during the sort,
those slots needs
to be rejoined as it was before the sort. That is to keep the
integrity of the tabular data.



Cheers

:-)

Pierre Dubois



More information about the whatwg mailing list