[whatwg] Sortable Tables
Ian Hickson
ian at hixie.ch
Thu Dec 27 18:04:25 PST 2012
I've added a feature to HTML to enable users (and authors) to sort tables.
The basic design of the feature is that if a column's <th> has a sorted=""
attribute, the UA will sort the table every time the mutation observers
would fire (before they fire). A table can have a sortable="" attribute,
which lets the user tell the user agent to add sorted="" attributes to
columns to sort them.
On Tue, 6 Nov 2012, Ojan Vafai wrote:
> On Tue, Nov 6, 2012 at 11:25 AM, Ian Hickson <ian at hixie.ch> wrote:
> > On Thu, 1 Jul 2010, Christoph Päper wrote:
> > >
> > > For starters, only rows inside tbodys shall be reordered. For now
> > > columns dont have to be reordered, ie. only vertical, no horizontal
> > > sorting.
Done.
> > > Nevertheless the design should make it possible to add the other
> > > direction later.
Well I guess nothing would stop us supporting sorted="" on <th>s at the
front of a row, but boy, that would be a lot more complicated to do. You'd
have to be moving cells around all over the place.
> > > Not every table has content that makes sense to be sorted in a
> > > different order. So sortable tables should be marked as such. Note
> > > that col and colgroup elements are hardly supported.
<table sortable>.
> > > Not every column has content that makes sense to be sorted in a
> > > different order. So non-sortable columns inside sortable tables
> > > should be marked as such.
Any column with a <th> is sortable, for now. We can add a "nosort" column
or something later if this becomes a problem.
> > > There are different ways to sort, eg. numeric, temporal or
> > > alphabetic and ascending or descending. Therefore columns should
> > > bear information how they should be sorted, ie. what kind of content
> > > their cells have.
Ascending/descending is supported (sorted="reversed").
Any temporal syntax supported by <time> can be used by putting <time> as
the only child of the cells to sort.
I intend to spec some sort of algorithm for doing numeric/string
comparison, but haven't yet come up with a good solution. If you have any
suggestions, this is the bug tracking this issue:
https://www.w3.org/Bugs/Public/show_bug.cgi?id=20524
> > > Several columns may be used for sorting by some kind of priority.
You can set sorted="" on multiple columns' headers, and give a sort key
cardinality in each, as in sorted="1", sorted="2", etc.
> > > The original order must be restorable.
This I have not supported. I don't see how to support it sanely.
> > > Cell content may not consist of the string that should be used
> > > verbatim for sorting purposes, eg. leading articles or similar
> > > numbers with different units (g, kg, t
). Cells should have
> > > an optional attribute indicating their sort key. The time element
> > > already provides the necessary metadata features for temporal
> > > sorting maybe there should be more of such elements instead.
I've used <data> for this, alongside <time>.
> > > There may be columns that shall remain stable, eg. rank numbers.
I haven't supported this. I've no idea how to do this sanely, especially
given cells with column and row spans.
> 1. Would sorting actually reorder the DOM nodes or just change their
> visual order? It's not clear to me which one is better. I think the
> former is what you'd want most of the time.
I've gone with reordering the DOM nodes. Things like :nth-child styling
become nigh on impossible without doing it at the DOM level, not to
mention the confusion that would reign from having such a dramatic
disconnect between rendering and DOM (e.g. with abs pos, etc).
> 2. What values should the sort property allow. One idea is that it takes
> a JS function similar to what JavaScript's sort function takes. If you
> leave it out then it just does alphanumeric sort.
I was going to have a comparator function, but I couldn't see a sane way
to make it work in the face of hostile functions that mutate the DOM, so
I dropped it. You can do custom sort orders by giving a key in the <data>
element's value="" attribute, though.
> 3. What elements does it go on? I don't see what it would do on a td. I
> could see putting it on a th though. Also, it's not clear to me what
> would get sorted. For example, in some tables, you would group trs
> inside tbodys and want to sort those.
sorted="" goes on a column-heading <th>, ideally in a <thead> but you can
also put it on the first row of your <tbody> if you don't have a <thead>.
Rows are sorted on a per-group basis. Rows that span each other are
treated as one row for sorting.
On Tue, 6 Nov 2012, Boris Zbarsky wrote:
>
> Another obvious question: how does (or should) sorting interact with
> rowspans?
The sort algorithm groups rows that span each other together and treats
them as one (using the data in their top row for sorting).
On Wed, 7 Nov 2012, Silvia Pfeiffer wrote:
>
> http://tympanus.net/codrops/2009/10/03/33-javascript-solutions-for-sorting-tables/
Interesting, thanks.
> Also, a sortable table's header needed some indication of the sortability,
> so some default CSS like this:
> th.sortable {
> &:after { content: " â²â¼"}
> &.current{
> &[data-direction="asc"]:after { content: " â¼"}
> &[data-direction="desc"]:after { content: " â²"}
> }
> }
I haven't defined the styling in detail, pending both user agent
implementation experience and the addition of :sorted to CSS.
On Wed, 7 Nov 2012, Silvia Pfeiffer wrote:
> On Wed, Nov 7, 2012 at 8:37 PM, Jirka Kosek <jirka at kosek.cz> wrote:
> >
> > It would be very difficult to support sorting on dates and numbers as
> > in HTML they are usually present formatted using specific locale. So
> > there should be additional attribute added to td/th which can hold
> > sort key which will override cell contents, something like
> >
> > <td sortas="2012-11-07">11. listopadu 2012</td>
<td><time datetime="2012-11-07">11. listopadu 2012</time>
On Wed, 7 Nov 2012, Stuart Langridge wrote:
>
> I'm the author of http://www.kryogenix.org/code/browser/sorttable/, a
> moderately popular JavaScript table sorting script. As such, I have
> about nine years worth of anecdata about how authors want their HTML
> tables to be sorted, the sorts of things they request, and issues that
> may be worth taking into consideration. These are not particularly in
> order; they're just things that I think are relevant.
Thank you very much for your input, it was invaluable.
> Sorttable.js, my script, has the guiding principle of not needing
> configuration in most cases. Therefore, it attempts to guess the type of
> a table column: if a column looks like it contains numbers, sorttable
> will use numeric sort (1 before 2 before 100) rather than alphanumeric
> sort (1 before 100 before 2); if a column looks like it contains date
> information, then sorttable will sort by date (for formats DD/MM/YYYY
> and MM/DD/YYYY). The algorithm used for this guessing is pretty naive
> (check the first cell in a column; if it's blank, check the next one;
> etc). I think that this, by itself, has accounted for sorttable's
> popularity, because in most cases, it Just Works; you add a <script>
> element pointing to the script, and class="sortable" to the <table>, and
> do *nothing else*, and your table is sortable without any configuration.
I intend to do something along those lines for HTML's sorting algorithm
also, though that is still up in the air (see above).
> Everything else below here is configuration-based: something you'd have
> to do explicitly as an author. The above point is the critical one;
> guessing column types to make table sorting be zero-config. Some
> alternative scripts require you to explicitly tag date or numeric
> columns, and I think that authors see that as annoying. Anecdata, of
> course.
>
> Sorttable also allows authors to specify "alternate content" for a cell.
> That is (ignore the invalid HTML attribute here; I didn't know any
> better, and we didn't have data-* attributes when I wrote this stuff)
>
> <td sorttable_customkey="11">eleven</td>
<td><data value="11">eleven</data></td>
> This is basically useful for when you have table data which has a
> definite order but it can't be autoguessed, or (more usefully still)
> when it could be autoguessed but that would be hard. The canonical
> example of this is dates: it would be exceedingly annoying, given
> <td>Wed 7th November, 10.00am GMT</td> to have to parse that cell
> content in JavaScript to turn it back into a Date() so it can be placed
> in sort order with other dates. The sorttable.js solution is to specify
> a "custom key", which sorttable pretends was the cell content for the
> purposes of sorting, so <td sorttable_customkey="20121107-100000">Wed
> 7th November, 10.00am GMT</td> and then the script can sort it.
<td><time datetime="2012-11-07T10:00Z">Wed 7th November, 10.00am GMT</time></td>
> This feature is basically the get-out clause, an author hook for saying
> "I know what I want, but your fancy sorting thing can't handle it; how
> do I override that?" They can specify custom keys for all their TDs and
> then sorting will work fine. (Obviously, dates are less of a problem in
> theory today with <date> elements, but... how does the script know to
> use the datetime attribute of the <date> in <td><date>...</date></td>?)
In the case of the spec, if the <td> element's only child is a <time> or a
<data>, it knows to use the datetime="" or value="" attributes respectively.
> In roughly descending order of popularity, here is what I've been asked
> questions about, over the last decade or so:
>
> 1. Sorting tables inserted after page load. This is obviously not a
> problem (sorting a table created with JS rather than in the base HTML),
> and sorttable should handle it without explicit action from the author
> to "mark" a table as sortable, but it doesn't because of laziness from
> me. I include it for completeness because sorttable not handling it
> generates probably a third of all the sorttable complaint email I
> receive; a properly specced sortable tables implementation in browsers
> would obviously handle this and wouldn't need to even have it specified.
Supported.
> 2. Sorting a table on page load. That is: a table in HTML containing
> unsorted data should be sorted by the browser when the page loads,
> without user action. Sorttable doesn't do this because I think it's
> wrong (if you want sorted data when the page loads, serve it as sorted
> in the HTML), but lots of people ask for it.
Supported, though I'm not sure how good an idea this will end up being.
> 3. Multiple header rows. Many authors have two or more <tr>s in the
> <thead>, one of which contains rowspanned <th>s, to group columns
> together. If this happens, which <th>s are clickable to sort the table?
> Which are not? This is hard to autodiagnose (and indeed sorttable punts
> on it and picks the first one, which is almost certainly wrong; even
> naively picking the last <tr> inside <thead> would be better, but still
> imperfect).
The spec picks the highest non-spanning <th> in a column, if there's a
<thead>. (If there's not, it uses the top row's <th>, if it doesn't span
columns.)
> 4. Handling colspans and rowspans in the table. Sorttable.js basically
> punts on this, because what's expected to happen when you sort a column
> which contains only half a cell (because the other half's in another
> column, with rowspan=2) is wildly author-specific. But a properly
> specced solution doesn't get to punt and say "unsupported". This will
> need some thought.
For column spanning, the spec's model basically just acts as if the cell
isn't spanning, but is in each column it spans.
So e.g. <td colspan=2>X</td> is treated as <td>X</td><td>X</td>, for the
purposes of sorting.
> 5. Numeric sort handling exponented numbers such as 1.5e6 (which do not
> match a naive "is this a number" regexp such as /^[0-9]+$/ )
I'd like to support this as part of the algorithm mentioned bofer:
https://www.w3.org/Bugs/Public/show_bug.cgi?id=20524
> 6. Specifying how to display that a column is sorted. This would likely
> be done in this specification by leaving it to CSS and
> th::sorted-forward { after: content("v"); } or some such thing (I have
> no policy suggestions here), but authors want to be able to specify
> this, along with different styles for a sorted column. This is mildly
> more awkward because there's no real concept of a column in the DOM of
> an HTML table, but perhaps all the TDs could grow a pseudo
> ::sorted-forward or something (handwaving here like mad, obviously).
I haven't specced this yet but once CSS has the :sorted pseudo (bug 20522)
I expect we'll be able to do something like:
th:sorted(ascending)::after { content: "v"; }
> 7. Case sensitivity in alphannumeric sorting. Some people like it, some
> people don't; it's good to have some sort of author-controllable switch.
> (Obviously solveable with <td
> sorttable_customkey="INSENSITIVE">Insensitive</td> in the limit case,
I intend to only support insensitive comparisons initially, but if that's
a problem we can definitely revisit it somehow. (It can't be worked around
easily, unlike the other way around.)
> and this, like many other things on this list, suggests that some sort
> of "here is the JavaScript function I want you to use to produce sort
> keys for table cells in this column" function is a useful idea.
> Sorttable allows this, and people use it a lot.)
I tried to do this but couldn't figure out a sane way to do it. A
comparator can totally destroy the table we're sorting, and I don't know
what to do if that happens.
> 8. Mark a column as not sortable. Note: this does not mean that clicking
> on that column doesn't sort it; it means that that column does not get
> sorted *even when the rest of the table does*. This gets requested for a
> sort of "left-hand header" concept, where the first column contains
> numbers, 1, 2, 3, 4 etc, one per row, to show which is row 1, row 2, row
> 3 etc of the table. Obviously this column should not be sorted when the
> rest of the table is. I'm not sure there's any good markup for this in
> HTML (<ol>s do it, but there's no <ol> concept for <tr>s).
I haven't supported this. To some extent, it's presentational, and thus
can be done using something like:
tr::before { display: table-cell; content: counter(row); }
...or some such.
> 9. A commonly requested type of things to know how to automatically sort
> is IP addresses. (I solve this by forwarding people the email explaining
> how to add a new sort type function to sorttable, because I've never got
> around to adding it to the script.)
This is something that should end up supported by the sorting algorithm
automatically.
> 10. Zebra-striped tables are a problem. Well, they're not a problem if
> you're striping with CSS (#mytable tr:nth-child(2n) td { background:
> #eee; }) but an awful lot of people bake the stripes into their HTML
> (<tr class="even">), and this gets screwed up if you sort the table. The
> solution here obviously might be to poke authors to do presentational
> stuff with CSS instead and then their problems go away, but *lots* of
> people complain about this.
:nth-child() is more widely supported than this feature, so I think it
makes sense to rely on the former if you're relying on the latter.
> 11. Authors like the idea of having script callbacks before and after a
> user action to sort, so they can do things to the table, show progress
> or an hourglass, etc. This would presumably be neatly handled by firing
> a "sort" event on the table or similar.
I've made 'sort' get fired at the table before the sort starts. Nothing is
fired after currently.
> 12. Stable sort: I recommend that the sort that's implemented be
> specified as being a stable sort, because people who care really want it
> and write me annoyed emails that it's not there, and no-one explicitly
> wants unstable sort. :)
Done.
> 13. What happens if a table has multiple <tbody> elements? Do they sort
> as independent units, or mingle together? Sorttable just sorts the first
> one and ignores the rest, because multiple tbodies are uncommon, but
> that's not really acceptable ;-)
Independent.
> 14. Fixed-position rows. Many authors have a "totals" row at the bottom
> of their table which should remain at the bottom of the table even after
> sorting, which is easily handled (that's what <tfoot> is for), but some
> authors also have rows midway through the table which are "headers":
> this especially shows up in long tables, where the column headers from
> <thead> are repeated midway down the table and should remain in position
> even when the table is sorted. In general this means that they should
> remain the same number of rows away from <thead>. This case is odd, and
> sorttable.js doesn't handle it, but lots of people ask for it.
<tfoot> is supported as suggested. Haven't done it for the mid-rows. Not
sure how to make that work while sorting around them. I mean, you'd have
to count the number of rows before each one so that you put back the right
number of rows or something...
On Thu, 8 Nov 2012, Cameron Jones wrote:
> > <time> exists, and <data> exists for non-time machine-readable data;
> > maybe they can be utilized in some way?
>
> I have done some investigation in this area too and having concrete
> datatypes would make this more utilizable, ie from the proposal for
> <data type="" value=""/>
>
> http://www.w3.org/wiki/User:Cjones/ISSUE-184
>
> The other area of integration would be with BCP-47 language tags and the
> CLDR which include i18n collation information, for example british
> numeric collation:
>
> en-GB-*u-kn-true*
>
> The significant benefit with this is that this standard is already
> universal across server\client and is of course fully internationalized.
>
> The other aspect of this is that there is a distinction between server
> pagination including sort ordering defining the content of a page and
> the client-based sorting which would be more of a presentational
> customization and outside the scope of pagination. As such, it may be
> better for the HTML to markup the structure of the content with sorting
> and collation but for this to be configurable through CSS without the
> structural DOM changes.
>
> This could also apply to HTML lists: <ul> <ol>, <dl>.
I haven't added this. I'm curious as to the use cases and how much
implementation interest there is (I guess this would primarily be for
validators?).
On Thu, 8 Nov 2012, Alex Russell wrote:
>
> I'm much more inclined to solve this from the data axis. Asking the
> table itself to do the sorting is weird. Instead, you most often want to
> have some data source return you rows in sorted order (or indicate row
> order). If you do something like MDV, sorting the table is applying a
> sort to the template that stamped out the view. That works with
> DOM-table backed tables as well as server or JS-backed tables.
I'm happy to strip out the current text in the spec and add in something
more like this model if there's implementation and author interest, but I
don't really understand what you are proposing. Can you elaborate?
On Wed, 7 Nov 2012, Christoph Päper wrote:
>
> >> Note that âcolâ and âcolgroupâ elements are hardly supported.
>
> But theyâre essential for assigning sort properties.
>
> <col key=â¦>
> <colgroup key=â¦>
I ended up using <th> for this instead.
> To support this, cells must be splittable!
>
> td {color: green;}
> #split {color: red;}
>
> <tr><td>3 <td id=split colspan=2> red
> <tr><td>1
> <tr><td>2 <td> green
>
> after sorting by the first column should look like
>
> <tr><td>1 <td id=split> red
> <tr><td>2 <td> green
> <tr><td>3 <td id=split> red
>
> would if duplicate IDs were legal. The DOM tree, however, would not
> change! The value of the cell at position (1,1), i.e. second row and
> column since we count from zero, is always undefined, but the value of
> the slot at (1,1) changes from âredâ to âgreenâ.
That's an interesting idea, but I don't think it's the right approach.
Some elements are not elements you want to clone (e.g. <audio>, <embed>,
<input>). And it's not clear how you remerge them.
On Fri, 9 Nov 2012, Pierre Dubois wrote:
>
> My opinion is that depends of the real scope of the "th" element.
>
> If the "th" is an empty cell or used for "layout", the sorting
> functionality would not be available.
> If the "th" is an "group header", the sorting functionality would be
> applied to the header cell along with their data fixed. Where the
> header cell is a
> subgroup header or/and an header that represent one or more row or column.
> If the "th" is an "header", the sorting functionality could be applied
> to the data cell associated and by default the sorting action would be
> extended to the other axis [row|col].
That's an interesting idea. I'm dubious about overloading the logic like
this, though, lest it make authors set invalid scope values just to get
sorting enabled/disabled.
I'd rather just add an attribute that says "this can't be a sort column",
if that's really a need.
When is it a need, though? I'd love to study a table that has a column
that it doesn't make sense to sort by.
> Use case: A data table that have row headers and column headers.
> Row and column that is in the scope of an rowspans and colspans data
> cell (td) would be fixed.
Not sure what you mean, but for what it's worth, the spec as written will
skip over and rows at the top of <tbody>s that consist of only <th>s.
> Use case: A data table that only have row headers.
> Row that is in the scope of an rowspans data cell (td) would be fixed.
If a data table only has row headers, I'm not sure how to sort it.
> Use case: A data table that only have column headers.
> Column that is in the scope of a colspans data cell (td) would be fixed.
Not sure what this means.
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
More information about the whatwg
mailing list