[whatwg] <time>

Thu Mar 12 04:05:09 PDT 2009

Summary: We should allow at least proleptic Gregorian dates before
0001-01-01 and we should allow ranges. These are low-hanging fruit with
clear use cases. (I don't even get into the question of different
calendars - I will save that for another discussion).

details below.

On Thu, 12 Mar 2009 14:20:32 +0100, Lachlan Hunt
<lachlan.hunt at lachy.id.au> wrote:

> Bruce Lawson wrote:
>> On Thu, 12 Mar 2009 17:05:38 +0530, Lachlan Hunt  
>> <lachlan.hunt at lachy.id.au> wrote:
>>
>>> I think the design principles that are applicable here include Solve  
>>> Real Problems [2],

Yes, I agree this is something we should be doing. As an historian I can
tell you that there are many many sites that provide dates for things, and
many of those dates are in the past (although more will probably be future
dates for simple mathematical reasons).

I also agree with the "Baby Steps" idea. That's why this mail doesn't  
suggest stepping beyond the Gregorian proleptic system for now. (I think  
that referring to the principles, however, is a bad idea in general.  
People think that their interpretation of what is a big or small problem  
or change is obviously the right one - but people clearly disagree on the  
details to the point that "invoking the principles" has come to seem more  
often counter-productive, or even in extremis indicative of a refusal to  
engage with the issues at hand, than the contrary :( ).

A big producer of content is the world of libraries and document
archivists. Example I already gave, such as the British Parliament, are
maintain collections that they are putting online, of documents whose
dates go back past the English conversion to the Gregorian calendar two
and a half years ago. Similarly, while I don't think anyone from the
Chinese National LIbrary system is on this list (there are drawbacks to
havinga  high-colume english-language list and thinking it might be
representative) I can tell you that they have a huge number of documents
that date to before the Chinese conversion to a Gregorian calendar in
1949, many that date to before the proleptic Gregorian date 0001-01-01,
and many whose dating is a range.

They are not alone. Many libraries around the world, and musea, collect
things, publish information about them, and already have carfeully
developed metadata stores. These are the kind of institutions who asked
for the Semantic Web and use it. They also often publish to the public
some of their high-quality data, with a lot more typically available to
researchers or people who have otherwise been given acces. (You can find
this stuff with search engines - it isn't rocket science. A random query
to ask.com got me to http://www.davidrumsey.com/directory/ which uses all
kinds of dates, including pre-Gregorian and ranges. I believe that this
isn't just one edge case, but I am not going to spend the week searching
the web to prove it).

> ... It already supports dates back to 0001-01-01, which covers a  
> significant proportion of historic dates already.  It's just not really  
> optimised for such uses.

Indeed. It has been pointed out that allowing for negative years would not
be difficult, (in the parsing-rules mold of the spec, it is "if the string
starts with a hypen, it's negative") would allow for a number of use cases
such as
http://www.britishmuseum.org/explore/highlights/highlight_objects/cm/g/gold_stater_in_the_name_of_tit.aspx
or http://en.wikipedia.org/wiki/List_of_Roman_consuls#First_century_BC and
so on.

>> 2) microformats are already used "in the wild" to mark up past events.  
>> sometimes ancient and sometimes without DDMMYYYY precision. People who  
>> wish to do that won't be able to use <time>, so it perpetuates the  
>> accessibility problems it wishes to solve and fragments the way dates  
>> are marked up on the Web; some will use time, some will use microformats
>
> Yes, examples of such imprecise dates using microformats have already  
> been provided for those, which is a good start, and personally I think  
> allowing support for years only (YYYY) or year and month only (YYYY-MM)  
> dates is a reasonably easy thing to do.  But the issue still needs  
> further investigation to understand what useful functionality consumers  
> would gain from such markup.

The examples I gave show cases where this precision is what is available
or relevant. Some further comments on your use cases below.

> Specifically:
> * Investigation of how it would help users of assistive technology to
>    have imprecise dates marked up.  (I mentioned one potential benefit in
>    my last email, but, as I said, I'm not certain about it and it needs
>    research to confirm it).
>
> * Investigation of how browsers could expose the date to to users in a
>    useful way, and an understanding of how and why this would be useful
>    for such historic and/or imprecise dates.

This would be useful for the original use case. We describe years by
various means, as you note below.

> * Investigation of other potential consuming applications, such as
>    - The SIMILE timeline application that can create timelines from
>      marked up events in a page;
>    - Date based news searching applications (e.g. searching for news from
>      a particular time period).

Or objects, or people, or events...

> * Investigation of how imprecise dates affect the ability to import such
>    events into a calendar.  e.g. The Sydney Royal Easter show scheduled
>    for 2009-04, and takes place over a period of a few weeks in the
>    month.  Is it enough to simply say:
>
> <time datetime="2009-04">9–22 April 2009</time>
>
>    Or would it be better to give the precise date range, as
>
> <time datetime="2009-04-09">9</time>–<time datetime="2009-04-22">22  
> April, 2009</time>
>
>    Or would supporting a range directly in the datetime field support
>    this better:
>
> <time datetime="2009-04-09/2009-04-22">9–22 April 2009</time>

It would be better to have it as a range, is the clear conclusion from
working in this area. The event does not take place over April, and it
doesn't happen on the start and end day. My calendar lists what country I
will be in, and those are date ranges - nothing else is useful.
Conferences I go to require a date range if they run for more than one
day, in order to process them intelligently.

People's lives are measured as a date range - in some cases (e.g. Julius
Caesar) they can be measured to a given day, in some cases, a year, and in
some cases, one terminal of the range is unknown.

Note that this introduces a complication. What to do when comparing an
unknown date as one terminal of a range ("Centurion Crismus Bonus,
??/-0042") to an open-ended range ("the period before the introduction of
the Gregorian calendar in anglophone North America, .../1752"), if
anything.

> Another case for an imprecise date might be:
>
> <p><time>2009</time> is The International Year of Astronomy.</p>
>
> For this, we would need to understand what real benefit consuming  
> applications would gain from that.  It's not really a date that someone  
> would want to import directly into their calendar.

WHy on earth not? I have imported such things into calendars before. There
is a lot of money spent on calendars explaining what chinese year we are
in, or what holidays and festivals are expected, and some of this is "what
did the UN or the CWA or the Secret Cabal or the HR department declare
this year to be?"

> But understanding what other potential applications, such as those  
> mentioned above, might want to do with it would be useful.

Understanding the universe would be useful. But it turns out that we ship  
software (and ships, and spacecraft) even before we have understood the  
grand unified field theory. Understanding *enough* to justify a decision  
to do something (and that is a very subjective judgement) is sufficient.  
In this case, there is a lot of practical experience of calendaring. Many  
people made a lot of money from it in the late 90s, for example. It would  
seem that proposing to take a small, conservative and widely-used part of  
a common standard practice, we could easily enough discover if there are  
noted problems we should note and plan to avoid.

>> What advantage is there for authors and consumers by *not* extending  
>> the range of dates that can be described with <time> ?
>
> That's the wrong question to ask because it places the burdon of proof  
> on the wrong side.  But by not addressing every possible little use case  
> under the sun, we keep the language simplified and easier for authors to  
> learn and use, and we can focus on really optimising for the top ~80-90%  
> of the use cases, without spending a disproportionate amount of time  
> trying to optimise for the remaining ~10% of edge cases too.

Actually, both questions are worth aking. Because if it turns out that for
1% incremental effort, we can effectively optimise for 99% of cases
instead of 80% (I am making these figures up, just as you are) then it
would appear worthwhile. Satisfying the demands of the braying public
isn't a design principle. It is one of the things that W3C specs try to do
in order to get the consensus that makes them respected specifications,
and where the effort required, and the risk of making mistakes, are
minimal (for example, because you are copying stuff that is used every day
in practice by lots of people all over the world, rather than inventing
something new).

cheers

Chaals

-- 
Charles McCathieNevile  Opera Software, Standards Group
      je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals       Try Opera: http://www.opera.com