[whatwg] SRT research: timestamps
Simon Pieters
simonp at opera.com
Thu Oct 6 01:58:29 PDT 2011
On Thu, 06 Oct 2011 01:45:13 +0200, Ralph Giles <giles at mozilla.com> wrote:
> On 05/10/11 10:22 AM, Simon Pieters wrote:
>
>> I did some research on authoring errors in SRT timestamps to inform
>> whether WebVTT parsing of timestamps should be changed.
>
> This is completely awesome, thanks for doing it.
>
>> hours too many '(^|\s|>)\d{3,}[:\.,]\d+[:\.,]\d+'
>> 834
>
> As Silvia mentioned, the WebVTT spec currently leaves the number of
> digits in the hour field as implementation defined, so long as it's at
> least two.
>
> I asked previously[1] if we could agree on and specify a limit. Would
> you mind checking what the histogram of digit numbers is in the hours
> field? Especially if you can separate cases like
>
>> 34500:24:01,000 --> 00:24:03,000
>
> either because the index is missing, or because the the interval is
> negative (for which the WebVTT spec would reject the entire cue).
I don't know how many have negative interval, I'd need to run a new script
over the 52,000,000 lines to figure out. (If you want me to check this,
please contact me with details about what you want to count as "negative
interval".)
The cases where there were 3 or more digits in the hours field are
distributed as follows:
leading id e.g.
10300:11:53,891 --> 00:11:56,155
33
hours set to 255 (these seem to all come from the same file and the
minutes are evenly distributed between 0 and 46; maybe the hours were
actually intended to be 00) e.g.
255:46:18,058 --> 255:46:25,191
671
hours in the first timestamp much greater than the second timestamp e.g.
244:00:13,320 --> 00:00:13,320
10
hours in the second timestamp much greater than the first timestamp e.g.
00:00:33,010 --> 415:54:55,400
3
leading zero (in first and/or second timestamp) e.g.
000:09:40,300 --> 00:09:45,519
150
other (garbage) e.g.
8247,711,7nsuacer :56:20,0071:15 -->ddar vid18
9
> Cheers,
> -r
>
> [1]
> http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-September/033271.html
--
Simon Pieters
Opera Software
More information about the whatwg
mailing list