[whatwg] SRT research: timestamps

Simon Pieters simonp at opera.com
Thu Oct 6 01:58:29 PDT 2011


On Thu, 06 Oct 2011 01:45:13 +0200, Ralph Giles <giles at mozilla.com> wrote:

> On 05/10/11 10:22 AM, Simon Pieters wrote:
>
>> I did some research on authoring errors in SRT timestamps to inform
>> whether WebVTT parsing of timestamps should be changed.
>
> This is completely awesome, thanks for doing it.
>
>> hours too many '(^|\s|>)\d{3,}[:\.,]\d+[:\.,]\d+'
>> 834
>
> As Silvia mentioned, the WebVTT spec currently leaves the number of
> digits in the hour field as implementation defined, so long as it's at
> least two.
>
> I asked previously[1] if we could agree on and specify a limit. Would
> you mind checking what the histogram of digit numbers is in the hours
> field? Especially if you can separate cases like
>
>> 34500:24:01,000 --> 00:24:03,000
>
> either because the index is missing, or because the the interval is
> negative (for which the WebVTT spec would reject the entire cue).

I don't know how many have negative interval, I'd need to run a new script  
over the 52,000,000 lines to figure out. (If you want me to check this,  
please contact me with details about what you want to count as "negative  
interval".)

The cases where there were 3 or more digits in the hours field are  
distributed as follows:

leading id e.g.
10300:11:53,891 --> 00:11:56,155

33

hours set to 255 (these seem to all come from the same file and the  
minutes are evenly distributed between 0 and 46; maybe the hours were  
actually intended to be 00) e.g.
255:46:18,058 --> 255:46:25,191

671

hours in the first timestamp much greater than the second timestamp e.g.
244:00:13,320 --> 00:00:13,320

10

hours in the second timestamp much greater than the first timestamp e.g.
00:00:33,010 --> 415:54:55,400

3

leading zero (in first and/or second timestamp) e.g.
000:09:40,300 --> 00:09:45,519

150

other (garbage) e.g.
8247,711,7nsuacer :56:20,0071:15 -->ddar vid18

9

> Cheers,
>  -r
>
> [1]
> http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2011-September/033271.html


-- 
Simon Pieters
Opera Software



More information about the whatwg mailing list