[whatwg] Fwd: Discussing WebSRT and alternatives/improvements

Wed Aug 11 06:45:42 PDT 2010

On Wed, 11 Aug 2010 15:09:34 +0200, Silvia Pfeiffer  
<silviapfeiffer1 at gmail.com> wrote:
> HTML and CSS have predefined structures within which their languages grow
> and are able to grow. WebSRT has newlines to structure the format, which  
> is clearly not very useful for extensibility. No matter how we turn  
> this, the xml background or HTML and the name-value background of CSS  
> provide them
> with in-built extensibility, which WebSRT does not have.

The parser has the "bad cue loop" concept for ignoring supposedly bogus  
lines. Seems extensible to me.

>> Sure, that's why the tools should be updated to support the standard  
>> format instead rather than each having their own variant of SRT.
>
> They don't have their own variant of SRT - they only have their own  
> parsers.

That comes down to the same thing in my opinion. This is like saying  
browsers did not all have their own variant of HTML4.

> Some will tolerate crap at the end of the "-->" line. Others won't.  
> That's no break of "conformance" to the basic "spec" as given in
> http://en.wikipedia.org/wiki/SubRip#SubRip_text_file_format . They all
> interoperate on the basic SRT format. But they don't interoperate on the
> WebSRT format. That's why WebSRT has to be a new format.

By that reasoning HTML5 would have had to be a new format too. And CSS 2.1  
as opposed to CSS 2, etc.

>> (And if they really just take in text like that they should at least run
>> some kind of validation so not all kinds of garbage can get in.)
>
> That's not a requirement of the "spec". It's requirement is to render
> whatever characters are given in cues. That's why it is so simple.

But it is not so simple because various extensions are out there in the  
wild and are used so the concerns you have with respect to WebSRT already  
apply.

> Sure. All I need to do is rename the file. Not much trouble at all.  
> Better than believing I can just copy stuff from others since it's  
> apparently the same format and then it breaks the SRT environment that I  
> already have and that works.

At least with the copy approach you would still see something in your SRT  
environment. The <ruby> bits would just be ignored or some such.

>>> That's already part of Ian's proposal: it already supports multiple
>>> different approaches of parsing cues. No extra complexity here.
>>
>> Actually that is not true. There is only one approach to parsing in  
>> Ian's proposal.
>
> A the moment, cues can have one of two different types of content:
> (see  
> http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#syntax-0
>
> [...]
>
> So that means in essence two different parsers.

Per the parser section there is only one. See the end of

http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#parsing-0

-- 
Anne van Kesteren
http://annevankesteren.nl/