On Thu, Aug 12, 2010 at 1:26 AM, Philip Jägenstedt <span dir="ltr"><<a href="mailto:philipj@opera.com">philipj@opera.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<div><div></div><div class="h5">On Wed, 11 Aug 2010 15:38:32 +0200, Silvia Pfeiffer <<a href="mailto:silviapfeiffer1@gmail.com" target="_blank">silviapfeiffer1@gmail.com</a>> wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
On Wed, Aug 11, 2010 at 10:30 PM, Philip Jägenstedt <<a href="mailto:philipj@opera.com" target="_blank">philipj@opera.com</a>>wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer <<br>
<a href="mailto:silviapfeiffer1@gmail.com" target="_blank">silviapfeiffer1@gmail.com</a>> wrote:<br>
<br></blockquote>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
Going with HTML in the cues, we either have to drop voices and inner<br>
timestamps or invent new markup, as HTML can't express either. I don't think<br>
either of those are really good solutions, so right now I'm not convinced<br>
that reusing the innerHTML parser is a good way forward.<br>
</blockquote>
<br>
<br>
I don't see a need for the voices - they already have markup in HTML, see<br>
above. But I do wonder about the timestamps. I'd much rather keep the<br>
innerHTML parser if we can, but I don't know enough about how the timestamps<br>
could be introduced in a non-breakable manner. Maybe with a data- attribute?<br>
Maybe <span data-t="00:00:02.100">...</span>?<br>
</blockquote>
<br></div></div>
data- attributes are reserved for use by scripts on the same page, but we *could* of course introduce new elements or attributes for this purpose. However, adding features to HTML only for use in WebSRT seems a bit odd.</blockquote>
<div><br>I'd rather avoid adding features to HTML only for WebSRT. Ian turned the <timestamps> into ProcessingInstructions <a href="http://www.whatwg.org/specs/web-apps/current-work/websrt.html#websrt-cue-text-dom-construction-rules">http://www.whatwg.org/specs/web-apps/current-work/websrt.html#websrt-cue-text-dom-construction-rules</a> . Could we introduce something like <?t at="00:00:02.100"?> without breaking the innerHTML parser?<br>
<br> </div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im"><br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
That would make text/srt and text/websrt synonymous, which is kind of<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
pointless.<br>
<br>
</blockquote>
<br>
<br>
No, it's only pointless if you are a browser vendor. For everyone else it<br>
is<br>
a huge advantage to be able to choose between a guaranteed simple format<br>
and<br>
a complex format with all the bells and whistles.<br>
<br>
<br>
<br>
The advantages of taking text/srt is that all existing software to create<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
SRT can be used to create WebSRT<br>
<br>
</blockquote>
<br>
<br>
That's not strictly true. If they load a WebSRT file that was created by<br>
some other software for further editing and that WebSRT file uses advanced<br>
WebSRT functionality, the authoring software will break.<br>
<br>
</blockquote>
<br>
Right, especially settings appended after the timestamps are quite likely<br>
to be stripped when saving the file.<br>
</blockquote>
<br>
<br>
Or may even break the software if it's badly implemented, or may end up<br>
inside the cue text - just like the other control instructions which will<br>
end up as plain text inside the cue. You won't believe how many people have<br>
pointed out to me that my SRT test parser exposed an <i> tag markup in the<br>
cue text rather than interpreting it, when I was experimenting with applying<br>
SRT cues in a HTML div without touching the cue text content. Extraneous<br>
markup is really annoying.<br>
</blockquote>
<br></div>
Indeed, but given the option of seeing no subtitles at all and seeing some markup from time to time, which do you prefer? For a long time I was using a media player that didn't handle "HTML" in SRT and wasn't very amused at seeing <i> and similar, but it was sure better than no subtitles at all. I doubt it will take long for popular software to start ignoring things trailing the timestamp and things in square brackets, which is all you need for basic "compatibility". Some of the tested software already does so.</blockquote>
<div><br>Hmm... not sure if I'd prefer to see the crap or rather be forced to run it through a stripping tool first. I think what would happen is that I'd start watching the movie, then notice the crap, get annoyed, stop it, run a stripping tool, restart the movie. I'd probably prefer noticing that before I start the movie, which would happen if the file was a different format. But it does take a bit of "expert knowledge" to know that websrt can be easily converted to srt and to have such a stripping tool installed, I give you that.<br>
<br>OTOH, if you say that it will take a short time for popular software to start ignoring the extra WebSRT stuff, well, in this case they have implemented WebSRT support in its most basic form and then there is no problem any more anyway. They will then accept the new files and their extensions and mime types and there is explicit support rather than the dodgy question of whether these SRT files will provide crap or not. During a transition period, we will make all software that currently supports SRT become unstable and unreliable. I don't think that's the right way to deal with an existing ecosystem. Coming in as the big brother, claiming their underspecified format, throwing in incompatible features, and saying: just deal with it. It's just not the cavalier thing to do.<br>
<br> </div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im"><br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
and servers that already send text/srt don't need to be updated. In either<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
case I think we should support only one mime type.<br>
<br>
</blockquote>
<br>
<br>
What's the harm in supporting two mime types but using the same parser to<br>
parse them?<br>
<br>
</blockquote>
<br>
Most content will most likely be plain old SRT without voices, <ruby> or<br>
similar. People will create them using existing software with the .srt<br>
extension and serve them using the text/srt MIME type. When they later<br>
decide to add some <ruby> or similar, it will just work without changing the<br>
extension or MIME type. The net result is that text/srt and text/websrt mean<br>
exactly the same thing, making it a wasted effort.<br>
</blockquote>
<br>
<br>
>From a Web browser perspective, yes. But not from a caption authoring<br>
perspective. At first, I would author a SRT file. Later, I want to add some<br>
fancy stuff. So, I load it into the application again. Then I add the fancy<br>
stuff. It tells me that I cannot save it as SRT, but have to save it as<br>
WebSRT, so I don't lose the information. Good! Now, the pipeline that I have<br>
set up for SRT files transcoding and burning onto video and which cannot yet<br>
deal with WebSRT will not accept the WebSRT file. Good again! Makes me<br>
extend my pipeline or go to the provider and upgrade my software, so I get<br>
the full feature support and the correct rendering. Excellent.<br>
</blockquote>
<br></div>
I think that as long as WebSRT is mostly compatible with SRT then people will keep using SRT tools, with the occasional mishap and disaster. I won't deny that it breaks expectations of what SRT is, but the alternative is to make WebSRT fundamentally incompatible so that not even media frameworks that rely on sniffing would treat it as SRT. However, unless <track> is a complete failure other applications will eventually want to support the format that browsers support, so inventing something completely new has a high cost too.<br>
</blockquote><div><br><br>We have reduced this cost by making it build on an existing format. Let's not pretend here: if all browser vendors support WebSRT, there will be a high motivation to implement support for it. Supporting the Web is a big argument. So, we are only talking about the transition period here as a problem period.<br>
<br>During the transition period, if WebSRT is incompatible, it will motivate people further to implement proper support for it. If it is almost compatible, it will motivate people to make quick patches that will just stop it from breaking their systems. The first one is positive motivation, introduction of a new feature, great announcements to make. The second one is negative motivation, swearing on the Web standards developers for breaking existing systems, apologies to the users for not supporting their new files properly etc etc. I honestly think we won't be making friends by stealing an existing format. But we can make friends by building a new format on an existing format such that code can be re-used by developers, and such that users can learn that they can make use of the new files by using simple tools.<br>
<br><br></div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<br>
Since browser vendors get all the benefits and none of the problems it would be a mistake to only listen to us, of course. It might be worthwhile contacting developers of applications like VLC, Totem or MPlayer and ask precisely how annoyed they would be if suddenly one day they had to tweak their SRT parser because of WebSRT.</blockquote>
<div><br>Some of them have already spoken: <a href="http://forum.doom9.org/showthread.php?p=1396576">http://forum.doom9.org/showthread.php?p=1396576</a> "Extending SRT is a very bad idea" etc etc. Also, I've had feedback from other subtitle professionals that are also against extending SRT, the main reasons being to break existing working software environments. But I will ask that question at <a href="http://universalsubtitles.org/opensubtitles2010">http://universalsubtitles.org/opensubtitles2010</a> and at <a href="http://www.foms-workshop.org/foms2010OVC/">http://www.foms-workshop.org/foms2010OVC/</a> where gstreamer, vlc and other developers will be present.<br>
<br>Cheers,<br>Silvia.<br><br></div></div>