On Wed, Aug 11, 2010 at 10:30 PM, Philip Jägenstedt <span dir="ltr"><<a href="mailto:philipj@opera.com">philipj@opera.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


<div><div></div><div class="h5">On Wed, 11 Aug 2010 01:43:01 +0200, Silvia Pfeiffer <<a href="mailto:silviapfeiffer1@gmail.com" target="_blank">silviapfeiffer1@gmail.com</a>> wrote:<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

On Tue, Aug 10, 2010 at 7:49 PM, Philip Jägenstedt <<a href="mailto:philipj@opera.com" target="_blank">philipj@opera.com</a>>wrote:<br>

<br>

I have checked the parse spec and<br>

<a href="http://www.whatwg.org/specs/web-apps/current-work/#tag-open-state" target="_blank">http://www.whatwg.org/specs/web-apps/current-work/#tag-open-state</a> indeed<br>

implies that a tag starting with a number is a parse error. Both, the<br>

timestamps and the voice markers thus seem problems when going with an<br>

innerHTML parser. Is there a way to resolve this? I mean: I'd quite happily<br>

drop the voice markers for a <span @class> but I am not sure what to do<br>

about the timestamps. We could do what I did in WMML and introduce a <t><br>

element with the timestamp as a @at attribute, but that is again more<br>

verbose. We could also introduce an @at attribute in <span> which would then<br>

at least end up in the DOM and can be dealt with specially.<br>

</blockquote>

<br></div></div>

What should numerical voices be replaced with? Personally I'd much rather write <philip> and <silvia> to mark up a conversation between us two, as I think it'd be quite hard to keep track of the numbers if editing subtitles with many different speakers. However, going with that and using an HTML parser is quite a hack. Names like <mark> and <li> may already have special parsing rules or default CSS.<br>


</blockquote><div><br></div><div>In HTML it is <span class="philip">..</span> and <span class="silvia">...</span>. I don't see anything wrong with that. And it's only marginally longer than <philip> ... </philip> and <silvia>...</silvia>.</div>


<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Going with HTML in the cues, we either have to drop voices and inner timestamps or invent new markup, as HTML can't express either. I don't think either of those are really good solutions, so right now I'm not convinced that reusing the innerHTML parser is a good way forward.</blockquote>


<div><br></div><div>I don't see a need for the voices - they already have markup in HTML, see above. But I do wonder about the timestamps. I'd much rather keep the innerHTML parser if we can, but I don't know enough about how the timestamps could be introduced in a non-breakable manner. Maybe with a data- attribute? Maybe <span data-t="00:00:02.100">...</span>?</div>


<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div><div></div><div class="h5"><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

 Think for example about the case where we had a requirement that a double<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

newline starts a new cue, but now we want to introduce a means where the<br>

double newline is escaped and can be made part of a cue.<br>

<br>

Other formats keep track of their version, such as MS Word files. It is to<br>

be hoped that most new features can be introduced without breaking<br>

backwards<br>

compatibility and we can write the parsing requirements such that certain<br>

things will be ignored, but in and of itself, WebSRT doesn't provide for<br>

this extensibility. Right now, there is for example extensibility with the<br>

"WebSRT settings parsing" (that's the stuff behind the timestamps) where<br>

further "setting:value" settings can be introduced. But for example the<br>

introduction of new "cue identifiers" (that's the <> marker at the start<br>

of<br>

a cue) would be difficult without a version string, since anything that<br>

doesn't match the given list will just be parsed as cue-internal tag and<br>

thus end up as part of the cue text where plain text parsing is used.<br>

<br>

</blockquote>

<br>

The bug I filed suggested allowing arbitrary voices, to simplify the parser<br>

and to make future extensions possible. For a web format I think this is a<br>

better approach format than versioning. I haven't done a full review of the<br>

parser, but there are probably more places where it could be more forgiving<br>

so as to allow future tweaking.<br>

</blockquote>


<br>

That's a good approach and will reduce the need for breaking<br>

backwards-compatibility. In an xml-based format that need is 0, while with a<br>

text format where the structure is ad-hoc, that need can never be reduced to<br>

0. That's what I am concerned about and that's why I think we need a version<br>

identifier. If we end up never using/changing the version identifier, the<br>

better so. But I'd much rather we have it now and can identify what<br>

specification a file adheres to than not being able to do so later.<br>

</blockquote>

<br></div></div>

Perhaps I'm too influenced by HTML and its failed attempts at versioning, but I think that if you want to know which version of a spec a document is written against, you can run it through a parser for each version. This doesn't tell you the author intent, but I'm not sure that's very interesting to know. If the author thinks it's important, perhaps it can be put in a comment in the header.</blockquote>


<div><br></div><div>I was most concerned about non-backwards-compatible changes here, but let's not repeat the discussion I had with Anne. Let's rather focus on making sure we have some means of extending WebSRT in future, should the need arise.</div>


<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im"><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

 On the other hand, keeping the same extension and (unregistered) MIME type<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

as SRT has plenty of benefits, such as immediately being able to use<br>

existing SRT files in browsers without changing their file extension or<br>

MIME<br>

type.<br>

<br>

</blockquote>

<br>

<br>

There is no harm for browsers to accept both MIME types if they are sure<br>

they can parse old srt as well as new websrt. But these two formats are<br>

different enough that they should be given a different extension and mime<br>

type. I do not see a single advantage in stealing the MIME type of an<br>

existing format for a new specification.<br>

<br>

</blockquote>

<br>

But there's no spec for the old SRT, the only thing one could do is parser<br>

it with a WebSRT parser.<br>

</blockquote>

<br>

<br>

I can write that spec in an afternoon and register the mime type with IANA.<br>

That really isn't a problem. People have managed to write correct SRT files<br>

without having a spec, because it's so trivial. Creating a spec is just a<br>

formality. For now, the wikipedia page really is sufficient.<br>

</blockquote>

<br></div>

Having a separate spec isn't really useful unless we expect people to implement it. Perhaps some new implementations would follow the spec, but browsers sure wouldn't implement two different parsers.</blockquote>


<div><br></div><div>As I also said to Anne: I wouldn't want to implement a SRT parser. It would and should just fall out as a side benefit from implementing WebSRT. It's not important for the browsers to make a distinction between SRT and WebSRT, but it is important to everyone else who is trying to manage their data.</div>


<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im"><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

That would make text/srt and text/websrt synonymous, which is kind of<br>

pointless.<br>

</blockquote>

<br>

<br>

No, it's only pointless if you are a browser vendor. For everyone else it is<br>

a huge advantage to be able to choose between a guaranteed simple format and<br>

a complex format with all the bells and whistles.<br>

<br>

<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

The advantages of taking text/srt is that all existing software to create<br>

SRT can be used to create WebSRT<br>

</blockquote>

<br>

<br>

That's not strictly true. If they load a WebSRT file that was created by<br>

some other software for further editing and that WebSRT file uses advanced<br>

WebSRT functionality, the authoring software will break.<br>

</blockquote>

<br></div>

Right, especially settings appended after the timestamps are quite likely to be stripped when saving the file.</blockquote><div><br></div><div>Or may even break the software if it's badly implemented, or may end up inside the cue text - just like the other control instructions which will end up as plain text inside the cue. You won't believe how many people have pointed out to me that my SRT test parser exposed an <i> tag markup in the cue text rather than interpreting it, when I was experimenting with applying SRT cues in a HTML div without touching the cue text content. Extraneous markup is really annoying.</div>


<div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im"><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

and servers that already send text/srt don't need to be updated. In either<br>

case I think we should support only one mime type.<br>

</blockquote>

<br>

<br>

What's the harm in supporting two mime types but using the same parser to<br>

parse them?<br>

</blockquote>

<br></div>

Most content will most likely be plain old SRT without voices, <ruby> or similar. People will create them using existing software with the .srt extension and serve them using the text/srt MIME type. When they later decide to add some <ruby> or similar, it will just work without changing the extension or MIME type. The net result is that text/srt and text/websrt mean exactly the same thing, making it a wasted effort.</blockquote>


<div><br></div><div>From a Web browser perspective, yes. But not from a caption authoring perspective. At first, I would author a SRT file. Later, I want to add some fancy stuff. So, I load it into the application again. Then I add the fancy stuff. It tells me that I cannot save it as SRT, but have to save it as WebSRT, so I don't lose the information. Good! Now, the pipeline that I have set up for SRT files transcoding and burning onto video and which cannot yet deal with WebSRT will not accept the WebSRT file. Good again! Makes me extend my pipeline or go to the provider and upgrade my software, so I get the full feature support and the correct rendering. Excellent.</div>


<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<br>

Do you find MPlayer's behavior annoying because by rescaling already<br>

rendered text, the text loses resolution and becomes less readable? This is<br>

definitely not the behaviour I am after.<br>

</blockquote>

<br></div>

Scaling with the video is annoying with small videos, as the text ends up being huge in fullscreen. I assume we're going to do scaling as well as we can, so that's not an argument in either direction.<br>

<br>

I'll have to withdraw any opinion for now, I don't know how to best deal with this.</blockquote><div><br></div><div><br></div><div>Yes, I can imagine that on small video it's bad to scale the text down with the video, since it becomes unreadable. I thought that a solution would be to define the screen size for which the text was written and then scale the text with the video. But maybe there is a function that needs to be applied where there is a minimum font size below which one cannot go and a maximum font size above which it's bad, too. It seems that scaling text at the same rate as video is not appropriate. I wonder if there is an optimal function that people have found to be best? Worth doing some experiments I guess.</div>


<div><br></div><div>Cheers,</div><div>Silvia.</div><div><br></div><div><br></div><div> </div></div>