[whatwg] Comments about the track element

Wed Jul 25 15:44:05 PDT 2012

On Wed, Jul 25, 2012 at 11:45 PM, Cyril Concolato
<cyril.concolato at telecom-paristech.fr> wrote:
>> Right now it is fully defined how data in a TextTrack (of the defined
>> kinds) is displayed on top of the video. As this is as yet unclear for
>> SVG resources,
>
> I wouldn't say it's unclear, I'd say it needs to be specified ;) meaning
> that it probably doesn't require much specification. I was thinking that we
> could use the CSS box of the video element to position the SVG, as if the
> SVG was put in a div.

Let's work on this basis and see where we get. There's also
positioning issues etc. so it's not as simple as just putting the SVG
in a cue.

>> I would suggest using the @metadata tack kind for now
>> and providing the SVG as markup in a TextTrackCue (either from WebVTT
>> cues
>
> I've tried this option but I'm facing several problems (Tested with Chrome
> Version 22.0.1216.0 canary).
>
> The first problem is how to embed SVG in a cue? Should the '<', '>' and
> other characters be escaped or not? According to Anne's validator,

So, I assume you created WebVTT files. (You don't have to - you can
directly use the TextTrack API.)

Anne's validator validates the WebVTT rules for caption and subtitle
kinds. For "metadata" kinds, there should be no parsing of the cues in
browsers. A validator can only decide whether to parse the cues
according to "captions"/"subtitles", or "chapters", or "metadata"
rules if the WebVTT file has such an indicator. I've asked for such
information to be included in WebVTT, but we don't currently have such
markup/metadata.

> they
> should be.

Actually, for @kind=metadata you don't need to escape '<' or '>'.

> But if I use them, then the parsing of the escaped string returns
> 'empty document'
> (http://perso.telecom-paristech.fr/~concolat/html5_tests/getcueasSVG-escaped.html).

Which parsing? Anne's validator? Have you tried Chrome directly?
http://perso.telecom-paristech.fr/~concolat/html5_tests/svg-escaped.vtt
does look very ugly.

> However, if I don't escape them, the parsing doesn't fail and returns an SVG
> document
> (http://perso.telecom-paristech.fr/~concolat/html5_tests/getcueasSVG.html).

cue.text is the SVG code? That's what we want, right?
(http://perso.telecom-paristech.fr/~concolat/html5_tests/svg.vtt looks
much nicer)

> In any case, I think embedding the SVG in WEBVTT does not really make sense.

Why not?

> An other problem is in terms of design. SVG has a timing model (similar to
> TTML), WebVTT another. For instance, SVG can express things like repetitions
> of animations that WebVTT cannot. Are you saying that TTML should be carried
> in a WebVTT file?

TTML in WebVTT probably doesn't make sense. But SVG's timing model can
be a applied within the timeframe of a cue, so that does make sense.

How would you specify this with TTML? It would run into the same
problems, wouldn't it?

> Similarly, in terms of design, embedding SVG in cues requires repeating a
> lot of SVG content at each cue (see
> http://perso.telecom-paristech.fr/~concolat/html5_tests/svg.vtt), as this
> approach requires parsing an entire document at each cue. You could probably
> envisage overlapping cues but that would require a lot of overhead.
> Leveraging the progressive loading of SVG cannot be done this way either.
> In general, I think it would make sense to leverage the browsers' support
> for SVG and not stack different technologies.

Sure, it should use existing SVG support. I'm not so sure I agree with
not stacking - that depends.
What would your preferred markup for
http://perso.telecom-paristech.fr/~concolat/html5_tests/svg.vtt be ?
How would you avoid the duplication?

> Another problem is that I don't know if it's possible to display the SVG
> content in a layer between the video and the UI controls. Currently, I
> display the SVG on top of the video element, therefore the UI controls are
> not accessible for clicks. Having to embed my own UI controls for that is a
> bit of a pain. And, semantically, when reading the spec, 'metadata' tracks
> say " Not displayed by the user agent. " so I think this might be a bit
> confusing for users/authors.

All publishers that want the same controls in all browsers make their
own controls anyway. If you make a library for SVG display on top of a
video, you can also make one for the controls (or use one of the many
existing ones).

> The third problem is performance-wise. In my example, the blue line (in
> SVG), when synchronized with the video, should be aligned with the moving
> (white-gray) edge of the pie. As you can see, this is not the case. Only 4-5
> cuechange events seems to be processed properly. I noticed the same problem
> with 'timeupdate' events. Also, I've noticed that even though my WebVTT file
> is designed to have only one active cue at a time, for some cuechange
> events, there are 2. This might be an implementation issue but this might be
> a problem of reentrant code (the cuechange callback being called while it's
> not finished), but in general, I'm not sure it's a good idea to go through
> the Javascript engine to do that, for the processing overhead.

TextTrack support is still very new. I agree that its update frequency
should be more often than the timeupdate events. Your example is
indeed pushing the boundaries. Basically you are asking it to draw a
clock handle in synch with a video that is updating its clock pie
every video frame. TextTrack was built for relatively "rare" events
along the timeline of a video - certainly not for something that needs
an update with every video frame. Going through WebVTT makes this
particularly slow.

>> or from JavaScript calls to addTextTrack()).
>
> Can you elaborate on this one? However, I suspect it'll have the same
> processing overhead.

I'm not sure. Having to repeatedly parse WebVTT cues and draw the SVG
image makes this particularly slow. Have you tried to paint the SVG
just once on the video and using TextTrackCues just to change the
transform value using JavaScript? Upon a cuechange event, you re-draw
the SVG.
>>> for
>>> instance reusing the viewport/viewbox negotiation phase. There would also be
>>> a need to make a more generic Track API or to replace the TextTrack API by
>>> the SVG API when the track is of kind 'graphics'.
>>
>> I don't understand this requirement. What API needs are there aside
>> from the synchronization? Trying to replicate SVG APIs through the
>> TextTrack API seems like a repetition of the API and thus fragile.
>
> Sorry for the confusion here. I didn't mean to replicate the SVG APIs here
> but I just meant that the TextTrack API is very specific to 'pure' text
> tracks (and even to WebVTT text tracks). You might want to expose the SVG
> API when SVG content is used for the overlay to control it.

Can you make an example? How do you think that should look?

Cheers,
Silvia.