Michael A. Peters
mpeters at domblogger.net
Tue Oct 17 08:50:47 PDT 2017
On 10/16/2017 10:08 AM, Roger Hågensen wrote:
> On 2017-10-14 10:13, Michael A. Peters wrote:
>> I use TextTrack API but it's documention does not specify that it
>> closes open tags within a cue, in fact I'm fairly certain it doesn't
>> because some people use it for json and other related none tag related
> Looking at https://www.html5rocks.com/en/tutorials/track/basics/
> it seems JSON can be used, no idea if content type is different or not
> for that.
>> Some errors using the tracks in XML were solved by the innerHTML trick
>> where I create a separate html document, append the cue, and then grab
>> the innerHTML but that doesn't always work to close tags when html
>> entities are part of the cue string.
> Mixing XML and HTML is not a good idea. Would it not be easier to have
> the server send out proper XML instead of hTML? Valid XML is also valid
> HTML (the reverse is not always true).
I agree, but what I was using an html document for - when using JS
innerHTML it has closing tags so the only issue would be tags that html
itself does not close (e.g. br) but those are not applicable with a
WebVTT cue - which is only suppose to support a very small number of
tags, all which have closing tags.
The problem is WebVTT does not require tags be closed in a cue, e.g.
04:05.000 --> 04:07.250
<c.foo>This is a cue.
That's allowed in WebVTT
I convert c.foo into
<span class="foo">This is a cue.
and when I add that to the html document and use innerHTML it then has
the closing </span> on it.
While it seems to work with some html entities, it breaks with others
So for now I have to just make sure all my WebVTT are closed and not use
the hack that adds closing tags - but since WebVTT cues do not have to
have closing tags, but the cues need to work in XML documents, a
built-in parser in JS that can add missing closing tags I think would be
a good thing.
> And if XML and HTML is giving you issues then use JSON instead.
> I did not see JSON mentioned in the W3C spec though.
I think the JSON in WebVTT cues is not spec but some are using it.
Basically the textrack API seems to allow almost any string, it really
has to as WebVTT is not static and the spec changes. I wouldn't mind
JSON being added to WebVTT as it would be a handy way to encode metadata
about the media but that's another topic.
A built in JS HTML parser may also be of benefit in preventing code
injection, e.g. stripping out tags from a WebVTT cue that a website does
The TextTrack API doesn't filter out things like script or other tags
that aren't part of WebVTT which means any site that allows users to
upload WebVTT files is creating a potential code injection vulnerability.
Server-side code should filter it on upload, but it would be nice to
*someday* be able to pass a string through a native JS filter much the
same way we can with htmltidy server-side and remove all but
white-listed tags and attributes and get back a cleaned string with all
It looks like Google has a library that does that but it isn't intended
for client-side JS and may not be fast enough for things like phones to
process time-sensitive cues (I don't know).
I might be wrong but it looked like the google library I found was
intended for server-side Node.js use.
More information about the whatwg