[whatwg] WebVTT feedback (was Re: Video feedback)

Silvia Pfeiffer silviapfeiffer1 at gmail.com
Tue Jul 19 20:14:23 PDT 2011

Hi Marc,

On Wed, Jul 20, 2011 at 10:06 AM, Marc 'Tafouk' <wwg at millie.uk.to> wrote:
> Hello folks,
> I've been following the latest developments on the WebVTT specification and
> am making an attempt to write an out-of-browser parser, using Anna
> Cavender's proposed patches to WebKit.

Cool! Is this a new video player app or going into, say, VLC or
something similar?

> First, I filed a request on the bugtracker
> <http://www.w3.org/Bugs/Public/show_bug.cgi?id=13292> regarding the "end-
> of-file marker" that's mentioned in the current draft
> <http://www.whatwg.org/specs/web-apps/current-work/#webvtt-cue-text-
> parsing-rules>

I replied. IIUC, it's just the EOF state that is meant, not an actual character.

> I have another question about self-closing tags in cue text. It seems
> they're not supported at all.

None of the tags that we have mean anything if they self-close (and
the <timestamp> is implicitly closing).

> The U+002F SOLIDUS character (/) is only handled in the WebVTT tag state.
> Test case 1-a):
>   00:00.000 --> 00:02.000
>   Initial <b/> test
> U+0062 (b) triggers "WebVTT start tag state"; U+002F is then handled as
> "Anything else" and is appended to result (tagname = "b/").

Yes. The next character is then a ">" and causes in the next loop to
return an end tag. Then, end tags are parsed and it's not in the list
that we expect, so this happens: Otherwise, ignore the token. Thus,
<b/> is ignored.

> Test case 1-b):
>   00:00.000 --> 00:02.000
>   Initial <b /> test
> U+0062 (b) triggers "WebVTT start tag state"; U+0020 (space) triggers
> "WebVTT start tag annotation state"; U+002F is handled as "Anything else"
> and is appended to buffer (annotation = "/").

Once ">" is reached, this leads to a start tag <b> with an annotation
of "/". From how I read it, the annotation string gets ignored.

> I am aware those may be moot atm because there is no void element AFAIK,
> and the current tags make no sense when immediately closed.

They still have to parse correctly. But I think from analysing the
spec they actually do.

> I also found a slight issue when following the parser specs : there is no
> validation of the class attribute.

says to attach the list of classes to the element. Right now, all
characters are allowed for class names bar space, tab, "." and ">". It
might indeed be an idea to restrict these character to those allowed
for class names in HTML.

> Test case 2):
>   00:00.000 --> 00:02.000
>   Second <c.......... [my annotation]> test
> classes is a list of 10 empty strings.

While possibly a bit or unneeded overhead, in
when mapping to HTML happens, they just create an additional space in
the class attribute, so are not harmful.


More information about the whatwg mailing list