Hi Chris,

this is a very good discussion to have and I would be curious about
the opinions of people.

CMML has been developed with an aim to provide "html"-type timed text
annotations for audio/video - in particular hyperlinks and annotations
to temporal sections of videos. This is both, more generic than
captions, and less generic in that captions have formatting and are
displayed in a particular way.

One option is to extend CMML to provide the caption functionality
inside CMML. This would not be difficult and in fact, the current
"desc" tag is already being used for such functionality in xine. It is
however suboptimal since it mixes aims. A better way would be to
invent a "caption" tag for CMML which would have some formatting
functionality (colours, alignment etc. - the things that the EBU
subtitling standard http://www.limeboy.com/support.php?kbID=12 is

Another option would be to disregard CMML completely and invent a new
timed text logical bitstream for Ogg which would just have the
subtitles. This could use any existing time text format and would just
require a bitstream mapping for Ogg, which should not be hard to do at

Now for Ogg Skeleton: Ogg Skeleton will indeed have a part to play in
this, however not directly for specification of the timed text
annotations. Ogg Skeleton is a track that describes what is inside the
Ogg file. So, assuming we would have a multitrack video file with a
video track, an audio track, an alternate audio track (e.g. speex as
suggested by you for accessibility to blind people), a CMML track (for
hyperlinking into and out of the video), and several caption tracks
(for different languages), then Ogg Skeleton would explain exactly
that these exist without the need for a program to decode the Ogg file

I think we need to understand exactly what we expect from the caption
tracks before being able to suggest an optimal solution. If e.g. we
want caption tracks with hyperlinks on a temporal basis and some more
metadata around that which is machine readable, then an extension of
CMML would make the most sense.


