[whatwg] Google Feedback on the HTML5 media a11y specifications

Mon Jan 24 13:17:11 PST 2011

On Mon, 24 Jan 2011 12:28:36 +0100, Glenn Maynard <glenn at zewt.org> wrote:

> On Mon, Jan 24, 2011 at 4:32 AM, Philip Jägenstedt <philipj at opera.com>
> wrote:
>> Wouldn't a more sane approach here be to have each language in its own
> file,
>> each marked up with its own language, so that they can be  
>> enabled/disabled
>> individually? I'd certainly appreciate not having the screen cluttered
> with
>> languages I don't understand...
>
> Personally I'd prefer that, but it would require a good deal of metadata
> support--marking which tracks are meant to be used together, tagging
> auxilliary track types so browsers can choose (eg. an "English subtitles
> with no song caption tracks" option), and so on.  I'm sure that's a
> non-starter (and I'd agree).

Maybe you could enable them all by default and let users disable the ones  
they don't like?

> A much more realistic method would be to mark the transcription cues  
> with a
> class, and enabling and disabling them with CSS.

That would work too.

>> (Also, we're not going to see <video><track> used for anime fansubbing  
>> on
>> the public Web until copyright terms are shortened to below the  
>> attention
>> span of anime fans.)
>
> Maybe so.  I don't know if professional subtitles ever do this.  I'm
> guessing (and hoping) not, but I'll ask around as a data point--they've
> taken on other practices of fansubbers in the past.

That'd be valuable data to have, and something funny to look at :)

>> Yeah, the monospace Latin glyphs in most CJK look pretty bad. Still, if
> one
>> wants really fine-grained font control, it should already be possible
> using
>> webfonts and targeting specific glyphs with <c.foo>, etc.
>
> I don't think you should need to resort to fine-grained font control to  
> get
> reasonable default fonts.  If you need to specify a font explicitly  
> because
> UAs choose incorrectly, something has gone wrong.  It doesn't help if  
> things
> are expected to work without CSS, either--I don't know how optional CSS
> support is meant to be to WebVTT.

My main point here is that the use cases are so marginal. If there were  
more compelling ones, it's not hard to support intra-cue language settings  
using syntax like <lang en>bla</lang> or similar.

> The above--semantics vs. presentation--brings something else to mind.   
> One
> of the harder things to subtitle well is when you have two conversations
> talking on top of each other.  This is generally done by choosing a  
> vertical
> spot for each conversation (generally augmented with a color), so the  
> viewer
> can easily follow one or the other.  Setting the line position *sort of*
> lets you do this, but that's hard to get right, since you don't know how  
> far
> apart to put them.  You'd have to err towards putting them too far apart
> (guessing the maximum number of lines text might be wrapped to, and  
> covering
> up much more of the screen than usually needed), or putting one set on  
> the
> top of the screen (making it completely impossible to read both at once,
> rather than just challenging).
>
> If I remember correctly, SSA files do this with a hack: wherever there's  
> a
> blank spot in one or the other conversation, a transparent dummy cue is
> added to keep the other conversation in the correct relative spot, so the
> two conversations don't swap places.
>
> I mention this because it comes to mind as something well-authored,
> well-rendered subtitles need to get right, and I'm curious if there's a
> reliable way to do this currently with WebVTT.  If this isn't handled,  
> some
> scenes just fall apart.
>

As far as I'm aware no one has experimented with the rendering parts of  
WebVTT yet. It's defined in  
<http://www.whatwg.org/specs/web-apps/current-work/multipage/rendering.html#rules-for-updating-the-display-of-webvtt-text-tracks>  
and is a little layout engine that tries to avoid overlapping.  
Implementing that and seeing what happens is the best way to find out if  
it's sane or not.

Do you have a working example of such a multi-conversation scene and how  
it should be rendered? That would be quite interesting to have a look at.

-- 
Philip Jägenstedt
Core Developer
Opera Software