[whatwg] accessibility management for timed media elements, proposal
bhawkeslewis at googlemail.com
Sun Jun 10 04:05:55 PDT 2007
Dave Singer wrote:
> At 16:35 +0100 9/06/07, Benjamin Hawkes-Lewis wrote:
>> The proposal does not describe how conflicts such as the following
>> would be resolved:
>> User specifies:
>> captions: want high-contrast-video: want
>> Author codes:
>> <video ... > <source media="all and (captions:
>> want;high-contrast-video: dont-want)" ... /> <source media="all and
>> (captions: dont-want;high-contrast-video: want)" ... /> </video>
> There is no suitable source here; it's best to have something (late)
> in the list which is less restrictive.
But if UAs can apply accessibility preferences to a catch-all <source>
listed last, then what's the advantage of creating multiple <source>
elements in the first place? Current container formats can
include captions and audio descriptions. So is the problem we're trying
to solve that container formats don't contain provision for alternate
visual versions (high contrast and not high contrast)? Or are we trying
to cut down on bandwidth wastage by providing videos containing only the
information the end-user wants?
>> a) I should think sign-language interpretation needs to be in
>> sign-interpretation: want | dont-want | either (default: want)
>> Unless we want to treat sign interpretation as a special form of
>> subtitling. How is subtitling in various languages to be handled?
> I think we assume that a language attribute can also be specified, as
The lang attribute specifies "the primary language for the element's
contents and for any of the element's attributes that contain text", not
the referenced resource. hreflang "gives the language of the linked
resource" as a single "valid RFC 3066 language code." So we'd need a new
attribute or to change the content model of hreflang to explicitly
specify the separate multiple languages of a resource.
I note in passing that these attributes should be updated to use RFC
4646 not RFC 3066 as per:
> I have to confess I saw the BBC story about sign-language soon after
> sending this round internally. But I need to do some study on the
> naming of sign languages and whether they have ISO codes. Is it true
> that if I say that the human language is ISO 639-2 code XXX, and
> that it's signed, there is only one choice for what the sign language
> is (I don't think so -- isn't american sign language different from
> british)? Alternatively, are there ISO or IETF codes for sign
> languages themselves?
Brian Campbell has eloquently answered some of these questions.
The reason I was thinking of using a CSS property was that signed
interpretation is not the same as signing featured in the original
video. But it's true that information about what sign languages are
available is important, so a CSS property alone wouldn't solve the
problem. Maybe we need new attributes to crack this nut:
<source contentlangs="en,sgn-en" captionlangs="sgn-en-sgnw,fr,de,it,sgn"
This would indicate that the main video content features people talking
in English and people signing in English; the video is captioned in
English, French, German, Italian, and their SignWriting analogues
(American Sign Language in the case of English), dubbed in French,
subtitled in German and Italian, and provided with signed interpretation
in American, French, German and Italian Sign Languages.
Granted it's a sledgehammer, but it does provide the fine-grained
linguistic information we need. It would also seemingly remove the need
for putting a caption media query on <source>. While this markup looks
complicated, most videos currently on the web could be marked up like:
<source hreflang="en" ...>
as all they provide is a single-language spoken track.
I should add a little note about "sgn-en-sgnw". The IANA language tag
registry includes the following entry:
> Type: script
> Subtag: Sgnw
> Description: SignWriting
> Added: 2006-10-17
One might want to omit the sgnw subtag on the basis that other sign
language transliterations are academic not everyday (just as one omits
the latn subtag for en, fr, and so on). However, those who work on such
things have yet to come up with an entirely settled formulation. See
this thread on the IETF languages mailing list:
Meanwhile, people are already creating SignWriting captions:
>> b) Would full descriptive transcriptions (e.g. for the deafblind)
>> fit into this media feature-based scheme or not?
>> transcription: want | dont-want | either (default: either)
> how are these presented to a deafblind user?
Depends. I think the ideal would be to have transcriptions inside a
container format, so that /everyone/ could access them and so that
deafbind people who still have some sight can see some of the video. The
transcriptions could be dispatched to a braille display. And, yeah, with
my sledgehammer system that would necessitate yet another language
attribute to indicate what languages transcriptions are provided in.
The crudest way of doing this would be to provide transcriptions of
audio descriptions to supplement the captions. I believe one can do that
with SMIL; I don't know what the situation with other container formats
or player UIs is however.
Alternatively and suboptimally:
<source src="mytranscription.html" hreflang="en" type="text/html"
media="all and (transcription:desired)>
(Here I'm leaning towards the ideal of catering to all comers with a
single container. But sometimes end-users prefer a simple format, e.g.
some PDF haters prefer to be given a version in plain text. This tells
you more about broken authoring practices and UAs, and about the digital
divide between broadband and dialup, than about intrinsic problems with
PDF however. Still, if it's possible to provide plain text and plain
HTML versions of whatever content you're producing, that's a good idea
as a supplement, not just a fallback.)
>> c) How about screening out visual content dangerous to those with
>> photosensitive epilepsy, an problem that has just made headlines in
>> the UK:
>> max-flashes-per-second: <integer> | any (default: 3)
>> Where the UA must not show visual content if the user is selecting
>> for a lower number of flashes per second. By default UAs should be
>> configured not to display content which breaches safety levels;
>> the default value should be 3 /not/ any.
> I think we'd prefer not to get into quantitative measures here, but a
> boolean "this program is unsuitable for those prone to epilepsy
> induced by flashing lights" might make sense. epilepsy: dont-want
Why? The reason I suggested a quantitive measure was to ensure that
encoded information remains relevant even as our medical understanding
of the condition changes. WCAG 1.0 stipulated that 4 flashes per second
would be dangerous, but WCAG 2.0 stipulates that more than 3 flashes per
second would be dangerous:
So 3.1 flashes would pass WCAG 1.0 but flunk WCAG 2.0. (Thinking about
it, this probably indicates the value should be decimal not integer.)
I can see there's an ease-of-use issue for authors who know their
content contains lots of flashes but haven't actually tested it to find
out the flash rate. But if UAs default to 3 as I emphasized they
/should/ do, such authors can use the "any" value to avoid having to
quantify the number of flashes. If in the future we decide 2 flashes
should be the threshold or add further criteria, contemporary UAs could
be reconfigured and future UAs could be revised, but the content would
go on working.
>> d) Facilitating people with cognitive disabilities within a media
>> query framework is trickier. Some might prefer content which has
>> been stripped down to simple essentials. Some might prefer content
>> which has extra explanations. Some might benefit from a media query
>> based on reading level. Compare the discussion of assessing
>> readability levels at:
>> reading-level: <integer> | basic | average | complex | any
>> (default: any)
>> Where the integer would be how many years of schooling it would
>> take an average person to understand the content: basic could be
>> (say) 9, average could be 12, and complex could be 17
>> This wouldn't be easily testable, but it might be useful
> Yes, this isn't testable, and is quantitative.
For verbal content, it /is/ testable (otherwise WCAG 2.0 would not
include it). The Juicy Studio page I referenced included tools to test
it with various formulae. Many of these formulae are good enough for
governmental use. For example, the US government tends to test documents
for readability with Flesch-Kincaid:
The uneasiness comes from the fact that WCAG 2.0 does not stipulate
precisely which formula(e) to use for testing. See:
I assume that's because:
1) The WCAG guideline needs to continue to be relevant even as formulae
2) Different languages need different formulae, so if they were to
mandate a solution they'd have to list a different solution for each
language. Here's one for Japanese, for example:
My guess is that /any/ of these readability indices would be good enough
for our purposes in practice. We don't need the same reproducibility
here as we do when it comes to not triggering epileptic fits!
More information about the whatwg