[whatwg] Volume and Mute feedback on <video>

Fri Sep 24 21:44:10 PDT 2010

On Fri, 20 Aug 2010, Mike Wilcox wrote:
> On Aug 20, 2010, at 2:57 PM, Ian Hickson wrote:
> > On Thu, 10 Jun 2010, Ashley Sheridan wrote:
> >> 
> >> Or you could just raise the volume of the audio track itself. I think 
> >> being able to raise the volume like this (beyond 100% of what it is) 
> >> with script just makes it something more likely to be abused (think 
> >> how the TV adverts always seem twice as loud as the programs they 
> >> surround) and so will end up getting blocked more often.
> > 
> > Yeah.
> 
> I highly disagree with limiting functionality from the standpoint that 
> someone could do something that annoys us. Greater than 100% volume has 
> a very solid use case. Setting a property to 1.5 is much easier than 
> re-encoding a video.

I don't understand what setting the volume to 1.5 would mean. As currently 
defined, 1.0 means "play this as loudly as possible" (i.e. do not 
attenuate the signal at all, just pass it through to the audio subsystem 
at full intensity).

On Fri, 20 Aug 2010, Jonas Sicking wrote:
> On Fri, Aug 20, 2010 at 12:57 PM, Ian Hickson <ian at hixie.ch> wrote:
> > On Mon, 31 May 2010, Silvia Pfeiffer wrote:
> >>
> >> I just came across a curious situation in the spec: IIUC, it seems 
> >> the @volume and @muted attributes are only IDL attributes and not 
> >> content attributes. This means that an author who is creating an 
> >> audio-visual Webpage has to use JavaScript to turn down (or up) the 
> >> loudness of their media elements or mute them rather than just being 
> >> able to specify this through content attributes.
> >
> > What is the use case for overriding the user's defaults in this way?
> >
> > I guess I could see a use case for muting (e.g. video ads often start 
> > off muted), but declaring the default volume seems very strange.
> 
> It doesn't seem to be overriding the users default any more than if the 
> video or audio track had been recorded with a different volume?

If the attribute we're considering has as its only effect changing the 
initial value of the .volume attribute, then it is overriding the user's 
default, since the .volume attribute currently gets initialised to either 
1.0 or the user's default when the element is created.

> One use case is simply wanting to have some background music on a page, 
> but not wanting it to play in a volume as loud as what the track was 
> originally recorded in.

Surely we don't want to do anything to support the <bgsound> use case. 
Isn't navigating to a Web page with background music annoying enough?

I could see an argument for wanting to control the volume of sounds in a 
game, e.g. to have quiet ambient sounds that increase in volume based on 
the user's position, but surely for that kind of thing what you really 
want is full positional audio, or at a minimum control over panning. But 
for these cases, you don't need declarative volume control; scripted 
control is quite enough -- and we should in any case handle this use case 
more seriously at some future point, not just do declarative volume now, 
then panning later, than positional audio, etc, all of which handle the 
same use case.

> >> However, if I have multiple videos on a page, all on autoplay, it 
> >> would be nice to turn off the sound of all of them without 
> >> JavaScript. With all the new CSS3 functionality, I can, for example, 
> >> build a spinning cube of video elements that are on autoplay or a 
> >> marquee of videos on autoplay - all of which would require muting the 
> >> videos to be bearable. If we added @muted to the content attributes, 
> >> it would be easy to set the muted state without having to write any 
> >> JavaScript.
> >
> > I guess that could make sense.
> >
> > Would we want to make .muted simply reflect the content attribute, so 
> > that the user enabling/disabling muting changes how the DOM is 
> > serialised? Or would we go for another attribute, say mute="", as the 
> > default, and have the IDL attribute be set by that attribute when 
> > loading, and then be independent of it? The latter seems better I 
> > guess.
> 
> Having the IDL attribute not reflect the content attribute I think will 
> be a source of confusion. The HTML DOM got this very wrong for form 
> values, and IE got it right. There were a great many people confused 
> about this for forms while gecko was still ramping up market share back 
> a few years ago. I think it would be much simpler to have it behave like 
> attributes like @disabed.

I think it would be even more confusing to have a Web page say <video>, 
then find that its innerHTML immediately after load is <video muted=""> in 
some cases. Yet this is what would happen if we made the DOM attribute 
reflect the content attribute, because any user who has set their UA to 
mute all <video> elements by default would be effectively causing muted="" 
attributes to spontaneously appear on all <video> elements.

It would also be the first time that a page's DOM could spontaneously 
mutate when parsing. We don't have any other case similar to this 
(<details> mutated on user interaction, but not based on user preferences; 
nothing else in the DOM changes in response to anything other than script 
calls, as far as I can remember).

Consider:

   <video src="test1.mp4" muted></video>
   <video src="test2.ogg" muted></video>
   <video src="test3.webm"></video>

What would render for this markup, given the following CSS?

   video[muted] { display: none; }

Is it intuitive to authors that for some users, all three videos would be 
hidden? I would argue that that would be even more confusing than having 
.muted not reflect muted="", even if we named them that way.

Having said that, I do agree that it would be bad to have the names of the 
content and IDL attributes be different from each other (like value="" and 
.defaultValue). So maybe we should have defaultmuted="" (and for its IDL 
attribute, .defaultMuted) rather than muted="" (and anything but .muted as 
its IDL attribute).

On Fri, 20 Aug 2010, Eric Carlson wrote:
>
> Additionally the element's setting is always modified by the browser 
> setting (if there is one) and the system setting before it gets to the 
> speakers, so an attribute can't override the user's default.

If it is always overriden by the browser setting, wouldn't that mean that 
it would never have any effect? Unless the browser setting can be "honour 
author setting", but then that means the browser can't just automatically 
remember whatever the user last set.

On Fri, 20 Aug 2010, Roger Hågensen wrote:
>
> I must say that content has no business messing wit the mute.
> So whatever the user has chosen as default audio or video behavior should be
> respected at all times, no exceptions.
> I.e. If the browser is "muted" then playback should be muted as well.

I agree. I think there is room for the other way around though (muting 
something that by default wouldn't be muted), since that is only likely to 
be used in cases where not muting really doesn't make sense (e.g. a page 
with a bazillion simultaneous videos playing).

> [...] Multiple on autoplay? Outch, that's just bad design no matter how 
> you think about it. If anything only the first encountered media should 
> autoplay, the other though should be autopause but possibly start 
> pre-buffering in he background though. Maybe multiple autoplay behavior 
> could be user configurable. (defaulting to the behavior I just 
> mentioned)

I think there are definitely use cases for multiple autoplay (consider 
youtubedoubler.com for instance).

> One thing is for sure...do not mess with the user's browser defaults. If I
> where to set the playback volume to say 25% (be it globally, for that site or
> page only or the current stream)
> then I do not want something to suddenly crank that up to 100% nor down to 0%
> without my explicit permission.

Well they can do that anyway (with script), the question is can they do it 
without script.

On Sat, 21 Aug 2010, Silvia Pfeiffer wrote:
> 
> I really wouldn't classify volume change as part of "video editing". My 
> TV remote has a volume up and down button that allows me to increase the 
> volume beyond what the video was originally encoded in. Do we really 
> want to refuse such a simple functionality to both users and web 
> developers?

The TV remote controls an amplifier. It's equivalent to the volume control 
on powered computer speakers. The volume control for <video> is only an 
attenuator.

On Sat, 21 Aug 2010, Silvia Pfeiffer wrote:
> 
> @volume is currently an attribute that takes values from 0 to 1, where 1 
> means to play the volume at which the media resource was created and 
> define that as 0dB. Thus, @volume isn't actually expressing what users 
> generally understand under volume, namely to be able to play back the 
> resource at its original level (the level that it was before it got 
> recorded) and be able to manipulate that level up or down. Instead, our 
> @volume expresses relative attenuation and we are only able to 
> manipulate the gain down and not up above what it is stored at.

I think it matches what users are used to -- it's what all the software 
volume controls in OSes and on all Web sites for the last 15 years have 
been like.

> If we lived in an optimal world, all audio resources would be normalized 
> to the same reference range and that range would be given as a perceived 
> loudness level (see http://en.wikipedia.org/wiki/Loudness). Then we 
> would be able to use the exact same setting for all our audio resources 
> and always get them at a level that we can rely on. We would actually 
> display @volume as a value between 0 and 1 where 0 is absence of sound 
> and 1 is the loudest that a human ear can bear without bursting (or even 
> a bit louder than that) and we would be able to represent each audio 
> recording with its exact perceived loudness on that scale, which is 
> identical to what it was recorded at. I believe this would be the 
> optimal solution for a user wrt volume.

I don't know that I agree, but I'm not sure it matters.

> Even if we don't use loudness as a measure, a better situation would 
> already be where we have audio resources follow a normalised sound 
> pressure level range. It would be simple to map a encoded value of x to 
> a fixed sound pressure in Pascal. Instead, the audio world always deals 
> in relative values, namely in dB. And unfortunately most of the time 
> what is 0dB for a digital file 1 is not the same perceived loudness as 
> what is 0dB for file 2 - maybe because the microphone was bad, maybe 
> because the mixer was badly set up, maybe because the recording settings 
> on the computer were screwed, maybe because transcoding settings were 
> screwed - there are a gazillion reasons. The fact is: this is reality 
> and we have to deal with it.
> 
> On TV and Radio, we have a world that has somewhat managed to deal with 
> this situation. When we continue to listen to a single radio station, we 
> expect the music pieces to all be played back at approximately the same 
> perceived loudness, and the same on TV. (Yes, they manipulate it 
> sometimes to make, e.g. advertising louder, but that is conscious 
> manipulation of users and not an inherent problem). This is a big 
> challenge for the TV and radio stations, but they generally manage to 
> stay fairly consistent within themselves. This is because they cannot 
> expect the user to continuously have to change their volume settings on 
> their radio or TV station just to be able to keep the sound within a 
> comfortable range.
> 
> On the Web no such consistency is available.

On the Web the same consistency is available on a per-site basis in the 
same way that across the radio spectrum the same consistency is available 
on a per-station basis.

> And with the current way in which audio and video work it's not even 
> possible to create such a consistency within a single Web page.

Sure it is. Simply encode the videos you broadcast in a consistent manner. 
This is a server-side problem, just like it is for TV and radio.

> There are actually two issues at hand here.
> 
> 1. Amplification
> 
> Firstly, it's the problem that audio and video files are not encoded 
> with the same reference sound pressure, resulting in files that are 
> extremely loud at the @volume=1 setting, while others are almost 
> imperceptible even at @volume=1. We can deal with the first situation: 
> we can turn the knob down on such a file. We can, however, not deal with 
> the second situation. We have no way right now to deal the know up and 
> amplify the sound pressure beyond what its maximum setting is. I believe 
> the reason for this is that amplification can cause artifacts and that's 
> acceptable.

I run into this problem occasionally -- for example, Mad Men on iTunes is 
encoded more quietly than The Daily Show. I work around it by turning up 
the volume on my amplified speakers.

> We can of course get out of this by introducing an additional attribute 
> that lets us amplify the sound pressure level of the resource (something 
> like a preamp). But that's not really that accessible to the user.

In theory there's no reason the browser couldn't offer such a control. In 
practice, it would be unexpected; e.g. in the iTunes case above, neither 
iTunes nor the OS provide me with such a feature as far as I can tell.

> Or, if it was possible, we could even introduce a @normalize attribute 
> that would normalize the @volume range to a loudness range within human 
> perception. The normalization, however, has to deal with lost 
> information, namely that the maximum sound pressure of the original 
> sound isn't available any more, and thus has to make some assumptions. 
> Trying to do this on a progressively downloading resource will lead to 
> constantly changing volume ranges, so it's not really practical.
> 
> What is most practical is actually to allow the @volume to have higher 
> settings than 1 and to set the slider 1 for the loaded resource. 
> Anything higher than 1 is amplification beyond the resource's original 
> gain, anything lower just what it is today. Obviously, the question is, 
> what value do you stop at. iTunes takes it amplification up to +12dB. 
> Maybe that can be mapped to "2" and then the increase be done 
> logarithmically. Some value has to be picked - unless we can introduce a 
> slider that dynamically increases its upper level as users keep hitting 
> it.

If we give authors the ability to amplify audio up to the current maximum 
level achievable by the hardware amplification, then I expect we'll just 
see lots of pages amplifying their video to high yet different amounts, 
resulting in the exact same problem that we have today, but with louder 
defaults.

> 2. Web author adjustment
> 
> It's this second issue that I was originally pointing out, even though I 
> got side-tracked with the much bigger problem of loudness.
> 
> As a Web page author you are basically in the same position as a radio 
> channel or a TV station: you want to publish all the video or audio 
> files at the same loudness so that a user can make the volume settings 
> on their computer once and not have to make any more changes for 
> listening to more of your content. This is particularly important if, 
> e.g., you have a playlist of videos and they play one after the other 
> (as a program), or you have all the videos displayed on the page for 
> people to click on and, say, watch in a lightbox.
> 
> Most of the time, you are just the publisher of content and not the 
> author of the content, so you will likely not be able to go back into a 
> studio with the file and make adjustments. Think, for example, of a 
> online radio station that gets user created content sent in to publish, 
> but also Grandma Peters who wants to put all the videos of her 
> grandchildren onto a Web page.
> 
> Assuming they will listen into the piece before publishing, they will 
> determine what volume adjustment would fit with the standard of their 
> other media resources. It would be nice to just be able to remember this 
> setting as the initial setting for when people load this resource. It 
> would be simple to satisfy this need by just exposing the @volume 
> attribute as a content attribute.
> 
> Another example is a Web page that has music playing back as constant 
> background, but allows you to click on talks (e.g. a list of 
> presentations) and the presentation will play in parallel to the music. 
> You'd want the music always to play back more quietly, so setting an 
> initial @volume on the <audio> element would totally make sense. It's 
> very much parallel to what @opacity means to visual content.

Do any sites do this today? How is this problem solved in Flash?

On Mon, 23 Aug 2010, Silvia Pfeiffer wrote:
> 
> I suppose you could throw it into CSS, but we haven't created any 
> audio-related properties in CSS yet and I am not sure we should. It 
> would be much easier to just expose @volume as a content attribute on 
> audio and video.

I'm not really convinced this particular problem needs solving at all, but 
if we solve it we should solve it the right (easy to use) way, not the 
easy (to spec and implement) way.

In conclusion: I see an argument for a defaultmuted="" attribute (though 
that name is suboptimal). I don't really see an argument for a 
defaultvolume="" attribute. I don't see how we could have muted="" or 
volume="" content attributes reflected by .muted and .volume, for reasons 
described above.

Before I add defaultmuted="", does anyone have anything to add here that 
could indicate that we should do something else?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'