[whatwg] Should <video controls> generate click events?

Ian Hickson ian at hixie.ch
Tue Sep 10 16:35:00 PDT 2013

On Tue, 20 Aug 2013, Edward O'Connor wrote:
> >>>
> >>> [W]e do want users to be able to bring up the native controls via a 
> >>> context menu and be able to use them regardless of what the page 
> >>> does in its event handlers. So, I request that the spec be explicit 
> >>> that interacting with the video controls does not cause the normal 
> >>> script-visible events to be fired.
> >[…]
> >> I've made the spec say this is a valid (and recommended) 
> >> implemenation strategy.
> > 
> > The change <http://html5.org/r/8134> looks good to me, thanks!
> I don't see why <video controls> should be any different than, say, 
> <button> here. If I install an event handler on an ancestor of an 
> element, I'm able to capture events and prevent the descendent element 
> from seeing them.

It's more similar to clicking on a context menu (or indeed a menu bar) 
than clicking on a button, IMHO. Or like clicking on a control in the 
popup from <input type=file>. Or clicking in the colour wheel that pops up 
when you active an <input type=color> control.

A <button> is an in-page control. A <video>'s controls are not an in-page 
control -- there's no in-page element for them, they might not even be 
positioned over the page (they could be on a palette that the user has 
dragged out of the page).

Also, with <button> the UA and the page are working together. The UA 
renders the button and shows how it is being pressed; the author makes it 
do something. The case we're talking about here is where the user presses 
a button and both the user agent and the author want to do something (and 
the author has no way to know what the user agent wants to do, or what the 
user thinks will happen).

> A UI which allows users to activate a control "regardless of what the 
> page does in its event handlers" is a general feature not specific to 
> media elements—and may be worth considering—but we shouldn't make a 
> one-off exception to the basic model of DOM events just for <video>.

I agree that it's not specific to <video>, but I don't think we have to 
mention it for other things because it's only <video> where the spec 
suggests having such a UI model and where browsers have historically also 
fired an event.

On Wed, 21 Aug 2013, Silvia Pfeiffer wrote:
> The paragraph added in <http://html5.org/r/8134> should probably be 
> restricted to the case where the default video controls have been 
> enabled by the user (e.g. through the context menu) rather than by the 
> Web page. It would indeed be bad if the Web page author, who is using 
> the default controls through a <video controls> attribute could not rely 
> on the events firing.


> IMHO, the example that Philip provided in http://people.opera.com/~** 
> philipj/click.html <http://people.opera.com/~philipj/click.html> is not 
> a realistic example of something a JS dev would do.

Seem pretty reasonable to me. How else would you make the video play/pause 
when you click the video frame, yet also have UA controls?

On Wed, 21 Aug 2013, Silvia Pfeiffer wrote:
> What I'm saying is that the idea that the JS developer controls 
> pause/play as well as exposes <video controls> is a far-fetched example.

I don't see why. It's what I'd do.

On Tue, 20 Aug 2013, Bob Lund wrote:
> What about a Web page that uses JS to control pause/play/etc based on 
> external messages, say from a WebSocket? The sender in this case acts as 
> a remote control.

That seems unrelated to this issue, unless I'm missing something.

On Tue, 20 Aug 2013, Rick Waldron wrote:
> Firefox actually implements click-to-play <video> by default. It's 
> unfortunate and all <video> interaction projects that I've worked on 
> directly or consulted for have been forced to include video surface 
> click -> event.preventDefault() calls to stop the behaviour. This may be 
> irrelevant to the current discussion, but I'm trying to get a better 
> understanding for the behavioural changes implied by this spec update, 
> so correction is highly desirable.

The change wouldn't affect this as far as I can tell. We're talking about 
interaction with specific controls enabled by controls="" (or manually by 
the user).

On Tue, 20 Aug 2013, Glenn Maynard wrote:
> It's the behavior users expect when watching videos, which is the case 
> <video> should optimize for.  If you're doing something else where the 
> user interacts with the video in other ways, then it's expected that you 
> need to prevent this behavior explicitly.
> Unlike browser controls, this is visible to scripts and something that 
> affects authors, so this probably should be in the spec if it isn't.

I'm not sure what you want in the spec here. Can you elaborate?

On Wed, 21 Aug 2013, Robert O'Callahan wrote:
> Just to be clear, we only do click-to-play when the "controls" attribute 
> is set. So if that's causing problems for you, I guess you want most of 
> our built-in controls but not all of them?

On Tue, 20 Aug 2013, Rick Waldron wrote:
> Also, at the time, the surface click to play was non-standard and 
> incredibly annoying because it just "showed up" as someone's pet feature 
> in Firefox. (I'm still not sure if it's a "standard" feature, I can't 
> find anything in the spec about it, but I could've just missed it)

It's not documented in the spec, but it seems reasonable.

Should we make this an explicit activation behaviour for the <video> 
element if it has a controls="" attribute?

On Wed, 21 Aug 2013, Robert O'Callahan wrote:
> I think you basically have to assume that if you specify "controls" then 
> the controls may accept clicks anywhere in the video element. There's 
> nothing in the spec to say that the controls must be restricted to a bar 
> of a certain height at the bottom of the element.

True, but there _is_ something now that says that if the browser considers 
it the user interacting with a control, that there shouldn't be events 
sent to the page. It's either a control (no events), or an activation 
behaviour (click events, can be canceled by preventDefault()).

On Wed, 21 Aug 2013, Silvia Pfeiffer wrote:
> Indeed. As a JS dev you make a choice: either you roll your own, or you 
> don't.
> If you roll your own, you write the JS to handle the clicks from the 
> controls and do video.pause() and video.play() yourself.
> If you don't roll your own, you write <video controls> and you expect 
> the browser to handle pausing/playing. You don't do what Philip's demo ( 
> http://people.opera.com/~philipj/click.html) does: handle pause and play 
> toggling in JS. Because the browser already does that for you.

I think doing both is perfectly reasonable, especially since we don't 
define an explicit activation behaviour currently.

> This is why I am saying: Philip's example is not a typical use case. It 
> only happens when the developer made the choice to roll their own, but 
> the user activates the default controls (e.g. through the context menu) 
> as well. This can't happen on YouTube, because YouTube hide away the 
> context menu on the video element.

There's no way for a page to definitely prevent the user from enabling 
these controls. A browser could always have some other way to enable them, 
or allow access to the context menu regardless, etc.

> However, the patch has a wider implication: namely that the User agent 
> will suppress all user interaction events from the browser-provided 
> video controls. I.e. if the user clicks on the play button, no click 
> event is raised on the video element and the elements that the video 
> element is in. That's what Edward is objecting to - and I agree.

Why do you object? What sense is there in sending the event to the page?

On Tue, 20 Aug 2013, Rick Waldron wrote:
> Thank you, this is the clarification I was looking for in my previous 
> inquiries. Given this explanation, I absolutely object to any change 
> (such as this) that will effectively cripple the interaction 
> "programmability" of <video> elements. There are commercial products 
> that have been developed and are being developed that rely on the 
> ability to add listeners for events that occur on the <video controls> 
> as part of reach and engagement data collection, eg. Did the user click 
> the Play button on the video and watch it all the way through? Did they 
> click Pause? Did they drag to seek?

How can you tell what they clicked?

On Wed, 21 Aug 2013, Simon Pieters wrote:
> Just listen for the 'play', 'paused', 'seeked', 'ended' etc events for 
> this. The change doesn't cripple the <video> API at all. Listening for 
> 'click' doesn't tell you whether the user clicked play or pause or 
> seeked or none of those, so it's quite useless for that purpose.


On Tue, 20 Aug 2013, Peter Occil wrote:
> I'm afraid this example doesn't work well in Firefox and Google Chrome.  
> It affects not only the video itself but also the browser-provided 
> controls, and in Firefox it seems to interfere with those controls.  I 
> think that at most the click-to-play behavior should only affect the 
> video itself, not the buttons or other controls (for this to work, this 
> would require hit-testing to see if the video or a control was clicked, 
> and only override the default behavior if the video itself was clicked; 
> the hit-testing, though, will be browser-specific and may require 
> defining a new method in the spec).  In this way, the video controls 
> would remain unaffected or be specially handled in a different way.  
> Another -- less realistic --solution may be to define new event handlers 
> ("videoclick"? "videopauseclick"?) that only affect parts of the video 
> element and not the entire video element.

On Wed, 21 Aug 2013, Philip Jägenstedt wrote:
> Indeed, the demo was precisely to show how the native controls become 
> unusable when the scripts on the page haven't taken them into 
> consideration.

Right, that's what the spec has since had fixed.

On Tue, 20 Aug 2013, Brian Chirls wrote:
> Rick makes some good points. It seems there is a clear cost to this 
> change, but I'm afraid that there is little benefit, since it won't 
> prevent the proposed control-breaking scenario anyway.
> It seems to me that danger of Mr. Jägenstedt's proposed scenario is that 
> the user is annoyed by being forced to watch and/or listen to a piece of 
> media against his/her will.

No, the problem is that the page's logic interacts with the browser's 
logic in a way that nobody can predict or handle.

You can't force the user to do anything. They control the computer, at the 
end of the day, and can do whatever they want.

On Wed, 21 Aug 2013, Silvia Pfeiffer wrote:
> On Wed, Aug 21, 2013 at 8:59 PM, Simon Pieters <simonp at opera.com> wrote:
> > 
> > The problem was this: if you want to do something when a user clicks 
> > on a video but not when the user interacts with the native controls, 
> > you're basically out of luck.
> No, you can do as you do below: you can define an onclick handler and do 
> something. As long as you don't do something that the native controls 
> are already taking care of, such as play() and pause(). If you are 
> indeed trying to influence the play/pause state of the video element 
> both from script and native controls, you need to be more creative with 
> your event handlers and use onclick and onpause and onplay event 
> handlers and carefully manage which ones cancel out which other ones.

Since you don't know what the controls can do, I don't see how that makes 

What if you want to mute the volume when the user clicks on a non-control 
part of the video, but do nothing when they click on the controls?

You can't do it without the browser suppressing events when the user is 
interacting with the controls.

> It also means that in the case of Firefox or in the case of Android 
> Chrome, where the native controls cover the full video with a overlay 
> button when not on autoplay, you cannot get any onclick events on the 
> video element at all.

Right, that seems perfectly reasonable.

On Wed, 21 Aug 2013, Simon Pieters wrote:
> It may be the case that the change is suboptimal especially now that 
> some browsers make the whole video a big play/pause button.

I don't see how that affects the issue.

> For instance, I can imagine exposing a property on the click event that 
> tells whether the user clicked on the controls, and maybe even what was 
> being clicked (as a string).
> <video onclick="if (controlsTarget == null) { if (paused) play(); else 
> pause(); }" ...></video>

That seems like it would be far more complicated than necessary. What's 
the use case? (Why would you ever want the event when the user did click 
on a control?)

On Wed, 21 Aug 2013, Brian Chirls wrote:
> Yes, I think adding information to the click event is a great approach. 
> Event objects often have additional information, like mouse coordinates 
> or key code, so it wouldn't feel like an unusual or special case. The 
> previous approach removes information, where this one adds it. Let's not 
> forget that the same information should apply to touch and hover events 
> as well.

What is this information for?

We're talking about a lot of complexity here, we shouldn't do that unless 
we have a really solid reason.

On Wed, 21 Aug 2013, Tab Atkins Jr. wrote:
> The problem with trying to use click, even with additional information, 
> is that *the UA-defined controls are unknown*.  Maybe they have a 
> play-pause button.  Maybe they've only got a scrubber, and rely on 
> clicking on the face to play/pause.  Maybe they do something quite 
> different.  The HTML spec makes *zero* guarantees about what's inside of 
> that, which is *intentional*.


> The correct thing to do is listen for the defined events which indicate 
> that a particular state has changed.  I can't think of a reasonable 
> use-case for wanting to know which button was clicked that isn't solved 
> at least as well by just listening for the event for the state change.


The API is very carefully designed to enable multiple independent 
controllers to interact sanely, especially the case of UA controls and 
page controls.

On Wed, 21 Aug 2013, Brian Chirls wrote:
> So, perhaps we need a separate set of events. So, when a user clicks the
> play button, events would fire in this order:
> 1. play requested by user agent from some UI. Cancelable.
> 2. 'play' event. The browser has been asked to play the video, whether by
> the UI or by API.
> 3. 'playing' event. After all the network magic has happened, the video is
> actually playing.

We already have all these events (not precisely how you describe them, but 
more or less). Please do see the spec:


On Thu, 22 Aug 2013, Elliott Sprehn wrote:
> This means that if I have <video controls> on the page and then I click 
> something that shows a non-modal dialog that should dismiss on clicking 
> elsewhere in the page, and then click the video the page popup doesn't 
> disappear.

If you click one of the controls on that video, right. It's similar to 
what happens if you click another window, or in an <iframe>. (It's 
conceptually the same thing.)

> Should authors be listening for mouseup instead to take actions when 
> users click inside <video>?

mouseup would be suppressed also.

On Thu, 22 Aug 2013, Elliott Sprehn wrote:
> This is wrong, it means I have no way to tell if you click inside the 
> <video> to dismiss popups or notifications. I don't think we should be 
> making <video controls> a blackhole to events, it breaks lots of use 
> cases.

What use cases does it break that aren't already broken?

If we want to support "any click outside this non-modal <dialog> should 
close it", then we should just provide that feature directly, we shouldn't 
rely on authors knowing to check for window.onblur and descendant iframe 
mouse events and so on.

> Instead we should expose the controls as a pseudo element on the event, 
> just like TransitionEvent has a String pseudoElement so you can tell if 
> the thing transitioning is the "::before" or the "::after", we should 
> add ::controls and inside the click handler you can take no action if 
> the target is the controls. If we assume the new Shadow DOM spec, we 
> could just use the "part" feature which was designed specifically for 
> this kind of thing and should be exposed on all events.

I don't think this solves the problem -- foolip's example would still 

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

More information about the whatwg mailing list