[whatwg] <video> "await a stable state" in resource selection (Was: Race condition in media load algorithm)

Tue Aug 10 09:39:04 PDT 2010

On 8/10/10 4:40 AM, Philip Jägenstedt wrote:
> Because the parser can't create a state which the algorithm doesn't
> handle. It always first inserts the video element, then the source
> elements in the order they should be evaluated. The algorithm is written
> in such a way that the overall result is the same regardless of whether
> it is invoked/continued on each inserted source element or after the
> video element is closed.

Ah, the waiting state, etc?

Why does the algorithm not just reevaluate any sources after the 
newly-inserted source instead?

> However, scripts can see the state at any point, which is why it needs to be the same in all browsers.

I'm not sure which "the state" you mean here.

>> Because changes to the set of <source> elements do not restart the
>> resource selection algorithm, right? Why don't they, exactly? That
>> seems broken to me, from the POV of how the rest of the DOM generally
>> works (except as required by backward compatibility considerations)...
>
> The resource selection is only started once, typically when the src
> attribute is set (by parser or script) or when the first source element
> is inserted. If it ends up in step 21 waiting, inserting another source
> element may cause it to continue at step 22.

Right, ok.

> Restarting the algorithm on any modification of source elements would
> mean retrying sources that have previously failed due to network errors
> or incorrect MIME type again and again, wasting network resources.
> Instead, the algorithm just keeps it state and waits for more source
> elements to try.

Well, the problem is that it introduces hysteresis into the DOM.  Why is 
this a smaller consideration than the other, in the edge case when 
someone inserts sources in reverse order and "slowly" (off the event loop)?

That is, why do we only consider sources inserted after the |pointer| 
instead of all newly inserted sources?

> I'm not sure what you mean by hysteresis

http://en.wikipedia.org/wiki/Hysteresis

Specifically, that the state of the page depends not only on the current 
state of the DOM but also on the path in state space that the page took 
to get there.

Or in other words, that inserting two <source> elements does different 
things depending on whether you do "appendChild(a); appendChild(b)" or 
"appendChild(b); insertBefore(a, b)", even though the resulting DOM is 
exactly the same.

Or in your case, the fact that the ordering of the setAttribute and 
insertChild calls matters, say.

Such situations, which introduce order-dependency on DOM operations, are 
wonderful sources of frustration for web developers, especially if 
libraries that abstract away the DOM manipulation are involved (so the 
web developer can't even change the operation order).

>> I have a really hard time believing that you trigger resource
>> selection when the <video> is inserted into the document and don't
>> retrigger it afterward, given that... do you?
>>
>>> 2. Instead of calling the resource fetch algorithm in step 5/9
>>
>> There doesn't seem to be such a step...
>>
>>> 3. In step 21, instead of waiting forever, just return and let inserting
>>> a source element cause it to continue at step 22.
>>
>> Again, the numbering seems to be off.
>
> These are steps in the resource selection algorithm, not in the resource
> fetch algorithm.

Yes.  Step 5 in the resource selection algorithm I see is:

   5. Queue a task to fire a simple event named loadstart at the media
      element.

It has no substeps.

> Mozilla is implementing this now. How are you interpreting "await a
> stable state" when the resource selection algorithm is triggered by the
> parser?

At the moment, given that we don't differentiate betwen "pause" and 
"spin the event loop" internally, it sounds like we plan to treat tis as 
"wait until the next event runs from the event loop".  This means we 
will treat an alert being up as being in a stable state; same for sync 
XHR, showModalDialog, etc.  From the parser we will basically treat it 
as "run asynchronously".

> Will the result be 100% predictable or depend on "random" things
> like how much data the parser already has available from the network?

I don't know about "result".  When the algorithm runs, exactly, will 
depend on the amount of data the parser parses before returning to the 
event loop.  Does that affect "result"?

-Boris