[whatwg] Script preloading

Tue Sep 3 17:01:11 PDT 2013

Hello folks. Sorry for the late response to several comments in this
mega-thread, I've mostly been traveling/vacationing for the past 2 months.
A teammate asked me to look at this in case I had comments. I don't know
web dev issues very well, so I'm going to restrain myself from offering
many opinions about the new proposals other than wow, all this dependency
stuff looks complicated, but maybe it's worth it? I'll keep to some
observations from a networking performance perspective, in case it's
relevant to the discussion:

* Any advantages the preloader currently gives is probably only going to be
magnified with HTTP/2. Browsers today will in key situations hold back
lower priority resource loads, even after the resource has been discovered
by the parser/preloader, in order to reduce network contention and
prioritize resources. But with HTTP/2, the browser almost never has to do
this since it can express the request priority in the HTTP/2 protocol
itself, and let the server order responses appropriately.
* <link rel=subresource> is great for resource discovery. Given the above
observation, note that it has some deficiencies. Most obviously, it does
not indicate the resource type. Browsers today can heuristically assign a
priority based on the resource type (script/image/stylesheet/etc).
Arguably, browsers could just use the filename extension as a hint to the
resource type, and that'd get us most of the way there. In any case,
Chromium, when it encounters <link rel=subresource> is going to assign the
resource load the lowest priority level, and only when the parser
encounters the actual resource via a <script> tag or something, will
another resource load be issued with the "appropriate" priority. Almost all
modern browsers will hold back low priority resource loads before first
paint in order to get critical scripts and stylesheets in <head> ASAP
without contention. Anything marked with <link rel=subresource> will be
considered low priority and in all likelihood not requested early. Note
that HTTP/2 currently does not support re-prioritization (and that feature
is being debated), so that means that when the resource load for <link
rel=subresource> gets issued over an HTTP/2 connection, it will have the
lowest priority, which is probably undesirable. FWIW, I think <link
rel=subresource> was a good initial start, but suffers from key weaknesses
and should be thrown out and replaced.
* Given current browser heuristics for resource prioritization based on
resource type, all <script> resources will have the same priority. Within
HTTP/1.X, that means you'll get some amount of parallelization based on the
connection per host limit and what origins the script resources are hosted,
and then get FIFO. New additions like lazyload attributes (and perhaps
leveraging the defer attribute) may affect this. With HTTP/2, there is a
very high (effectively infinite) parallelization limit. With
prioritization, there's no contention across priority levels. But since
script resources today generally all have the same priority, they will all
contend and most naive servers are going to round robin the response bytes,
which is the worst thing you could do with script resources, since current
JS VMs do not incrementally process script resources, but process them as a
whole. So round-robining all the response bytes will just push out start
time of JS processing for all scripts, which is rather terrible.
* Obviously, given what I've said above, some level of hinting of
prioritization/dependency amongst scripts/resources within the web platform
would be useful to the networking layer since the networking layer can much
more effectively prioritize resources and thus mitigate network contention.
If finer grained priority/dependency information isn't provided in the web
platform, my browser's networking stack is likely going to have to, even
with HTTP/2, do HTTP/1.X style contention mitigation by restricting
parallelization within a priority level. Which is a shame since web
developers probably think that with HTTP/2, they can have as many fine
grained resources as they want.

Cheers.

On Wed, Jul 10, 2013 at 3:39 AM, Ian Hickson <ian at hixie.ch> wrote:

>
> A topic that regularly comes up is script loading.
>
> I sent an e-mail responding to related feedback last year, though it
> didn't get any replies to the script loading parts of it:
>
>
> http://lists.w3.org/Archives/Public/public-whatwg-archive/2012Dec/0221.html
>
> It seems that people want something that:
>
>  - Lets them download scripts but not execute them until needed.
>  - Lets them have multiple interdependent scripts and have the browser
>    manage their ordering.
>  - Do all this without having to modify existing scripts.
>
> I must admit to not really understanding these requirements (script
> execution can be made more or less free if they are designed to just
> expose some functions, for example, and it's trivial to set up a script
> dependency mechanism for scripts to run each other in order, and there's
> no reason browsers can't parse scripts off the main thread, etc). But
> since everyone else seems to think these are issues, let's ignore that.
>
> The proposals I've seen so far for extending the spec's script preloading
> mechanisms fall into two categories:
>
>  - provide some more control over the mechanisms already there, e.g.
>    firing events at various times, adding attributes to make the script
>    loading algorithm work differently, or adding methods to trigger
>    particular parts of the algorithm under author control.
>
>  - provide a layer above the current algorithm that provides strong
>    semantics, but that doesn't have much impact on the loading algorithm
>    itself.
>
> I'm very hesitant to do the first of these, because the algorithm is _so_
> complicated that adding anything else to it is just going to result in
> bugs in browsers. There comes a point where an algorithm just becomes so
> hard to accurately test that it's a lost cause.
>
> The second seems more feasible, though.
>
> Would something like this, based on proposals from a variety of people in
> the past, work for your needs?
>
> 1. Add a "dependencies" attribute to <script> that can point to other
>    scripts to indicate that execution of this script should be delayed
>    until all other scripts that are (a) earlier in the tree order and (b)
>    identified by this attribute have executed.
>
>      <script id="jquery" src="jquery.js" async></script>
>      <script id="shims" src="shims.js" async></script>
>      <script dependencies="shims jquery" src="myscript.js" async></script>
>
>    This would download jquery.js, shims.js, and myscript.js ASAP, without
>    blocking anything else, and would then run jquery.js and shims.js ASAP,
>    in any order, and then once both have executed, it would execute
>    myscript.js.
>
> 2. Add an "whenneeded" boolean content attribute, a "markNeeded()" method,
>    and an internal "is-needed flag" (initially false) to the <script>
>    element. When a script is about to execute, if its whenneeded=""
>    attribute is set, but its "is-needed" flag is not, then delay
>    execution. Calling markNeeded() on a script that has a whenneeded
>    boolean but that has not executed yet first causes the markNeeded()
>    method on all the script's dependencies to be called, and then causes
>    this script to become ready to execute.
>
>      <script id="jquery" src="jquery.js" async whenneeded></script>
>      <script id="shims" src="shims.js" async whenneeded></script>
>      <script id="myscript" dependencies="shims jquery" src="myscript.js"
>              async whenneeded></script>
>
>    This would download jquery.js, shims.js, and myscript.js ASAP, and then
>    wait for further instructions.
>
>      document.getElementById('myscript').markNeeded();
>
>    This would then cause the scripts to execute, first jquery.js and
>    shims.js (in any order), and then myscript.js. If any hadn't finished
>    downloading yet, it would first wait for that to finish.
>
>    (We could make markNeeded() return a promise, too.)
>
> Is there a need for delaying the download of a script as well? (If so, we
> could change whenneeded="" to have values, like whenneeded="execute" vs
> whenneeded="download" or something.)
>
> Is there something this doesn't handle which it would need to handle?
>
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
>