[whatwg] Script preloading

Wed Jul 10 11:59:11 PDT 2013

> Which of your use-cases have not been met? So far I've seen only "I want X, Y, Z" but not what you need X, Y, Z to achieve that isn't covered by other simpler proposals or existing features.

You know, I keep relying on the fact that the body of work on this topic for almost 3 years ought NOT have to be re-visited every few months when these threads wake from dormancy. I keep hoping that someone who really cares about actually addressing all the concerns, and not just some of them, will do the due dilligence to look at all the previous stuff before criticizing me for not providing enough detail.

I've written nearly a book's worth on this over all the threads and sites and blog posts over the years. In fact, I think its fair to say at this point that I've spent more time over the last 4+ years obsessing on script loading than any other developer, anywhere, ever.

I don't like the implication that I'm apparently just an impetuous little child demanding my way with no reasoning.

So, fine. Here it is. I'm going to state explicitly the use-cases I care about. This is nothing new. I am saying the same things I've been saying for 3 years. But sure, I'll say them, AGAIN, because now someone wants to hear them again. I doubt anyone is going to read this crazy long message and actually read all these, but I'll put them here nonetheless.

And I'm listing them here because they are not covered fully by any of the other proposals besides the 2 or 3 I keep pushing for. You may think they are covered, but I think the nuances prove that they aren't. The devil is always in the details. Or you may think my use-cases are irrelevant and so you dismiss them as unimportant. Guess there's nothing I can do about that.

-------------

1. Premise: I'm the author of a popular and wide-spread used script loader. It's a general utility that's used in tens of thousands of different sites, under a myriad of conditions and in different ways, and in a huge swath of different browsers and devices. I need the ability inside this general utility to do consistent, 100% reliable, predictable script loading for sites, without making ANY assumptions about the site/markup/environment itself. I need to be as unintrusive as possible. It needs to be totally agnostic to where it's used.

2. Premise: I need a solution for script (pre)loading that works not JUST in markup at page-load time, but in on-demand scenarios long after page-load, where markup is irrelevant. Markup-only solutions that ignore on-demand loading are insufficient, because I have cases where I load stuff on-demand. Lots of cases. Bookmarklets, third-party widgets, on-demand loading of heavy resources that I only want to pay the download penalty for if the user actually goes to a part of the page that needs it (like a tab set, for instance). In fact, most of the code I write ends up in the on-demand world. That's why I care so much about it.

3. Premise: this is NOT just about deferring parsing. Some people have argued that parsing is the expensive part. Maybe it is (on mobile), maybe not. Frankly, I don't care. What I care about is deferring EXECUTION, not parsing (parsing can happen after-preload or before-execution, or anywhere in between, matters not to me). Why? Because there's still lots of legacy content on the web that has side-effects when it runs. I need a way to prevent those side effects through my script loading, NOT just hoping someday they rewrite their code so that it has no side effects upon execution.

NOTE: there ARE people who care about the expense of parsing. Gmail-mobile (at one point, anyway) was doing the /* here's my code */ comment-execute trick to defer parsing. So me not caring about it doesn't make it not an important use-case. Perhaps it IS something to consider. But it doesn't change any of my proposals or opinions -- it leaves the door open for the browser to decide when parsing is best.

4. Use-case: I am dynamically loading one of those social widgets that, upon load, automatically scans a page and renders social buttons. I need to be able to preload that script so it's ready to execute, but decide when I want it to run against the page. I don't want to wait for true on-demand loading, like when my user clicks a button, because of the loading delay that will be visible to the user, so I want to pre-load that script and have it waiting, ready at a moment's notice to say "it's ok to execute, do it now! now! now!".

There is not "this script has loaded, so run the other one" scenario here. It's some run-time environment condition, such as user-interaction, which needs to be the trigger. For instance, it might be when I finish using a templating engine client-side to render the markup that the social widget will search for and attach to. Or it might be like clicking to open a tab, which isn't rendered until made visible, and so we can't run the social widget code until that tab is rendered and made visible.

Any solution which triggers a preloaded script to execute ONLY by some other script finishing loading is insufficient. Full stop.

5. Premise: "cache preloading" is not a suitable technique (it's a hack). Cache preloading essentially says "just load something into the cache", and then when you're ready for it, **re-request** it, and hope and cross your fingers that it's still in the cache. Jake, you suggest something later in your email with <link rel="subresource"> and then later doing <script>. THAT is just a more exotic form of "cache preloading", and is insufficient for the same reasons.

There's a big difference between TRYING to have something primed in the browser cache, and having that thing already in memory, in a resource element, parsed, and ready to execute.

Recent studies showed, IIRC, that almost 50% of all scripts served across the top of the web were being sent without proper caching headers. If the browser is doing what it should do, it won't cache those. Therefore, the attempt to cache-preload will fail to actually effectively preload.

6. Use-case: I want to preload a script which is hosted somewhere that I don't control caching headers, and to my dismay, I discover that they are serving the script with incorrect/busted/missing caching headers. If I use a cache-preload technique, it will fail to work as I had hoped. I will pay the double-download penalty because the browser didn't actually cache it the first time, and my user will pay the extra UX penalty of having to wait longer for the second load, when my whole goal was to remove that UX visible delay.

Unless we invent something where a resource tag can be marked with a "cacheItAnywaysDamnit" attribute that forces the caching regardless of bad/missing caching headers, any solution which relies on cache preloading is unreliable, and thus insufficient for general script loader usage as anything more than a fallback hack.

WORSE: what if the server sending the script is doing its job of sending good caching headers, but something else interfers with the caching, like a busted corporate proxy, or a router on airplane wifi, or an anti-virus program on the user's computer, or the developer-tools settings are disabling caching, or any other such thing. Regardless of how it happens, a script can in theory be cacheable and actually not get cached, which causes any "cache preload" technique to break.

Insufficient. Full stop.

7. Premise: my script loader might be used in one centralized and coordinated location on the page, or it might be used many times indepedently in many parts of the page, such as many different CMS plugins requesting their own sets of scripts to load. Those plugins are ignorant of anything else going on in the page, and as far as they're concerned, they'll expect their script loading to be independent of any other script loading.

8. Use-case: One CMS plugin wants to load "A.js" and "B.js", where B relies on A. Both need to load in parallel (for performance), but A must execute before B executes. I don't control A and B, so changing them is not an option. This CMS plugin doesn't want to auto-execute its code, however. It wants to wait for some user-interaction, such as a button click, before executing the code. We don't want there to be any big network-loading delay visible to the user between their click of the button and the running of that plugin's code. Preloading is desired to minimize the delays.

Another CMS plugin on this same page wants to load "A.js", "C.js", and "D.js". This plugin doesn't know or care that the other plugin also requests "A.js". It doesn't know if there is a script element in the page requesting it or not, and it doesn't want to looking for it. It just wants to ask for A as a pre-requisite to C and D. But C and D have no dependency on each other, only a shared dependency on A. C and D should be free to run ASAP (in whichever order), assuming that A has already run. That means C shouldn't wait for D, nor should D wait for C.

Again, as in the other plugin, there's some user-interaction that initiates the load of A, C, and D. This user interaction may be before the other plugin requested A, or after it requested A. Neither CMS plugin cares or knows about the other or their shared usage of A.

Any solution which relies on some markup or script-load event to trigger the loading of either plugin's scripts is insufficient, as these plugins both want have their code execute only in on-demand user interaction scenarios.

Any solution which relies on the plugins coordinating themselves in how or when they ask for shared dependency A is insufficient for general script loading.

Further complicating my use-case: "A.js" can be requested relatively (via a <base> tag or just relative to document root) or absolutely, or it might be requested with the leading-// protocol-relative from, taking on whatever http or https protocol the page has, whereas other references to it may specify the prototcol. Point being, any solution which relies on matching a script's `src` property/attribute and not against its actual resolved URL is insufficient.

NOTE: by contrast, LABjs resolves this relative/protocol-relative/absolute URL problem by implementing a URL canonicalization routine in it, so ALL the URL's requested of LABjs are de-duplicated based on their canonical/absolute form regardless of how requested. That's something a script loader can do, but not something that a pattern matching against markup could take advantage of.

Any solution which relies on those plugins having some shared convention for what ID or what class name, to do the hooking up, is insufficient for general script loading, because of all the possible ways that those things can fail in an unknown page hosting them. These plugins can't guarantee what ID's or classes they use are reliably unique without undue processing burden.

So any solution which relies on these sorts of things MIGHT fail, and is thus unreliable and insufficient. Full stop.

9: Premise: sometimes you load more scripts than you actually use. While that can be considered a bad thing in many cases, there are cases where it's a desired thing. Any suitable solution needs to be able to preload a script but not be required to actually ever execute it.

10. Use-case: I have two different calendar widgets. I want to pop one of them up when a user clicks a button. The user may never click the button, in which case I don't want the calendar widget to have ever executed to render. I don't just want the calendar widget rendered but hidden, I don't want it rendered into the DOM at all if the user doesn't click the button.

It'd be nice if both calendar widgets were built sensibly so that loading the code didn't automatically render. One of them IS, the other is unfortunately mis-behaving, and it will render itself as soon as its code is run. This is not a question of concern about the parsing time of the script, or the display of the widget. It's a question of, for whatever reason, not wanting the actual DOM elements until I want them.

Furthermore, these two widgets are not "equal". Perhaps one is better for smaller browser window sizes, and the other is better for larger browser windows. Or pehaps one is preferred if a user is using a mouse, the other is more optimized for touch. Or perhaps one is better for handling date ranges, and one is better for single dates, and the user might want to do one or the other, but not both. Or maybe one calendar widget is better for smaller text font sizes, and one is better for larger font sizes. Or maybe I decide the user gets neither widget, and just gets a text-entry box. Or……

Regardless, the point is, there's run-time conditions which are going to determine if I want to execute calendar widget A or B, or maybe I never execute either. But I want them both preloaded and ready to go, just in case, so that if the user DOES need one, it's free and ready to execute with nearly no delay, instead of having to wait to request as I would with only on-demand techniques.

Any solution which forces a script that is preloaded to eventually be executed is insufficient. Full stop.

11. Use-case: I have a set of script "A.js", "B.js", and "C.js". B relies on A, and C relies on B. So they need to execute strictly in that order.

But, I want to ensure that there's as little delay between A executing and B executing and C executing as possible. For instance, imagine they progressively render different parts of a widget. I don't want A to run just because it's done loading, if B and C are not ready to execute "immediately" thereafter. In other words, I only want to execute A, B and C once all 3 are preloaded and ready to go. It's not just about their order preservation, but also about minimizing delays between them, for performance PERCEPTION.

Sometimes, progressive rendering is nicer for users. But sometimes, it's uglier to a user and less desired. Imagine progressively rendering a modal dialog box where the box pops up first, then some content, then the close button, and finally, a few seconds later, the action button to dismiss the box. The user will be annoyed that the box didn't pop up fully rendered and interactable.

So, I want to delay the execution of these scripts which render these different parts until all of them are present. And no, I can't just change all those scripts to not have execution side-effects.

So, what do I need to do? I need to preload all 3 scripts in parallel, but wait for all to be done before I execute any of them. Here's where the "markNeeded()" / promise solution falls down. As soon as I call "markNeeded()" to get a promise, that script will be eligible to run, which means it may automatically run earlier than I want it to.

What I need is a way to observe that it has finished preloading, but not conflate that observation with "go ahead and execute". Otherwise, A may run well before B and C finish and have a chance to run.

What I need is an event on each of A, B, and C that tells me that each has finished preloading, such that I can be reasonably confident that if I then execute each of them, in order, they will run right together, one after the other, with minimal delay between them. Hence, the need for `onpreload` event, or something like it.

Any solution that relies on something having been executed (`onload`) or being marked eligible for auto-execution (`markNeeded()`) earlier than I want it to be, so that I can be notified that something has been only preloaded so that others can be made eligible for execution… that takes away my control to ensure they all execute "together" with minimal interruption. It's insufficient. Full stop.

----------------------------------

OK, deep breath. Sigh. That was way more than I wanted to write, and probably way more than anyone wanted to read. But that should cover all my big concerns. Hopefully I don't have to recount those use cases over and over again, now.

Every single one of those premises and use-cases are trivial to accomplish with a real preloading solution that gives ME full control over execution. I noted a whole bunch of places where I see shortcomings of the other proposals compared to the nuances of those use-cases.

I'm not just making these use-cases up to be contrarian. I'm drawing from actual real-world experience in every single one of them.

I also want to point out that just because these are my use-cases doesn't mean there aren't other complex use-cases which require real full preloading and are insufficiently served by these other proposals. There's lots of other complexity that developers dream up in the real world. Just because I haven't run across it yet doesn't mean it doesn't exist.

But nevertheless, I hope that list is exhaustive enough (certainly exhausting to write/read) to show that there's a lot more than just "I really want this". There's real complexities to these use-cases.

As far as I'm concerned, the only "silver bullet" that solves every single one, no matter how complex, stated here or not, is real preloading like my proposal suggests.

--Kyle