[whatwg] Proposal: Loading and executing script as quickly as possible using multipart/mixed

Mon Dec 3 22:02:09 PST 2012

It might be good to use a custom MIME type instead of multipart/mixed. multipart/mixed can represent arbitrary heterogenous sequences of types, which is not the desired semantic here - you want a sequence of all text/javascript types. It also has a syntactic affordance for conveying a MIME type per chunk, which is unnecessary in this case. Since browsers will likely need custom logic for this case anyway, I think it might be better to have a multipart/javascript type. Note: if this feature is needed for other script types, let's say vbscript, you could mint distinct types like multipart/vbscript, or use a MIME parameter: "multipart/script; type=text/javascript"

On Dec 3, 2012, at 1:14 PM, Adam Barth <w3c at adambarth.com> wrote:

> == Use case ==
> 
> Load and execute script as quickly as possible.
> 
> == Discussion ==
> 
> Currently, there are a number of ways to load a script from the
> network and execute it, but none of them will actually load and
> execute the script as fast as physically possible.  Consider the
> following markup:
> 
> <script async src="path/to/script.js"></script>
> 
> In this case, the user agent will wait until it receives the last byte
> of script.js from the network before executing the first byte of
> script.js.  In principle, the user agent could finish executing
> script.js sooner if it could overlap some of the execution time with
> some of the network latency, for example by executing a chunk of the
> script while waiting for the bytes for the next chunk to arrive from
> the network.
> 
> Unfortunately, without additional information, the user agent doesn't
> know where "safe" chunk boundaries are located.  Picking an arbitrary
> byte boundary is likely to cause a syntax error, and even picking an
> arbitrary JavaScript statement boundary will change the semantics of
> the script.  The user agent needs some sort of signal from the author
> to know where the safe chunk boundaries are located.
> 
> == Workarounds ==
> 
> The simplest work around is to break your script into several pieces:
> 
> <script async src="path/to/script-part1.js"></script>
> <script async src="path/to/script-part2.js"></script>
> <script async src="path/to/script-part3.js"></script>
> 
> Now, script-part1.js will execute before the user agent has received
> the last byte of script-part3.js.  Unfortunately, this approach does
> not make efficient use of the network.  Specifically, if the three
> parts are retrieved from the network in parallel, then the user agent
> might receive a byte from script-part3.js before receiving all the
> bytes of script-part1.js, wasting network bandwidth (because the bytes
> from script-part3.js are not useful until all of script-part1.js is
> received an executed).
> 
> A more sophisticated workaround is to use an <iframe> element rather
> than a <script> element to load the script:
> 
> <iframe src="path/to/script-in-markup.html"></iframe>
> 
> In this approach, script-in-markup.html is the following HTML:
> 
> <script>
> [... text of script-part1.js ...]
> </script>
> <script>
> [... text of script-part2.js ...]
> </script>
> <script>
> [... text of script-part3.js ...]
> </script>
> 
> Now the bytes of the script are retrieved from the network in the
> proper order (making efficient use of bandwidth) and the user agent
> can overlap execution of the script with network latency (because the
> <script> tags delineate the safe chunks).
> 
> This approach is used in production web applications, including Gmail,
> to load and execute script as quickly as possible.  If you inspect a
> running copy of Gmail, you can find this frame---it's the one with ID
> "js_frame".
> 
> Unfortunately, this approach as a number of disadvantages:
> 
> (1) Creating an extra <iframe> for loading JavaScript is not resource
> efficient.  The user agent needs to create a number of extra data
> structures and an extra JavaScript environment, which wastes time as
> well as memory.
> 
> (2) Authors need to write their scripts with the understanding that
> the primary callers of their code will do so from another frame.  For
> example, the instanceof operator might not work as expected if they
> ask whether an object from the caller (i.e., from the parent frame) is
> an instance of a constructor from the callee's environment (i.e., from
> the child frame).
> 
> (3) This approach requires the author who loads the script to use
> different syntax than normally used for loading script.  For example,
> this prevents this technique from being applied to the JavaScript
> libraries that Google hosts (as described by
> <https://developers.google.com/speed/libraries/>).
> 
> == Proposal ==
> 
> The <script> element should support multipart/mixed.
> 
> == Details ===
> 
> The main ingredient that we're missing is a way for the author to
> signal to the user agent which chunks of scripts are safe to execute
> in parallel with loading subsequent chunks from the network.
> Fortunately, the web platform already has a mechanism for breaking a
> single HTTP response body into chunks that are processed sequentially:
> multipart/mixed.
> 
> For example, if an HTTP server provides a multipart/mixed response to
> a request for an image, the <img> element will display each part of
> the response in sequence, animating the image.  Similarly, if an HTTP
> server provides a multipart/mixed response to a request for an HTML
> document, the user agent will display each part of the response
> sequentially.
> 
> One way to address this use case is to add multipart/mixed support to
> the <script> element.  Upon receiving a multipart/mixed response to a
> request for a script, the <script> element must execute each part of
> the response as they become available.  This behavior appears to be
> consistent with the definition of multipart/mixed
> <http://tools.ietf.org/html/rfc2046#section-5.1.3>.
> 
> To load and execute a script as quickly as possible, the author would
> use the following markup:
> 
> <script async src="path/to/script.js"></script>
> 
> The HTTP server would then break script.js into chunks that are safe
> to execute sequentially and provide each chunk as a separate MIME part
> in a multipart/mixed response.
> 
> Adam