[whatwg] Caching of identical files from different URLs using checksums

Ashley Sheridan ash at ashleysheridan.co.uk
Thu Jun 7 15:33:31 PDT 2012



Ian Hickson <ian at hixie.ch> wrote:

>On Fri, 17 Feb 2012, Sven Neuhaus wrote:
>> Hello,
>> 
>> as of 2012, some websites are including popular javascript libraries
>from CDNs, like
>> Google's. The benefits are:
>> 
>> * Traffic savings for the site operator because the javascript
>libraries are downloaded from
>>   the CDN and not from the site that uses them
>> * If enough sites refer to the same external file, the browser will
>cache the file and even if
>>   it's a first visit, the (potentially large) javascript file will
>not have to be downloaded.
>> 
>> There are however some drawbacks to this approach:
>> 
>> * Security: The site operator is trusting an external site.  If the
>CDN serves a malicious file
>>   it will directly lead to code execution in browsers under the
>domain settings of the site
>>   including it (a form of cross site scripting).
>> * Availability: The site depends on the CDN to be available. If the
>CDN is down the site may not
>>   be available at all.
>> * Privacy: The CDN will see requests for the file with HTTP referer
>headers for every visitor
>>   of the site.
>> * Extra DNS lookup if file is not already cached
>> * Extra HTTP connection (can't use persistent connection because it's
>a different site) if file is not cached
>> 
>> I am proposing a solution that will solve all these problems, keep
>the 
>> benefits and offers some extra advantages:
>> 
>> 1. The site stores a copy of the library file(s) on its own site.
>> 2. The web page includes the library from the site itself instead of
>from the CDN
>> 3. The script tag specifies a checksum calculated using a
>cryptographic hash function.
>
>This kind of thing has been proposed a number of times. Unfortunately, 
>each time it has not gotten traction from browser vendors. I recommend 
>approaching browser vendors directly and encouraging them to implement 
>a solution along these lines.
>
>-- 
>Ian Hickson               U+1047E                )\._.,--....,'``.   
>fL
>http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._
>,.
>Things that are impossible just take longer.  
>`._.-(,_..'--(,_..'`-.;.'

Isn't this already tackled now with the use of externally hosted resources? The Jquery library for example can be included from their domain. Google offers many such libraries and resources (fonts, etc) which can be cached once yet be used on multiple sites.

The only thing that a caching mechanism such as the proposed would do is allow for a resource to be included from the same domain, thereby avoiding issues with that. But, aren't there now ways to prevent that too?

Thanks,
Ash
http://ashleysheridan.co.uk



More information about the whatwg mailing list