<div class="gmail_quote">On Mon, Aug 9, 2010 at 1:40 PM, Justin Lebar <span dir="ltr"><<a href="mailto:justin.lebar@gmail.com">justin.lebar@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="im">> Can you provide the content of the page which you used in your whitepaper?<br>

> (<a href="https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820" target="_blank">https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820</a>)<br>

<br>

</div>I'll post this to the bug when I get home tonight.  But your comments<br>

are astute -- the page I used is a pretty bad benchmark for a variety<br>

of reasons.  It sounds like you probably could hack up a much better<br>

one.<br>

<div class="im"><br>

>    a) Looks like pages were loaded exactly once, as per your notes?  How<br>

> hard is it to run the tests long enough to get to a 95% confidence interval?<br>

<br>

</div>Since I was running on a simulated network with no random parameters<br>

(e.g. no packet loss), there was very little variance in load time<br>

across runs.<br></blockquote><div><br></div><div>I suspect you are right.  Still, it's good due diligence - especially for a whitepaper :-)  The good news is that if it really is consistent, then it should be easy...</div>

<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<div class="im"><br>

>    d) What did you do about subdomains in the test?  I assume your test<br>

> loaded from one subdomain?<br>

<br>

</div>That's correct.<br>

<div class="im"><br>

> I'm betting time-to-paint goes through the roof with resource bundles:-)<br>

<br>

</div>It does right now because we don't support incremental extraction,<br>

which is why I didn't bother measuring time-to-paint.  The hope is<br>

that with incremental extraction, we won't take too much of a hit.<br></blockquote><div><br></div><div>Well, here is the crux then.  </div><div><br>What should browsers optimize for?  Should we take performance features which optimize for PLT or time-to-first-paint or something else?  I have spent a *ton* of time trying to answer this question (as have many others), and this is just a tough one to answer.</div>

<div><br>For now, I believe the Chrome/WebKit teams are in agreement that sacrificing time-to-first render to decrease PLT is a bad idea.  I'm not sure what the firefox philosophy here is?</div><div><br></div><div>One thing we can do to better evaluate features is to simply always measure both metrics.  If both metrics get better, then it is a clear win.  But without recording both metrics, we just don't really know how to evaluate if a feature is good or bad.</div>

<div><br></div><div>Sorry to send you through more work - I am not trying to nix your feature :-(  I think it is great you are taking the time to study all of this.</div><div><br></div><div>Mike</div><div><br></div><div><br>

</div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

<font color="#888888"><br>

-Justin<br>

</font><div><div></div><div class="h5"><br>

On Mon, Aug 9, 2010 at 1:30 PM, Mike Belshe <<a href="mailto:mike@belshe.com">mike@belshe.com</a>> wrote:<br>

> Justin -<br>

> Can you provide the content of the page which you used in your whitepaper?<br>

> (<a href="https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820" target="_blank">https://bug529208.bugzilla.mozilla.org/attachment.cgi?id=455820</a>)<br>

> I have a few concerns about the benchmark:<br>

>    a) Looks like pages were loaded exactly once, as per your notes?  How<br>

> hard is it to run the tests long enough to get to a 95% confidence interval?<br>

>    b) As you note in the report, slow start will kill you.  I've verified<br>

> this so many times it makes me sick.  If you try more combinations, I<br>

> believe you'll see this.<br>

>    c) The 1.3MB of subresources in a single bundle seems unrealistic to me.<br>

>  On one hand you say that its similar to CNN, but note that CNN has<br>

> JS/CSS/images, not just thumbnails like your test.  Further, note that CNN<br>

> pulls these resources from multiple domains; combining them into one domain<br>

> may work, but certainly makes the test content very different from CNN.  So<br>

> the claim that it is somehow representative seems incorrect.   For more<br>

> accurate data on what websites look like,<br>

> see <a href="http://code.google.com/speed/articles/web-metrics.html" target="_blank">http://code.google.com/speed/articles/web-metrics.html</a><br>

>    d) What did you do about subdomains in the test?  I assume your test<br>

> loaded from one subdomain?<br>

>    e) There is more to a browser than page-load-time.  Time-to-first-paint<br>

> is critical as well.  For instance, in WebKit and Chrome, we have specific<br>

> heuristics which optimize for time-to-render instead of total page load.<br>

>  CNN is always cited as a "bad page", but it's really not - it just has a<br>

> lot of content, both below and above the fold.  When the user can interact<br>

> with the page successfully, the user is happy.  In other words, I know I can<br>

> make webkit's PLT much faster by removing a couple of throttles.  But I also<br>

> know that doing so worsens the user experience by delaying the time to first<br>

> paint.  So - is it possible to measure both times?  I'm betting<br>

> time-to-paint goes through the roof with resource bundles:-)<br>

> If you provide the content, I'll try to run some tests.  It will take a few<br>

> days.<br>

> Mike<br>

><br>

> On Mon, Aug 9, 2010 at 9:52 AM, Justin Lebar <<a href="mailto:justin.lebar@gmail.com">justin.lebar@gmail.com</a>> wrote:<br>

>><br>

>> On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor <<a href="mailto:Simetrical%2Bw3c@gmail.com">Simetrical+w3c@gmail.com</a>><br>

>> wrote:<br>

>> > If UAs can assume that files with the same path<br>

>> > are the same regardless of whether they came from a resource package<br>

>> > or which, and they have all but a couple of the files cached, they<br>

>> > could request those directly instead of from the resource package,<br>

>> > even if a resource package is specified.<br>

>><br>

>> These kinds of heuristics are far beyond the scope of resource<br>

>> packages as we're planning to implement them.  Again, I think this<br>

>> type of behavior is the domain of a large change to the networking<br>

>> stack, such as SPDY, not a small hack like resource packages.<br>

>><br>

>> -Justin<br>

>><br>

>> On Mon, Aug 9, 2010 at 9:47 AM, Aryeh Gregor <<a href="mailto:Simetrical%2Bw3c@gmail.com">Simetrical+w3c@gmail.com</a>><br>

>> wrote:<br>

>> > On Fri, Aug 6, 2010 at 7:40 PM, Justin Lebar <<a href="mailto:justin.lebar@gmail.com">justin.lebar@gmail.com</a>><br>

>> > wrote:<br>

>> >> I think this is a fair point.  But I'd suggest we consider the<br>

>> >> following:<br>

>> >><br>

>> >> * It might be confusing for resources from a resource package to show<br>

>> >> up on a page which doesn't "opt-in" to resource packages in general or<br>

>> >> to that specific resource package.<br>

>> ><br>

>> > Only if the resource package contains a different file from the real<br>

>> > one.  I suggest we treat this as a pathological case and accept that<br>

>> > it will be broken and confusing -- or at least we consider how many<br>

>> > extra optimizations we could make if we did accept that, before<br>

>> > deciding whether the extra performance is worth the confusion.<br>

>> ><br>

>> >> * There's no easy way to opt out of this behavior.  That is, if I<br>

>> >> explicitly *don't* want to load content cached from a resource<br>

>> >> package, I have to name that content differently.<br>

>> ><br>

>> > Why would you want that, if the files are the same anyway?<br>

>> ><br>

>> >> * The avatars-on-a-forum use case is less convincing the more I think<br>

>> >> about it.  Certainly you'd want each page which displays many avatars<br>

>> >> to package up all the avatars into a single package.  So you wouldn't<br>

>> >> benefit from the suggested caching changes on those pages.<br>

>> ><br>

>> > I don't see why not.  If UAs can assume that files with the same path<br>

>> > are the same regardless of whether they came from a resource package<br>

>> > or which, and they have all but a couple of the files cached, they<br>

>> > could request those directly instead of from the resource package,<br>

>> > even if a resource package is specified.  So if twenty different<br>

>> > people post on the page, and you've been browsing for a while and have<br>

>> > eighteen of their avatars (this will be common, a handful of people<br>

>> > tend to account for most posts in a given forum):<br>

>> ><br>

>> > 1) With no resource packages, you fetch two separate avatars (but on<br>

>> > earlier page views you suffered).<br>

>> ><br>

>> > 2) With resource packages as you suggest, you fetch a whole resource<br>

>> > package, 90% of which you don't need.  In fact, you have to fetch a<br>

>> > resource package even if you have 100% of the avatars on the page!  No<br>

>> > two pages will be likely to have the same resource package, so you<br>

>> > can't share cache at all.<br>

>> ><br>

>> > 3) With resource packages as I suggest, you fetch only two separate<br>

>> > avatars, *and* you got the benefits of resource packages on earlier<br>

>> > pages.  The UA gets to guess whether using resource packages would be<br>

>> > a win on a case-by-case basis, so in particular, it should be able to<br>

>> > perform strictly better than either (1) or (2), given decent<br>

>> > heuristics.  E.g., the heuristic "fetch the resource package if I need<br>

>> > at least two files, fetch the file if I only need one" will perform<br>

>> > better than either (1) or (2) in any reasonable circumstance.<br>

>> ><br>

>> > I think this sort of situation will be fairly common.  Has anyone<br>

>> > looked at a bunch of different types of web pages and done a breakdown<br>

>> > of how many assets they have, and how they're reused across pages?  If<br>

>> > we're talking about assets that are used only on one page (image<br>

>> > search) or all pages (logos, shared scripts), your approach works<br>

>> > fine, but not if they're used on a random mix of pages.  I think a lot<br>

>> > of files will wind up being used on only particular subsets of pages.<br>

>> ><br>

>> >> In general, I think we need something like SPDY to really address the<br>

>> >> problem of duplicated downloads.  I don't think resource packages can<br>

>> >> fix it with any caching policy.<br>

>> ><br>

>> > Certainly there are limits to what resource packages can do, but we<br>

>> > can wind up closer to the limits or farther from them depending on the<br>

>> > implementation details.<br>

>> ><br>

><br>

><br>

</div></div></blockquote></div><br>