[whatwg] AppCache-related e-mails

Felix Halim felix.halim at gmail.com
Sat Jul 2 12:16:49 PDT 2011


On Sat, Jul 2, 2011 at 8:14 AM, Bjartur Thorlacius <svartman95 at gmail.com> wrote:
> Şann fös  1.júl 2011 03:22, skrifaği Felix Halim:
>>
>> I'm looking for a solution that doesn't require modifying anything
>> except adding a manifest.
>>
> I recommend fixing your website. As others have stated, this has practical
> benefits, in the online as well as the offline case.

I don't mind fixing my website, if I really have to! If AppCache have
an option to always view the main page "online", I won't have to do
anything.


>> however, if we don't have "pageStorage", even we have a clean dynamic
>> separation, it will quickly run out of space if we use "localStorage"
>> since the localStorage quota is per domain.
>>
> Nobody's forcing you to use localStorage. How do you figure using
> pageStorage or localStorage will be less work than using iframes or other
> linking methods already proposed?

It's not about the amount of work that matters, it's the quota I'm
talking about.


>> Let's see an example:
>>
>> I have a dynamic page with this url:
>>
>> http://bla/page?id=10
>>
>> The content inside is changing very frequently, lets say every hour.
>> Of course, I want the browser to cache the latest version.
>
> Then specify the applicable HTTP headers with informative values. HTTP
> caching hasn't stopped working, nor is it barred from improving. There is
> space for implementations to improve while complying with current
> specifications. All you have to do is split dynamic resources from static,
> read the RFC and send the appropriate headers.
> Of course this method has the drawback of requiring a request/response pair
> for every resource transferred over HTTP.

Remember that I also want those URL to be available even if the user is offline.
HTTP Cache is not that powerful, AppCache is.


>> In that case, my cleanly separated static and dynamic will have no effect!
>> Because all the statics get duplicated for each App Cache.
>> It will be the same as if I don't have the framework!
>>
> I'm not following your line of thinking. Why do you insist on using an App
> Cache for each page rather than a shared cache for all your resources?

I do want to use shared cache for shared resources and "page cache"
for non-shared resources (unique to that page). However, the
non-shared resources will become too large to fit in 5MB quota.
Remember I have different non-shared content for id=10, id=11, ...,
id=100000, I don't think that will fit in localStorage.


> Are you certain that users wish to archive every single dynamic resource
> they fetch from your site? Disposition of any significant amount of storage
> should be in the hands of the user, if indirectly through the user agent.
> Take handhelds.

Users only view the resources they want.

When they viewed it, I want it to be there for offline use or for
performance reasons.
I expect the users only view (and cache) few hundreds of them.
They cannot cache what they didn't view / open.
It is OK for the browser to not cache it if it doesn't any storage left.

I am satisfied if there is a "page storage" quota of 5MB given per
page (not per domain).
This will solve all my problems (of course by restructuring my site).



>> If only I can store the dynamic content into a pageStorage (assuming
>> different URL ->  including the shebang bookmark has different
>> pageStorage), then I won't be running out of storage if I keep one
>> page within 5MB. So
>
> And you're sure this is a good thing, because?

Because currently, browsers can handle a page content < 5MB very fast.
I think it is OK for a page (not a domain) to have 5MB data quota.

If you are building games, perhaps need more than that (it has to go
to the web store to get unlimited permission). However, for regular
pages, 5MB "currently" is more than enough. 5MB per domain is too
small!


>> http://bla/page#!id=10
>>
> You *can't* allocate a quota per URI fragment, as a script in the page could
> create new ones as wanted.

Yes I know, that was only for an example to point out that:

If I use shared cache:

http://bla/page

I will run out of quota quickly.

If I use parameters like this:

http://bla/page?id=10

I will have to refresh TWICE to get the latest content (annoying).

If I can use:

http://bla/page#!id=10

I get the best of both worlds, that is I have shared static cache, and
I won't run out of quota for the non-shared-dynamic cache since the
quota is 5MB per hash value. I know that this has a security hole that
the script can just generate random url to get more quota.

My suggestion is to give quota to hash value for the first time the
page is loaded, so a later script modification will be linked to the
original hash value's quota.


>> So, we have seen how the AppCache fails to satisfy certain usecase and
>> how pageStorage is needed to make the alternative solution works.
>>
> Show how either the HTTP specification or common practice forbids HTTP
> caches from satisfy your use cases.

I think it's clear that HTTP Cache is inferior to AppCache.
What HTTP Cache can, it can be overridden by AppCache.
AppCache complements HTTP Cache.

HTTP Cache does satisfy my use cases since long time ago.
But with the presence of AppCache, I want better control.


> You're asking for user agents archiving websites for offline viewing (in the
> offline mode). I fetch a bunch of unread pages for offline reading from time
> to time, but keeping a copy of read pages when I'm most likely going to be
> online may be considered a waste of storage space. Users who have a lot of
> storage at their disposal may do this already. I see no reason to do so only
> if requested by the author.

AppCache cannot cache what has not been read/viewed.
What have been viewed can be safely deleted by the browser without warning.
Except, perhaps, for special "installable" web-apps in the store?


> There are two situations:
>  a) A user agent runs in a constrained environment with limited storage
> space. The user agent caches static resources only, or doesn't cache at all.
>  b) A user agent runs in a plentiful environment with great volumes of
> storage space at it's disposal. The user agent archives most if not all
> resources it fetches, throwing stale resources away according to a cache
> algorithm when a quote set in proportion to the available storage space is
> reached.
>
> How does your proposal help the user agent in archiving, or not archiving
> resources?

The decision of archiving is entirely up to the browser.
The cached pages in the AppCache will work fine if it is deleted by the browser.
It will just re-fetch everything and it will be as slow as the first
view of the page.

The archiving problem is not relevant for my proposal.


>> All these discussions only begs to add one feature to AppCache:
>> - Only show the cache when the network / server is offline, otherwise,
>> show the online version of the page.
>>
> This requires a validation roundtrip (as is common), but is otherwise fair.

Yes of course.

However, if the AppCache is set to cache the main page + everything,
no roundtrip is needed.

My proposal is to include an option to ALWAYS make a roundtrip for the
main page and show the latest main page (rather than show the cached
main page which avoid the roundtrip but will do that in later time and
giant refresh needed).


>> The current AppCache doesn't care whether the network/server is online
>> or offline, it BLINDLY shows the cache everytime. This is good for the
>> default, however, we should HAVE an option to not show the cache if we
>> are ONLINE (this is what people meant when they say "DON"T CACHE THE
>> MAIN PAGE").
>>
> HTTP caches may do what you want.

Yes, but I'm no longer satisfied with HTTP Caches when I can use a
more reliable AppCache.

Felix Halim



More information about the whatwg mailing list