[whatwg] Unlimited pageStorage for App Cached web pages

Bjartur Thorlacius svartman95 at gmail.com
Tue May 31 18:46:08 PDT 2011


On 5/31/11, Felix Halim <felix.halim at gmail.com> wrote:
> On Mon, May 30, 2011 at 10:39 PM, Bjartur Thorlacius
> <svartman95 at gmail.com> wrote:
>
> The dynamic resources only updated if the user visit the particular
> app cached web-page.
>
Yeah, that's logical. Caches should still be allowed to refetch
resources just before they're expected to be used. I might want my
home computer to fetch the latest news in the morning and evening, so
I can start reading when I wake up and when I get home from school.

> Remember that the dynamic resources I'm talking about here is NOT
> shared between other web-cached pages (even they are in the same
> domain).
>
That's fine. I don't think caches need to know that, but I'll get back
to you after some sleep. It may be hard to get quotas right; but
multiple HTML documents *could* link to the same resources. I think
quotas should only be enforced per resource and on the user agent,
leaving the user agent to use the quota for small files only as
effectively as it can, e.g. by keeping only frequently used resources.

>> The former is easy to achieve, but user agents tend to throw away stale
>> versions as to not present outdated information to the user and to save
>> storage space.
>
> The user agent only need to keep the latest version.
> It's fine to throw away the outdated one if you have the latest.
>
Sorry, I meant potentially stale. User agents should of course not
keep obsolete versions of resources when they have fresh ones, but
they may end up with versions of resources are not fresh. In this case
they SHOULD validate or refetch the resource - except when "working
offline".

>> You want user agents to fetch the latest version whenever possible, but
>> keep
>> an old copy for when your servers are unreachable.
>
> The "whenever possible" is when the user revisit the cached page.
...and a cache has a fresh copy or an authoritative server for the
resource is reachable and responsive.

> The "old" here means the latest version that was cached..
>
Yes.

>> 5MB ought to be enough for anyone.
>
> You were joking, right? :D
>
Yeah, I'm kidding. :P
I believe quota size should be decided on case-by-case basis (unlike
localStorage where it's probably useful to make assumptions as to the
available storage space).

> 5MB for each App Cached web-page is probably OK.
> However, 5MB localStorage quota for each domain is NOT OK.
>
The right amount to reserve for caching depends on the scarcity of
space on the user's machine.

>> If all you want to do is store mutable resources for offline use and
>> validate them if possible, but returning the cached entry (or entity) if
>> validation is impossible, simply serve the resources with an Expires
>> header
>> set to a date in the past.
>
> Here is an example of how I want the App Cache to behave:
>
> First, it always try to fetch the main page, and all its
> static/dynamic resources and display it (just like normal web page).
> Then some time later, if the user want's to visit the SAME page again
> but the user is offline or the server is unavailable, then the latest
> cache of the page is displayed.
>
HTTP user agents MAY implement the behaviour you describe; i.e. use
potentially stale entries when validation (checking if it's fresh) is
impossible - as long as that doesn't happen "normally". I don't
consider "working offline" normal. Caches are free to serve
potentially stale entries as long as they they disclose how old they
are (so the user agent can determine if it's usable, or warn the
user).

My impression is that HTTP caching fulfills your needs and that the
HTTP specification doesn't forbid the behaviour you prescribe. Thus
you're free to implement your ideal behaviour without modifying any
specifications. Are you unable to use HTTP caching?

> In the sense, it's exactly like "working offline mode" whenever the
> user is offline or the server is not responding.
>
Why can't you use "offline mode"? Serve the dynamic content with the
expiration date set to the past. That way the UA can store it for
offline use if it has enough storage space at it's disposal (making
space for it using an implementation defined cache algorithm such as
LFU).

> The current App Cache design updates the cache to the latest version
> in the background when the user visit the page for the second time and
> then it needs to refresh the page to actually update the display. This
> is annoying since the user will first see stale data, then a few
> second later, it's updated with a giant refresh (including all the
> static resources). This is because the App Cache is too COARSE
> grained. It doesn't know what actually changes (which data are static,
> which data are dynamic). That is another reason why we need
> pageStorage: to separate the dynamic and the static resources.
>
Disclaimer: I'll have to do some reading on App Cache; I don't even
understand why good ol' HTTP caching doesn't do the job.
HTTP caches know which resources are static or dynamic, as the HTTP
server tells them. A networked cache will return the static resources
immediately for rendering, but stall the dynamic resources until
they've been validated or refetched. As far as I can see, this is
exactly the behaviour you want.



More information about the whatwg mailing list