[whatwg] HTML5 Offline Web Applications

Mon Sep 15 14:25:35 PDT 2008

On Mon, Aug 25, 2008 at 11:54 AM, Michael Nordman <michaeln at google.com> wrote:

> Manifest file section headers:
> * BYPASS: list of url [namespaces/filters]
> * CACHE: list of exact [urls]
> * INTERCEPT: list of [urlnamespaces, namespace-handler url]
> * AUTOCACHE: list of [urlnamespaces, namespace-handler url]
> * FALLBACK: : list of [urlnamespaces, namespace-handler url]

Using namespaces for bypass URIs, and separating auto-caching from
fallback, both seem like big wins.  We'd need to specify what to do
when namespaces collide, but I figure "most specific namespace wins"
would be fine.

> One idea is to rephrase this feature in terms closer to std http caching for
> all entries that do not explicily appear in the manifest file. In
> effect, closer to telling the http cache to not purge the resource.
>
> * at initial cache time
>   - cache the resource
>
> * at appCache update time
>   - validate all non-explicit entries per usual http caching semantics
>      (so 404s  will remove these entries at update time)
>   - network/server errors do not fail the larger update
>   - beyond that, not sure what todo on network/server errors... remove or
> retain the resources?
>   - perhaps maintain a list of 'failed to update' items that the webapp can
> access via script

This all makes sense.

> * at resource load time
>   - validate per usual http caching rules going forward
>     (so 404s will remove these entries)
>   - with the following exceptions
>      - use the cached resource as a fallback for network or server(5xx)
> errors
>      - do not purge the resource upon expiration

This seems reasonable, but it seems a bit strange that
applicationCache.add() resources will behave differently than
explicitly-listed manifest entries (on a particularly slow/flaky
wireless network, parts of the application will be quick and others
won't).

On the subject of fallbacks, I don't think the spec is quite clear on
how the fallbacks are meant to be loaded.  There seem to be two
possible interpretations:

1) The fallback resource is loaded by the client as though it were
loaded from the original URI - security decisions are made with the
original URI, and window.location, bookmarks, history, etc. all
reflect the original URI.  This is somewhat analogous to the real
server returning fallback content at the original URI.

2) The fallback resource is loaded by the client as though it were
loaded from the fallback URI for purposes of security decisions,
window.location, etc.  But bookmarks, history, etc all reflect the
original URI.  This is somewhat analogous to a server redirect (with
bookmark/history changes to reflect the original URI), or to a frame
at the original URI including the fallback URI (but without the
intermediate window object).

We need to decide which of these behaviors makes the most sense.  The
first seems the most straightforward, though I think we'd want a few
changes:

a) The fallback URI should be required to have the same origin as the namespace.
b) Maybe there should be some way for the page to know that it was
loaded as a fallback.

If we settle on the second approach, we need to give the page some way
to find out what the original URI was (since window.location will
point to the fallback URI).

-dave