[whatwg] HTML5 Offline Web Applications

Mon Sep 15 16:56:09 PDT 2008

Hi Dave,
Thanx for taking a look.

On Mon, Sep 15, 2008 at 2:25 PM, Dave Camp <dave.camp at gmail.com> wrote:

> On Mon, Aug 25, 2008 at 11:54 AM, Michael Nordman <michaeln at google.com>
> wrote:
>
> > Manifest file section headers:
> > * BYPASS: list of url [namespaces/filters]
> > * CACHE: list of exact [urls]
> > * INTERCEPT: list of [urlnamespaces, namespace-handler url]
> > * AUTOCACHE: list of [urlnamespaces, namespace-handler url]
> > * FALLBACK: : list of [urlnamespaces, namespace-handler url]
>
> Using namespaces for bypass URIs, and separating auto-caching from
> fallback, both seem like big wins.  We'd need to specify what to do
> when namespaces collide, but I figure "most specific namespace wins"
> would be fine.
>

I like splitting them up too. Everyone I've spoken with thinks the
"auto-cache"
feature is a convenience that really doesn't have to be in the spec. But
fallback
and intercept are more than a convenience. The split lets us make decisions
about
these independently.

Regarding collisions, another possibility is to check each type in order...
   bypass, intercept, autocache, fallback.
This is in keeping with how the current draft treats the relative priorities
of 'bypass'
vs 'opportunistic-caching'.

>
>
> > One idea is to rephrase this feature in terms closer to std http caching
> for
> > all entries that do not explicily appear in the manifest file. In
> > effect, closer to telling the http cache to not purge the resource.
> >
> > * at initial cache time
> >   - cache the resource
> >
> > * at appCache update time
> >   - validate all non-explicit entries per usual http caching semantics
> >      (so 404s  will remove these entries at update time)
> >   - network/server errors do not fail the larger update
> >   - beyond that, not sure what todo on network/server errors... remove or
> > retain the resources?
> >   - perhaps maintain a list of 'failed to update' items that the webapp
> can
> > access via script
>
> This all makes sense.
>
> > * at resource load time
> >   - validate per usual http caching rules going forward
> >     (so 404s will remove these entries)
> >   - with the following exceptions
> >      - use the cached resource as a fallback for network or server(5xx)
> > errors
> >      - do not purge the resource upon expiration
>
> This seems reasonable, but it seems a bit strange that
> applicationCache.add() resources will behave differently than
> explicitly-listed manifest entries (on a particularly slow/flaky
> wireless network, parts of the application will be quick and others
> won't).
>

I think as currently spec'd, the update / validation scheme is a
non-starter,
we have to do something different in this area.

Also, I can't speak for them... but in working with app teams at Google, I
think
this would be a preferred way of dealing with 'extra' resources pinned in
the
cache outside of the core set of js/css/html/images required to bootstrap an
app.

>
>
> On the subject of fallbacks, I don't think the spec is quite clear on
> how the fallbacks are meant to be loaded.  There seem to be two
> possible interpretations:
>
> 1) The fallback resource is loaded by the client as though it were
> loaded from the original URI - security decisions are made with the
> original URI, and window.location, bookmarks, history, etc. all
> reflect the original URI.  This is somewhat analogous to the real
> server returning fallback content at the original URI.
>
> 2) The fallback resource is loaded by the client as though it were
> loaded from the fallback URI for purposes of security decisions,
> window.location, etc.  But bookmarks, history, etc all reflect the
> original URI.  This is somewhat analogous to a server redirect (with
> bookmark/history changes to reflect the original URI), or to a frame
> at the original URI including the fallback URI (but without the
> intermediate window object).
>
> We need to decide which of these behaviors makes the most sense.  The
> first seems the most straightforward, though I think we'd want a few
> changes:
>
> a) The fallback URI should be required to have the same origin as the
> namespace.
> b) Maybe there should be some way for the page to know that it was
> loaded as a fallback.
>
> If we settle on the second approach, we need to give the page some way
> to find out what the original URI was (since window.location will
> point to the fallback URI).
>

I think the spec is clear that #1 should be done... or should i say that
after you've transcoded Ian's pseudo-code in english to something with curly
braces, this becomes apparent :)

I also think the spec covers the same source and origin matrix appropiately.
While handler resources may be from a different origin, all of the
namespaces
that could be handled by a local resource are constrained to be within the
same
origin of the manifest file. . So siteAs code will not run in siteB's
context without
siteB's consent.

This info was scattered about in the spec, to spare people from having
to decrypt
what the spec says, here are my notes on those constraints with additions
for the
new namespace type (intercept) I'm advocating.

*same-origin, same-scheme constraints*

// by entry category
 * manifest - the manifestUrl is the location of the resource after
following redirects
* toplevel - same origin as manifestUrl
* explicit - same scheme
* namespace-handler - same scheme
* auto-cached - same origin
* manually-cached - same scheme

// by namespace type
 * bypass - same scheme
* intercept - same origin [new] (although the handler only needs the same
scheme)
* autocache - same origin
* fallback - same origin (although the handler only needs the same scheme)

>
> -dave
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20080915/2e33563d/attachment-0001.htm>