[whatwg] HTML5 Offline Web Applications

Fri Aug 29 16:36:17 PDT 2008

Hello again all,

A couple more comments.

*When is anything ever deleted?*

Maybe i missed it, but where does appCache deletion happen?

Something that Gears user's have done is to serve an empty manifest file.
The results are a close approximation to having deleted the resource store.
I would vote to have some syntax for expressing 'delete me'  in the manifest
file for an appCache. A new type of event may be warranted for completion of
such an update, and when swapCache() is called there would no longer an
appCache associated with the context.

*Should we revisit the caching semantics for any resource not explicitly
listed in the manifest?

*Unless i missed something, I think the appCache update/validation logic is
fundamentally flawed with regard to resources that are not explicitly
listed. As presently spec'd, a failure to update/validate any of these
resources causes the entire update to fail, and the old version will
remain pinned in the cache. Now suppose the app changes it's url space such
that some of the resources that got picked up by one of the mechanisms to
add new resources (autocaching namespace or manually .add()ed or <html
manifest=x>) no longer make sense... i think this means the appCache is
stuck in time.

One idea is to rephrase this feature in terms closer to std http caching for
all entries that do not explicily appear in the manifest file. In
effect, closer to telling the http cache to not purge the resource.

* at initial cache time
  - cache the resource

* at appCache update time
  - validate all non-explicit entries per usual http caching semantics
     (so 404s  will remove these entries at update time)
  - network/server errors do not fail the larger update
  - beyond that, not sure what todo on network/server errors... remove or
retain the resources?
  - perhaps maintain a list of 'failed to update' items that the webapp can
access via script

* at resource load time
   - validate per usual http caching rules going forward
    (so 404s will remove these entries)
  - with the following exceptions
     - use the cached resource as a fallback for network or server(5xx)
errors
     - do not purge the resource upon expiration

Comments?

On Mon, Aug 25, 2008 at 11:54 AM, Michael Nordman <michaeln at google.com>wrote:

> Hello all, I have many comments on the Offline Web Applications corner of
> the HTML5 spec. This is the first round of comments you'll see coming from
> me. This one is mostly top-level comments.
> 5.7.2 Application caches
> I found the terminology used to describe the contents of the
> cache sometimes contradictory and confusing, and it doesn't correspond
> directly with the terminology used in the manifest file syntax. FWIW, some
> word smithing and reconciling the differences could add clarity to the spec.
>
> * cached resource** categories*
>
> * implicit category
> This categorization applies to html docs which explicitly contain a
> reference to the manifest file via the 'manifest' attribute of their <html>
> tag. I understand they are not necessarily explicitly listed in the manifest
> file, but they may also be explicitly listed. The end result is that a
> resource can be categorized as both 'implicit' and 'explicit'. This is
> confusing. I'd vote to have a different name for clarity sake... some
> ideas... 'toplevel', 'manifest referencing', 'native' (an awkward play on
> foreign).
>
> * manifest category
> Perfect.
>
> * explicit category
> Ok provided 'implicit' is renamed.
>
> * fallback category
> The term 'fallback' refers to the prescribed use of these resources for the
> opportunistic-caching namespace in particular. As part of pulling apart
> namespaces vs how to handle hits within a namespace, I'd vote to change the
> name for this category... some ideas... 'namespace-handler'.  I'll say more
> more to say about different types of 'namespaces' below.
>
> * opportunistcally cached category
> A mouthful, but ok. Another possibility is 'auto-cached' which would work
> well with the 'manually-cached' terminology below.
>
> * dynamic category
> I'd like to reserve the term 'dynamic' for a different use of that term
> (more on that in a moment).  Some name possibilites for this category...
> 'manually-cached' or 'script-added' or 'programatically-added'.
>
> * flavors of namespaces*
>
>  * online whitelist
> As mentioned in previous messages, this would need to be some form of
> namespacing or filtering to be useful. A better term for this might be
> 'bypass' since with respect to the appcache, hits here bypass the cache. Its
> not clear if path prefix matching is the best option for filtering out
> request that should bypass the cache. In working with app developers using
> Gears, the idea of specifying a particular query argument to filter on in
> addition to a path prefix has come up. http://server/pathprefix   +
> &bypassAppCache
>
> * opportunistic caching namespaces
> A mouthful but ok. Whatever terminology used for the category of resulting
> entries should be used here... perhaps 'auto-caching namespace'.
>
>  * fallback namespace [factored out of opportunistic-caching]
> This form of namespace is addressed by the spec at present, but is
> co-mingled with the auto-caching feature. This is a proposal to detangle
> them from one another. The basic idea is to load the resource as usual, and
> only upon failure fallback to a cached 'namespace-handler'... no
> auto-caching involved.
>
> * intercept namespaces [new]
> This form of namespace is not in the spec at present. This is a proposal to
> add it. It is a heavily used feature of the Gears LocalServer. The basic
> idea is to intercept requests into this namespace and satisfy them with a
> cached 'namespace-handler'  without consulting the server.
>
>  *summary of the above change requests*
>
> Cached resource categories (just name changes):
> * toplevel - pages which <html manifest='manifesturlforthisappcache'>
> * manifest - the manifest file
> * explicit - explicitly listed in the manifest file
> * namespace-handler - resource which is utilized by a name-space
> * auto-cached - resources that have been cached via the auto-cache
> namespace
> * manually-cached - resources that have been cached via a javascript call
> to appCache.add()
>
> Namespaces (name changes, refactored things a bit, and introduced the
> 'intercept' namespace)
>  * bypass - bypasses further lookup within the appcache and resorts to the
> usual resource loading
> * intercept - doesn't hit server, serves a cached namespace-handler
> resource
> * autocache - hits server, caches successful response for future use, on
> server errors serves a cached namespace-handler resource
> * fallback - hits server, does NOT cache successful responses, on server
> errors serves a cached namespace-handler resource
>
> Manifest file section headers:
>  * BYPASS: list of url [namespaces/filters]
> * CACHE: list of exact [urls]
> * INTERCEPT: list of [urlnamespaces, namespace-handler url]
> * AUTOCACHE: list of [urlnamespaces, namespace-handler url]
> * FALLBACK: : list of [urlnamespaces, namespace-handler url]
>
> *Scriptlets - or dynamic namespace-handlers [new idea]*
>
> Something we wrestled with in the process of putting together the Gears
> LocalServer was the distinction between intercepting requests for urls and
> identifying the appropiate cached resource for that request. We ended up
> with a declarative manifest file, similar to but different from what is
> contained in this spec. This wasn't an altogether satisfying answer. The
> expressiveness of the language to match/filter requested urls is limited in
> Gears and this spec shares that same characterization.
>
> Something else we've wrestled with in Gears was having to do awkward
> redesigns in corners of a web application in order to 'take it offline',
> single-sign-on for example. In general, anywhere an application relies on
> HTTP features more than HTML to influence navigation or conditional resource
> loading, it's difficult to address with a static cache.
>
> So I'd like to propose extending this spec to incorporate 'dynamically
> generated responses'. I think this capability fits into this corner of the
> HTML5 spec because this is most directly useful in the "Offline Web
> Application" scenario. The basic idea is to execute application code
> (script) to produce responses to intercepted resource loads. The application
> code is executed in the background and can formulate a response
> asynchronously.
>
> Some handwaving where this could hang off of this spec
> * Modify namespace-handlers entries to have an attitional attribute to
> indicate that they are to be executed rather than returned
>
> And some handwaving at what a scriptlet can do...
> * Can read the request headers and POST body
> * Can set response status code and headers (redirects)
> * Can generate a textual response body
>  * Can designate a non-executable cached resource to be returned in
> response
> * Can decide to 'bypass' handling of a request and defer to the usual
> resource loading
> * Can decide to perform the usual resource loading, but to have the
> response added to the appCache
> * Can access HTML5Database APIs
> * Can utlize XmlHttpRequest to communicate with a server
>
> This would obviously be significant addition to the spec, but i do think
> this is worth consideration in the context of 'offline applications'. Based
> on observations of app developers wrestling with Gears, there have been
> several pain points. The HTML5ApplicationCache addresses one of them
> with per-application caches. This addition would address the second of
> them.  (Another pain point has been application deployment).
>
> Am interested in seeing what others think of an addition along these lines.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20080829/286920c3/attachment-0001.htm>