[whatwg] HTML5 Offline Web Applications

Michael Nordman michaeln at google.com
Mon Aug 25 11:54:21 PDT 2008

 Hello all, I have many comments on the Offline Web Applications corner of
the HTML5 spec. This is the first round of comments you'll see coming from
me. This one is mostly top-level comments.
5.7.2 Application caches
I found the terminology used to describe the contents of the cache sometimes
contradictory and confusing, and it doesn't correspond directly with the
terminology used in the manifest file syntax. FWIW, some word smithing and
reconciling the differences could add clarity to the spec.

* cached resource** categories*

* implicit category
This categorization applies to html docs which explicitly contain a
reference to the manifest file via the 'manifest' attribute of their <html>
tag. I understand they are not necessarily explicitly listed in the manifest
file, but they may also be explicitly listed. The end result is that a
resource can be categorized as both 'implicit' and 'explicit'. This is
confusing. I'd vote to have a different name for clarity sake... some
ideas... 'toplevel', 'manifest referencing', 'native' (an awkward play on

* manifest category

* explicit category
Ok provided 'implicit' is renamed.

* fallback category
The term 'fallback' refers to the prescribed use of these resources for the
opportunistic-caching namespace in particular. As part of pulling apart
namespaces vs how to handle hits within a namespace, I'd vote to change the
name for this category... some ideas... 'namespace-handler'.  I'll say more
more to say about different types of 'namespaces' below.

* opportunistcally cached category
A mouthful, but ok. Another possibility is 'auto-cached' which would work
well with the 'manually-cached' terminology below.

* dynamic category
I'd like to reserve the term 'dynamic' for a different use of that term
(more on that in a moment).  Some name possibilites for this category...
'manually-cached' or 'script-added' or 'programatically-added'.

* flavors of namespaces*

 * online whitelist
As mentioned in previous messages, this would need to be some form of
namespacing or filtering to be useful. A better term for this might be
'bypass' since with respect to the appcache, hits here bypass the cache. Its
not clear if path prefix matching is the best option for filtering out
request that should bypass the cache. In working with app developers using
Gears, the idea of specifying a particular query argument to filter on in
addition to a path prefix has come up. http://server/pathprefix   +

* opportunistic caching namespaces
A mouthful but ok. Whatever terminology used for the category of resulting
entries should be used here... perhaps 'auto-caching namespace'.

 * fallback namespace [factored out of opportunistic-caching]
This form of namespace is addressed by the spec at present, but is
co-mingled with the auto-caching feature. This is a proposal to detangle
them from one another. The basic idea is to load the resource as usual, and
only upon failure fallback to a cached 'namespace-handler'... no
auto-caching involved.

* intercept namespaces [new]
This form of namespace is not in the spec at present. This is a proposal to
add it. It is a heavily used feature of the Gears LocalServer. The basic
idea is to intercept requests into this namespace and satisfy them with a
cached 'namespace-handler'  without consulting the server.

 *summary of the above change requests*

Cached resource categories (just name changes):
* toplevel - pages which <html manifest='manifesturlforthisappcache'>
* manifest - the manifest file
* explicit - explicitly listed in the manifest file
* namespace-handler - resource which is utilized by a name-space
* auto-cached - resources that have been cached via the auto-cache namespace
* manually-cached - resources that have been cached via a javascript call to

Namespaces (name changes, refactored things a bit, and introduced the
'intercept' namespace)
 * bypass - bypasses further lookup within the appcache and resorts to the
usual resource loading
* intercept - doesn't hit server, serves a cached namespace-handler resource
* autocache - hits server, caches successful response for future use, on
server errors serves a cached namespace-handler resource
* fallback - hits server, does NOT cache successful responses, on server
errors serves a cached namespace-handler resource

Manifest file section headers:
 * BYPASS: list of url [namespaces/filters]
* CACHE: list of exact [urls]
* INTERCEPT: list of [urlnamespaces, namespace-handler url]
* AUTOCACHE: list of [urlnamespaces, namespace-handler url]
* FALLBACK: : list of [urlnamespaces, namespace-handler url]

*Scriptlets - or dynamic namespace-handlers [new idea]*

Something we wrestled with in the process of putting together the Gears
LocalServer was the distinction between intercepting requests for urls and
identifying the appropiate cached resource for that request. We ended up
with a declarative manifest file, similar to but different from what is
contained in this spec. This wasn't an altogether satisfying answer. The
expressiveness of the language to match/filter requested urls is limited in
Gears and this spec shares that same characterization.

Something else we've wrestled with in Gears was having to do awkward
redesigns in corners of a web application in order to 'take it offline',
single-sign-on for example. In general, anywhere an application relies on
HTTP features more than HTML to influence navigation or conditional resource
loading, it's difficult to address with a static cache.

So I'd like to propose extending this spec to incorporate 'dynamically
generated responses'. I think this capability fits into this corner of the
HTML5 spec because this is most directly useful in the "Offline Web
Application" scenario. The basic idea is to execute application code
(script) to produce responses to intercepted resource loads. The application
code is executed in the background and can formulate a response

Some handwaving where this could hang off of this spec
* Modify namespace-handlers entries to have an attitional attribute to
indicate that they are to be executed rather than returned

And some handwaving at what a scriptlet can do...
* Can read the request headers and POST body
* Can set response status code and headers (redirects)
* Can generate a textual response body
 * Can designate a non-executable cached resource to be returned in response
* Can decide to 'bypass' handling of a request and defer to the usual
resource loading
* Can decide to perform the usual resource loading, but to have the response
added to the appCache
* Can access HTML5Database APIs
* Can utlize XmlHttpRequest to communicate with a server

This would obviously be significant addition to the spec, but i do think
this is worth consideration in the context of 'offline applications'. Based
on observations of app developers wrestling with Gears, there have been
several pain points. The HTML5ApplicationCache addresses one of them
with per-application caches. This addition would address the second of
them.  (Another pain point has been application deployment).

Am interested in seeing what others think of an addition along these lines.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20080825/aee56754/attachment-0001.htm>

More information about the whatwg mailing list