[whatwg] size limits on web databases

Wed Nov 25 17:18:24 PST 2009

On Sat, 29 Aug 2009, Rob Kroeger wrote:
> On Saturday, August 29, 2009, Ian Hickson <ian at hixie.ch> wrote:
> > On Thu, 13 Aug 2009, Rob Kroeger wrote:
> >> >
> >> >From http://dev.w3.org/html5/webdatabase/:
> >>
> >> "The openDatabase() method on the Window and WorkerUtils interfaces 
> >> must return a newly constructed Database object that represents the 
> >> database requested."
> >>
> >> The spec does not make it clear what the UA on an extremely 
> >> resource-constrained device (e.g. a mobile phone) should do if the 
> >> requested size database size cannot be satisfied. Some 
> >> implementations return a null Database object if something has gone 
> >> wrong in the openDatabase() call but (at least to me) the spec does 
> >> not seem to permit this and simply returning null does not 
> >> particularly help an application adapt gracefully to the availability 
> >> of only a small database.
> >>
> >> Consequently, I would hope that this could be improved in some 
> >> fashion. Three possible modifications to the spec occur to me. From 
> >> the viewpoint of webdatabase developer, I prefer (1), could work with 
> >> (2) and would greatly dislike (3). Is this reasonable?
> >>
> >> 1. Retain the existing def'n of openDatabase but add a property on
> >> interface Database:
> >>   unsigned long minimumCapacity;
> >> which returns the amount of storage that the UA guarantees to be
> >> present in the database at the time of opening. The UA should try to
> >> set minimumCapacity so that QUOTA_ERR will be extremely unlikely if
> >> the database client code never writes more than minimumCapacity bytes
> >> to the database.

The problem with this is that the units of capacity are more or less 
meangingless. The same data on a device with disk compression using UTF-8 
with byte-aligned fields in storage will report a very different number 
than a device with redundant storage using UTF-32 with kilobyte-aligned 
fields, even if they are both able to store only one more block of data.

> >> 2. A language change:
> >>
> >> "The user agent may raise a SECURITY_ERR exception instead of 
> >> returning a Database object if the request violates a policy decision 
> >> (e.g. if the user agent is configured to not allow the page to open 
> >> databases)."
> >>
> >> to something like this:
> >>
> >> "The user agent must raise a SECURITY_ERR exception instead of
> >> returning a Database object if the request violates a policy decision
> >> (e.g. if the user agent is configured to not allow the page to open
> >> databases) or the estimatedSize of the database cannot currently be
> >> satisfied (e.g. the UA is running from a read-only volume or the
> >> estimatedSize exceeds the free space on the volume.)"

That can be a policy decision, if you like. However, I wouldn't recommend 
it. Preventing reads from a readonly medium seems unnecessary, and you 
never know when the disk space might become available. For example, there 
might only be 1023KB left now, but what if the user then deletes a 10GB 
video file? Should he have to reload the Web app?

> >> 3. An alternative language change:
> >>
> >> "The openDatabase() method on the Window and WorkerUtils interfaces
> >> must return a newly constructed Database object that represents the
> >> database requested."
> >>
> >> to
> >>
> >> "The openDatabase() method on the Window and WorkerUtils interfaces
> >> must return a newly constructed Database object that represents the
> >> database requested or null if openDatabase call has failed."

I don't see why it would fail because of quota issues.

> It makes it extremely difficult to build an application that both starts 
> up quickly and operates reliably.

It makes it no harder to do that than it is anyway, given that disk space 
availability can fluctuate wildly and in unpredictable ways. I have disks 
(Drobos) that claim to have 14TB free but really will fail if I go past 
2TB, unless I swap in a new 2TB disk, in which case it will hot-swap to a 
true capacity of about 3TB. I have network disks that claim to have 100GB 
free but where if I write 200GB to the disk it'll work fine, because the 
server administrator will free up disk space as I need it. The disk could 
be removable, disappearing at any time. The user could asynchronously and 
independently create or delete gigabytes of data as the script is writing 
to the database.

> Consider a mobile web application for reading email (Gmail for Mobile
> for example) where the database caches email locally.  Startup on a
> cellular network proceeds roughly like this:
> 
> 1. load the app from the Application cache
> 2. create the Database object
> 3. query the Database for some email
> 4. (Ideally) do some app work while waiting for the statement callback
> 5. display some email on the screen
> 6. request new emails from the server
> 7. interact with the user...
> 8.1 persist user changes to the database
> 8.2 receive new email from the server and write that into the database
> 
> The user perceives the app's startup time to be steps 1 through 5. But
> with notification on QUOTA_ERR, the app only knows if it has a fully
> functioning database several seconds after step 5 at the unsuccessful
> conclusion of step 8.1 or 8.2.

A mobile device today is actually one of the least difficult environments 
for this feature, as it typically has no direct user interaction with the 
file system, limited multitasking, no network-attached storage, no 
removable storage, and the browser vendors tends to be, or be associated 
with, the handset manufacturer, leading to a much closer integration.

If someone can write an app that works with databases on the desktop in 
the face of the weird storage issues there, then mobile phones will be 
trivial.

> Several choices exist to handle this:
> 
> * insert 2.1: write to the database and 2.2: get the success callback
> so that the app can adjust itself early for not having a working
> database. This works but adds several hundred ms to the time of 1..5
> on a mobile phone.
> * fail at 8.1 and relaunch the app in no-database mode. This is easy
> to implement correctly but users greatly dislike losing changes from
> step 7. At the frequency of occurrence of filling up the disk on a
> desktop, this would be a perfectly fine solution. At the frequency of
> occurrence of filling up the storage on an under-resourced mobile
> phone, it is not a fine solution.
> * Seamlessly switch at state 8.1 to no-database mode. Experience has
> shown that this is frustratingly hard get right in real
> implementations and still cannot guarantee saving user changes from
> step 7 if the network has failed.

I think a quality implementation will have to do the latter anyway.

> So, mostly what I'm requesting is that the UA provide some feedback as 
> early as possible (at stage 2 say) that makes a "highly likely" promise 
> of how much space is available in the database so that the app can 
> extend a similarly likely promise to its users that the app will operate 
> correctly with or without a network connection.

You could always try to write 5MB of data and see what happens. :-)

> For what it's worth, I believe that the option (1) choice of 
> minimumCapacity in my original email would be quite easy to implement on 
> WebKit for iPhone and Safari UAs: just return 5MB.

The iPhone has GBs of storage, though, so it's probably not the issue.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'