<div>Great analysis. I only have a few comments/questions:</div><div><br></div>On Wed, Sep 9, 2009 at 1:41 PM, Chris Jones <span dir="ltr"><<a href="mailto:cjones@mozilla.com">cjones@mozilla.com</a>></span> wrote:<br>
<div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Jeremy Orlow wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
On Wed, Sep 9, 2009 at 4:39 AM, Chris Jones <<a href="mailto:cjones@mozilla.com" target="_blank">cjones@mozilla.com</a> <mailto:<a href="mailto:cjones@mozilla.com" target="_blank">cjones@mozilla.com</a>>> wrote:<br>
<br>
Aaron Boodman wrote:<br>
<br>
On Tue, Sep 8, 2009 at 11:23 AM, Chris Jones<<a href="mailto:cjones@mozilla.com" target="_blank">cjones@mozilla.com</a><br></div><div class="im">
<mailto:<a href="mailto:cjones@mozilla.com" target="_blank">cjones@mozilla.com</a>>> wrote:<br>
<br>
In general, I agree with Rob about this proposal. What<br>
problem with storage<br>
mutex as spec'd today does your proposal solve?<br>
<br>
<br>
The spec requires a single storage mutex for the entire UA.<br>
Therefore<br>
in a MELUA a web page can become unresponsive while waiting for some<br>
other page to give up the lock. This is not good and something<br>
we have<br>
tried to avoid everywhere else in the spec.<br>
<br>
Attempts to address this by doing per-origin locks wind up with<br>
deadlocks being possible.<br>
<br>
Aaron Boodman wrote:<br>
<br>
On Tue, Sep 8, 2009 at 1:41 AM, Robert<br>
O'Callahan<<a href="mailto:robert@ocallahan.org" target="_blank">robert@ocallahan.org</a><br></div>
<mailto:<a href="mailto:robert@ocallahan.org" target="_blank">robert@ocallahan.org</a>>><div><div></div><div class="h5"><br>
wrote:<br>
<br>
What is the intended semantics here? Chris' explicit<br>
commitTransaction<br>
would<br>
throw an exception if the transaction was aborted<br>
due to data<br>
inconsistency,<br>
leaving it up to the script to retry --- and making<br>
it clear to script<br>
authors that non-storage side effects during the<br>
transaction are not<br>
undone.<br>
How would you handle transaction aborts?<br>
<br>
Calls to transaction() are queued and executed serially<br>
per-origin<br>
with exclusive access. There is no such thing as a<br>
transaction abort<br>
because there cannot be consistency problems because of<br>
the serialized<br>
access.<br>
<br>
No, transactions can still fail. They can fail in ways<br>
immediately hidden<br>
from the script that requested them if the UA has to<br>
interrupt the<br>
conceptually executing transaction in the ways enumerated in<br>
a separate<br>
branch of this thread. Later script executions can observe<br>
inconsistent<br>
state unless more is specified by your proposal.<br>
<br>
Transactions can also fail visibly if write-to-disk fails<br>
(probably also in<br>
other ways I haven't considered). It's not clear what<br>
should happen wrt to<br>
your proposal in this case.<br>
<br>
<br>
If so, I agree with roc's responses to them that they could probably<br>
be handled without surfacing errors to the developer.<br>
<br>
OTOH, I'm not really against adding the concept of fallibility here.<br>
<br>
In fact, I believe that the "Synchronous database API"<br>
describes the same<br>
transaction semantics as I proposed in the OP. That spec<br>
adds implicit<br>
begin/commitTransaction and read-only transactions, but<br>
otherwise the<br>
semantics are the same.<br>
<br>
So I'd like to amend my original proposal to be<br>
<br>
Use Synchronous Web Database API transaction semantics.<br>
Except do not<br>
offer readTransaction: a transaction is implicitly a<br>
read-only transaction<br>
if only getItem() is called on localStorage from within<br>
localStorage.transaction().<br>
<br>
<br>
Agree. That is what I was trying to propose, too. I'm not sure where<br>
we disagree :). Is it just that my proposal has no concept of<br>
errors?<br>
I'm not against adding them, mainly I was trying to keep my proposal<br>
simple for purposes of discussion.<br>
<br>
<br>
Ay, there's the rub: I think the disagreement is between "mutex" vs.<br>
"transaction" semantics. So far, I think perhaps "mutex" has been<br>
used as shorthand for "transaction." But they aren't the same.<br>
<br>
I think we all agree that a script may fail to modify localStorage<br>
in some situations (irrespective of global mutex vs. per-domain<br>
mutex). One camp, wanting "mutex" semantics, would prefer to pretend<br>
that the failures never happen and let scripts clean up the mess<br>
(partially-applied changes) if they do occur. This is semantically<br>
broken, IMHO.<br>
<br>
The second camp, wanting "transaction" semantics, explicitly<br>
acknowledge to web authors that localStorage is fallible, guarantee<br>
that modifications to localStorage are atomic, and notify scripts<br>
when modifications can't be made atomically. This is the same<br>
approach taken by Web Database. IMHO, this is much better<br>
semantically because (i) it gives web apps stronger guarantees; and<br>
(ii) it makes the discussion about global mutex/per-domain<br>
mutex/non-blocking an implementation issue rather semantic issue, as<br>
it should be.<br>
<br>
Can those in the first camp explain why "mutex" semantics is better<br>
than "transaction" semantics? And why it's desirable to have one DB<br>
spec specify "transaction" semantics (Web Database) and a second<br>
specify "mutex" semantics (localStorage)?<br>
<br>
<br>
The way I understand it, there's 3 camps...and I think they've been abusing both the word transaction and mutex. We should probably all start being more precise with our wording in this respect. :-)<br>
<br>
</div></div></blockquote>
<br>
I'd like to refine the above description of the design space. I think there are three main design decisions: what ACID properties are guaranteed and at what granularity, sync and/or async API, and whether or not scripts can be notified when modifications to localStorage fail.<br>
<br>
In the current localStorage spec, the unit of atomicity/consistency is each modification (setItem()/removeItem()/clear()) of localStorage. But the unit of isolation is all operations to localStorage between acquiring the storage mutex and releasing it. And durability isn't specified AFAICT. And AFAICT, scripts can observe some failed modifications to localStorage, but not all.<br>
<br>
In the current Web Database spec, the unit of A/C/I is each transaction, i.e., all executeSql() statements invoked on a Transaction object. Durability isn't defined, but it seems reasonable to assume that successful Transactions should be durable (best effort). So a Transaction object is (best-effort) ACID. Scripts *can* observe failed transactions and thus "rolled-back" changes.<br>
<br>
The first point on which the new proposals for localStorage in this thread differ is whether to guarantee ACID (best effort) at a *uniform* granularity or not. All the proposals have some notion of "begin" and "end". All of the proposals seem to want all operations between begin and end to be isolated (although some implementations in the wild do not guarantee this). Some choose individual operations (get/set/remove/clear) of localStorage as the unit of atomicity/consistency. This allows for some modifications between begin and end to be applied even if all changes couldn't be applied. Others choose all modifications between begin and end as the unit of atomicity/consistency. For this last group, "end" really means "commit", because begin/commit define a transaction in the sense of Web Database's Transaction objects.<br>
<br>
Semantically, an async vs. sync API doesn't change anything. It does, however, affect the optimizations available to implementations. An async callback might only be invoked by a SELUA when localStorage was loaded from disk into memory, so that the app could handle events in the mean time rather than blocking on disk. In addition, a MELUA with a mutex implementation might only invoke the localStorage callback when the mutex could be acquired (e.g. only when a trylock() succeeded). I'm beginning to be convinced that async callbacks are superior because of more flexible (and possibly performant) implementation options.<br>
<br>
Finally there's observable vs. unobservable "failures." What "failure" means depends on the subset of ACID preserved, and at what granularity. Some proposals do not allow scripts to observe failures. For any proposal wishing to expand the unit of atomicity/consistency beyond single modifications (single set/remove/clear), I believe that the proposal must immediately terminate web apps if all changes between begin/end could not be applied. Otherwise the UA has the non-option of either exposing non-atomic or inconsistent changes to localStorage, or allowing side-effecty script statements to complete in between attempted modifications to localStorage that fail. Other proposals explicitly *allow* scripts to be notified of failures, with the intention that a script could retry failed modifications. One use for such an API is a localStorage implementation with optimistic transactions, i.e. transactions implemented with STM-like techniques (which is what I had in mind with the OP).<br>
<br>
(For the latter, Rob O'Callahan proposed a very interesting "localStorage developer/debug mode" in which the UA would always fail a transaction at least once before succeeding. This would allow authors to ensure that they uniformly handled failed transactions. This could even be exposed as localStorage.__debug__ or somesuch rather than through UA-specific preferences.)<div class="im">
<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Those who want pessimistic transactions. I.e. using locking so that you never need to do a rollback (because it can never "fail"). This would be compatible with either a sync or an async interface.<br>
<br>
</blockquote>
<br></div>
By the above characterization: { uniform granularity of ACID (traditional transactions), async/sync unspecified, unobservable transaction failures }.<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Those who optimistic transactions. I.e. rollback may happen. Either we need to restrict what can be done during a localStorage transaction or we need to have an exception that tells the script to undo itself. This was the original proposal, AFAICT. It would work with both a sync or an async interface.<br>
<br>
</blockquote>
<br></div>
{ Traditional transactions, sync/async unspecified, observable transaction failures }.<br>
<br>
I should note that I'm now of the opinion that { traditional transactions, async, observable transaction failures } is the way to go.<div class="im"><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Those who want a queue. I.e. those who want an asynchronous callback based interface and the UA will only call one callback at a time. Perhaps on a per-origin basis. Note that this can never "fail", need to be rolled back, etc.<br>
<br>
</blockquote>
<br></div>
This sounds to me like { traditional transactions, async, unobservable transaction failures } which is the same as your first camp above except async only. Or are you proposing that the unit of atomicity/consistency is not all operations performed in the callback; i.e., that modifications done in the callback can be partially applied?</blockquote>
<div><br></div><div>It's just an implementational difference. A queue means that the event loop can continue processing stuff while waiting for the 'lock' (which maybe is better described as an 'update token' or something). If you implement it as a lock (which you would for a synchronous interface) then the event loop is blocked.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I believe Aaron is in the queue camp with me. I'm becoming more and more convinced that Chromium should/will not implement the storage mutex at all (even for LocalStorage) unless we can come up with a way for event loops to not be blocked. And, as far as I can tell, Async interfaces are the only way to accomplish this.<br>
</blockquote>
<br></div>
In general, agreed. I still believe that a sync API</blockquote><div><br></div><div>The problem with a sync interface, especially if it's one that can be held after the top level script context, is deadlock issues with WebDatabase (and possibly others). What's there now doesn't have this issue because you'd never have the lock when calling the database transaction callback.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">with exposed transaction failures</blockquote><div><br></div><div>You'll only have transaction failures in an optimistic transaction model, right? So is that what you're suggesting?</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">(as I proposed in the OP) and the right implementation could do quite well. But I now think that an async version of that same API could perform even better. In addition, that API is most flexible in terms of possible UA implementations.</blockquote>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><br></blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
IOW, I think that { traditional transactions, async, observable failures } subsumes both { traditional transactions, sync, observable failures } (OP's proposal) *and* { traditional transactions, async, unobservable failures } (your and Aaron's proposal).<br>
<br>
<br>
IMHO there are two remaining questions: first, whether the "ideal" localStorage transactional API should allow observable transaction failures. I believe that it should, as this allows for the widest variety of efficient implementations without changing ACID (best effort) guarantees given to authors or significantly complicating the localStorage API.<br>
</blockquote><div><br></div><div>What failures could there be in a pesimistic/queue model?</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Second, what is the best way to go forward with transactional localStorage while remaining backwards-compatible with current implementations. One option would be to deprecate localStorage in favor of a future, transactional window.domainStorage or somesuch.<br>
</blockquote><div><br></div><div>If we do this, we might as well just adopt something like the WebSimpleDatabase proposal (which I still haven't gotten around to reading yet) which seems much more powerful in many other ways.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Another, probably better, option is Maciej's proposal, a two-headed localStorage. The non-transactional localStorage would be deprecated and remain spec'd as today { single-modification AC/storage-mutex I/undefined D, sync, some observable failures }.</blockquote>
<div><br></div><div>This is how I'd lean.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">In addition, for cases like "clear private data", UAs would be allowed to silently break storage-mutex isolation for apps using the non-transactional API.<br>
</blockquote><div><br></div><div>I think it'd be better if they waited for the lock to be freed.</div></div>