On Fri, Sep 4, 2009 at 4:02 PM, Chris Jones <span dir="ltr"><<a href="mailto:cjones@mozilla.com">cjones@mozilla.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
I'd like to propose that HTML5 specify different schemes than a conceptual global storage mutex to provide consistency guarantees for localStorage and cookies.<br>
<br>
Cookies would be protected according to Benjamin Smedberg's post in the "[whatwg] Storage mutex and cookies can lead to browser deadlock" thread. Roughly, this proposal would give scripts a consistent view of document.cookie until they completed. AIUI this is stronger consistency than Google Chrome provides today, and anecdotal evidence suggests even their weaker consistency hasn't "broken the web."<br>
</blockquote><div><br></div><div>To be fair, IE is in the same boat...which makes this argument even stronger, I think.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
localStorage would be changed in a non-backwards-compatible way. I believe that web apps can be partitioned into two classes: those that have planned for running concurrently (single-event-loop or not) in multiple "browsing contexts", and those that haven't. I further posit that the second class would break when run concurrently in multiple contexts regardless of multiple event loops, and thus regardless of the storage mutex. Even in the single-event-loop world, sites not prepared to be loaded in multiple tabs can stomp each other's data even though script execution is atomic. (I wouldn't dare use my bank's website in two tabs at the same time in a single-event-loop browser.) In other words, storage mutex can't help the second class of sites.<br>
<br>
(I also believe that there's a very large, third class of pages that work "accidentally" when run concurrently in multiple contexts, even though they don't plan for that. This is likely because they don't keep quasi-persistent data on the client side.)<br>
<br>
Based on that, I believe localStorage should be designed with the first class of web apps (those that have considered data consistency across multiple concurrent contexts) in mind, rather than the second class. Is a conceptual global storage mutex the best way for, say, gmail to guarantee consistency of its e-mail/contacts database? I don't believe so: I think that a transactional localStorage is preferable. Transactional localStorage is easier for browser vendors to implement and should result in better performance for web apps in multi-process UAs. It's more of a burden on web app authors than the hidden storage mutex, but I think the benefits outweigh the cost.<br>
<br>
I propose adding the functions<br>
<br>
window.localStorage.beginTransaction()<br>
window.localStorage.commitTransaction()<br>
or<br>
window.beginTransaction()<br>
window.commitTransaction()<br>
<br>
(The latter might be preferable if we later decide to add more resources with transactional semantics.)<br>
<br>
localStorage.getItem(),. setItem(), .removeItem(), and .clear() would remain specified as they are today. beginTransaction() would do just that, open a transaction. Calling localStorage.*() outside of an open transaction would cause a script exception to be thrown; this would unfortunately break all current clients of localStorage. There might be cleverer ways to mitigate this breakage by a UA pretending not to support localStorage until a script called beginTransaction().<br>
<br>
yieldForStorageUpdates() would no longer be meaningful and should be removed.<br>
<br>
A transaction would successfully "commit", atomically applying its modifications to localStorage, if localStorage was not modified between beginTransaction() and commitTransaction(). Note that a transaction consisting entirely of getItem() could fail just as those actually modifying localStorage. If a transaction failed, the UA would throw a TransactionFailed exception to script. The UA would be allowed to throw this exception at any time between beginTransaction() and commitTransaction().<br>
<br>
There are numerous ways to implement transactional semantics. Single-event-loop UAs could implement beginTransaction() and commitTransaction() as no-ops. Multi-event-loop UAs could reuse the global storage mutex if they had already implemented that (beginTransaction() == lock, commitTransaction() == unlock).<br>
<br>
Some edge cases:<br>
<br>
* calling commitTransaction() without beginTransaction() would throw an exception<br>
<br>
* transactions would not be allowed to be nested, even on different localStorage DBs. E.g. if site A's script begins a transaction on A.localStorage, and calls into site B's script embedded in an iframe which begins a transaction on B.localStorage, an exception would be thrown.<br>
<br>
* transactions *could* be spread across script executions, alert() dialogs, sync XHR, or anywhere else the current HTML5 spec requires the storage mutex be released. Note that UAs wishing to forbid that behavior could simply throw a TransactionFailed exception where the storage mutex would have been released in the current spec. Or this could be made illegal by the spec.<br>
<br>
* it's not clear to me how to handle async XHRs and Worker messages sent from within a failed transaction. They could be specified to be sent or not and either behavior implemented easily. My gut tells me that they *should* be sent regardless.<br>
<br>
Feedback very much desired.<br>
<br>
Cheers,<br>
Chris<br>
<br>
Addendum: I think that a past argument against a transactional approach was that scripts can cause side effects during transactions that can't be (easily, performantly) rolled back. This is true, and troubling in that it deviates from SQL semantics, but because this proposal is designed for the first class of web apps I don't believe it's a compelling argument. Further, a script can only corrupt its browsing-context-local state by mishandling failed transactions. Using gmail as a convenient example, if a transaction failed but gmail wasn't prepared to handle the failure, that particular gmail instance would just break. No e-mails or contacts would be corrupted, and the user could reload gmail and regain full functionality. Servers should already be prepared to deal with clients behaving unpredictably.<br>
</blockquote></div><br><div>Very interesting. Some of the details I'm not sure about, but I think this is much better than what already exists. Enough better that I think it's worth breaking backwards compatibility.</div>
<div><br></div><div>I mostly agree with your assertions about the type of developer who's using localStorage. It sure would be nice if we could give developers powerful APIs and keep them simple and make it possible to implement them in a performant manner. Unfortunately, I think the current design cannot be changed to meet "possible to implement in a performant manner" without breaking backwards compatibility.</div>
<div><br></div><div>Part of me thinks that this API should match the WebDatabase API more. For example, you'd call a function with a callback. That callback would be given the localStorage object which it'd use to do manipulations. Etc. But part of me like what you're suggesting here. I actually think the idea of throwing an exception whenever there's a serialization problem could be very compelling, and could keep the door wide open for future performance enhancements. It's even possible that javascript engines could embed elements of software transactional memory in the future to eliminate the need to make such calls. That seems really exciting.</div>
<div><br></div><div>It might also be possible to combine the 2 ideas: you call a function with your callback and the callback is given a localStorage object which is only valid within the callback, but an exception can be thrown when there's a problem with the transaction. Of course, the benefit to explicitly starting and ending a transaction is that it can span setTimeouts, event handlers, etc. On the other hand, I wonder if the cases where an app would do this and still be able to recover from a transaction failure would be limited.</div>
<div><br></div><div>Another thing we might want to consider is making transactions optional. This would satisfy group 1 and 2, but would put the group 3 you mentioned at more risk. In other words, not calling beginTransaction would not be fatal. It would just mean localStorage works as currently spec'ed. But, doing it within a transaction (be it a callback or within ___Transaction calls) would give you additional guarantees.</div>
<div><br></div><div>Note that if we do decide to break backwards compatibility, there are some other things we should consider...but I won't bring those up unless we do decide to move in this direction.</div><div><br>
</div><div>Btw, I want to make it clear that I take the idea of breaking compatibility VERY seriously. I know LocalStorage is fairly well adopted and that changing this would be pretty major. But having a cross-event-loop, synchronous API is really a terrible idea. And changing it now will be easier than changing it later.</div>