<div class="gmail_quote">On Wed, Apr 1, 2009 at 3:17 PM, Robert O'Callahan <span dir="ltr"><<a href="mailto:robert@ocallahan.org">robert@ocallahan.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div class="im">On Thu, Apr 2, 2009 at 11:02 AM, Robert O'Callahan <span dir="ltr"><<a href="mailto:robert@ocallahan.org" target="_blank">robert@ocallahan.org</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204, 204, 204);padding-left:1ex">
(Note that you can provide hen read-only scripts are easy to optimize for full parallelism using )</blockquote><div> </div></div></div>Oops!<br><br>I was going to point out that you can use a reader/writer lock to implement serializability while allowing read-only scripts to run in parallel, so if the argument is that most scripts are read-only then that means it shouldn't be hard to get pretty good parallelism.</blockquote>
<div><br></div><div>The problem is escalating the lock. If your script does a read and then a write, and you do this in 2 workers/windows/etc you can get a deadlock unless you have the ability to roll back one of the two scripts to before the read which took a shared lock. If both scripts have an 'alert("hi!");' then you're totally screwed, though.</div>
<div><br></div><div>There's been a LOT of CS research done on automatically handling the details of concurrency. The problem has to become pretty constrained (especially in terms of stuff you can't roll back, like user input) before you can create something halfway efficient.</div>
<div><br></div><div><span class="Apple-style-span" style="font-family: 'Times New Roman'; font-size: 16px; "><div style="margin-top: 8px; margin-right: 8px; margin-bottom: 8px; margin-left: 8px; font: normal normal normal small/normal arial; ">
<br><div class="gmail_quote">On Wed, Apr 1, 2009 at 3:02 PM, Robert O'Callahan <span dir="ltr"><<a href="mailto:robert@ocallahan.org">robert@ocallahan.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
<div class="im">On Thu, Apr 2, 2009 at 7:18 AM, Michael Nordman <span dir="ltr"><<a href="mailto:michaeln@google.com" target="_blank">michaeln@google.com</a>></span> wrote:<br></div><div class="gmail_quote"><div class="im">
<blockquote class="gmail_quote" style="margin-top: 0pt; margin-right: 0pt; margin-bottom: 0pt; margin-left: 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex; ">
I suggest that we can come up with a design that makes both of these camps happy and that should be our goal here.<div><br></div><div>To that end... what if...</div><div><br></div><div>interface Store {</div><div> void putItem(string name, string value);</div>
<div> </div><div> string getItem(string name); </div><div> // calling getItem multiple times prior to script completion with the same name is gauranteed to return the same value</div><div> // (unless the current script had called putItem, if a different script had called putItem concurrently, the current script wont see that)</div>
<div><br></div><div> void transact(func transactCallback);</div><div> // is not guaranteed to execute if the page is unloaded prior to the lock being acquired</div><div> // is guaranteed to NOT execute if called from within onunload</div>
<div> // but... really... if you need transactional semantics, maybe you should be using a Database?</div><div><br></div><div> attribute int length;</div><div> // may only be accessed within a transactCallback, othewise throws an exception</div>
<div> </div><div> string getItemByIndex(int i);</div><div> // // may only be accessed within a transactCallback, othewise throws an exception</div><div>};</div></blockquote><blockquote class="gmail_quote" style="margin-top: 0pt; margin-right: 0pt; margin-bottom: 0pt; margin-left: 0.8ex; border-left-width: 1px; border-left-style: solid; border-left-color: rgb(204, 204, 204); padding-left: 1ex; ">
<div></div><div><br></div><div><br></div><div>document.cookie;</div><div>// has the same safe to read multiple times semantics as store.getItem()</div><div><br></div><div><br></div><div>So there are no locking semantics (outside of the transact method)... and multiple reads are not error prone.</div>
<div><br></div><div>WDYT?</div></blockquote></div><div><br>getItem stability is helpful for read-only scripts but no help for read-write scripts. For example, outside a transaction, two scripts doing putItem('x', getItem('x') + 1) can race and lose an increment.</div>
</div></blockquote><div><br></div><div>Totally agree that it doesn't quite work yet.</div><div><br></div><div>But what if setItem were to watch for unserializable behavior and throw a transactCallback when it happens? This solves the silent data corruption problem, though reproducing the circumstances that'd cause this are obviously racy. Of course, reproducing the deadlocks or very slow script execution behavior is also racy.</div>
<div><br></div><div> </div><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
<div class="gmail_quote"><div>Addressing the larger context ... More than anything else, I'm channeling my experiences at IBM Research writing race detection tools for Java programs ( <a href="http://portal.acm.org/citation.cfm?id=781528" target="_blank">http://portal.acm.org/citation.cfm?id=781528</a> and others), and what I learned there about programmers with a range of skill levels grappling with shared memory (or in our case, shared storage) concurrency. I passionately, violently believe that Web programmers cannot and should not have to deal with it. It's simply a matter of implementing what programmers expect: that by default, a chunk of sequential code will do what it says without (occasional, random) interference from outside.</div>
</div></blockquote><div><br></div><div>I definitely see pro's and cons to providing a single threaded version of the world to all developers (both advanced and beginner), but this really isn't what we should be debating right now.</div>
<div><br></div><div>What we should be debating is whether advanced, cross-event-loop APIs should be kept simple enough that any beginner web developer can use it (at the expense of performance and simplicity within the browser) or if we should be finding a compromise that can be kept fast, simple (causing less bugs!), and somewhat harder to program for.</div>
<div><br></div><div>If someone wants to cross the event loop (except in the document.cookie case, which is a pretty special one), they should have to deal with more complexity in some form. Personally, I'd like to see a solution that does not involve locks of any sort (software transactional memory?).</div>
<div><br></div><div> </div><blockquote class="gmail_quote" style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex; ">
<div class="gmail_quote"><div>I realize that this creates major implementation difficulties for parallel browsers, which I believe will be all browsers. "Evil', "troubling" and "onerous" are perhaps understatements... But it will be far better in the long run to put those burdens on browser developers than to kick them upstairs to Web developers. If it turns out that there is a compelling performance boost that can *only* be achieved by relaxing serializability, then I could be convinced ... but we are very far from proving that.</div>
</div></blockquote><div><br></div><div>Like I said, a LOT of research has been done on concurrency. Basically, if you're not really careful about how you construct your language and the abstractions you have for concurrency, you can really easily back yourself into a corner that you semantically can't get out of (no matter how good of a programmer you are).</div>
</div></div></span></div></div>