[whatwg] WebWorkers vs. Threads

Thu Aug 14 04:47:56 PDT 2008

On Thu, Aug 14, 2008 at 10:06 PM, Shannon <shannon at arc.net.au> wrote:

> On second thoughts I withdraw these claims. I don't have the statistics to
> know one way or the other why "portable threads" are more prevalent than
> "share nothing" ones. There may be many reasons but latencies probably isn't
> one of them. It could just be fashion or convenience.
>

Excuse my rant:

Shared-memory threads and locks are a relatively easy model to implement,
they *appear* to be a straightforward extension to existing platforms and
programming languages, and they don't require much implementation wizardry
to provide good performance in simple apps written by really smart people.
So they're the natural concurrency extension that everyone adds to their
platform first. Then people build really complicated apps that don't scale
well and have strange problems with intermittent race conditions and
deadlocks and everyone wishes they had a better model, but by then it's too
late.

This isn't the right forum to have a big discussion about concurrent
programming models. But for the record, here are some of the problems with
the shared-memory, threads-and-locks model:

1) Choosing the scope of locks in time (when you lock and for how long) and
space (how much data is covered by each lock) is hard. Mistakes in one
direction lead to race conditions where unexpected thread interleavings
produce errors, and these are basically impossible to test for. Mistakes in
another direction lead to deadlocks which are also very difficult to test
for. There are tools that can help detect potential errors but they're no
panacea (I did research in this area at IBM).

2) Locks actually scale poorly. It's very hard to achieve high levels of
parallelism because you keep hitting lock contention and have to refine lock
granularity and then implement various esoteric optimizations to eliminate
false sharing etc.

3) It's really hard to do basic stuff well. Maurice Herlihy uses an example
of implementing a double-ended queue with concurrent access to both ends;
getting it to work correctly for lengths 0, 1, and > 1 is rocket science.

4) Threads and locks just don't support compositional reasoning about
programs. To avoid deadlocks and races you have to expose a lot of
information about the internals of functions --- what locks they might take,
what they might wait for, and what data they might access. For specific
domains like OS kernels you can impose rigid rules across the code and get
away with it. For more complex apps, especially ones with dynamic
extensiblity, that doesn't work well. Especially when not everyone on your
team is a genius.

Shared-nothing message passing systems like Workers have problems too and
won't be a good fit for some applications, but for the applications that
fit, they avoid a lot of problems, and they're really easy to implement.

There are other models, like replacing locks with a notion of atomic code
blocks --- gets rid of many problems, but a much harder model to implement
efficiently so you don't see it in production systems yet. A related popular
idea is adding transactions and transactional memory to the programming
language, but again, hard to implement well.

So for the programming model problem, we don't know what the right answer is
for all applications, but we do know that threads and locks are the wrong
answer.

Rob
-- 
"He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all." [Isaiah
53:5-6]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20080814/abc26d90/attachment-0001.htm>