[whatwg] WebWorkers vs. Threads
Shannon
shannon at arc.net.au
Wed Aug 13 11:50:31 PDT 2008
Kristof Zelechovski wrote:
> A background task invoked by setTimeout has to be split to small chunks;
> _yielding_ occurs when each chunk ends (having called setTimeout to execute
> the next chunk). It is very hard to code in this way; you have to maintain
> an explicit stack and create an exit/entry point at every chunk boundary.
> This technique is interesting as an academic exercise only, real-world
> developers will be right to stay away from it.
>
I'm not sure I get your meaning. If this is how current browsers
implement setTimeout then how is it "academic"? Also since nobody is
talking about deprecating setTimeout I don't see how its relevant.
Whatever happens setTimeout remains an issue that real-world developers
can't stay away from.
> Guarding concurrent access to global variables is not enough if those
> variables hold references to objects because an object can end up in a
> logically inconsistent state if two threads try modifying its properties
> concurrently. The objects would have to be lockable to avoid corrupting
> global state.
> Even if you limit yourself to scalar variables, there is nothing to prevent
> a script to define a compound state as a set of scalar variables, each one
> with its own name. While it is not a good programming practice, old code
> does it a lot because it is (or was) more efficient to say 'gTransCount'
> than 'gTrans.count'.
> Chris
Ok I'm clear on that, these are good arguments for providing explicit
locking. I'm still not clear on how variable race conditions in multiple
interleaved setTimeout chunks would be different for true threads but
I'll take your word for it that automated locking is hard or impossible
to implement.
What I really don't understand is how the WebWorkers proposal solves
this. As far as I can tell it does some hand-waving with MessagePorts to
pretend it goes away but what happens when you absolutely DO need
concurrent access to global variables - say for example the DOM - from
multiple threads? How do you perform any sort of synchronisation?
Take the example given:
{ var la = g.i; g.i = la + 1 }
The WebWorkers implementation (scary! hide your children!!):
--- worker.js ---
updateGlobalLa = function (e) {
var localLa = someLongRunningFunction( e );
workerGlobalScope.port.postMessage("set la = "+ localLa);
}
workerGlobalScope.port.AddEventListener("onmessage", updateGlobalLa, false);
workerGlobalScope.port.postMessage("get la");
--- main.js ---
// global object or variable
var la = 0;
handleMessage = function(e) {
if (typeof e.match("set la"))
la = parseInt(e.substr(3));
} else if (typeof e.match("get la")) {
worker.postMessage(la.toString());
}
}
var worker = new Worker("worker.js");
worker.AddEventListener("onmessage", handleMessage, false);
Unlike the one-line example above we increment the global value based on
some long-running calculation on its original value (rather than just
add 1). This shows a more realistic use case for threading.
Unfortunately our potentially dangerous one-liner is now an equally
dangerous 18-line monster spread over 2 files and we STILL haven't
solved the issue of another worker or the main context updating 'la'
between our original postMessage query and our response.
I should also point out that even this simple, naive and probably
incorrect example still took me nearly 2 hours to write - largely due to
the complexity of the WebWorkers spec and the lack of any decent
examples. Honestly anyone who thinks this interface is supposed to make
things easier is kidding themselves.
Regardless of the kind of Getters/Setters/Managers/Whatever paradigm you
use in your main thread you can never escape the possibility that 2
workers might want exclusive access to an essential global object (ie,
DOM node or global setting). So far I have not found any real-world
programming language or hardware that can do this without some kind of
side-effect or programming construct (ie, locks, mutexes, semaphores,
etc...). What WebWorkers is really doing is requiring the author to
write their own.
In other words despite all the complexity and limitations of workers all
that's actually achieved is:
a.) Synchronisation problems simply promoted to the message queue level.
b.) Decrease in performance due to horrible string-only messaging interface.
c.) Increase in browser and javascript bugs due to API complexity.
d.) Decrease in programmer interest in using threads (I certainly
wouldn't use them in their current state).
I don't think I can stress enough how many important properties and
functions of a web page are ONLY available as globals. DOM nodes, style
properties, event handlers, window.status ... the list goes on. These
can't be duplicated because they are properties of the page all workers
are sharing. Without direct access to these the only useful thing a
worker can do is "computation" or more precisely string parsing and
maths. I've never seen a video encoder, physics engine, artificial
intelligence or gene modeller written in javascript and I don't really
think I ever will. Apart from being slow there is the obvious
correlation that anything that complex is:
a.) The realm of academics and science geeks using highly parallel
specialist systems and languages, not web developers.
b.) Valuable enough to be commercial software - and therefore requiring
protection against illicit copying (something Javascript can't provide).
Shannon
More information about the whatwg
mailing list