[whatwg] WebWorkers vs. Threads
Shannon
shannon at arc.net.au
Wed Aug 13 01:14:14 PDT 2008
Jonas Sicking wrote:
> Shannon wrote:
>> I've been following the WebWorkers discussion for some time trying to
>> make sense of the problems it is trying to solve. I am starting to
>> come to the conclusion that it provides little not already provided by:
>>
>> setTimeout(mainThreadFunc,1)
>> setTimeout(workThreadFunc,2)
>> setTimeout(workThreadFunc,2)
>
> Web workers provide two things over the above:
>
> 1. It makes it easier for the developer to implement heavy complex
> algorithms while not hanging the browser.
I suppose the limitations of the current approaches depends largely on
what Javascript actions actually block a setTimeout or callback
"thread". I keep being told WebWorkers solves this problem but I don't
know any examples of code or functions that block the running of other
callbacks. As with Lua I have always treated setTimeout as a means to
execute code in parallel with the main "thread" and never had an issue
with the callback or main loop not running or being delayed.
> What you describe above is also known as cooperative multithreading.
> I.e. each "thread" has to manually stop itself regularly and give
> control to the other threads, and eventually they must do the same and
> give control back.
Actually I was referring to the browser forcefully interleaving the
callback execution so they appear to run simultaneously. I was under the
impression this is how they behave now. I don't see how Javascript
callbacks can be cooperative since they have no yield statement or
equivalent.
> I'm also unsure which mozilla developer has come out against the idea
> of web workers. I do know that we absolutely don't want the
> "traditional" threading APIs that include locks, mutexes,
> synchronization, shared memory etc. But that's not what the current
> spec has. It is a much much simpler "shared nothing" API which already
> has a basic implementation in recent nightlies.
He wasn't against WebWorkers, he was, as you say, against full
threading (with all the mutexes and locks etc... exposed to the JS
author). I can't find the reference site but it doesn't really matter
except from the point of view that many people (including myself) aren't
convinced a full pthread -like API is the way to go either. I just don't
see why locking can't be transparently handled by the interpreter given
that the language only interacts with true memory registers indirectly.
In other news...
Despite the feedback I've been given I find the examples of potential
applications pretty unconvincing. Most involve creating workers to wait
on or manage events like downloads or DB access. However Javascript has
evolved a fairly complex event system that already appears to provide a
reasonable simulation of parallelism (yes it isn't _true_ parallel
processing but like Luas coroutines that isn't really apparent to the
end user). In practice this means long-running actions like downloading
and presumably DB interaction are already reasonably "parallel" to the
main execution thread and/or any setTimeout "subprocesses". I would
suggest it is even possible for future browsers to shift some of these
activities to a "true" thread without any need for the authors explicit
permission.
I would really prefer that WebWorkers were at a minimum a kind of
syntactic sugar for custom callbacks (ie setTimeout but with arguments
and a more appropriate name). With the exception of OS threads it seems
to me that WebWorkers is at best syntactic sugar for existing operations
but with zero DOM access and serious IO limitations. Also unlike
existing options and normal threading conventions the WebWorker is
forced to download its code and import as a string its arguments rather
than have its code passed in as a function reference and its arguments
passed by reference or true value. I know all the reasons why these
limits exist; I'm just saying I think they render the whole proposal
mostly useless; kind of like a car that only runs on carrots.
I have come up with one valid use case of my own for the current
proposal: distributed computing like SETI or Folding at Home in Javascript.
This would allow you to participate ALL of your multi-core or SMP
computer resources to the project(s) just by visiting their site.
However on further consideration this has two major flaws:
1.) Being an interpreted language and having no direct access to MMX,
GPUs and hardware RNGs makes Javascript a poor choice for intensive
mathematical applications like these. I would expect a plugin or
standalone version of these tools to have anywhere from a 10x to 10,000x
improvement in performance depending on the calculations being performed
and the hardware available. Yes there are a few more risks and a few
more clicks but I wonder whether just having access to a few more
threads will sway these groups to start a seperate web-only codebase
(with all the extra maintenance involved).
2.) Computing power is a resource. It can be bought and sold. However it
can also be stolen and used by malicious sites for key/password
cracking, captcha solving, DDOS and other schemes. Downloading a plugin
or application for distributed computing is generally (on a secure
browser) an opt-in process. However it is debatable whether visiting a
website counts as opting-in; especially if the workers are being spawned
by ad banner iframes rather than the primary site. What (other than
political implications) stops hacked sites and banner networks from
selling processing time on my computer each time I visit a sponsored
site? True this could be happening now using only one thread but since
we are talking about unlocking more resources the issue of how they are
allocated becomes more relevant. Could we end up opening access to all
system CPUs just to have sites abuse it and browser vendors throttle it
anyway due to end-user complaints.
Anyway, I ask why is it necessary to provide a crippled threading
library for non-existent applications when the potential for a better
model surely exists. Right now I would like to see coroutines with full
global and DOM access adopted with an eye to one day running them across
multiple cores (using automated serialisation of read/writes on global
variables). This should greatly simplify moving legacy global/DOM
dependent code to our shiny new "multi-threaded" environment.
If we adopt WebWorkers in its current form I think we'll just have to
deprecate it in a few years due to it being clumsy and limited compared
to coroutines or future alternatives. Naturally this will mean all the
"top 100" corporate sites that rushed to implement it will hold off the
actual deprecation for many years beyond that (aka "quirks mode"). I
have no doubt that true multi-core web applications will happen, and I
welcome it. I just don't want to see it implemented like WebWorkers in
anything like its current form.
Shannon
More information about the whatwg
mailing list