[whatwg] Web Workers feedback

Wed Mar 24 17:51:42 PDT 2010

On Thu, 10 Dec 2009, Simon Pieters wrote:
>
> Web Workers says in the SharedWorker constructor algorithm:
> 
> "Otherwise, if name is the empty string and there exists a 
> SharedWorkerGlobalScope object whose closing flag is false, and whose 
> location attribute is exactly equal to scriptURL, then let worker global 
> scope be that SharedWorkerGlobalScope object."
> 
> The location attribute is an object implementing WorkerLocation, which 
> is not a URL, so with a literal interpretation it will never be equal to 
> scriptURL. I guess it should say something like "whose location 
> attribute represents an absolute URL that is exactly equal to scriptURL" 
> instead.

Fixed. Thanks.

On Wed, 16 Dec 2009, Jan Fabry wrote:
> 
> Has it been considered to pass more than JSON data to workers? I could 
> not find a rationale behind this in the FAQ, or in other places I 
> looked. I understand the need for separation because of concurrency 
> issues, but aren't there other ways to accomplish this?
> 
> Would it be possible to do a deep copy of the function (object) you pass 
> to the the constructor? So copy everything (or mark it for 
> copy-on-write), but remove references to DOM elements if they exist. 
> This way, I think you can create a parallel data structure, so the 
> original one remains untouched (avoiding concurrency issues).
> 
> The important difference between this and the usual JSON-serializing of 
> objects that the examples talk about, is that functions can be passed 
> through too in an easy manner. If you have to simulate this using only 
> Javascript, you have to somehow bind the free variables, which requires 
> some introspection, and thus is not easy (if even possible?) to simulate 
> in "user space".

Passing functions doesn't really work, because as you say, you have to 
figure out how to rebind variables, the global scope, etc.

On Wed, 16 Dec 2009, Drew Wilson wrote:
>
> I'm not certain what a "deep copy" of the function means - would you 
> need to copy the entire lexical scope of the function? For example, 
> let's say you do this:
> 
> var foo = 1;
> 
> function setFoo(val) { foo = val; }
> function getFoo() { return foo; }
> 
> worker.postMessage(setFoo);
> worker.postMessage(getFoo);
> 
> foo = 2;
> 
> Then, from worker code, I call the copy of getFoo() - what should it 
> return (undefined? Does it pull over a copy of foo from the original 
> lexical scope, in which case it's 1)? What if foo is defined in a 
> lexical closure that is shared by both setFoo() and getFoo() - it seems 
> like the separate copies of setFoo() and getFoo() passed to the worker 
> would need to reconstruct a shared closure on the worker side, which 
> seems difficult if not impossible.
> 
> I think that some variant of data URLs and/or eval() gets you most of 
> what you really need here without requiring extensive JS VM gymnastics.

On Wed, 16 Dec 2009, Jan Fabry wrote:
> 
> In my idea, the free variables get bound at the moment you call 
> postMessage. The worker receives two different objects, they have 
> nothing in common.
> 
> After the first postMessage, the worker receives an object, that happens 
> to be a function, bound to some invisible variable (similar to the 
> "trick" to creates private variables in JS). When it is called, nothing 
> visible happens, because the foo variable is not visible on the outside.
> 
> After the second postMessage, the worker receives a new object, also a 
> function, bound to a variable with the value 1. If you call it, it 
> returns 1. If I call the function that the first postMessage delivered, 
> this does not affect the function that the second passed, since they are 
> bound to different copies of the same origin variables, and thus in 
> effect bound to different variables.
> 
> Image if we would create a JSONOnSteriods function. If you pass it 
> anything that a regular JSON serializer can handle, it gives the same 
> output (the regular serializer is recursive, so if you reverse the 
> process, you have in effect created a deep copy). My JSONOnSteriods 
> function would also be able to serialize functions (like you can do 
> using setFoo.toString()), but also notice the free variables and bind 
> them. If there is a client-side (userland?) function that would accept a 
> function and return the names of the free variables, I think you could 
> even simulate this in regular Javascript, but a solution in the VM would 
> be less kludgy and probably much faster.

This sounds very complicated and bug-prone!

On Wed, 16 Dec 2009, Boris Zbarsky wrote:
> On 12/16/09 1:27 PM, Jan Fabry wrote:
> > > function setFoo(val) { foo = val; }
> > > function getFoo() { return foo; }
> ...
> > After the second postMessage, the worker receives a new object, also a
> > function, bound to a variable with the value 1.
> 
> What if getFoo were:
> 
>   function getFoo() { return this["foo"]; }
> 
> What if it were:
> 
>   function getFoo() { return this["fo" + "o"]; }
> 
> What about:
> 
>   var o = "o";
>   function getFoo() { return this["fo" + o]; }
> 
> ?

On Thu, 17 Dec 2009, Jan Fabry wrote:
> On 16 Dec 2009, at 13:47, Boris Zbarsky wrote:
> > 
> >   function getFoo() { return this["foo"]; }
> > 
> >   function getFoo() { return this["fo" + "o"]; }
> > 
> >   var o = "o";
> >   function getFoo() { return this["fo" + o]; }
> 
> These three functions are equivalent to me. They will return this.foo, 
> but 'this' is a keyword that refers to the scope the function is called 
> in, it is not a regular variable.
> 
> Ignoring web workers, say we execute the following in a current 
> Javascript environment:
> 
> getFoo.call({'foo': 'otherFoo'})
> 
> will return 'otherFoo'. 
> 
> getFoo.call({})
> 
> will return undefined.
> 
> If no scope is given, the global scope is used, and then it depends on 
> the state of the variables on the worker side. If no 'foo' variable has 
> been defined, it will return undefined.
> 
> > Maybe a better question is: What problem are you trying to solve?
> 
> I do not have a concrete problem now, but I am imaging libraries that 
> currently use the nice features of Javascript, like functions being 
> passed around as parameters, to delegate certain behavior to code 
> written by users of their libraries. It took a while before the good 
> parts of Javascript were discovered, and we are happy that they exist, 
> so I think we should try to make web workers as good as possible too.
> 
> Much of this can probably be emulated, but, as Simon said in a related 
> discussion, regarding data: urls: [ 
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-September/023195.html 
> ]
> 
> "In particular, though, I suspect that people will work around this 
> limitation by one of the means we've come up with so far, or maybe 
> people with come up with even uglier workarounds. If we remove the 
> limitation, people will have no reason to come up with ugly hacks but 
> instead use the obvious supported way to do it, and it will be easier to 
> debug and follow code."
> 
> (btw Jonathan, I think the last reply in that discussion was from Ian: [ 
> http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-October/023588.html 
> ])
> 
> 
> When this discussion is over, I want to know why it is not implemented: 
> because it leads to some undefinable situations, because it would be too 
> complicated to teach to developers what does and what doesn't work, or 
> because it is too difficult for implementors to do it right. But when I 
> look at what browsers can do these days, I have not yet seen a limit to 
> the intelligence of their developers :-)

It seems to me to be a very complicated and bug-prone idea for something 
that doesn't have concrete use cases. It might make sense to support it 
later if a clear need appears (which we determine by looking at the hacks 
that people use to work around the lack of support), but in the absence of 
such a need, I don't see much benefit to adding this.

On Wed, 16 Dec 2009, Jan Fabry wrote:
> 
> The Google Gears API seems to provide both createWorker(scriptText) and 
> createWorkerFromUrl(scriptUrl). Why was only the URL variant retained in 
> the Web Workers spec? With the script variant, there would have been at 
> least a little basis for more dynamic programming.

Based on feedback from the Gears team, the string form was dropped based 
on few people needing it.

On Wed, 16 Dec 2009, Jonathan 'J5' Cook wrote:
> 
> Did anything ever come out of the earlier request-less web worker 
> discussion?

Not yet, but we're mostly waiting for authoring experience before adding 
more features. It depends what people really need, but it's hard to 
determine that without a crowd of authors trying to use what we have.

On Fri, 18 Dec 2009, JOSE MANUEL CANTERA FONSECA wrote:
> 
> I was wondering if with HTML5 Workers is possible to have the same 
> functionality on Cross Origin Workers as in Google Gears [1].
> 
> Any help would be appreciated
> 
> [1] http://code.google.com/intl/es-ES/apis/gears/gears_faq.html#crossOriginWorker

On Fri, 18 Dec 2009, Anne van Kesteren wrote:
> 
> FYI: They're not HTML5 Workers, just Web Workers. And no, they do not 
> have cross-origin functionality currently.

On Fri, 18 Dec 2009, JOSE MANUEL CANTERA FONSECA wrote:
> 
> Any plans for adding such a functionality or alternatively are there 
> other mechanisms proposed or foreseen?

On Fri, 18 Dec 2009, Anne van Kesteren wrote:
> 
> Pretty sure we'll add it in due course. You can use postMessage() and 
> <iframe> or XMLHttpRequest (not everywhere yet) for cross-origin 
> communication if that is your main use case.

Indeed, this is something we'll probably add relatively soon. I'm mostly 
just waiting for implementations to be more widely available and used.

On Sun, 20 Dec 2009, ATSUSHI TAKAYAMA wrote:
> 
> I'm wondering if calling (postMessage-ing to) Web Workers from a worker 
> thread is possible.
> 
> The use case I have in mind is to do a recursive calculation. So we're 
> not only able to do this;
> 
> - Main Thread (waits for results from workers)
> -- Worker 1
> -- Worker 2
> -- Worker 3
> 
> but also able to do this kind of thing;
> 
> - Main Thread (waits for results from it's own workers)
> -- Worker 1 (waits for results from it's own workers)
> --- Worker 1-1
> --- Worker 1-2
> --- Worker 1-3 (waits for results from it's own workers)
> ---- Worker 1-3-1
> ---- Worker 1-3-2
> ---- Worker 1-3-3
> -- Worker 2 (waits for results from it's own workers)
> --- Worker 2-1
> --- Worker 2-2
> --- Worker 2-3
> -- Worker 3 (no more recursion)

On Mon, 21 Dec 2009, Simon Pieters wrote:
> 
> Sure. The spec has an example of this (1.2.5 Delegation).

Indeed.

On Tue, 23 Feb 2010, Simon Pieters wrote:
>
> The Web Worker's first example of shared workers is quite involved and 
> not so easy to follow if you haven't dealt with shared workers before. 
> For someone wanting to experiment with shared workers, it's easier to 
> grasp how things work by doing something very basic first. It would be 
> useful if the spec had an example for this.

I've added the examples you suggested. Thanks.

On Thu, 25 Feb 2010, Simon Pieters wrote:
> 
> The worker script could be modified in step 3 as follows to make it clear that
> the script is in fact shared:
> 
> test.js
> var i = 0;
> onconnect = function(e) {
>  i++;
>  var port = e.ports[0];
>  port.postMessage('hello, ' + i);
>  port.onmessage = function(e) {
>    port.postMessage('pong');
>  }
> }

I applied this idea to the demo.

On Thu, 25 Feb 2010, Drew Wilson wrote:
>
> BTW, I think it's valuable to point out in the example that 
> MessageEvent.target == the port that received the message (so we don't 
> need to use a closure as in the example below - just use 
> event.target.postMessage()).

I have mentioned this in the second "step" of this example.

> This is slightly outside the scope of this discussion, but I've heard 
> rumblings about the (w3c?) community collectively developing some sort 
> of unified test suite for HTML5 APIs like SharedWorkers. Is someone 
> driving that effort forward? I say this because I've tried to put 
> together an ad hoc test suite for the WebKit implementation, but I'm 
> sure it's missing a number of obscure cases, so having a canonical suite 
> would be really valuable to ensure compatibility.

I am not aware of any common test suite work on this, but I certainly 
encourage people interested in this to work on one! If hosting space on 
the whatwg.org site would be useful I can easily set that up.

On Thu, 25 Feb 2010, ATSUSHI TAKAYAMA wrote:
> 
> Right now, in the spec "2.7.5 Safe passing of structured data", it says
> 
> If input is another native object type (e.g. Error) Return the null 
> value.
> 
> but if we want to debug workers, it's more convenient to be able to pass 
> the error directly rather than
> 
> postMessage({name: err.name, message: err.message})
> 
> which loses all information like line number, etc. or we will just start 
> cloning every property of the Error (stack, lineNumber, stacktrace, etc 
> depending on the browser).
> 
> I think that's an unnecessary chore for all developers. We should just 
> be able to postMessage an error.
> 
> (of course, the best solution would be to be able to console.debug from 
> a worker thread, but it's not a standard way yet)

On Fri, 26 Feb 2010, Drew Wilson wrote:
>
> BTW, the spec says that unhandled exceptions are either propagated to 
> the parent Worker object (in the case of dedicated workers) or reported 
> to the user via the console (for shared workers).
> 
> So I'm not certain why passing Error objects via postMessage() would be 
> necessary for a spec-compliant UA (note that some UAs have bugs in their 
> implementation such that not all exceptions in Workers are logged to the 
> console, but that shouldn't motivate a change in the spec).

On Mon, 1 Mar 2010, Simon Pieters wrote:
> On Sun, 28 Feb 2010 23:03:31 +0100, ATSUSHI TAKAYAMA 
> <taka.atsushi at googlemail.com> wrote:
> > 
> > This is good to know. As far as I tested, Firefox and Safari actually 
> > supports worker.onerror.
> > 
> > It also turns out that Firefox and Chromium actually sends a clone of 
> > an Error object. Safari turns it into a string. Are they going to 
> > convert change the behavior in the future?
> 
> Internal builds of Opera converts Error to null as the spec says, but 
> we'd be happy to see the spec changed to say that Error, DOMException, 
> and similar things get cloned.
> 
> If you want to make sure your current code will continue to work, you 
> should probably toString() explicitly before postMessage()ing.

I haven't changed this. Exception objects are somewhat magical in some 
languages and it seems unwise for the spec to try to clone that magic 
across thread or process boundaries. We could, however, define some 
simpler cloning mechanism that, for Error and DOMException objects, clones 
only specific fields, returning a simple Object with specific properties 
(in particular, 'message'). However, it's not clear to me what the use 
case is here. As others have noted, for debugging, the error information 
is already expected to be transmitted across without any help from the 
author -- and debugging tools are likely to be much more useful here than 
being able to send Errors easily.

On Mon, 1 Mar 2010, ben turner wrote:
> 
> I'm implementing the structured clone algorithm and this part bothers me 
> a little bit:
> 
>   - If input is a host object (e.g. a DOM node)
>       Return the null value.
> 
> Seems like this has the potential to confuse web programmers somewhat. 
> If I were to write code like this:
> 
>   worker.postMessage(window);
> 
> I would expect something meaningful to happen as long as no exception 
> was generated. According to the spec, though, we would send null to the 
> worker and not generate any exception. Is that really desirable?
> 
> I like the idea of making the structured clone as friendly as possible 
> but maybe we should add some teeth to this case just like we do for 
> recursive objects?

We can't send "true" DOM objects across the divide, since implementations 
don't support DOMs in their worker implementations. I'm not really sure 
what else we should do. If we do something other than send null today, 
it's going to make it very difficult to later start cloning real DOM 
objects if we ever support that.

On Thu, 11 Mar 2010, Mikko Rantalainen wrote:
> timeless wrote:
> > On Tue, Mar 2, 2010 at 12:50 AM, ben turner <bent at mozilla.com> wrote:
> >>  - If input is a host object (e.g. a DOM node)
> >>      Return the null value.
> > 
> > The general reason, I believe for this behavior is if you have:
> > 
> > a=[x,y,z,q,r,s]; worker.postMessage(a) and r turns out to be window, 
> > you don't want to trigger an exception just because one value in a 
> > list is a native object.
> 
> Why do you think so? I'd expect an exception instead of potential data 
> loss (due to not being to able to post the actual data to the worker). 
> I'd be happy to filter the "r" out of the list if I need to, but I'd 
> hate to try to figure why *some* of the data I was posting does not show 
> up at the worker. Obviously, if I know that I cannot post "r" and I 
> don't want to do the filtering myself, it would be nice to have an extra 
> parameter for postMessage() telling that it's okay to drop some data if 
> it cannot be transferred but that should not be the default. However, I 
> would consider that a special case and API could do just fine without 
> such feature.

We could throw an exception, but that would make migrating from this not 
being supported to this being supported later a lot harder (you'd have to 
catch exceptions and then remove the nodes, rather than just doing null 
checks in the worker). I don't know that that's worth it.

On Mon, 15 Mar 2010, ATSUSHI TAKAYAMA wrote:
> > 
> > The only option that comes to mind that doesn't expose compatibility 
> > issues would be to only issue onclose events if close() is explicitly 
> > called on the entangled port, but if you're doing that you might as 
> > well just have the code calling close() post a "I'm closing" message 
> > first. -atw
> 
> This would mean that all web pages using a SharedWorker (and keep 
> reference to MessagePort inside the SharedWorker) have to set "unload" 
> event handlers to call port.close() so that references to the ports in 
> the SharedWorker don't get accumulated. That is not desirable. 
> "pagehide" handler may not be sufficient for this purpose.
> 
> I think, as Hixie suggested, an array like object to track references of 
> all connected ports would be nice *for SharedWorkers*. For ports created 
> dynamically by new MessageChannel, it doesn't seem to work well.

Well an array mechanism could just be a generic "port array" that you can 
add and remove ports from dynamically, it need not be just for received 
ports. I haven't added this yet, but would appreciate more feedback from 
vendors on whether they think it's worth adding this now or waiting.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'