[html5] JSON parsing in Web Worker

Tue Dec 28 11:21:20 PST 2010

Drew,

I tested Safari 5.0.2 (6533.18.5) and while it's one of the faster
browsers out there, my tests show that parsing 650kb json string takes
3x longer when I use webworker than when I parse it in the main
thread.

Parsing alone, take equivalent amount of time, it's the async
messaging and mainly transfer of data from the worker that adds 2x
overhead.

I use JSON.parse to do the parsing, and while this method is snappy,
with payloads bigger than 500kb, I can make the UI freeze just long
enough to make it noticeable.

I think what I really want is for JSON.parse to by implemented as
async and executed in it's own thread. I would then just pass in a
callback that would handled the parsed object when it's ready. Web
workers get pretty close to allowing me to do something similar, but
the messaging overhead is killing all the benefits I'm getting from
the async parsing in worker thread.

/i

On Tue, Dec 28, 2010 at 10:51 AM, Drew Wilson <atwilson at chromium.org> wrote:
> Hi Igor,
> Objects passed via message ports (including the intrinsic port for dedicated
> workers) are cloned. I can't speak for other implementations, but in WebKit
> I believe cloned objects aren't JSON encoded/decoded, but instead there is
> another native mechanism for cloning these objects that will likely be
> faster than JSON encoding.
> That said, I'm not sure that "parsing large JSON files" is the best
> WebWorker use case, depending on how you're doing the parsing and how large
> the files are.
> -atw
>
> On Tue, Dec 28, 2010 at 10:35 AM, Igor Minar <iiminar at gmail.com> wrote:
>>
>> Hello,
>>
>> I'm exploring the possibilities of using web workers for parsing large
>> JSON files outside of the main UI thread.
>>
>> I found several references that this could be one of the use cases for
>> web workers (e.g. oreilly's intro to web workers [1]). However, the
>> more I read about webworkers, the less attractive they are for this
>> purpose, mainly because of how data is passed from worker to the main
>> thread.
>>
>> Please correct me if I'm wrong, but my understanding is that any data
>> that is returned in the message from the worker, is copied rather than
>> shared and it seems that this is often implemented by serializing the
>> data into a json string and then deserializing it in the main script.
>> Is this right? Because if it is, then what's the point of parsing the
>> json string in worker thread, just to serialize it and then parse it
>> again in the main thread.
>>
>> I'd love to be wrong about this because the concept of workers looks
>> like a perfect match for my use case (parsing large json payloads
>> quickly without affecting the UI), but my trivial microbenchmarks show
>> that the overhead of passing the data to, as well as from the
>> webworker is just too big to use it for this purpose.
>>
>> thanks,
>> Igor
>>
>>
>> [1] http://answers.oreilly.com/topic/1358-introducing-the-web-workers-api/
>> _______________________________________________
>> Help mailing list
>> Help at lists.whatwg.org
>> http://lists.whatwg.org/listinfo.cgi/help-whatwg.org
>
>