[html5] JSON parsing in Web Worker

Igor Minar iiminar at gmail.com
Wed Dec 29 23:16:50 PST 2010


I modified the code to deal with most/all of the issues below.

I still see significant overhead in all browsers:

chrome 9: +240%
chrome 10: +180%
safari 5: 125%
firefox 4beta: +61%

Here I'm comparing synchronous parsing to async parsing implemented
via a preinitialized web worker (Test #1 vs Test #5 in my code [1]).

I'm going to play with this a bit more tomorrow, but at the moment it
seems that json parsing is one of the activities that should not be
done in a webworker, except in cases when the parsed object doesn't
need to be passed to the main thread.

/i


[1] https://github.com/IgorMinar/angular.js/blob/json-webworker/perf/jsonPerfSpec.js


On Tue, Dec 28, 2010 at 8:22 PM, Igor Minar <iiminar at gmail.com> wrote:
> Ricardo, Drew,
>
> My code is here:
> https://github.com/IgorMinar/angular.js/blob/json-webworker/perf/jsonPerfSpec.js
>
> The harness is not perfect, but all should be good enough. The code is
> of primarily exploratory quality in addition to being work in progress
> :)
>
> The main issues are:
> -  I should wait for a signal from the worker that it's ready, before
> I send the first request to it. So if initializing worker takes a long
> time, I might be partially including the startup time in the duration
> - I should repeat the test 100 or 100s of times and calculate the
> average and possibly use bigger payloads or slower computer because
> the results I'm seeing are in 10s to 100s ms range.
> - I'm using JS Test Driver as my harness so the text output sometimes
> looks weird or doesn't make sense. I look at the times printed on the
> "[LOG] took:" lines
>
>
> Out of all the test, Test #1 and Test #5 are the most interesting.
>
> #1 tests synchronous parsing and #5 tests async parsing, when
> webworker processes payload that originates in the worker context
> (simulating xhr executed from within worker and worker parsing the
> response returned before handing it over to the main thread).
>
>
> Currently I'm getting results like these:
>
> Total 16 tests (Passed: 16; Fails: 0; Errors: 0) (11586.00 ms)
>
>  Safari 533.18.5 Mac OS: Run 8 tests (Passed: 8; Fails: 0; Errors 0)
> (6447.00 ms)
>    json.test that it Test #0: native json passed (5878.00 ms)
>      [LOG] 58.74 ms per iteration
>    Test #1: Synchronous Json parser.testParsing passed (42.00 ms)
>      [LOG] took: 19
>      [LOG] took: 27
>    Test #2: WebWorker Json parser.test passed (85.00 ms)
>      [LOG] took: 82
>    Test #3: Preinitialized WebWorker Json parser.test passed (104.00 ms)
>      [LOG] took: 101
>    Test #4: WebWorker Json parser with inlined payload.test passed (110.00 ms)
>      [LOG] took: 107
>    Test #5: Preinitialized WebWorker Json parser with inlined
> payload.test passed (95.00 ms)
>      [LOG] took: 92
>    Test #6: WebWorker Json parser with inlined payload without return
> value.test passed (69.00 ms)
>      [LOG] took: 66
>    Test #7: Preinitialized WebWorker Json parser with inlined payload
> without return value.test passed (64.00 ms)
>      [LOG] took: 61
>
>  Chrome 9.0.572.0 Mac OS: Run 8 tests (Passed: 8; Fails: 0; Errors 0)
> (11586.00 ms)
>    json.test that it Test #0: native json passed (9260.00 ms)
>      [LOG] 92.59 ms per iteration
>    Test #1: Synchronous Json parser.testParsing passed (198.00 ms)
>      [LOG] took: 187
>      [LOG] took: 193
>    Test #2: WebWorker Json parser.test passed (554.00 ms)
>      [LOG] took: 551
>    Test #3: Preinitialized WebWorker Json parser.test passed (297.00 ms)
>      [LOG] took: 294
>    Test #4: WebWorker Json parser with inlined payload.test passed (459.00 ms)
>      [LOG] took: 457
>    Test #5: Preinitialized WebWorker Json parser with inlined
> payload.test passed (344.00 ms)
>      [LOG] took: 341
>    Test #6: WebWorker Json parser with inlined payload without return
> value.test passed (232.00 ms)
>      [LOG] took: 230
>    Test #7: Preinitialized WebWorker Json parser with inlined payload
> without return value.test passed (242.00 ms)
>      [LOG] took: 240
>
> cheers,
> Igor
>
>
>
> On Tue, Dec 28, 2010 at 5:06 PM, Drew Wilson <atwilson at chromium.org> wrote:
>> Forgive what's probably a very naive suggestion, but I'm assuming you're
>> measuring just the parse + messaging time, and not the thread startup time
>> in your 3x measurement below (i.e. you're doing the measurements on an
>> already-running worker)?
>> -atw
>>
>> On Tue, Dec 28, 2010 at 11:21 AM, Igor Minar <iiminar at gmail.com> wrote:
>>>
>>> Drew,
>>>
>>> I tested Safari 5.0.2 (6533.18.5) and while it's one of the faster
>>> browsers out there, my tests show that parsing 650kb json string takes
>>> 3x longer when I use webworker than when I parse it in the main
>>> thread.
>>>
>>> Parsing alone, take equivalent amount of time, it's the async
>>> messaging and mainly transfer of data from the worker that adds 2x
>>> overhead.
>>>
>>> I use JSON.parse to do the parsing, and while this method is snappy,
>>> with payloads bigger than 500kb, I can make the UI freeze just long
>>> enough to make it noticeable.
>>>
>>> I think what I really want is for JSON.parse to by implemented as
>>> async and executed in it's own thread. I would then just pass in a
>>> callback that would handled the parsed object when it's ready. Web
>>> workers get pretty close to allowing me to do something similar, but
>>> the messaging overhead is killing all the benefits I'm getting from
>>> the async parsing in worker thread.
>>>
>>> /i
>>>
>>>
>>>
>>> On Tue, Dec 28, 2010 at 10:51 AM, Drew Wilson <atwilson at chromium.org>
>>> wrote:
>>> > Hi Igor,
>>> > Objects passed via message ports (including the intrinsic port for
>>> > dedicated
>>> > workers) are cloned. I can't speak for other implementations, but in
>>> > WebKit
>>> > I believe cloned objects aren't JSON encoded/decoded, but instead there
>>> > is
>>> > another native mechanism for cloning these objects that will likely be
>>> > faster than JSON encoding.
>>> > That said, I'm not sure that "parsing large JSON files" is the best
>>> > WebWorker use case, depending on how you're doing the parsing and how
>>> > large
>>> > the files are.
>>> > -atw
>>> >
>>> > On Tue, Dec 28, 2010 at 10:35 AM, Igor Minar <iiminar at gmail.com> wrote:
>>> >>
>>> >> Hello,
>>> >>
>>> >> I'm exploring the possibilities of using web workers for parsing large
>>> >> JSON files outside of the main UI thread.
>>> >>
>>> >> I found several references that this could be one of the use cases for
>>> >> web workers (e.g. oreilly's intro to web workers [1]). However, the
>>> >> more I read about webworkers, the less attractive they are for this
>>> >> purpose, mainly because of how data is passed from worker to the main
>>> >> thread.
>>> >>
>>> >> Please correct me if I'm wrong, but my understanding is that any data
>>> >> that is returned in the message from the worker, is copied rather than
>>> >> shared and it seems that this is often implemented by serializing the
>>> >> data into a json string and then deserializing it in the main script.
>>> >> Is this right? Because if it is, then what's the point of parsing the
>>> >> json string in worker thread, just to serialize it and then parse it
>>> >> again in the main thread.
>>> >>
>>> >> I'd love to be wrong about this because the concept of workers looks
>>> >> like a perfect match for my use case (parsing large json payloads
>>> >> quickly without affecting the UI), but my trivial microbenchmarks show
>>> >> that the overhead of passing the data to, as well as from the
>>> >> webworker is just too big to use it for this purpose.
>>> >>
>>> >> thanks,
>>> >> Igor
>>> >>
>>> >>
>>> >> [1]
>>> >> http://answers.oreilly.com/topic/1358-introducing-the-web-workers-api/
>>> >> _______________________________________________
>>> >> Help mailing list
>>> >> Help at lists.whatwg.org
>>> >> http://lists.whatwg.org/listinfo.cgi/help-whatwg.org
>>> >
>>> >
>>
>>
>



More information about the Help mailing list