[html5] JSON parsing in Web Worker

Igor Minar iiminar at gmail.com
Tue Dec 28 20:22:05 PST 2010


Ricardo, Drew,

My code is here:
https://github.com/IgorMinar/angular.js/blob/json-webworker/perf/jsonPerfSpec.js

The harness is not perfect, but all should be good enough. The code is
of primarily exploratory quality in addition to being work in progress
:)

The main issues are:
-  I should wait for a signal from the worker that it's ready, before
I send the first request to it. So if initializing worker takes a long
time, I might be partially including the startup time in the duration
- I should repeat the test 100 or 100s of times and calculate the
average and possibly use bigger payloads or slower computer because
the results I'm seeing are in 10s to 100s ms range.
- I'm using JS Test Driver as my harness so the text output sometimes
looks weird or doesn't make sense. I look at the times printed on the
"[LOG] took:" lines


Out of all the test, Test #1 and Test #5 are the most interesting.

#1 tests synchronous parsing and #5 tests async parsing, when
webworker processes payload that originates in the worker context
(simulating xhr executed from within worker and worker parsing the
response returned before handing it over to the main thread).


Currently I'm getting results like these:

Total 16 tests (Passed: 16; Fails: 0; Errors: 0) (11586.00 ms)

  Safari 533.18.5 Mac OS: Run 8 tests (Passed: 8; Fails: 0; Errors 0)
(6447.00 ms)
    json.test that it Test #0: native json passed (5878.00 ms)
      [LOG] 58.74 ms per iteration
    Test #1: Synchronous Json parser.testParsing passed (42.00 ms)
      [LOG] took: 19
      [LOG] took: 27
    Test #2: WebWorker Json parser.test passed (85.00 ms)
      [LOG] took: 82
    Test #3: Preinitialized WebWorker Json parser.test passed (104.00 ms)
      [LOG] took: 101
    Test #4: WebWorker Json parser with inlined payload.test passed (110.00 ms)
      [LOG] took: 107
    Test #5: Preinitialized WebWorker Json parser with inlined
payload.test passed (95.00 ms)
      [LOG] took: 92
    Test #6: WebWorker Json parser with inlined payload without return
value.test passed (69.00 ms)
      [LOG] took: 66
    Test #7: Preinitialized WebWorker Json parser with inlined payload
without return value.test passed (64.00 ms)
      [LOG] took: 61

  Chrome 9.0.572.0 Mac OS: Run 8 tests (Passed: 8; Fails: 0; Errors 0)
(11586.00 ms)
    json.test that it Test #0: native json passed (9260.00 ms)
      [LOG] 92.59 ms per iteration
    Test #1: Synchronous Json parser.testParsing passed (198.00 ms)
      [LOG] took: 187
      [LOG] took: 193
    Test #2: WebWorker Json parser.test passed (554.00 ms)
      [LOG] took: 551
    Test #3: Preinitialized WebWorker Json parser.test passed (297.00 ms)
      [LOG] took: 294
    Test #4: WebWorker Json parser with inlined payload.test passed (459.00 ms)
      [LOG] took: 457
    Test #5: Preinitialized WebWorker Json parser with inlined
payload.test passed (344.00 ms)
      [LOG] took: 341
    Test #6: WebWorker Json parser with inlined payload without return
value.test passed (232.00 ms)
      [LOG] took: 230
    Test #7: Preinitialized WebWorker Json parser with inlined payload
without return value.test passed (242.00 ms)
      [LOG] took: 240

cheers,
Igor



On Tue, Dec 28, 2010 at 5:06 PM, Drew Wilson <atwilson at chromium.org> wrote:
> Forgive what's probably a very naive suggestion, but I'm assuming you're
> measuring just the parse + messaging time, and not the thread startup time
> in your 3x measurement below (i.e. you're doing the measurements on an
> already-running worker)?
> -atw
>
> On Tue, Dec 28, 2010 at 11:21 AM, Igor Minar <iiminar at gmail.com> wrote:
>>
>> Drew,
>>
>> I tested Safari 5.0.2 (6533.18.5) and while it's one of the faster
>> browsers out there, my tests show that parsing 650kb json string takes
>> 3x longer when I use webworker than when I parse it in the main
>> thread.
>>
>> Parsing alone, take equivalent amount of time, it's the async
>> messaging and mainly transfer of data from the worker that adds 2x
>> overhead.
>>
>> I use JSON.parse to do the parsing, and while this method is snappy,
>> with payloads bigger than 500kb, I can make the UI freeze just long
>> enough to make it noticeable.
>>
>> I think what I really want is for JSON.parse to by implemented as
>> async and executed in it's own thread. I would then just pass in a
>> callback that would handled the parsed object when it's ready. Web
>> workers get pretty close to allowing me to do something similar, but
>> the messaging overhead is killing all the benefits I'm getting from
>> the async parsing in worker thread.
>>
>> /i
>>
>>
>>
>> On Tue, Dec 28, 2010 at 10:51 AM, Drew Wilson <atwilson at chromium.org>
>> wrote:
>> > Hi Igor,
>> > Objects passed via message ports (including the intrinsic port for
>> > dedicated
>> > workers) are cloned. I can't speak for other implementations, but in
>> > WebKit
>> > I believe cloned objects aren't JSON encoded/decoded, but instead there
>> > is
>> > another native mechanism for cloning these objects that will likely be
>> > faster than JSON encoding.
>> > That said, I'm not sure that "parsing large JSON files" is the best
>> > WebWorker use case, depending on how you're doing the parsing and how
>> > large
>> > the files are.
>> > -atw
>> >
>> > On Tue, Dec 28, 2010 at 10:35 AM, Igor Minar <iiminar at gmail.com> wrote:
>> >>
>> >> Hello,
>> >>
>> >> I'm exploring the possibilities of using web workers for parsing large
>> >> JSON files outside of the main UI thread.
>> >>
>> >> I found several references that this could be one of the use cases for
>> >> web workers (e.g. oreilly's intro to web workers [1]). However, the
>> >> more I read about webworkers, the less attractive they are for this
>> >> purpose, mainly because of how data is passed from worker to the main
>> >> thread.
>> >>
>> >> Please correct me if I'm wrong, but my understanding is that any data
>> >> that is returned in the message from the worker, is copied rather than
>> >> shared and it seems that this is often implemented by serializing the
>> >> data into a json string and then deserializing it in the main script.
>> >> Is this right? Because if it is, then what's the point of parsing the
>> >> json string in worker thread, just to serialize it and then parse it
>> >> again in the main thread.
>> >>
>> >> I'd love to be wrong about this because the concept of workers looks
>> >> like a perfect match for my use case (parsing large json payloads
>> >> quickly without affecting the UI), but my trivial microbenchmarks show
>> >> that the overhead of passing the data to, as well as from the
>> >> webworker is just too big to use it for this purpose.
>> >>
>> >> thanks,
>> >> Igor
>> >>
>> >>
>> >> [1]
>> >> http://answers.oreilly.com/topic/1358-introducing-the-web-workers-api/
>> >> _______________________________________________
>> >> Help mailing list
>> >> Help at lists.whatwg.org
>> >> http://lists.whatwg.org/listinfo.cgi/help-whatwg.org
>> >
>> >
>
>



More information about the Help mailing list