[whatwg] Structured clone algorithm on LocalStorage

Darin Fisher darin at chromium.org
Thu Sep 24 10:52:16 PDT 2009


On Thu, Sep 24, 2009 at 10:40 AM, Jonas Sicking <jonas at sicking.cc> wrote:

> On Thu, Sep 24, 2009 at 1:17 AM, Darin Fisher <darin at chromium.org> wrote:
> > On Thu, Sep 24, 2009 at 12:20 AM, Jonas Sicking <jonas at sicking.cc>
> wrote:
> >>
> >> On Wed, Sep 23, 2009 at 10:19 PM, Darin Fisher <darin at chromium.org>
> wrote:
> >> >
> >> >
> >> > On Wed, Sep 23, 2009 at 8:10 PM, Jonas Sicking <jonas at sicking.cc>
> wrote:
> >> >>
> >> >> On Wed, Sep 23, 2009 at 3:29 PM, Jeremy Orlow <jorlow at chromium.org>
> >> >> wrote:
> >> >> > On Wed, Sep 23, 2009 at 3:15 PM, Jonas Sicking <jonas at sicking.cc>
> >> >> > wrote:
> >> >> >>
> >> >> >> On Wed, Sep 23, 2009 at 2:53 PM, Brett Cannon <brett at python.org>
> >> >> >> wrote:
> >> >> >> > On Wed, Sep 23, 2009 at 13:35, Jeremy Orlow <
> jorlow at chromium.org>
> >> >> >> > wrote:
> >> >> >> >> What are the use cases for wanting to store data beyond strings
> >> >> >> >> (and
> >> >> >> >> what
> >> >> >> >> can be serialized into strings) in LocalStorage?  I can't think
> >> >> >> >> of
> >> >> >> >> any
> >> >> >> >> that
> >> >> >> >> outweigh the negatives:
> >> >> >> >> 1)  From previous threads, I think it's fair to say that we can
> >> >> >> >> all
> >> >> >> >> agreed
> >> >> >> >> that LocalStorage is a regrettable API (mainly due to its
> >> >> >> >> synchronous
> >> >> >> >> nature).  If so, it seems that making it more powerful and thus
> >> >> >> >> more
> >> >> >> >> attractive to developers is just asking for trouble.  After
> all,
> >> >> >> >> the
> >> >> >> >> more
> >> >> >> >> people use it, the more lock contention there'll be, and the
> more
> >> >> >> >> browser UI
> >> >> >> >> jank users will be sure to experience.  This will also be worse
> >> >> >> >> because
> >> >> >> >> it'll be easier for developers to store large objects in
> >> >> >> >> LoaclStorage.
> >> >> >> >> 2)  As far as I can tell, there's no where else in the spec
> where
> >> >> >> >> you
> >> >> >> >> have
> >> >> >> >> to serialize structured clone(able) data to disk.  Given that
> >> >> >> >> LocalStorage
> >> >> >> >> is supposed to throw an exception if any ImageData is contained
> >> >> >> >> and
> >> >> >> >> since
> >> >> >> >> File and FileData objects are legal, it seems as though making
> >> >> >> >> LocalStorage
> >> >> >> >> handle structured clone data has a fairly high cost to
> >> >> >> >> implementors.
> >> >> >> >>  Not to
> >> >> >> >> mention that disallowing ImageData in only this one case is not
> >> >> >> >> intuitive.
> >> >> >> >> I think allowing structured clone(able) data in LocalStorage is
> a
> >> >> >> >> big
> >> >> >> >> mistake.  Enough so that, if SessionStorage and LocalStorage
> >> >> >> >> can't
> >> >> >> >> diverge
> >> >> >> >> on this issue, it'd be worth taking the power away from
> >> >> >> >> SessionStorage.
> >> >> >> >> J
> >> >> >> >
> >> >> >> > Speaking from experience, I have been using localStorage in my
> PhD
> >> >> >> > thesis work w/o any real need for structured clones (I would
> have
> >> >> >> > used
> >> >> >> > Web Database but it isn't widely used yet and I was not sure if
> it
> >> >> >> > was
> >> >> >> > going to make the cut in the end). All it took to come close to
> >> >> >> > simulating structured clones now was to develop my own
> >> >> >> > compatibility
> >> >> >> > wrapper for localStorage (http://realstorage.googlecode.com for
> >> >> >> > those
> >> >> >> > who care) and add setJSONObject() and getJSONObject() methods on
> >> >> >> > the
> >> >> >> > wrapper. Works w/o issue.
> >> >> >>
> >> >> >> Actually, this seems like a prime reason *to* add structured
> storage
> >> >> >> support. Obviously string data wasn't enough for you so you had to
> >> >> >> write extra code in order to work around that. If structured
> clones
> >> >> >> had been natively supported you both would have had to write less
> >> >> >> code, and the resulting algorithms would have been faster. Faster
> >> >> >> since the browser can serialize/parser to/from a binary internal
> >> >> >> format faster than to/from JSON through the JSON
> serializer/parser.
> >> >> >
> >> >> > Yes, but since LocalStorage is already widely deployed, authors are
> >> >> > stuck
> >> >> > with the the structured clone-less version of LocalStorage for a
> very
> >> >> > long
> >> >> > time.  So the only way an app can store anything that can't be
> >> >> > JSONified
> >> >> > is
> >> >> > to break backwards compatibility.
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Wed, Sep 23, 2009 at 3:11 PM, Jonas
> >> >> > Sicking <jonas at sicking.cc> wrote:
> >> >> >>
> >> >> >> On Wed, Sep 23, 2009 at 1:35 PM, Jeremy Orlow <
> jorlow at chromium.org>
> >> >> >> wrote:
> >> >> >> > What are the use cases for wanting to store data beyond strings
> >> >> >> > (and
> >> >> >> > what
> >> >> >> > can be serialized into strings) in LocalStorage?  I can't think
> of
> >> >> >> > any
> >> >> >> > that
> >> >> >> > outweigh the negatives:
> >> >> >> > 1)  From previous threads, I think it's fair to say that we can
> >> >> >> > all
> >> >> >> > agreed
> >> >> >> > that LocalStorage is a regrettable API (mainly due to its
> >> >> >> > synchronous
> >> >> >> > nature).  If so, it seems that making it more powerful and thus
> >> >> >> > more
> >> >> >> > attractive to developers is just asking for trouble.  After all,
> >> >> >> > the
> >> >> >> > more
> >> >> >> > people use it, the more lock contention there'll be, and the
> more
> >> >> >> > browser UI
> >> >> >> > jank users will be sure to experience.  This will also be worse
> >> >> >> > because
> >> >> >> > it'll be easier for developers to store large objects in
> >> >> >> > LoaclStorage.
> >> >> >> > 2)  As far as I can tell, there's no where else in the spec
> where
> >> >> >> > you
> >> >> >> > have
> >> >> >> > to serialize structured clone(able) data to disk.  Given that
> >> >> >> > LocalStorage
> >> >> >> > is supposed to throw an exception if any ImageData is contained
> >> >> >> > and
> >> >> >> > since
> >> >> >> > File and FileData objects are legal, it seems as though making
> >> >> >> > LocalStorage
> >> >> >> > handle structured clone data has a fairly high cost to
> >> >> >> > implementors.
> >> >> >> >  Not to
> >> >> >> > mention that disallowing ImageData in only this one case is not
> >> >> >> > intuitive.
> >> >> >> > I think allowing structured clone(able) data in LocalStorage is
> a
> >> >> >> > big
> >> >> >> > mistake.  Enough so that, if SessionStorage and LocalStorage
> can't
> >> >> >> > diverge
> >> >> >> > on this issue, it'd be worth taking the power away from
> >> >> >> > SessionStorage.
> >> >> >>
> >> >> >> Despite localStorage unfortunate locking contention problem, it's
> >> >> >> become quite a popular API. It's also very successful in terms of
> >> >> >> browser deployment since it's available in at least latest
> versions
> >> >> >> of
> >> >> >> IE, Safari, Firefox, and Chrome. Don't know about support in
> Opera?
> >> >> >
> >> >> > The more popular it becomes, the more it's going to hurt UA
> >> >> > developers,
> >> >> > web
> >> >> > developers, and users.  I don't see why this is an argument for
> >> >> > making
> >> >> > it
> >> >> > more powerful.
> >> >>
> >> >> How will it hurt UA developers? I think we're stuck forever to
> >> >> implement the locking mechanism. Adding more datatypes to the API
> >> >> doesn't mean that we'll have to implement it more.
> >> >
> >> >
> >> > multi-core is the future.  what's the opposite of fine-grained
> locking?
> >> >  it's not good ;-)
> >> > the implicit locking mechanism as spec'd is super lame.  implicitly
> >> > unlocking under
> >> > mysterious-to-the-developer circumstances!  how can that be a good
> >> > thing?
> >> > storage.setItem("y",
> >> > function_involving_implicit_unlocking(storage.getItem("x")));
> >>
> >> I totally agree on all points. The current API has big imperfections.
> >> However I haven't seen any workable counter proposals so far, and I
> >> honestly don't believe there are any as long as our goals are:
> >>
> >> * Don't break existing users of the current implementations.
> >> * Don't expose race conditions to the web.
> >> * Don't rely on authors getting explicit locking mechanisms right.
> >>
> >
> > The current API exposes race conditions to the web.  The implicit
> > dropping of the storage lock is that.  In Chrome, we'll have to drop
> > an existing lock whenever a new lock is acquired.  That can happen
> > due to a variety of really odd cases (usually related to nested loops
> > or nested JS execution), which will be difficult for developers to
> > predict, especially if they are relying on third-party JS libraries.
> > This issue seems to be discounted for reasons I do not understand.
>
> I don't believe we've heard about this before, so that would be the
> reason it hasn't been taken into account.
>
> So you're saying that chrome would be unable implement the current
> storage mutex as specified in spec? I.e. one that is only released at
> the explicit points that the spec defines? That seems like a huge
> problem.
>

No, no... my point is that to the application developer, those "explicit"
points will appear quite implicit and mysterious.  This is why I called
out third-party JS libraries.  One day, a function that you are using
might transition to scripting a plugin, which might cause a nested
loop, which could then force the lock to be released.  As a programmer,
the unlocking is not explicit or predictable.

Moreover, there are other examples which have been discussed on the
list.  There are some DOM operations that can result in a frame receiving
a DOM event synchronously.  That can result in a nesting of storage locks,
which can force us to have to implicitly unlock the outermost lock to avoid
deadlocks.  Again, the programmer will have very poor visibility into when
these things can happen.



>
> >> But, as imperfect as the current API is, I think the following is a
> >> decent way forward:
> >>
> >> * Allow pages that want the convenience of localStorage to use it. For
> >> multi-process browsers this will mean poor UI *for pages that use
> >> localStorage*. Especially when said pages hold on to localStorage for
> >> a long time.
> >> * Add alternative APIs that don't suffer from the same problems. More
> >> below.
> >>
> >> >> > In addition, this argument assumes that Microsoft (and other UAs)
> >> >> > will
> >> >> > implement the structured clone version of LocalStorage.  Has anyone
> >> >> > (or
> >> >> > can
> >> >> > anyone) from Microsoft comment on this?
> >> >>
> >> >> Given that I've never heard microsoft commit to a webstandard, ever,
> I
> >> >> doubt that we'll hear anything here. Or that the lack of hearing
> >> >> anything means we can draw any conclusions.
> >> >>
> >> >> > This is not a small feature to add.  Yes, it's smaller than
> creating
> >> >> > a
> >> >> > new
> >> >> > storage mechanism (that everyone is willing to adopt), but I still
> >> >> > think
> >> >> > that's what we should be looking at.  Rather than polishing a turd.
> >> >>
> >> >> I do think that localStorage is a decent API that developers will
> want
> >> >> to, and should, use. I think looking into adding a async accessor to
> >> >> get a storage object so that people can use an localStorage-like API
> >> >> while avoiding risks of blocking. This would also allow sharing data
> >> >> between worker threads and the main window.
> >> >
> >> > i think the async callback to get a storage object is an improvement,
> >> > but
> >> > i'm not sure that it addresses all of the problems.  for example, if a
> >> > worker
> >> > wants to read values from storage, compute, and then put a value into
> >> > storage, it would probably do all of this from the storage callback.
> >> >  that
> >> > would result in holding the lock for a long time, which would lock out
> >> > any
> >> > other threads, including non-worker threads.
> >> > the problem here is that localStorage is a pile of global variables.
>  we
> >> > are
> >> > trying to give people global variables without giving them tools to
> >> > synchronize
> >> > access to them.  the claim i've heard is that developers are not savy
> >> > enough
> >> > to use those tools properly.  i agree that developers tend to use
> tools
> >> > without
> >> > fully understanding them.  ok, but then why are we giving them global
> >> > variables?
> >> > there has to be a better answer.
> >>
> >> I actually described an potential solution in the thread on worker
> >> storage.
> >>
> >> The problem you describe is a worker holding on the the storage for an
> >> very long (indefinite) time, thereby locking out other threads/windows
> >> from accessing the same storage area. This seems inevitable if we want
> >> to prevent race conditions while at the same time not forcing the
> >> complexities of locks onto web developers. The WebDatabase API suffers
> >> from exactly the same problem.
> >
> > Hmm... are you saying that from the SQLStatementCallback used to read
> > some data out of the database, you might compute on that data, and then
> > issue an executeSql call to write a computed result, and that in this
> > scenario,
> > the fact that it is the same transaction means that other threads are
> locked
> > out of accessing the same database?  I hadn't considered chaining
> executeSql
> > calls like this to keep the transaction alive.  Hmm...
>
> Indeed.
>
> >> However, we can lessen the problem. By adding multiple storage areas,
> >> we can allow a worker to use one storage area, while allowing other
> >> parties to simultaneously use other storage areas. This way, if a
> >> worker and a window aren't sharing data at all, they never get in the
> >> way of each other.
> >>
> >> So a very simplistic design would be something like the following:
> >>
> >> getStorageArea(name, callback)
> >>
> >> when called will asynchronously call the callback parameter once the
> >> storage area named by the first parameter becomes available. The
> >> callback receives the storage area as an argument. We would also have
> >> the function
> >>
> >> getMultipleStorageAreas(names, callback)
> >>
> >> Same as above, but names is an array of strings indicating multiple
> >> storage areas that need to be acquired before the callback is called.
> >> The callback receives all the areas in an array as an argument. This
> >> function allows transferring data between multiple storage areas
> >> without risking racing.
> >>
> >> There's several problems with this, such as the names are sort of
> >> crappy, and that getting storage areas an array isn't very friendly.
> >> However you get the basic idea.
> >>
> >> We don't even need to use Storage objects for this. In fact, I hope
> >> mozilla will in a not too distant future come up with an alternative
> >> proposal to the WebDatabase SQL API. Something like this might fit
> >> into such a proposal as I think that'll have multiple separate storage
> >> areas anyway.
> >>
> >> / Jonas
> >
> >
> > Maybe we should just invent a similar transaction method for name/value
> > storage?  Wouldn't that be better than inventing a new idiom?  Ideally,
> > we'd also make reads and writes on storage be asynchronous.  The
> > transaction would then be usable to hold the lock across multiple
> > asynchronous reads and writes.  Since local storage is backed by disk,
> > it seems like a more ideal local storage API would not
> require synchronous
> > filesystem access.
>
> Not quite following what you're suggesting, but there's lots of ways
> to design this. The critical part is to allow grabbing (with
> associated locking) of just a subset of the available storage space.
>
> / Jonas
>


I was suggesting that we only provide asynchronous getItem / setItem calls,
where
each call is parameterized by a transaction.  This is how database works.

-Darin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20090924/5039ff4d/attachment-0002.htm>


More information about the whatwg mailing list