[whatwg] Persistent storage is critically flawed.

Ian Hickson ian at hixie.ch
Sun Aug 27 21:09:46 PDT 2006


On 8/27/06, Shannon Baker <shannon at arc.net.au> wrote:
>
> == 1: Authors failure to handle the implications of "global" storage. ==
> First lets talk about the global store (|globalStorage['']) which is
> accessible from ALL domains.

This is mentioned in the "Security and privacy" section; the third
bullet point here for example suggests blocking access to "public"
storage areas:

   http://whatwg.org/specs/web-apps/current-work/#user-tracking


> Did anyone stop to really consider the implications of this? I mean,
> sure the standard implies that UA's should deal with the security
> implications of this themselves, but what if they don't? Let's say a UA
> does allow access to this global storage, what would we expect to find
> in this storage space? Does the author really believe that this will be
> only used for sharing preferences between domains for the benefit of the
> user? Hell no! It's going to look like this:
>
> KEY                           VALUE
> adsense3wd4ghgtut9jhn
> kjh234kj23u4y2j34234hkj234hkj23h4k234k234   <--  Advertiser user tracking
> johnyizcool                   I Kickerz Azz!!!!!!
>     <--  Attention freak
> USconspiracy                  911 was an inside job. Tell
> everybody!      <--  Political activist
> UScitID
> kh546jkh45856456h45iu6y46j45j6h54kj6h45k6   <--  Government spying
> GodsLove.com                  Warning! This user supports
> abortion.       <--  Vigilantie user tracking

Yes, there's an entire section of the spec discussing this in detail,
with suggested solutions.


> |What possible use could this storage region ever have to a legitimate
> site? Especially when sensible UA's will just block it anyway? I for one
> do not want my browser becoming some sort of global 'grafitti wall'
> written on by every website I visit. Truthfully I cannot come up with a
> single legitimate use for the 'global' or 'com' regions that cannot be
> handled by per-domain storage or global storage with ACLs (see next point).

Indeed, the spec suggests blocking such access.


> == 2: Naive access controls which will result in guaranteed privacy
> violations. ==
> The standard advocates the two-way sharing of data between domains and
> subdomains - Namely that host.example.com should share data with the
> servers at 'www.host.example.com', 'example.com', and all servers rooted
> at '.com'. In its own words: "Each domain and each subdomain has its own
> separate storage area. Subdomains can access the storage areas of parent
> domains, and domains can access the storage areas of subdomains."
>
> My objection to this is similar to my objection to the 'global' storage
> space - It's totally naive. The whole scheme is based on the unfounded
> belief that there is a guaranteed trust relationship available between
> the parties controlling each of these domains.

There generally is; but for the two cases where there are not, see:

   http://whatwg.org/specs/web-apps/current-work/#storage

...and:

   http://whatwg.org/specs/web-apps/current-work/#storage0

Basically, for the few cases where an author doesn't control his
subdomain space, he should be careful. But this goes without saying.
The same requirement (that authors be responsible) applies to all Web
technologies, for example CGI script authors must be careful not to
allow SQL injection attacks, must check Referer headers, must ensure
POST/GET requests are handled appropriately, and so forth.


> Sure, one may be reliant
> on another for DNS redirection but that hardly implies that one wishes
> to share potentially confidential data with the other. As the author
> themselves stated there is no guarantee that users of geocities.com
> sub-domains wish their users data to be shared with GeoCities.

Indeed; users are geocities.com shouldn't be using this service, and
geocities themselves should put their data (if any) in a private
subdomain space.


> The
> author states that geocities could mitigate this risk with a fake
> sub-domain but how does that help the owner of mysite.geocities.com?

It doesn't. The solution for mysite.geocities.com is to get their own domain.


> The
> author implies that UA's should deal with this themselves and fails to
> provide any REALISTIC guidelines for them to do so (sure lets hardcode
> all the TLD's and free hosting providers).

The spec was written in conjunction with UA vendors. It is realistic
for UA vendors to provide a hardcoded list of TLDs; in fact, there is
significant work underway to create such a list (and have it be
regualrly updated). That work was originally started for use for HTTP
Cookie implementations, which have similar problems, but would be very
useful for Storage API implementations (although, again as noted in
the draft, not imperative for a secure implementation if the author is
responsible.


> What annoys me is that the
> author acknowledges the issue and then passes the buck to browser
> manufacturers as though it's their problem and they should solve it in
> any (incompatible or non-compliant) way they like.

Any solution must be compliant, by definition; regarding
compatibility, it isn't clear to me that the suggestion in the spec
would be incompatible.


> But why bother? This whole problem is easily solved by allowing data to
> be stored with an access control list (ACL). For example the site
> developer should be able to specify that a data object be available to
> '*.example.com' and 'fred.geocities.com' only. How this is done (as a
> string or array) is irrelevant to this post but it must be done rather
> than relying on implicit trust where none exists.

One could create much more complex APIs, naturally, but I do not see
that this would solve the problems. It wouldn't solve the issue of
authors who don't understand the security implications of their code,
for instance. It also wouldn't prevent the security issue you
mentioned -- why couldn't all *.geocities.com sites cooperate to
violate the user's privacy? Or *.co.uk sites, for that matter? (Note
that it is already possible today to do such tracking with cookies; in
fact it's already possible today even without cookies if you use
Referer tracking, and even without Referer tracking one can use IP and
User-Agent fingerprinting combined with log analysis to perform quite
thorough tracking.)


> == 3: Lack of privilege separation. ==
> The proposal assumes that the shared data should be readable and
> writable by all sub and parent domains. I believe there is no reason why
> this shouldn't be extended to provide 'access control' similar to that
> implemented by standard file systems. For example if I want to publish
> an object called 'myKey' and make it accessable to other sites it does
> not automatically mean I want them to be able to modify or delete it. It
> is important that global storage allows read-only access to variables if
> it is to be widely adopted for information sharing between untrusting
> parties.

Certainly one could add a .readonly field or some such to storage data
items, or even fully fledged ACL APIs, but I don't think that should
be available in a first version, and I'm not sure it's really useful
in later versions either.


> == 4: Messy API requiring callbacks to handle concurrency. ==
> The author uses a complicated method of handling concurrency by using
> callbacks triggered by setItem() to interrupt processing in other open
> pages (ie, other tabs or frames) which could access the same data. Why
> can I not simply lock the item during updates or long reads and force
> other scripts to wait? While I'm unsure wether ECMAscript can handle
> proper database-style transactions it seems like it would be fairly easy
> for the developer to implement critical sections by using shared storage
> objects or metadata as mutexes and semaphores. I can't see what role the
> callback mechanism would fulfill that could not be handled more easily
> using traditional transactional logic.

I don't really understand what this is referring to. Could you show an
example of the transaction/callback system you refer to? The API is
intended to be really simple, just specify the item name and there you
go.


> == Conclusion ==
> In conclusion it appears to me that the proposal is based on several
> fundamentally flawed security assumptions and is overly complex. I see
> this becoming a hiding place for viruses, malware and tracking cookies.
> Any sensible browser manufacturer would turn this feature off or limit
> its scope - thus rendering it inoperable for the many beneficial uses it
> would otherwise have. Those browsers that support this proposal are
> likely to do so in incompatible ways - due largely to the faults and
> omissions in this proposal that it implies UA's will solve. It seems
> like a large amount of browser sniffing will be required to have any
> assurance that persistent storage will work as advertised. Therefore,
> the global storage proposal must be fixed or removed.

While I agree that there are valid concerns, I believe they are all
addressed explicitly in the spec, with suggested solutions.

I would be interested in seeing a concrete proposal for a better
solution; I don't really see what a better solution would be.

-- 
Ian Hickson



More information about the whatwg mailing list