[whatwg] Web-sockets + Web-workers to produce a P2P website or application

Tue Jan 19 08:59:32 PST 2010

I have an idea for a possible use case that as far as I can tell from
previous discussions on this list has not been considered or at least
not in the form I present below.

I have a friend whose company produces and licenses online games for
social networks such as Facebook, Orkut, etc.

One of the big problems with these games is the shear amount of static
content that must be delivered via HTTP once the application becomes
popular. In fact, if a game becomes popular overnight, the scaling
problems with this static content quickly becomes a technical and
financial problem.

To give you an idea of the magnitude and scope, more than 4 TB of
static content is streamed on a given day for one of the applications.
It's very likely that others with similarly popular applications have
encountered the same challenge.

When thinking about how to resolve this, I took my usual approach of
thinking how do we decentralize the content delivery and move towards
an agent-based message passing model so that we do not have a single
bottleneck technically and so we can dissipate the cost of delivering
this content.

My idea is to use web-sockets to allow the browser function more a
less like a bit-torrent client. Along with this, web-workers would
provide threads for handling the code that would function as a server,
serving the static content to peers also using the program.

If you have lots of users (thousands) accessing the same application,
you effectively have the equivalent of one torrent with a large swarm
of users, where the torrent is a package of the most frequently
requested static content. (I am assuming that the static content
requests follow a power law distribution, with only a few static files
being responsible for the overwhelming bulk of static data
transferred.).

As I have only superficial knowledge of the technologies involved and
the capabilities of HTML5, I passed this idea by a couple of
programmer friends to get their opinions. Generally they thought is
was a very interesting idea, but that as far as they know, the
specification as it stands now is incapable of accommodating such a
use case.

Together we arrived at a few criticisms of this idea that appear to be
resolvable:

-- Privacy issues
-- Security issues (man in the middle attack).
-- content labeling (i.e. how does the browser know what content is
truly static and therefore safe to share.)
-- content signing (i.e. is there some sort of hash that allows the
peers to confirm that the content has not been adulterated).
-- privacy issues

All in all, many of these issues have been solved by the many talented
programmers that have developed the current bit-torrent protocol,
algorithms and security features. The idea would simply to design the
HTML5 in such a way that it can permit the browser to function as a
full-fledged web-application bit-torrent client-server.

Privacy issues can be resolved by possibly defining something such as
"browser security zones" or "content label" whereby the content
provider (application developer) labels content (such as images and
CSS files) as safe to share (static content) and labels dynamic
content (such as personal photos, documents, etc.) as unsafe to share.

Also in discussing this, we come up with some potentially useful
extensions to this use case.

One would be the versioning of the "torrent file", such that the
torrent file could represent versions of the application. i.e. I
release an application that is version 1.02 and it becomes very
popular and there is a sizable swarm. At some point in the future I
release a new version with bug-fixes and additional features (such as
CSS sprites for the social network game). I should be able to
propagate this new version to all clients in the swarm so that over
some time window such as 2 to 4 hours all clients in the swarm
discover (via push or pull) the new version and end up downloading it
from the peers with the new version. The only security feature I could
see that would be required would be that once a client discovers that
their is a new version, it would hit up the original server to
download a signature/fingerprint file to verify that the new version
that it is downloading from its peers is legitimate.

The interesting thing about this idea is that it would permit large
portions of sites to exist in virtual form. Long-term I can imagine
large non-profit sites such as Wikipedia functioning on top of this
structure in such a way that it greatly reduces the amount of funding
necessary. It would be partially distributed with updates to wikipedia
being distributed via lots of tiny versions from super-nodes à la a
Skype type P2P model.

This would also take a lot of power out of the hands of those telcos
that are anti-net neutrality. This feature would basically permit a
form of net neutrality by moving content to the fringes of the
network.

Let me know your thoughts and if you think this would be possible
using Web-sockets and web-workers, and if not, what changes would be
necessary to allow this to evolve.

Sincerely,

Andrew J. L. de Andrade
São Paulo, Brazil

(P.S. I consider myself a pretty technical person, but I don't really
program. I only dabble in programming as a hobby and to better
understand my colleagues. Feel free to be as technical as you want in
your reply, but please forgive me if I make or made any bonehead
mistakes.)