[whatwg] Web-sockets + Web-workers to produce a P2P website or application
Andrew de Andrade
andrew at deandrade.com.br
Tue Jan 19 11:54:27 PST 2010
On Tue, Jan 19, 2010 at 5:31 PM, Melvin Carvalho
<melvincarvalho at gmail.com> wrote:
>
>
> On Tue, Jan 19, 2010 at 5:59 PM, Andrew de Andrade <andrew at deandrade.com.br>
> wrote:
>>
>> I have an idea for a possible use case that as far as I can tell from
>> previous discussions on this list has not been considered or at least
>> not in the form I present below.
>>
>> I have a friend whose company produces and licenses online games for
>> social networks such as Facebook, Orkut, etc.
>>
>> One of the big problems with these games is the shear amount of static
>> content that must be delivered via HTTP once the application becomes
>> popular. In fact, if a game becomes popular overnight, the scaling
>> problems with this static content quickly becomes a technical and
>> financial problem.
>>
>> To give you an idea of the magnitude and scope, more than 4 TB of
>> static content is streamed on a given day for one of the applications.
>> It's very likely that others with similarly popular applications have
>> encountered the same challenge.
>>
>> When thinking about how to resolve this, I took my usual approach of
>> thinking how do we decentralize the content delivery and move towards
>> an agent-based message passing model so that we do not have a single
>> bottleneck technically and so we can dissipate the cost of delivering
>> this content.
>>
>> My idea is to use web-sockets to allow the browser function more a
>> less like a bit-torrent client. Along with this, web-workers would
>> provide threads for handling the code that would function as a server,
>> serving the static content to peers also using the program.
>>
>> If you have lots of users (thousands) accessing the same application,
>> you effectively have the equivalent of one torrent with a large swarm
>> of users, where the torrent is a package of the most frequently
>> requested static content. (I am assuming that the static content
>> requests follow a power law distribution, with only a few static files
>> being responsible for the overwhelming bulk of static data
>> transferred.).
>>
>> As I have only superficial knowledge of the technologies involved and
>> the capabilities of HTML5, I passed this idea by a couple of
>> programmer friends to get their opinions. Generally they thought is
>> was a very interesting idea, but that as far as they know, the
>> specification as it stands now is incapable of accommodating such a
>> use case.
>>
>> Together we arrived at a few criticisms of this idea that appear to be
>> resolvable:
>>
>> -- Privacy issues
>> -- Security issues (man in the middle attack).
>> -- content labeling (i.e. how does the browser know what content is
>> truly static and therefore safe to share.)
>> -- content signing (i.e. is there some sort of hash that allows the
>> peers to confirm that the content has not been adulterated).
>> -- privacy issues
>
> Yes I sort of see this kind of thing as the future of the web. There's an
> argument to say that it should have been done 10 or even 20 years ago, but
> we're still not there. I think websockets will be a huge step forward for
> this kind of thing. One issue still remains NAT traversal, perhaps this is
> what has held developers back, though notable exceptions such as skype have
> provided a great UX here.
>
> Gaming is one obvious application for this, which in many ways is the
> pinnacle of software engineering.
>
> I see this kind of technique really bringing linked data into its own
> (including RDFa) where browsers become more data aware and more socially
> aware and are able interchage relevant information. Something like FOAF
> (as a means to mark up data) is well suited to provide a distributed network
> of peers, can certainly handle global namespaced data naming, and is getting
> quite close to solving privacy and security challenges.
>
> Im really looking forward to seeing what people start to build on top of
> this technology, and your idea certainly sounds exciting.
>
>>
>> All in all, many of these issues have been solved by the many talented
>> programmers that have developed the current bit-torrent protocol,
>> algorithms and security features. The idea would simply to design the
>> HTML5 in such a way that it can permit the browser to function as a
>> full-fledged web-application bit-torrent client-server.
>>
>> Privacy issues can be resolved by possibly defining something such as
>> "browser security zones" or "content label" whereby the content
>> provider (application developer) labels content (such as images and
>> CSS files) as safe to share (static content) and labels dynamic
>> content (such as personal photos, documents, etc.) as unsafe to share.
>>
>> Also in discussing this, we come up with some potentially useful
>> extensions to this use case.
>>
>> One would be the versioning of the "torrent file", such that the
>> torrent file could represent versions of the application. i.e. I
>> release an application that is version 1.02 and it becomes very
>> popular and there is a sizable swarm. At some point in the future I
>> release a new version with bug-fixes and additional features (such as
>> CSS sprites for the social network game). I should be able to
>> propagate this new version to all clients in the swarm so that over
>> some time window such as 2 to 4 hours all clients in the swarm
>> discover (via push or pull) the new version and end up downloading it
>> from the peers with the new version. The only security feature I could
>> see that would be required would be that once a client discovers that
>> their is a new version, it would hit up the original server to
>> download a signature/fingerprint file to verify that the new version
>> that it is downloading from its peers is legitimate.
>>
>> The interesting thing about this idea is that it would permit large
>> portions of sites to exist in virtual form. Long-term I can imagine
>> large non-profit sites such as Wikipedia functioning on top of this
>> structure in such a way that it greatly reduces the amount of funding
>> necessary. It would be partially distributed with updates to wikipedia
>> being distributed via lots of tiny versions from super-nodes à la a
>> Skype type P2P model.
>>
>> This would also take a lot of power out of the hands of those telcos
>> that are anti-net neutrality. This feature would basically permit a
>> form of net neutrality by moving content to the fringes of the
>> network.
>>
>> Let me know your thoughts and if you think this would be possible
>> using Web-sockets and web-workers, and if not, what changes would be
>> necessary to allow this to evolve.
>>
>> Sincerely,
>>
>> Andrew J. L. de Andrade
>> São Paulo, Brazil
>>
>> (P.S. I consider myself a pretty technical person, but I don't really
>> program. I only dabble in programming as a hobby and to better
>> understand my colleagues. Feel free to be as technical as you want in
>> your reply, but please forgive me if I make or made any bonehead
>> mistakes.)
>
>
I think the most interesting part of this idea is that the load scales
linearly with the demand for the content, because the site in question
only needs to serve the "torrent" file with distributed hash tables,
signatures, fingerprints, etc. In other words, the original site
exists to guarantee authenticity.
Better than that is the fact that distribution is effectively cheaper
as content becomes more valuable to the Internet collectively. A
mechanism like this would be massively useful for a Youtube or Vimeo
because video content distribution is one of the main sources of
bandwidth demand today and is static. The delivery of 1020p HD content
all of a sudden becomes very very feasible.
On the other hand, if this takes off, telcos and other providers of
"internet tubes" are going to have to worry about uplinks getting
clogged at the last mile. They should be thinking about this now,
because it represents a massive change in usage patterns for regular
internet users and not just the fringe that already uses bit-torrent.
Melvin, on your observation of NAT traversal, I think now is the time
to start thinking of implementing this with IPv6 already as the
default. Given the timelines we are talking about for HTML5 adoption,
this feature creates a very desirable feature that would push a lot of
those at the fringes of the network to adopt IPv6 sooner than they
might have otherwise.
The idea of this complementing RDFa is interesting and something I
hadn't considered yet. I'm going to think about that tonight. In fact
I need to read up on RDFa.
Andrew J L de Andrade
@andrewdeandrade
More information about the whatwg
mailing list