[whatwg] Web Sockets

Sun Jul 6 02:18:46 PDT 2008

Many years ago I wrote a draft for how to do full-duplex communication 
from a Web page. Over the years we've received much feedback on this 
TCPConnection API. I've now completely rewritten the relevant section and 
given it a new name, Web Sockets:

   http://www.whatwg.org/specs/web-apps/current-work/multipage/comms.html#network

If there are any security issues with this proposal, or if it fails to 
achieve its goals (discussed below), or fails to handle a case you care 
about, then please don't hesitate to send feedback to the list!

On Thu, 26 May 2005, Charles Iliya Krempeaux wrote:
> 
> Some might say that letting the client create general TCP connects would 
> bring up all sorts of security concerns.  However, I think these 
> security concerns can be dealt with by making it so the API would only 
> allow the client to create a TCP connection to the "host" that the 
> client -- the webpage (or web application) -- came from.

It's dangerous to allow even that, since if that host is a virtual host, 
for example, it would allow cross-host communication over HTTP. You 
couldn't do cookie-based authentication, but sometimes even that isn't 
needed to cause problems.

> Or, alternatively, we could allow the host that the webpage (or web 
> application) came from to specify a list of domains (or IP addresses) 
> that clients could connect to.  (Of course, there would be restrictions 
> on this.  The hosts in that list would need to "allow" the original host 
> to do this.  A mechanism for this would need to be created.)
>
> There could even be other restrictions.  For example, a host could 
> specify what ports it allows webpages (and web applications) to connnect 
> to.

To some extent that's what we've done now.

On Thu, 26 May 2005, Kornel Lesinski wrote:
> 
> To have your own connections you'd have to use other port than 80 and 
> that may be disallowed on many restricted systems.

Using the Upgrade: header we can get around that for the less novice 
authors.

> If user navigates to the next page, browser will destroy your JS objects 
> and close their connections. That may result in worse performance than 
> with HTTP connection that is kept alive between pages.

The use case is really just for one-page applications, I think.

> Even if connections are limited to the same host, you couldn't safely 
> serve anything else on it. Spammers might use numerous HTML-injection 
> techniques to send spam using other people's computers, and this may get 
> much worse if host restriction fails. From past experience of hundreds 
> of cross-site scripting vulnerabilities, you can be sure that this will 
> happen sooner or later.

Indeed.

On Thu, 26 May 2005, Charles Iliya Krempeaux wrote:
> 
> I'm guess what you are saying is that a "host" could potentially have 
> multiple "web sites" on it using "named virtual hosts".  And although 
> you can separate out multiple sites with HTTP (using "named virtual 
> hosts"), it is not always possible with other protocols.  Also, if you 
> can create TCP connections to the same "host" then you could fake the 
> HTTP "Host" field, and access another site.  Is this what you are 
> saying?  (Is there anything else?)

That's exactly right.

> If this is the case, then perhaps there needs to be another protocol 
> created that provides something like TCP connections but is "host" aware 
> (just like HTTP).  This would be analogous to UDP packets and IP 
> packets.  (UDP packets are alot like IP packets, from the developers 
> point of view.)

WebSocket is now Host-aware.

On Thu, 26 May 2005, Kornel Lesinski wrote:
> 
> On a second thought this may be prevented by forcing some special 
> handshake or transport protocol for custom connections... but then this 
> feature becomes just alternative HTTP + XML RPC that only offers smaller 
> lag for price of increased complexity and worse browser/server support. 
> Is it worth it?

Yes. :-)

On Thu, 26 May 2005, Charles Iliya Krempeaux wrote:
> 
> I think that we should allow TCP connections, even if it won't work in 
> some cases.

We can't really allow raw sockets, for the reasons shown above.

> > Let's say there's website
> > example.com/page.php?name=John
> > that prints
> > Hello "John"!
> > 
> > On your website, if you create iframe with URL:
> > example.com/page.php?name=<script>connectPort(25).send("HELO...SPAM...SPAM");</script>
> 
> I won't be a problem if the web developers is escaping whatever the user 
> supplies.  (This is developer error, ignorance, or stupidity.)

That's common. :-)

> I don't think developer error, ignorance, or stupidity should be an 
> argument to not allow TCP connections.

It's an argument that applies to everything on the Web. :-)

On Thu, 26 May 2005, Joshua RANDALL FTRD/DIH/BOS wrote:
> 
> Firstly, it exposes a very low level protocol to the programmer of the 
> web application, forcing them to code higher-level protocols themselves 
> from within ECMAScript.  While certainly this would allow a high degree 
> of freedom on web application programmers, it would also put an 
> unnecessary burden on them when a higher-level, simpler solution would 
> have sufficed.

There are cases where a relatively low-level API is useful because there 
isn't a higher-level one.

> Secondly, it probably wouldn't work in all cases -- for example clients 
> that require the use of a proxy server to access the network, or are 
> behind a firewall that allows only HTTP connections.

Web Sockets can be proxied through proxies like TLS, and can be tunneled 
over HTTP and upgrade to Web Socket dynamically.

> Finally, it assumes that TCP connections are the best way to get data 
> from the server to the client.  While this is almost certainly the case 
> for desktop computers, it may not be a good assumption for mobile 
> terminals.  Operator networks might in the future have built-in eventing 
> protocols that can more efficiently dispatch data asynchronously to 
> client devices without the need for the overhead of maintaining many 
> virtually unused TCP connections.

This is more about two-way full-duplex communication. For server-push, 
<event-source> and its APIs are more relevant, and are designed to be 
extendable to support SMS push or some such.

> However, since ideally the same web application code would work on all 
> platforms and networks, it would be better if there was a way to 
> negotiate the low-level transport between the client and server rather 
> than have it hard-coded into the script.  For example, a web application 
> showing real-time stock prices wants to get an event stream that updates 
> the stock prices from a server, let's say stockserver.org.  For most 
> desktop clients, using the event-source URI 
> "http://stockserver.org/stockprice" would be fine.  However, a mobile 
> client using the same page would waste a (relatively) lot of bandwidth 
> just by keeping that HTTP connection alive, and it so happens that the 
> particular (and fictitious) mobile network has low-level support for SIP 
> events (could just as easily be XMPP or perhaps even SMS or WAP-Push). 
> Therefore, for that client it would be advantageous to use a URI such as 
> "sip:stockserver.org;subscribe?event=stockprice" instead of HTTP since 
> there would be significantly less network overhead that way.  However, 
> it is undesirable for the web application developer to have to provide a 
> separate version of the page for the mobile user on a SIP-capable 
> network, so it would be advantageous to have an option for the server 
> and client to negotiate the low-level transport.  Perhaps this could be 
> done using an extension to the proposed baseline HTTP-based 
> implementation?

I don't really know how to solve this.

> Ian, perhaps we could add to section 9.1 a header that can be sent by 
> the user agent along with the initial request to the event-source URI 
> that specified a list of event protocols that the user agent supports?  
> Perhaps something like "capabilities"?  Then, the server, knowing what 
> the client support was, would have the option of returning a 3xx 
> redirect to the other protocol URI instead of opening the event stream?  
> If for some reason the user agent was unable to establish the event 
> stream using the new protocol, it could re-contact the server but remove 
> the failed protocol from it's list of capabilities.  This seems to me to 
> be the least obstrusive way of adding basic protocol negotiation to the 
> server-sent DOM events -- do you see any reason why it shouldn't be in 
> there?

We'd still have to define what the protocols are, etc. I don't really know 
how to do this properly.

On Sun, 12 Jun 2005, Thomas Much wrote:
> 
> having read the WD (2005-06-10), IMHO the specification for "open(...)" 
> is missing the following statement:
> 
> "If the readyState attribute has a value other than 0 (Uninitialized), 
> raises an exception. [...]"
> 
> Or is there a special intention behind leaving this statement out?

I removed the open() method.

On Wed, 26 Oct 2005, Mike Dierken wrote:
> > >
> > > If the browser had an HTTP daemon built-in, would that work?
> > 
> > No, since you can't guarentee that incoming connections will connect 
> > (e.g. because you are behind NAT with no port forwarding, a very 
> > common case).
>
> If the client initiated the connection, then the roles were reversed, 
> that would mirror the TCPConnection approach.

It wouldn't be HTTP if the server was the one to connect. :-)

> > Also, requiring that UAs implement HTTP servers, as opposed to just 
> > implementing the simple TCPConnection protocol described at the 
> > moment, seems like a significantly more expensive way of solving this 
> > problem.
>
> A full (i.e. complex) server wouldn't be necessary, just the protocol 
> parsing. Implementing the protocol for simple use is straightforward, 
> and bringing in Apache as a library would be one approach.

Bringing in Apache as a library is a serious cost compared to WebSocket 
whose handshake can be implemented in a dozen lines of perl.

> > That would require that the Web author implement HTTP on his side (or 
> > at least a simple version of an HTTP server) which seems like undue 
> > work.
>
> My opinion is that 'implement HTTP' means 'reading and parse a text 
> stream' - not undue work, even for a 'Web author'.

A compliant HTTP server is MUCH, MUCH more than just parsing headers. 
(Though frankly, even parsing HTTP headers correctly is complicated.)

> > What would the advantage be? We're not connecting to an HTTP server. 
> > Upgrade makes sense if you are upgrading from HTTP to something, but 
> > here we're not expecting HTTP to ever be used over the connection.
>
> The point is that since you want to initiate a connection on port 80, 
> then you should use the protocol assigned to that port. It has specified 
> mechanisms to upgrade to a proprietary protocol - it doesn't cost a 
> whole heck of a lot & you wind up being standards compliant.

I've used this concept now.

> > > > We don't want to require that authors implement an entire HTTP 
> > > > server just to be able to switch to a proprietary protocol.
> > >
> > > Nobody has suggested requiring an entire server. Two messages is all 
> > > it takes. Not only does HTTP scale up well, it scales down too.
> > 
> > No, because you have to implement correct handling of everything 
> > _other_ than Upgrade: as well, even if it is to return "Not Supported" 
> > each time.
>
> Good point.

I've gotten around this by just saying it's not HTTP. If you connect to an 
HTTP server then it's HTTP, and you expect the HTTP server to Do The Right 
Thing, but if you connect straight to a Web Socket server, it doesn't have 
to do HTTP since it's not speaking HTTP. The Web Socket Protocol just 
happens to be compatible with HTTP at a minimal level.

On Thu, 27 Oct 2005, Ted Goddard wrote:
> > > 
> > > Rather than invent another protocol, this seems like an excellent 
> > > application for BEEP:
> > > 
> > > http://www.ietf.org/rfc/rfc3080.txt
> > 
> > Good lord, that protocol is FAR more complicated than it needs to be. 
> > And it doesn't address several of the security issues that are 
> > critical here, such as severly limiting what the initial packets can 
> > contain, and ensuring that the remote host is expecting a connection 
> > initiated by a Web page of the specified domain.
> 
> It may be a bit complicated, but BEEP is well suited to exchanging 
> messages over a variety of transports.  The flow control mechanism 
> proposed in 6.3.7.3 doesn't allow for pipelining, for instance (remember 
> kermit?).

We really don't need much of this complexity though.

> Mike Dierken proposed an HTTP server in the the UA.  If BEEP is too 
> complex, at least a pair of HTTP connections could be effectively used 
> for messaging, and would use a well defined protocol with readily 
> available and mature implementations.  Providing a back-channel to the 
> browser will revolutionize web applications, so it's worth making fairly 
> robust.

I agree that it should be robust, but what is it that it needs to be 
robust against?

> 6.3.5.1. Broadcasting over TCP/IP
> 
> IP Multicast would allow multiple UAs on the same host to interact. In 
> particular, this would allow the technology to be demonstrated on a 
> standalone laptop ...  How about IP multicast rather than UDP to 
> 255.255.255.255?

WebSocket as designed now can work on the same IP. (Broadcast is gone, 
however, and there's no way to talk to another UA, not that that is a very 
common use case.)

On Thu, 2 Nov 2006, Dave Raggett wrote:
> 
> well how about an XMLBEEPRequest specification then?
> 
> Beep is kind of like a bidirectional version of HTTP and includes 
> multiplexing capabilities with stream prioritization, see:
> 
>      http://beepcore.org/index.html
>
> Beep isn't in widespread use as yet, but is well thought off by the IETF 
> folks.

BEEP is _way_ overengineered for our needs, whilst still not actually 
satisfying our needs.

On Tue, 17 Apr 2007, Nicholas Shanks wrote:
>
> May I suggest that you also allow the DOM "referrer" attribute to match 
> a HTTP "Referrer" header if one is present, and fall back to the 
> "Referer" header otherwise. This provides for HTML 5 compliant UAs to be 
> forwards compatible with a potential future HTTP spec that fixes the 
> typo.

It's far too late for that. But if the change happens, I'll update the 
spec. No point getting ahead of ourselves.

> Also, the DOM cookie attribute discussion should mention the HTTP 
> Set-Cookie2 header. Don't know what it should say though.

Done.

On Thu, 24 Apr 2008, Michael Carter wrote:
>
> Currently, the TCPConnection constructor implicitly opens a tcp 
> connection. One downside to this is that a user of the api couldn't 
> re-use the TCPConnection object for future connections.

Why is that a problem?

> XMLHttpRequest on the other hand has open() and abort() methods. The 
> same duality should exist for TCPConnection, thus allowing for re-use.

I'm not convinced this is a good feature of XHR.

> A secondary concern is that the usage of the API is tied to the 
> execution model of javascript with respect to concurrency. That is to 
> say, the only good time to attach an onopen, onclose, or onread callback 
> to the TCPConnection object is immediately following its creation. While 
> this may not be a problem and could certainly be worked around in most 
> cases, adding connect() would allow these callbacks to be attached at 
> any point after the creation of the object, but before the explict call 
> to connect().

You can look at the readyState attribute to see what the state is, if you 
really must. However, why not just not create the connection until you 
need it?

On Tue, 17 Jun 2008, Michael Carter wrote (in a different order):
> 
> I propose that we
> PROPOSAL.1) change the initial handshake of the protocol to be HTTP 
> based to leverage existing solutions to these problems.
> PROPOSAL.2) modify the API to use URIs instead of port/host pairs

Done.

> There are a list of requirements for the protocol:
> 
> <Hixie> basically my main requirements are that:
> HIXIE.1) there be the ability for one process to have a full-duplex
> communication channel to the script running in the web page

Supported.

> HIXIE.2) the server-side be implementable in a fully conformant way in just
> a few lines of perl without support libraries

Supported.

> HIXIE.3) that it be safe from abuse (e.g. can't connect to smtp servers)

Supported.

> HIXIE.4) that it work from within fascist firewalls

Supported, assuming they don't do deep packet inspection (and even then, 
you could use TLS).

> <othermaciej> my two problems with it are: (1) it uses port/host 
> addressing instead of URI addressing, which is a poor fit for the Web 
> model

Fixed.

> <othermaciej> (2) it's bad to send non-http over the assigned ports for 
> http and https

Somewhat fixed, assuming we can get the relevant ports reserved; though 
even if we don't at least now it looks like HTTP.

> <othermaciej> (3) I am worried that connection to arbitrary ports could 
> lead to security issues, although Hixie tried hard to avoid them

I'm worried about this too, but I think the current mechanism is safe with 
most if not all protocols. I'm very interested in hearing about any 
problems.

> ISSUE.3) inability to traverse forward proxies

Fixed.

> ISSUE.4) lack of cross-domain access control

Fixed.

> ISSUE.5) DNS rebinding security holes

Fixed (using Host:, like HTTP, but confirmed using WebSocket-Location, 
which should be even more secure than HTTP).

> ISSUE.6) lack of load balancing integration

Somewhat fixed.

> ISSUE.7) lack of authentication integration

Somewhat fixed, though this isn't perfect even now.

> ISSUE.8) virtual hosting with secure communication (no Host header, and 
> even if there was, there's no way to indicate this header *before* the 
> secure handshake)

Fixed.

> TLS Upgrade

I haven't supported this, for the same reason RFC2817 isn't supported by 
UAs.

On Wed, 18 Jun 2008, Shannon wrote:
>
> I understand the reasoning but I do not believe this should be limited 
> to ports 80 and 443. By doing so we render the protocol difficult to use 
> as many (if not most) custom services would need to run on another port 
> to avoid conflict with the primary webserver.

The spec allows any ports now, though maybe we should limit ports below 
1024 that aren't the HTTP or WSP ports.

> I propose that there be requirements that limit the amount and type of 
> data a client can send before receiving a valid server response.

Done, except for the length of the URL.

> The requirements should limit:
> * Number or retries per URI
> * Number of simultaneous connections
> * Total number of connection attempts per script domain  (to all URIs)

This isn't currently limited, but implementation feedback should help us 
decide what limits are sensible here.

> There should also be a recommendation that UAs display some form of 
> status feedback to indicate a background connection is occurring.

That's a UI issue, so I haven't done anything regarding this.

> It is always possible that non-http services are running on port 80. One 
> logical reason would be as a workaround for strict firewalls. So the 
> main defense against abuse is not the port number but the handshake. The 
> original TCP Connection spec required the client to send only "Hello\n" 
> and the server to send only "Welcome\n". The new proposal complicates 
> things since the server/proxy could send any valid HTTP headers and it 
> would be up to the UA to determine their validity. Since the script 
> author can also inject URIs into the handshake this becomes a potential 
> flaw.

The handshake now is much stricter than in mcarter's initial proposal.

> Consider the code:
> 
> tcp = TCPConnection('http://mail.domain.ext/\\r\\nHELO HTTP/1.1 101 Switching
> Protocols\\r\\n' )
> 
> client>>
> OPTIONS \r\n
> HELO HTTP/1.1 101 Switching Protocols\r\n
> HTTP/1.1\r\n
> 
> server>>
> 250 mail.domain.ext Hello \r\n
> HTTP/1.1 101 Switching Protocols\r\n
> [111.111.111.111], pleased to meet you

The URL will get escaped, so this isn't a big deal.

Are there any protocols where the sender controls the first few characters 
of the response?

> One last thing. Does anybody know how async communication would affect 
> common proxies (forward and reverse)? I imagine they can handle large 
> amounts of POST data but how do they feel about a forcibly held-open 
> by-directional communication that never calls POST or GET? How would 
> caches respond without expires or max-age headers? Would this hog 
> threads causing apache/squid to stop serving requests? Would this work 
> through Tor?

The requirement on proxies are the same as for TLS as far as I can tell.

On Wed, 18 Jun 2008, Shannon wrote:
> 
> I agree. Since the aim of the URI injection is to get an echo of a valid 
> header it is important that the server response include illegal URI 
> components that a server would not otherwise send. Newline could be part 
> of a legitimate response from a confused server or one that echos 
> commands automatically, eg:
> 
> tcp = new 
> TCPConnection('http://mail.domain.ext/Upgrade:TCPConnection/1.0' )
> 
> server>>
> Upgrade:TCPConnection/1.0
> Error: Unrecognized command.
> 
> Unlike my previous example this is a perfectly valid URI. Whatever the 
> magic ends up being it should aim to include illegal URI characters, ie: 
> angle-brackets, white-space, control characters, etc.. in an arrangement 
> that couldn't happen accidentally or through clever tricks. ie:
> 
> Magic: <tcp allow>\r\n
> 
> This example magic includes at least three characters that cannot be 
> sent in a valid URI (space, left angle bracket, right angle-bracket) in 
> addition to the newline and carriage returns.

The handshake now consists of 75 characters including newlines and spaces 
that can't be sent in the initial request (which only includes a URL path 
under author control).

If the protocol accepts escapes and echos those first, though, there could 
still be a problem. Do any protocols do that?

> > > One last thing. Does anybody know how async communication would 
> > > affect common proxies (forward and reverse)? I imagine they can 
> > > handle large amounts of POST data but how do they feel about a 
> > > forcibly held-open by-directional communication that never calls 
> > > POST or GET?
> > 
> > That's basically what TLS is, right? The simple solution would be to 
> > just tunnel everything through TLS when you hit an uncooperative 
> > proxy.
> 
> Not with a few lines of perl you don't.

With the Web Socket protocol, you can tunnel through a proxy using the 
TLS-like CONNECT behaviour without using TLS itself.

On Wed, 18 Jun 2008, Frode Børli wrote:
>
> If a TCPConnection is supposed to be able to connect to other services, 
> then some sort of mechanism must be implemented so that the targeted web 
> server must perform some sort of approval. The method of approval must 
> be engineered in such a way that approval process itself cannot be the 
> target of the dos attack.

As far as I can tell, the WebSocket mechanism isn't susceptible to any DOS 
attack that isn't already possible with, say, <img>.

> If the client must send information trough the TCPConnection initially, 
> then we effectively stop existing servers such as IRC-servers from being 
> able to accept connections without needing a rewrite.

Correct; that's intentional.

> The protocol should not require any data (not even hello - it should 
> function as an ordinary TCPConnection similar to implementations in 
> java, c# or any other major programming language. If not, it should be 
> called something else - as it is not a TCP connection.

Agreed. I've called it Web Socket.

On Wed, 18 Jun 2008, Frode Børli wrote:
> 
> It should not be allowed to connect to any other host or ip-address than 
> the IP-address where the script was retrieved from. Exactly the same 
> security policy is enforced in Java applets and Flash. If the javascript 
> should be able to connect to other servers, then there should be some 
> sort of mechanism involving certificates and possibly also a TXT record 
> in the DNS for each server that can be accessed.

In this day of mashups and other cross-site cooperations, I think it's 
important that we design this to be capable of cross-site communication 
from the start.

> Still I do not believe it should have a specific protocol. If a protocol 
> is decided on, and it is allowed to connect to any IP-address - then 
> DDOS attacks can still be performed: If one million web browsers connect 
> to any port on a single server, it does not matter which protocol the 
> client tries to communicate with. The server will still have problems.

Given that this already exists as a problem for regular HTTP, HTTPS, FTP, 
and indeed almost any protocol accessible from Web browsers, I don't think 
it's a big deal.

On Thu, 19 Jun 2008, Shannon wrote:
> 
> I fail to see how virtual hosting will work for this anyway. I mean 
> we're not talking about Apache/IIS here, we're talking about custom 
> applications, scripts or devices - possibly implemented in firmware or 
> "a few lines of perl". Adding vhost control to the protocol is just 
> silly since the webserver won't ever see the request and the customer 
> application should be able to use any method it likes to differentiate 
> its services. Even URI addressing is silly since again the application 
> may have no concept of "paths" or "queries". It is simply a service 
> running on a port. The only valid use case for all this added complexity 
> is proxying but nobody has tested yet whether proxies will handle this 
> (short of enabling encryption, and even that is untested).

Just because we want it to be possible to implement a server in a few 
lines doesn't preclude making it possible to have a complex solution that 
_does_ do virtual hosting, multiple URIs, etc. Maybe Apache will grow to 
have a module to support this natively.

> I'm thinking here that this proposal is basically rewriting the CGI 
> protocol (web server handing off managed request to custom scripts) with 
> the ONLY difference being the asynchronous nature of the request. 
> Perhaps more consideration might be given to how the CGI/HTTP protocols 
> might be updated to allow async communication.

I'm certainly open to suggestions, but so far the Web Socket protocol 
based on mcarter's ideas has been the best as far as I can tell.

> Having said that I still see a very strong use case for low-level 
> client-side TCP and UDP. There are ways to manage the security risks 
> that require further investigation. Even if it must be kept same-domain 
> that is better than creating a new protocol that won't work with 
> existing services. Even if that sounds like a feature - it isn't. There 
> are better ways to handle access-control for non-WebConnection devices 
> than sending garbage to the port.

I don't really see what you mean here.

> [WebSocket DOS attacks are] more harmful because an img tag (to my 
> knowledge) cannot be used to brute-force access, whereas a socket 
> connection could. With the focus on DDOS it is important to remember 
> that these sockets will enable full read/write access to arbitrary 
> services whereas existing methods can only write once per connection and 
> generally not do anything useful with the response.

The WebSocket protocol can't be used for read/write until the handshake 
has been received.

On Thu, 19 Jun 2008, Frode Børli wrote:
> 
> Could you test keeping the same connection as the webpage was fetched 
> from, open? So that when the server script responds with its HTML-code - 
> the connection is not closed, but used for kept alive for two way 
> communications?

Wouldn't that interfere with HTTP's pipelining?

On Fri, 20 Jun 2008, Frode Børli wrote:
>
>    <input id='test' type='button'>";
>    <script type='text/javascript'>
>       // when the button is clicked, raise the test_click event
> handler on the server.
>       document.getElementById('test').addEventListener('click',
> document.serverSocket.createEventHandler('test_click');
>       // when the server raises the "message" event, alert the message
>       document.serverSocket.addEventListener('message', alert);
>    </script>
> <?php
> // magic PHP method that is called whenever a client side event is
> sent to the server
> function __serverSocketEvent($name, $event)
> {
>     if($name == 'test_click')
>        server_socket_event("message", "You clicked the button");
> }
> ?>

You could do things like this using WebSocket, yes; it would take some 
library support, but that could be written easily enough.

> If a Session ID (or perhaps a Request ID) is added to the headers then 
> it is possible to create server side logic that makes it easier for web 
> developers. When session ids are sent trough cookies, web servers and 
> proxy servers have no way to identiy a session (since only the script 
> knows which cookie is the session id). The SessionID header could be 
> used by load balancers and more - and it could also be used by for 
> example IIS/Apache to connect a secondary socket to the script that 
> created the page (and ultimately achieving what I want).

If sent to the same host and port, the same cookies will be sent too.

> Doh, ofcourse! :) So I am going back to my first suggestion - the server 
> with the script must have a certificate as well. The script must be 
> signed with a private key, and the DNS server must have the public key.

That sounds way more complicated than just asking the server directly the 
way that Web Socket does.

> I care more about how it works for the developer than how the protocol 
> itself is implemented. I think maybe the protocol should be discussed 
> with others or have its own WG.

This is simple enough that it really doesn't need its own WG. Indeed, 
having a new WG for this would likely just result in the protocol being 
overengineered.

> The script that generates the page should be able to communicate with 
> the page it generated. The page should also be able to connect to a 
> separate script, if the web developer thinks that it is important.

This is possible with WebSocket. Just fire up a listener on a unique port, 
then tell the remote end to use that port, and stop listening to it as 
soon as you have a connection (or keep listening to let the remote end 
connect again in the face of network errors). This can all be done from 
the same script. You could support tens of thousands of connections per IP 
that way, probably way more than you'd ever want to actually host on a 
single machine given that each connection would imply its own script in 
this kind of setup.

On Fri, 20 Jun 2008, Shannon wrote:
> 
> I propose a new protocol called Asynchrous CGI that extends Fast CGI to 
> support asynchonous and persistent channels rather than the creation of 
> an entirely new WebSockets protocol from scratch.

This is out of scope for this WG, but I recommend going ahead with this 
work in a more appropriate venue, as it would help Web Socket gain 
adoption.

> I already have already provided two examples in previous posts but to 
> reiterate quickly this protocol as currently described can be 
> manipulated to allow a full challenge-response process. This means I can 
> make every visitors browser continually attempt username/password 
> combinations against a service, detect when access is granted, and 
> continue to send commands following the handshake.

This does not appear to be the case with the Web Socket API as defined.

> IMG and FORM allow at most a single single request to be sent before 
> closing the connection and generally return the data in a form that 
> cannot be inspected inside javascript. I have shown that by injecting a 
> custom URI into the handshake I can theoretically force a valid server 
> response to trick the browser into keeping the connection open for the 
> purpose of DDOS or additional attacks.

If this is possible with the Web Socket protocol as defined, I would be 
very interested in hearing more about this.

On Fri, 20 Jun 2008, Philipp Serafin wrote:
>
> Idea: Add an additional HTML element. If it is present, the browser will 
> not close the connection after it downloaded the document, but instead 
> send an OPTIONS <uri> Upgrade: ... request and present and give the 
> page's scripts access to a default WebSocket object that represents this 
> connection.

I don't think that's necessary, really. It would also be far more 
complicated to set up than Web Sockets are.

> Consider the following scenario:
> 
> Bob and Eve have bought space on a run-of-the-mill XAMPP web hoster. 
> They have different domains but happen to be on the same IP. Now Eve 
> wants do bruteforce Bob's password-protected web application. So she 
> adds a script to her relatively popular site that does the following:
> 
> 1) Open a TCP connection to her own domain on port 80. As far as the 
> browser is concerned, both origin and IP adress match the site one's, so 
> no cross domain checks are performed.

Actually the WebSocket protocol as defined does require a handshake even 
in this case, and requires it to both acknowledge its hostname as well as 
acknowledge the origin of the script that triggered the connection.

> We could strengthen the security requirements, so that even same-domain 
> requests need permission. However, then we had about the same hole as 
> soon as the web host updates its services and gives Eve permission to 
> access "her own" site.

WebSocket as defined appears to be immune to this, and as an added bonus, 
doesn't have any of the DNS complexity.

On Tue, 24 Jun 2008, Philipp Serafin wrote:
> 
> If this works, we could extend Michael's original algorithm as follows 
> (this would be in addition to the "new WebSocket()" interface and would 
> not replace it)
> 
> PROPOSAL: Turning an existing HTTP connection into a WebSocket 
> connection:
> 
> If the server sends a Connection: Upgrade header and an Upgrade header 
> with a "WebSocket" token as part of a normal response and if the 
> resource fetched established a browsing contest, the client must not 
> issue any other requests on that connection and must initiate a protocol 
> switch. After the switch has finished, the client would expose the 
> connection to the application via a DefaultWebSocket property or 
> something similar.
> 
> An exchange could look like this:
> 
> C: GET /uri HTTP/1.1
> C: Host: example.com
> C: [ ... usual headers ... ]
> C:
> 
> S: HTTP/1.1 200 OK
> S: Content-Type: text/html
> S: [ ... usual headers ... ]
> S: Upgrade: WebSocket/1.0
> S: Connection: Upgrade
> S:
> S: [ ... body ... ]
> 
> C: OPTIONS /uri HTTP/1.1
> C: Host: example.com
> C: Upgrade: WebSocket/1.0
> C: Connection: Upgrade
> C:
> 
> S: HTTP/1.1 101 Switching Protocols
> S: Upgrade: WebSocket/1.0
> S: Connection: Upgrade
> S:
> 
> C/S: [ ... application specific data ... ]
> 
> Because the connection would be same-origin pretty much per definition, 
> no access checks would be needed in that situation.
> 
> Would something like this be doable and wanted?

This seems way more complex than necessary to address the use cases. Maybe 
in a future version, though.

[snip more e-mails on the same theme that don't really propose anything 
new that hasn't been mentioned above; please do let me know if I missed a 
WebSocket design problem or a proposal that deserves further thought]

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'