[whatwg] Issues with Web Sockets API

Tue Jul 28 16:40:49 PDT 2009

On Tue, 14 Jul 2009, Jeremy Orlow wrote:
> > >
> > > I think 'readyState' should just go away since an application will 
> > > have to keep track of state updates through the fired events and use 
> > > try/catch blocks around all API calls anyway.
> >
> > The attribute is mostly present for debugging purposes. I wouldn't 
> > expect anyone to actually use it for production work.
> 
> Is there precedent for other portions of the API that are mostly for 
> debugging purposes?  (I can't think of anything off the top of my head.)

readyState on Document and <video> aren't realy useful for anything but 
debugging either, as far as I can tell.

> Also, maybe it should be noted as such in the spec?

I don't really see much benefit to including such a statement; if someone 
wants to use it for a non-debugging reason, why not do so?

> If it's only for debugging purposes, maybe a cleaner way to define it is 
> to simply be the last event fired on a given WebSocket?

I don't really understand what problem we would be trying to solve by 
changing that.

> One other random question:  in the IDL for WebSockets, the three 
> constants for ready state are all defined as shorts but the value of 
> ready state is a long.  Is this an oversight?

Fixed.

On Mon, 27 Jul 2009, Alexey Proskuryakov wrote:
> 
> I agree with Michael that send() should not silently drop data that 
> could not be sent. It is very easy to fill send buffers, and if bytes 
> get silently dropped, implementing app-level acks becomes quite 
> difficult.

I've made it clear that if bytes can't be sent, the connection must be 
closed.

> However, I do not think that raising an exception is an appropriate 
> answer. Often, the TCP implementation takes a part of data given to it, 
> and asks to resubmit the rest later. So, just returning an integer 
> result from send() would be best in my opinion.

I think we are best off abstracting away this level of complexity from 
authors, especially since we'd need to make sure that data was not sent 
half-way through a UTF-8 sequence, and since the framing is under the 
control of the UA, not the application. There's no way to retry a 
partially-successful send() from the API here.

> 1) Web Sockets is specified to send whatever authentication credentials 
> the client has for the resource. However, there is no challenge-response 
> sequence specified, which seems to prevent using common auth schemes. 
> HTTP Basic needs to know an authentication realm for the credentials, 
> and other schemes need a cryptographic challenge (e.g. nonce for Digest 
> auth).

I expect to address this in more detail in a future version. For now, use 
in-band authentication in the WebSocket once you are connected. We may 
find that that is actually enough.

> 2) It is not specified what the server does when credentials are 
> incorrect, so I assume that the intended behavior is to close the 
> connection. Unlike HTTP 401 response, this doesn't give the client a 
> chance to ask the user again. Also, if the server is on a different 
> host, especially one that's not shared with an HTTP server, there isn't 
> a way to obtain credentials, in the first place.

How we address this will likely depend on how we address the earlier 
point.

> 3) A Web Sockets server cannot respond with a redirect to another URL. 
> I'm not sure if the intention is to leave this to implementations, or to 
> add in Web Sockets v2, but it definitely looks like an important feature 
> to me, maybe something that needs to be in v1.

What's the use case? Why does this need to be at the connection layer 
rather than the application layer? (Why would we need this, when TCP 
doesn't have it? Would you also need "redirect"-like functonality in IRC, 
IMAP, SSH, and other such protocols?)

> 4) "If the user agent already has a Web Socket connection to the remote 
> host identified by /host/ (even if known by another name), wait until 
> that connection has been established or for that connection to have 
> failed."
>
> It doesn't look like "host identified by /host/" is defined anywhere. 
> Does this requirement say that IP addresses should be compared, instead 
> of host names?

Right. I've tried to clarify this.

> I'm not sure if this is significant for preventing DoS attacks, and 
> anyway, the IP address may not be known before a request is sent. This 
> puts an unusual burden on the implementation.

Without this requirement, you can just have a DNS server return the victim 
IP for a wildcard DNS entry, and then just have attackers open connections 
to thousands of "hosts".

> 5) We probably need to specify a keep-alive feature to avoid proxy 
> connection timeout. I do not have factual data on whether common proxies 
> implement connection timeout, but I'd expect them to often do.

This seems like something that would be easy to deal with at the 
application layer, if desired.

> 6) The spec should probably explicitly permit blocking some ports from 
> use with Web Sockets at UA's discretion. In practice, the list would 
> likely be the same as for HTTP, see e.g. 
> <http://www.mozilla.org/projects/netlib/PortBanning.html>.

Done.

> 7) "use a SOCKS proxy for WebSocket connections, if available, or failing
> that, to prefer an HTTPS proxy over an HTTP proxy"
> 
> It is not clear what definition of proxy types is used here. To me, an HTTPS
> proxy is one that supports CONNECT to port 443, and an HTTP proxy (if we're
> making a distinction from HTTPS) is one that intercepts and forwards GET
> requests. However, this understanding contradicts an example in paragraph
> 3.1.3, and also, it's not clear how a GET proxy could be used for Web Sockets.

Clarified, I hope.

> 8) Many HTTPS proxies only allow connecting to port 443. Do you have the 
> data on whether relying on existing proxies to establish connections to 
> arbitrary ports is practical?

I do not. I expect most people to use direct connections over port 81 or 
TLS over port 443, as discussed in the introduction.

> 9) "There is no limit to the number of established Web Socket 
> connections a user agent can have with a single remote host".
> 
> Does this mean that Web Socket connections are exempt from the normal 
> 4-connection (or so) limit? Why is it OK?

That limit is an HTTP limit. WebSocket is not an HTTP protocol, so the 
limit has no bearing on WebSocket.

As I understand it, the limit in HTTP is intended to deal with the problem 
of multiple short-lived connections being needed to render a page, e.g. 
going to a Web page with thousands of <img>s. There would be no way for 
the author to ensure the page didn't DOS the server in such a case. This 
is not a concern with WebSocket, where the author controls when the 
connections are made.

> 10) Web Socket handshake uses CRLF line endings strictly. Does this add 
> much to security? It prevents using telnet/netcat for debugging, which 
> is something I personally use often when working on networking issues.
> 
> If there is no practical reason for this, I'd suggest relaxing this 
> aspect of parsing.

Do you mean client->server or server->client?

> 11) There is no way for the client to know that the connection has been
> closed. For example:
> - socket.close() is called from JavaScript;
> - onclose handler is invoked;
> - more data arrives from the server, and onmessage is dispatched (which I
> think is correct, and it matches what TCP does);
> - finally, a TCP FIN arrives, indicating that there will be no more data from
> the server (the underlying TCP connection is in TIME_WAIT state after that);
> - the client never learns that the server is done sending data.

The onclose only fires once the connection has closed, which is after the 
TCP FIN, so it happens after the last 'message' event.

> As Web Sockets are basically at the same level as TCP, and TCP provides 
> complete info about socket state, I don't think that delegating 
> connection closing to app-level protocols would be appropriate.

I'm not sure what you mean.

On Mon, 27 Jul 2009, Maciej Stachowiak wrote:
> 
> With WebSocket, another possibility is for the implementation to buffer 
> pending data that could not yet be sent to the TCP layer, so that the 
> client of WebSocket doesn't have to be exposed to system limitations. At 
> that point, an exception is only needed if the implementation runs out 
> of memory for buffering. With a system TCP implementation, the buffering 
> would be in kernel space, which is a scarce resource, but user space 
> memory inside the implementation is no more scarce than user space 
> memory held by the Web application waiting to send to the WebSocket.

Indeed.

On Mon, 27 Jul 2009, Alexey Proskuryakov wrote:
> 
> I agree that this will help if the application sends data in burst mode, 
> but what if it just constantly sends more than the network can transmit? 
> It will never learn that it's misbehaving, and will just take more and 
> more memory.

I've added an attribute that says how much data has been buffered, so an 
application can tell if this number is rising unexpectedly.

> An example where adapting to network bandwidth is needed is of course 
> file uploading, but even if we dismiss it as a special case that can be 
> served with custom code, there's also e.g. captured video or audio that 
> can be downgraded in quality for slow connections.

We may have to do more complex things when we introduce files and streams, 
but in practice I expect those to be a non-issue since the UA would take 
care of them completely with just one send() call.

   function upload(file) {
     websocket.send(file);
   }

   websocket.startSendingStream(camera.stream);
   ...
   websocket.stopSendingStream(camera.stream);

...or something. Those are in fact far easier to deal with than just 
continuous updates of the user's progress in a game or some such.

On Mon, 27 Jul 2009, Jeremy Orlow wrote:
> 
> Maybe the right behavior is to buffer in user-space (like Maciej 
> explained) up until a limit (left up to the UA) and then anything beyond 
> that results in an exception.  This seems like it'd handle bursty 
> communication and would keep the failure model simple.

Running out of space is hitting a hardware limitation, at which point you 
can do whatever you like (the spec doesn't require any particular 
behaviour in such scenarios, since what is possible depends on the UA).

I have, however, made the spec clear that if the send() fails somehow, the 
connection must be closed.

On Mon, 27 Jul 2009, Alexey Proskuryakov wrote:
> 
> Having a send() that doesn't return anything and doesn't raise 
> exceptions would be a clear signal that send() just blocks until it's 
> possible to send data to me, and I'm sure to many others, as well. There 
> is no reason to silently drop data sent over a TCP connection - after 
> all, we could as well base the protocol on UDP if we did, and lose 
> nothing.

I think returning a boolean is more or less the same as "silently 
dropping" in practice.

On Mon, 27 Jul 2009, Drew Wilson wrote:
> 
> There's another option besides blocking, raising an exception, and 
> dropping data: unlimited buffering in user space. So I'm saying we 
> should not put any limits on the amount of user-space buffering we're 
> willing to do, any more than we put any limits on the amount of other 
> types of user-space memory allocation a page can perform.

Agreed.

On Mon, 27 Jul 2009, Jeremy Orlow wrote:
> 
> I agree with Alexey that applications need feedback when they're 
> consistentiently exceeding what your net connection can handle.  I think 
> an application getting an exception rather than filling up its buffer 
> until it OOMs is a much better experience for the user and the web 
> developer.

True. the .bufferedAmount attribute will now allow this.

> If you have application level ACKs (which you probably 
> should--especially in high-throughput uses), you really shouldn't even 
> hit the buffer limits that a UA might have in place.  I don't really 
> think that having a limit on the buffer size is a problem and that, if 
> anything, it'll promote better application level flow control.

Probably true also.

On Mon, 27 Jul 2009, Drew Wilson wrote:
> 
> I'm assuming that no actual limits would be specified in the 
> specification, so it would be entirely up to a given UserAgent to decide 
> how much buffering it is willing to provide. Doesn't that imply that a 
> well-behaved web application would be forced to check for exceptions 
> from all send() invocations, since there's no way to know a priori 
> whether limits imposed by an application via its app-level protocol 
> would be sufficient to stay under a given user-agent's internal limits?

Without the recent changes, yes.

> Even worse, to be broadly deployable the app-level protocol would have 
> to enforce the lowest-common-denominator buffering limit, which would 
> inhibit throughput on platforms that support higher buffers. In 
> practice, I suspect most implementations would adopt a "just blast out 
> as much data as possible until the system throws an exception, then set 
> a timer to retry the send in 100ms" approach. But perhaps that's your 
> intention? If so, then I'd suggest changing the API to just have a 
> "canWrite" notification like other async socket APIs provide (or 
> something similar) to avoid the clunky catch-and-retry idiom.

The attribute now lets you just wait until the buffer is empty, which is 
more or less equivalent, I think.

On Mon, 27 Jul 2009, Maciej Stachowiak wrote:
> 
> I think even unlimited buffering needs to be combined with at least a 
> hint to the WebSocket client to back off the send rate, because it's 
> possible to send so much data that it exceeds the available address 
> space, for example when uploading a very large file piece by piece, or 
> when sending a live media stream that requires more bandwidth than the 
> connection can deliver. In the first case, it is possible, though highly 
> undesirable, to spool the data to be sent to disk; in the latter case, 
> doing that would just inevitably fill the disk. Obviously we need more 
> web platform capabilities to make such use cases a reality, but they are 
> foreseeable and we should deal with them in some reasonable way.

Both the lice stream and the file case are actually far easier for us to 
deal with, as noted above, than just lots of generated text data.

On Tue, 28 Jul 2009, Robert O'Callahan wrote:
>
> Why not just allow unlimited buffering, but also provide an API to query 
> how much data is currently buffered (approximate only, so it would be OK 
> to just return the size of data buffered in user space)?
> 
> Then applications that care and can adapt can do so. But most 
> applications will not need to. The problem of partial writes being 
> incorrectly handled is pernicious and I definitely think partial writes 
> should not be exposed to applications.

That's what I've done.

On Mon, 27 Jul 2009, Michael Nordman wrote:
> 
> The proposed websocket interface is too dumbed down. The caller doesn't 
> know what the impl is doing, and the impl doesn't know what the caller 
> is trying to do. As a consequence, there is no "reasonable" action that 
> either can take when buffers start overflowing. Typically, the network 
> layer provides sufficient status info to its caller that, allowing the 
> higher level code to do something reasonable in light of how the network 
> layer is performing. That kind of status info is simply missing from the 
> websocket interface. I think its possible to add to the interface 
> features that would facilitate more demanding uses cases without 
> complicating the simple use cases. I think that would be an excellent 
> goal for this API.

Do the minimal new additions address this to your satisfaction?

On Mon, 27 Jul 2009, Drew Wilson wrote:
> 
> I would suggest that the solution to this situation is an appropriate 
> application-level protocol (i.e. acks) to allow the application to have 
> no more than (say) 1MB of data outstanding.
> 
> I'm just afraid that we're burdening the API to handle degenerative 
> cases that the vast majority of users won't encounter. Specifying in the 
> API that any arbitrary send() invocation could throw some kind of "retry 
> exception" or return some kind of error code is really really 
> cumbersome.

I agree that we aren't talking about a particularly common case.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'