[whatwg] WebSockets feedback

Thu Jul 22 13:18:46 PDT 2010

On Wed, 21 Apr 2010, Simon Pieters wrote:
>
> WebSocket establish a WebSocket connection:
> 
> [[
> 30. If code is not three bytes long, or if any of the bytes in code are not in
> the range 0x30 to 0x39, then fail the WebSocket connection and abort these
> steps.
> ]]
> 
> This step seems entirely redundant. The next step checks for "101" or 
> "407".

Removed.

On Wed, 21 Apr 2010, Simon Pieters wrote:
>
> WebSocket establish a WebSocket connection:
> 
> [[
> 41. ...
> If the entry's name is "upgrade"
> If the value is not exactly equal to the string "WebSocket", then fail the
> WebSocket connection and abort these steps.
> ]]
> 
> Reading the client's opening handshake:
> 
> [[
> Upgrade
> Invariant part of the handshake. Will always have a value that is an ASCII
> case-insensitive match for the string "WebSocket".
> 
> Can be safely ignored, though the server should abort the WebSocket connection
> if this field is absent or has a different value, to avoid vulnerability to
> cross-protocol attacks.
> ]]
> 
> Why should the client compare case-sensitively but the server 
> case-insensitively?

Fixed.

On Thu, 22 Apr 2010, Simon Pieters wrote:
>
> WebSocket data framing
> 
> [[
> 8. If the frame type is 0xFF and the length was 0, then run the following
> substeps:
> ]]
> 
> This will be true for 0xFF 0x80 0x00, or any number of leading 0x80 bytes in
> length. Presumably the frame should only be treated as a closing handshake if
> it was 0xFF 0x00.

Why?

On Thu, 22 Apr 2010, Simon Pieters wrote:
>
> establishing a WebSocket connection:
> 
> [[
> 41. ... or if there are any entries in the fields list whose names are the
> empty string, then fail the WebSocket connection and abort these steps. ...
> ]]
> 
> I think it is better to check for this while parsing the fields, by checking
> if the name byte array is empty here:
> 
> [[
> 34. Read a byte from the server.
> 
> ...
> If the byte is 0x3A (ASCII :)
> Move on to the next step.
> ]]

Why?

On Fri, 23 Apr 2010, Simon Pieters wrote:
>
> The establish a WebSocket connection algorithm is very specific about 
> when to bail out. This is annoying. It means we have to reimplement 
> header parsing just for WebSockets, when we already have very well 
> tested header parsing in place. We'd like to be able to bail out or wait 
> for more data as we see fit for the opening handshake. It does not seem 
> to be interop-sensitive whether one browser bails out a few bytes 
> earlier than another browser for an invalid handshake. The difference is 
> only observable in carefully crafted cases where the server sleeps 
> half-way through the handshake.
> 
> For instance, the algorithm says to wait for an 0x0A byte and only then 
> check the status code. We want to check the status code earlier, 
> byte-for-byte.
> 
> For the fields, the algorithm says to do some processing while parsing 
> each field, and then do further processing when they're all received. 
> We'd like to wait with processing them until we have them all.
> 
> (This feedback makes the earlier feedback about moving processing of 
> fields with empty name moot.)

I've made this implementation difference conforming, since it's harmless.

On Fri, 30 Apr 2010, Simon Pieters wrote:
>
> start the WebSocket closing handshake:
> 
> [[
> Note: The closing handshake finishes once the server returns the 0xFF packet,
> as described above.
> ]]
> 
> I assume it should say 0xFF frame, not packet.

Fixed.

> This note is only true when the client sends the closing handshake first.

Fixed.

> ...except that I can't find anywhere in the server part to send 0xFF 0x00 when
> it receives 0xFF 0x00 from the client. I just see:
> 
> [[
> 5. If type is 0xFF and length is 0, then set the client terminated flag 
> and abort these steps. All further data sent by the client should be 
> discarded.
> ]]
> 
> and:
> 
> [[
> At any time, the server may decide to terminate the WebSocket connection by
> running through the following steps:
> 
> Send a 0xFF byte and a 0x00 byte to the client to indicate the start of the
> closing handshake.
> 
> Wait until the client terminated flag has been set, or until a server-defined
> timeout expires.
> 
> Close the WebSocket connection.
> ]]
> 
> I'm confused at this point how the closing handshake is supposed to work.

I've added a sentence that links these things together a bit more.

(In general this is a bit loose because I don't really see any advantage 
to the server being non-conforming if it just hangs around doing nothing 
for a while after the client sends the 0xFF frame, so there's nothing that 
forces the server to immediately respond.)

> [[
> Once these steps have started, the server must not send any further data to
> the server. The 0xFF 0x00 bytes indicate the end of the server's data, and
> further bytes will be discarded by the client.
> ]]
> 
> s/to the server/to the client/

Fixed.

On Tue, 4 May 2010, Simon Pieters wrote:
>
> establish a WebSocket connection:
> [[
> 15. If the client has any cookies that would be relevant to a resource
> accessed over HTTP, if secure is false, or HTTPS, if it is true, on host host,
> port port, with resource name as the path (and possibly query parameters),
> then add to fields any HTTP headers that would be appropriate for that
> information. [HTTP] [COOKIES]
> ]]
> 
> Adding an HTTP header seems to allow HTTP syntax that is incompatible with
> WebSocket fields syntax. For instance, whitespace before the colon, horizontal
> tab instead of space after the colon, continuation lines, comments, escapes...

Fixed.

> Also, does it say to add a single entry to fields with all headers or one
> entry per header?

Fixed.

On Thu, 6 May 2010, Simon Pieters wrote:
> On Tue, 20 Apr 2010 16:00:36 +0200, Simon Pieters <simonp at opera.com> wrote:
> 
> > [[
> > WebSocket object with an open connection must not be garbage collected if
> > there are any event listeners registered for message events.
> > ]]
> > 
> > Shouldn't it also not be garbage collected if there are listeners for open,
> > error and close? What about when the connection is not yet established?
> 
> I think the policy should be:
> 
> if readyState is CONNECTING:
>  has 'open' event listener: don't collect
>  has 'message' event listener: don't collect
>  has 'error' event listener: don't collect
>  has 'close' event listener: don't collect
> 
> if readyState is OPEN:
>  has 'open' event listener: OK to collect
>  has 'message' event listener: don't collect
>  has 'error' event listener: don't collect
>  has 'close' event listener: don't collect
> 
> if readyState is CLOSING:
>  has 'open' event listener: OK to collect
>  has 'message' event listener: OK to collect
>  has 'error' event listener: OK to collect
>  has 'close' event listener: don't collect
> 
> if readyState is CLOSED:
>  has 'open' event listener: OK to collect
>  has 'message' event listener: OK to collect
>  has 'error' event listener: OK to collect
>  has 'close' event listener: OK to collect

Agreed.

On Fri, 7 May 2010, Simon Pieters wrote:
>
> establish a WebSocket connection
> 
> [[
> 28. Read bytes from the server until either the connection closes, or a 0x0A
> byte is read. Let field be these bytes, including the 0x0A byte.
> 
> If field is not at least seven bytes long, or if the last two bytes aren't
> 0x0D and 0x0A respectively, or if it does not contain at least two 0x20 bytes,
> then fail the WebSocket connection and abort these steps.
> 
> User agents may apply a timeout to this step, failing the WebSocket connection
> if the server does not send back data in a suitable time period.
> 
> 29. Let code be the substring of field that starts from the byte after the
> first 0x20 byte, and ends with the byte before the second 0x20 byte.
> ]]
> 
> This makes it possible for servers to include 0x0D bytes before and after the
> status code, and potentially trick broken clients that aren't so fuzzy with
> new lines to misinterpret the handshake. Maybe we should read ahead to the
> first 0x0D byte and check if the next byte is 0x0A instead.

I presume you mean it's possible for the server to send something back 
like 0x0D 0x20 "101" 0x20 0x0D 0x0A.

I've added a step saying that the UA must bail if there's any other 0x0Ds 
in the string.

On Fri, 7 May 2010, Simon Pieters wrote:
>
> establish a WebSocket connection
> 
> [[
> 41. ...
> 
> If the entry's name is "set-cookie" or "set-cookie2" or another cookie-related
> field name
> If the relevant specification is supported by the user agent, handle the
> cookie as defined by the appropriate specification, with the resource being
> the one with the host host, the port port, the path (and possibly query
> parameters) resource name, and the scheme http if secure is false and https if
> secure is true. [COOKIES]
> 
> If the relevant specification is not supported by the user agent, then the
> field must be ignored.
> ]]
> 
> At this point, the handshake can still fail. It seems bad to set cookies 
> if the handshake fails. We want to process set-cookie when the handshake 
> has succeeded (but before changing readyState and firing 'open').

Fixed.

On Thu, 3 Jun 2010, Simon Pieters wrote:
> 
> ...but still in that same task that changes readyState and fires 'open'.

Done.

On Wed, 12 May 2010, Simon Pieters wrote:
>
> establishing a WebSocket connection:
> 
> [[
> Note: There is no limit to the number of established WebSocket connections a
> user agent can have with a single remote host. Servers can refuse to connect
> users with an excessive number of connections, or disconnect resource-hogging
> users when suffering high load.
> ]]
> 
> Still, it seems likely that user agents will want to have limits on the number
> of established WebSocket connections, whether to a single remote host or
> multiple remote hosts, in a single tab or overall. The question is what should
> be done when the user agent-defined limit of established connections has been
> reached and a page tries to open another WebSocket.
> 
> I think just waiting for other WebSockets to close is not good. It just means
> that newly loaded pages don't work.
> 
> If there are any WebSockets in CLOSING state, then I think we should wait
> until they have closed. Otherwise, I think we should force close the oldest
> WebSocket.

This falls under the hardware limitations clause.

On Thu, 13 May 2010, Boris Zbarsky wrote:
> 
> If a situation doesn't happen often, then historically speaking most 
> authors will have no provisions to handle it.  Try browsing the web with 
> non-default colors set in your browser, with a default font size that's 
> not 16px, or with a 13px minimum font size set.  These aren't exactly 
> hard things to deal with, but authors just don't deal with them.  I 
> sincerely doubt they'd deal with the possibility of a websocket not 
> actually opening unless is was _very_ common.
> 
> Maybe the spec should say that attempts to open a websocket should have 
> a 50% chance of failing even if there's no good reason for it, just so 
> it is in fact common for opening to fail?  ;)  (No, that's not a 
> completely serious proposal, but it's not completely facetious either; 
> it would take something like that for authors to handle failure 
> properly.)

Indeed.

On Fri, 21 May 2010, Simon Pieters wrote:
>
> WebSocket Sending the server's opening handshake, step 2:
> 
> [[
> Establish the following information:
> host
> The host name or IP address of the WebSocket server, as it is to be addressed
> by clients. The host name must be punycode-encoded if necessary. If the server
> can respond to requests to multiple hosts (e.g. in a virtual hosting
> environment), then the value should be derived from the client's handshake,
> specifically from the "Host" field.
> ]]
> 
> This should say that the host is expected to be lowercase for comparison 
> purposes with the value of the Host field.

Fixed.

On Tue, 25 May 2010, Henry Sinnreich wrote:
>
> The Web Socket Protocol is far more general and useful than modestly
> described in the latest I-D version
> 
> http://www.whatwg.org/specs/web-socket-protocol/
> 
> The sections 1. Introduction and 1.1. Background could also mention some of
> these facts:
> 
> Many new network application protocols were standardized in the IETF staring
> in the mid Œ90s.
> The concept then was new applications require new network protocols.
> Examples include: The SIP/SDP family of protocols, RTSP, XMPP, MEGACO, MSRP,
> etc. for communications and media control applications, not to mention other
> non-IETF network application protocols such as H.323 or IAX.
> 
> The web development 15 years later has shown however that HTTP can 
> support literally countless Internet applications. Actually any 
> application that requires only a data communication channel between the 
> UA and the Web feature server. This certainly includes IM as in the 
> section 1.1. Background, but also web conferencing, blogs, wikis, social 
> networks, etc.
> 
> The data communications for SIP signaling over symmetric HTTP are 
> described in ³SIP APIs for Communications on the Web² 
> http://www.ietf.org/id/draft-sinnreich-sip-web-apis-00.txt
> 
> What do you think?

I'd rather leave that kind of history to advocates and historians. The 
history section is just intended to provide the immediate background to 
the spec to give it some minor context.

On Tue, 22 Jun 2010, Simon Pieters wrote:
>
> WebSocket data framing
> 
> [[
> Otherwise, let error be true.
> ]]
> 
> It's not completely clear that this applies to all binary frames.

Clarified.

On Mon, 28 Jun 2010, Wellington Fernando de Macedo wrote:
>
> The WebSockets API spec states:
> 
> "A WebSocket object with an open connection must not be garbage 
> collected if there are any event listeners registered for message 
> events."
>
> The Mozilla's implementation, however, also keeps alive the object if it 
> has any event listeners registered for open events. We are calling them 
> (the message and open events) as 'strong' events. You can read the 
> discussion about that in comments #5, #6 and #9 of: 
> https://bugzilla.mozilla.org/show_bug.cgi?id=572975
> 
> Now, there has been raised two more possibilities in the discussion (from comment #48):
> 
>  * When there are not sent outgoing messages;
>
>  * When at least one open or message events has been received, and there 
> are close events listeners (the close event could be flagged as 'strong' 
> in this case);
>
> We, from Mozilla, would like to know what do you think about that, if it 
> makes sense or not.

I've added the case of unsent outgoing messages to the protection, in 
addition to the additional protection discussed earlier in this e-mail.

On Tue, 20 Jul 2010, Simon Pieters wrote:
>
> The WebSocket spec restricts the value of the subprotocol that the 
> client is allowed to send to the server (see step 3 in the WebSocket() 
> constructor algorithm). However there's no restriction on the server 
> side (see step 2 of Sending the server's opening handshake). The client 
> could not have asked for any subprotocol, but the server can still 
> respond with a subprotocol, and the connection will be accepted.
> 
> Shouldn't the server have the same restriction as the client?

The idea is that the client can try to connect to a server without knowing 
what protocol it supports at all, and find out what protocol it supports. 
This also allows servers to be updated to list a subprotocol in 
preparation for supporting multiple subprotocols, without having to 
explicitly hide the subprotocol declaration in the case when the UA didn't 
specify one.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'