[whatwg] WebSocket.bufferedAmount

Wed Apr 14 17:23:12 PDT 2010

The changes discussed below are summarised in this diff:
   http://html5.org/tools/web-apps-tracker?from=5048&to=5049

On Thu, 25 Mar 2010, Olli Pettay wrote:
>
> IMO scripts should be able to check whether the data they have posted is 
> actually sent over the network.

That requirement is handled by basically any solution that eventually 
drops to zero, provided you mean local send, and is only handled by a 
remote ack, if you mean sent all the way over the network to the other 
end.

On Thu, 25 Mar 2010, Anne van Kesteren wrote:
> 
> I think what also matters here is how the protocol will evolve. Is it 
> that expectation that send(DOMString) can eventually send very different 
> things over the wire depending on how the server reacts to the initial 
> handshake request? How do the various options we have evaluate against 
> the potential scenarios coming out of that?

Yeah, if we add opt-in compression then the number of bytes sent including 
overhead could be smaller than the number of bytes in UTF-8 excluding 
overhead. It doesn't seem to much matter what exactly the number we use is 
so long as it is consistent across UAs.

On Fri, 26 Mar 2010, Olli Pettay wrote:
> 
> And if bufferedAmount includes the overhead, it needs to be specified 
> what bufferedAmount is during handshake.

I've clarified that the handshake doesn't affect it.

On Fri, 26 Mar 2010, Boris Zbarsky wrote:
> On 3/25/10 5:50 PM, Ian Hickson wrote:
> > What would the use case be for the second one? As far as I'm aware 
> > there's only one use case here: making it possible to saturate the 
> > network but not over-saturate it (i.e. to send data at the exact rate 
> > that the network can take it, without getting behind by sending more 
> > data than the network can handle, and without sending so little that 
> > the network is ever idle).
> 
> In practice, with real networks whose speed varies second-to-second, 
> this is not really feasible.

You can do a reasonable job, but I agree that perfection is impossible. 
The more latency you're willing to put up with, the better a job you can 
do, up to a limit.

> And given the various levels of buffering in a typical network stack, 
> I'm not quite sure how you see this working from the JS app's point of 
> view.  Or is bufferedAmount reporting the amount of data that the server 
> has not yet acknowledged receipt of or something?  The draft I'm looking 
> at doesn't say anything like that, but maybe it's out of date?

This is only the amount the UA hasn't sent, network stack buffering is not 
affected here. Effectively if this number is not zero, most of the other 
buffers are going to be full already. It's probably wise to try to keep 
this number as low as possible if you want good latency.

> That's not even worrying about issues like the network becoming "idle" 
> while you're waiting for your process's time slice or whatnot.

Indeed, nothing we can do here really helps with the case of the UA not 
being able to fill the network fast enough to saturate it.

> > I don't see a problem with defining this. I agree that if we include 
> > overhead that it should be defined, but just saying that it's "the 
> > number of bytes to be sent that have not yet been sent to the network" 
> > does define it, as far as I can tell.
> 
> I'm still not quite sure why the "number of bytes" would include the 
> websocket framing bytes but not the SSL bytes, the IP bytes, the 
> ethernet frame, the Tor stuff when that's in use, etc.  What makes them 
> special in terms of the protocol consumer needing to know about them (I 
> realize they're special in that we're defining the web socket protocol)?  
> This isn't a rhetorical question, to be clear; I genuinely don't see a 
> difference....

That's a fair point.

> > I think viewing the API spec and the protocol spec as separate is a 
> > mistake. They are one document:
> 
> Hold on.  There doesn't have to be a tight coupling between API and 
> protocol here, as far as I can see.  The API just deals with messages. 
> It seems pretty protocol-agnostic to me (and in particular, it seems to 
> me like the protocol can evolve without changing the API).
> 
> Is there a good reason to actually couple them?

They're one feature. Why would we not couple them? If we decouple them 
it's much more likely that they'll evolve in suboptimal ways.

> Given that, do we in fact need byte-exact values here at all?  For 
> example, could we make reporting it whichever way conforming as long as 
> it satisfies certain criteria (monotonicity, etc)?
> 
> This is one of those cases (and I don't say this often!) when I actually 
> happen to think that overspecifying (in either direction) precludes or 
> over-complicates some perfectly valid and reasonable implementation 
> strategies, and since in practice the values returned don't matter much 
> I'd prefer to not thus overspecify.

I'm a little concerned that if one browser returns numbers that are a 
factor of 2 different from another browser (e.g. because one returns the 
actual WebSocket bytes sent to the network and the other returns the byte 
length of the UTF-16 strings passed to send()), we'll see scripts that 
were written for one have somewhat crazy behaviour on the other one.

On Mon, 29 Mar 2010, Jonas Sicking wrote:
> 
> My understanding of the use case for bufferedAmount is to allow an 
> application to send large amounts of data without incurring large 
> amounts of latency. For example that you could send streams of mouse 
> movement coordinates without risking a high delay between a movement and 
> the information of the movement hits the wire.
>
> It seems to me that bufferedAmount does fulfill that use case, whereas 
> simply a boolean 'hasBufferedData' wouldn't as well.

Agreed.

On Wed, 31 Mar 2010, Boris Zbarsky wrote:
> On 3/30/10 10:22 AM, Jonas Sicking wrote:
> > Making it implementation dependent is likely to lead to website 
> > incompatibilities. Such as:
> > 
> > ws = new WebSocket(...);
> > ws.onopen = function() {
> >    ws.send(someString);
> >    if (ws.bufferedAmount > X) {
> >      doStuff();
> 
> Can bufferedAmount not change due to data actually hitting the network 
> during the execution of this code?  As in, will all the someString data 
> be buffered immediately after that send() call?

In the interests of sanity, I've changed the spec to make bufferedAmount 
only get updated between tasks in the event loop.

On Wed, 31 Mar 2010, Boris Zbarsky wrote:
> 
> More to the point, is send() allowed to actually send anything when 
> called, or does it have to buffer it all until the next time you get to 
> the event loop?

The implementation can send any time after the call to send(), but the 
bufferedAmount variable is required to pretend nothing was sent yet.

On Thu, 1 Apr 2010, Boris Zbarsky wrote:
> 
> Let's say bufferedAmount were to reflect the number of UTF-8-encoded 
> bytes to be sent, for the sake of argument.
> 
> I wait until bufferedAmount is 0, then call send("My text").
> 
> What are possible values of bufferedAmount if I examine it right after the
> send() call?  Is 0 a valid possible value?  What about 1?  2? 3? 4? 5? 6? 7?

As of now, only 7 is valid.

On Thu, 25 Mar 2010, Ivan Kozik wrote:
> 
> I like the idea of firing an event when bufferedAmount becomes zero. It 
> might be good to have a method to configure to "target buffer size" that 
> the application wants, in case it's more than zero.

I haven't added this yet, but I could see us adding this in a future 
version, if it turns out to be something authors want.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'