[whatwg] WebSocket bufferedAmount includes overhead or not

Wed Mar 24 15:08:43 PDT 2010

On 3/24/10 11:33 PM, Ian Hickson wrote:
> On Sun, 21 Feb 2010, Olli Pettay wrote:
>>
>> I propose that bufferedAmount doesn't take account the bits added by the
>> protocol. This way if the protocol is later changed, web developers
>> don't need to change their code because of the way they rely on
>> bufferedAmount.
>
> On Thu, 4 Mar 2010, Fumitoshi Ukai (�~\飼�~V~G�~U~O) wrote:
>>
>> I noticed that WebSocket spec updated to not inlcude framing overhead in
>> bufferedAmount.
>> http://lists.whatwg.org/pipermail/commit-watchers-whatwg.org/2010/003971.html
>> I tried to implement it in WebKit, but found it make hard to implement
>> correctly. https://bugs.webkit.org/show_bug.cgi?id=35571
>> It's easy after WebSocket is closed (just add length of message), but while
>> it's open, we'll manage buffer including frame bytes and underlying socket
>> will write arbitrary length of the buffer (may not be on frame boundary)
>> To get bufferdAmount correctly without framing overhead, we need to parse
>> the buffer again.  It's not light operation and it's challenge to make it
>> effective.
>> I think including frame overhead is much easier.
>
> On Thu, 4 Mar 2010, Olli Pettay wrote:
>>
>> Not hard at all in gecko's implementation (the patch is still waiting
>> for a review and will be possibly updated to include the latest changes
>> to the protocol before pushing to hg repo).
>
> On Fri, 5 Mar 2010, Alexey Proskuryakov wrote:
>>
>> I was going to mention this as the primary reason why frame bytes should
>> be included. JavaScript code needs this information for flow control,
>> and it's raw bytes that are sent over the tubes, not original message
>> strings.
>>
>> Also, I think it's a layering violation. In WebKit, we'd have to queue
>> unsent messages separately just to implement this quirk (see
>> https://bugs.webkit.org/attachment.cgi?id=50093 for a proof of concept).
>> It becomes very difficult to implement we decide to add size of data
>> that an underlying network library buffers internally - which I think
>> would be a reasonable thing to do.
>>
>>> Also why to have framing bytes and not the bytes related to http
>>> handling?
>>
>> Nothing would change for engines or JS code if HTTP headers were counted
>> in bufferedAmount. Since they are only sent when establishing a
>> connection, adding a small constant at the beginning will make no
>> difference to flow control. And the constant is going to be zero in
>> practice, because the data will immediately go where we can't see it.
>
> On Fri, 5 Mar 2010, Alexey Proskuryakov wrote:
>>
>> My recollection is that feature was added as a result of discussions
>> about implementing flow control. How else are you supposed to know that
>> you're streaming too fast without modifying the server? Since WebSockets
>> is a match for TCP/IP, and the latter provides ways to adaptively change
>> data rate, it's natural that one expects the same from WebSockets.
>
> On Fri, 5 Mar 2010, Alexey Proskuryakov wrote:
>>
>> Yes, that's lots of work for something no one should care about, as you
>> implied above. And that's work that makes the results slightly misleading,
>> even if that's so slightly that it's not important in practice.
>>
>> Remembering frame offsets even after data has been serialized to a stream is
>> an unusual requirement for networking code.
>
> On Fri, 5 Mar 2010, Olli Pettay wrote:
>>
>>  From API perspective I do care. Web developers shouldn't need to know
>> about the protocol, yet (s)he should be able to understand what
>> bufferedAmount means.
>
> On Fri, 5 Mar 2010, Alexey Proskuryakov wrote:
>>
>> An explanation like "it's how much data is buffered to be sent over
>> network" seems adequate to me.
>
> On Wed, 17 Mar 2010, Alexey Proskuryakov wrote:
>>
>> We have a suggested patch that implements the proposed new behavior for
>> WebKit now, but I think that it adds unnecessary complexity, and puts
>> limits on how we can refactor the code in the future. We need to
>> remember frame boundaries for much longer, making it difficult to
>> interface with general purpose networking code.
>>
>> I'd prefer sticking to the previously specified behavior.
>
> On Tue, 23 Mar 2010, Olli Pettay wrote:
>>
>> And I certainly prefer the current behavior, where the API is not so
>> tightly bound to the protocol, and where the bufferedAmount is handled
>> more close to what progress events do with XMLHttpRequest.
>
> On Tue, 23 Mar 2010, Anne van Kesteren wrote:
>>
>> We (Opera) would prefer this too. I.e. to not impose details of the
>> protocol on the API.
>
> If we're exposing nothing from the protocol, does that mean we shouldn't
> be exposing that the string converts to UTF-8 either?

Yeah, I've been thinking about that too.

>
> I guess I'm unclear on whether bufferedAmount should return:
>
> 1. the sum of the count of characters sent?
>     (what would we do when we add binary?)
I believe this is actually what we want.
If web developer sends a string which is X long,
bufferedAmount should report X.

And when we add binary, if buffer which has size Y is
sent, that Y is added to bufferedAmount.

The reason why I'd like it to work this way is that
IMO scripts should be able to check whether the data
they have posted is actually sent over the network.

-Olli

>
> 2. the sum of bytes after conversion to UTF-8?
>
> 3. the sum of bytes yet to be sent on the wire?
>
> I'm not sure how to pick a solution here. It sounds like WebKit people
> want 3, and Opera and Mozilla are asking for 2. Is that right? I guess
> I'll go with 2 unless more people have opinions.
>