[whatwg] Comments on <event-source> and addEventSource()

Mon Feb 26 19:21:30 PST 2007

I believe this replies to all <event-source> comments I received since 
2004. If you have any comments I haven't replied to please let me know.

On Thu, 26 May 2005, Joshua RANDALL FTRD/DIH/BOS wrote:
> 
> I did notice that, while the server-sent DOM events described in the 
> current draft address by first two concerns, the third one has not been 
> specifically handled, although I suppose it is left open by allowing 
> other non-HTTP protocols.

The third concern cited was:

> Operator networks might in the future have built-in eventing protocols 
> that can more efficiently dispatch data asynchronously to client devices 
> without the need for the overhead of maintaining many virtually unused 
> TCP connections

As you say, the intent is that the aforementioned operators will develop 
their own protocols and/or data formats to do this. Then, based on 
implementation experience, we can document those that are most successful.

> However, since ideally the same web application code would work on all 
> platforms and networks, it would be better if there was a way to 
> negotiate the low-level transport between the client and server rather 
> than have it hard-coded into the script.  For example, a web application 
> showing real-time stock prices wants to get an event stream that updates 
> the stock prices from a server, let's say stockserver.org.  For most 
> desktop clients, using the event-source URI 
> "http://stockserver.org/stockprice" would be fine.  However, a mobile 
> client using the same page would waste a (relatively) lot of bandwidth 
> just by keeping that HTTP connection alive, and it so happens that the 
> particular (and fictitious) mobile network has low-level support for SIP 
> events (could just as easily be XMPP or perhaps even SMS or WAP-Push). 
> Therefore, for that client it would be advantageous to use a URI such as 
> "sip:stockserver.org;subscribe?event=stockprice" instead of HTTP since 
> there would be significantly less network overhead that way.  However, 
> it is undesirable for the web application developer to have to provide a 
> separate version of the page for the mobile user on a SIP-capable 
> network, so it would be advantageous to have an option for the server 
> and client to negotiate the low-level transport.  Perhaps this could be 
> done using an extension to the proposed baseline HTTP-based 
> implementation?

Well, if the server is to be involved in the negotiation, the easiest 
option is for the author to just say:

   <event-source src="stocks"/>

...where "stocks" gets resolved, the server is contacted, and the server 
then does an HTTP redirect to the actual source, which the server can do 
negotiation for using normal HTTP redirects.

> Ian, perhaps we could add to section 9.1 a header that can be sent by 
> the user agent along with the initial request to the event-source URI 
> that specified a list of event protocols that the user agent supports? 
> Perhaps something like "capabilities"?

The 'Accept' header can already do this. I've added mention of this to the 
spec.

> Then, the server, knowing what the client support was, would have the 
> option of returning a 3xx redirect to the other protocol URI instead of 
> opening the event stream?

Yup, that's already possible.

> If for some reason the user agent was unable to establish the event 
> stream using the new protocol, it could re-contact the server but remove 
> the failed protocol from it's list of capabilities.  This seems to me to 
> be the least obstrusive way of adding basic protocol negotiation to the 
> server-sent DOM events -- do you see any reason why it shouldn't be in 
> there?

My main reason would be that we already have content and protocol 
negotiation methods, and we should just re-use them. The current design 
allows that.

In practice I doubt this will be seen much, I have to be honest.

On Sat, 11 Jun 2005, S. Mike Dierken wrote:
>
> I recently saw your draft specification and the sections related to the 
> event-source element. This is really fantastic stuff and I wish I had 
> been paying attention much sooner!

Excellent!

(And sorry again for the delay in dealing with your comments!)

> I'd like to find out how I can be an active contributor in this area, 
> both in terms of discussions as well as in terms of development.

Well, you can take part in this discussion, and you can implement the 
features, both as JS shims for legacy user agents (as the Google <canvas> 
implementation or Dean Edwards' Web Forms 2 implementation) and in your 
own Web browsers or in existing open source Web browsers. You can also 
write test cases, write content to see how the features work in the real 
world, help authors trying to use the features, you name it. :-) We 
organise a lot of the work in channel #whatwg on Freenode.

> I've also included a brain dump of my thoughts - also posted to my blog: 
> http://korrespondence.blogspot.com/2005/06/whatwg-and-event-source-design.html
> I realize this could easily be interpreted as a johnny-come-lately ivory 
> tower post, but I hope that's not how it's viewed - I'm really very 
> interested in participating and collaborating in any way I am able to.

Everyone's input is definitely welcome! Sorry about taking so long to 
respond, I've been mostly focussing on feature-completeness and am now 
starting to ramp up the comment-replying.

> From the section "9.1.1. The event-source element" the approach is to 
> introduce a new element. I'd like to consider whether more than one 
> event-source element is allowed

It is. I'll make that clearer.

> and also consider whether simply introducing new attributes on existing 
> elements is easier and feasible.

It would be easy to specify, for sure. However, I've received very vocal 
feedback from implementors that they'd rather as little as possible was 
done at the global element level. That's why I specified this with a 
separate element.

> If a document is allowed to have more than one event-source element then 
> the client app will have to deal with either multiple connections or 
> combining multiple event-sources on one connection while delivering 
> events to the corresponding event-source element event handler.

We have not specified any way to multiplex the event sources; I'm not sure 
it really makes sense to require that (it would require all kinds of magic 
on the server side). In practice I would expect authors not to use 
multiple <event-source> elements, or to use them to get data from separate 
sources.

> If multiple connections are supported, then the client application could 
> saturate the capability of the client machine and could even be 
> considered a 'poorly behaved' client on the shared network.

Well, HTTP already requires that clients not open more than a certain 
number of connections per server. We can't override this, let someone try 
to DOS a client and server by opening a bazillion simulaneous connections.

> I suggest using onMessage rather than onEvent

Agreed. I've changed the spec appropriately.

> to distinguish between network based messages and application based 
> events (e.g. connection-opened, connection-closed-by-client, 
> connection-closed-by-server, etc). I realize that some people feel web 
> developers would want these to look identical, but many years of 
> experience across the software industry has shown that they are simply 
> different beasts and making that explicit actually helps the developer. 
> The onEvent handler could be used for connection events on the 
> event-source outside the actual messages within that stream.

Well, there are other events for that. 'load', 'error', 'abort', etc, no 
need for an 'event' event. :-)

Currently, though, the spec doesn't actually give authors access to this.

> 2) format and definition of event streams
>
> This section indicates that client should re-open connections after a 
> small delay if they were closed in a successful situation. This section 
> should consider persistent connections (i.e. keep-alive) and distinguish 
> reply-level response codes from connection-level operations.

The section seems, to me at least, to clearly indicate that it operates at 
the HTTP level (or equivalent in other protocol stacks). Could you 
elaborate on what you mean?

> For example, rather than saying "HTTP 200 OK responses with the right 
> MIME type, however, should, when closed, be reopened after a small 
> delay.", it may be better to say something like "The retrieval of an 
> event-source that completes successfully (it has the correct 
> content-type and an HTTP 200 OK response status code) should be tried 
> again after a short delay."

I don't understand the difference. Could you explain? Thanks.

> In addition, the client should continue to obey the appropriate 
> cache-control response headers - this allows the server to dynamically 
> influence the interval that the client retrieves future events (beyond a 
> static value placed in other attributes on the event-source element). 
> This would be useful in the HTTP 204 No Content situation described in 
> this section as well.

Well we definitely don't want to honour any headers that say to cache the 
data for 200 OK responses with the MIME type used for event stuff -- in 
those cases, a cached response would just cause the events to all replay 
instantly. I agree that everything else (including 204s) should honour 
caching, though, and have fixed the spec appropriately.

> There are three aspects of this event stream that I'm concerned about 
>  - the meaning of the event stream resource itself,
>  - the framing of individual messages within that event stream
>  - how to handle re-processing the event stream in different situations 
>    (errors, page refresh by the user, etc)
> 
> The proposal I have is to consider the src= attribute on the 
> event-source element as referencing a 'collection of messages'. As to 
> the framing of individual messages, when retrieving this collection of 
> messages via HTTP, I suggest using the multipart/mixed or 
> multipart/digest MIME type. Individual parts can then have their own 
> content-type and developers can decide which suits their needs - a 
> simple name/value pair approach like form data (not my favorite), 
> javascript object definitions like JSON, etc. See 
> http://www.w3.org/Protocols/rfc1341/7_2_Multipart.html for details.

The problem with this approach is two-fold.

First, it is far, far more complicated for authors to understand, and far 
easier to break. It also requires much more comprehensive error handling 
rules, since the syntax is more involved and thus more potential errors 
exist to have handled. (The MIME RFCs don't define how to handle errors.)

Second, as it is intended that this work by triggring DOM events, the 
framing will automatically have to be based around a DOM event structure, 
with any payload encoded as part of that. So allowing explicit support for 
other types would not stop the top-level data from having to include DOM 
event information.

Authors who want to include JSON or XML can still do so using the current 
mechanism, though, they just need to include it in the packet, as in:

Data: <message from='juliet at example.com' to='romeo at example.net' xml:lang='en'>
Data:  <body>Art thou not Romeo, and a Montague?</body>
Data: </message>

...then the client-side merely looks like:

   <event-source src="messageProxy.cgi" onmessage="process(event.data)">

...where process() handles the XML data. You can do similar things with 
JSON or other text formats.

> I realize that one use-case or scenario is for mobile devices and
> message size is a concern. I think it may be possible to follow the
> pattern of multipart/mixed but create a terse syntax that follows
> the same capabilities of multipart/mixed with respect to compression
> (transfer-encoding, gzip, etc), formats (content-type) and
> localization (character-encoding).

Could you describe the use cases that you had in mind for which
different compression mechanisms and encodings would be appropriate?

The current described mechanism isn't intended to be compressed due to
the way that it must be decoded immediately -- the typical expected
case is that one line of text is sent, and immediately acted
upon. It's not something you'd want to compress; compression systems
usually introduce all kinds of delays due to buffering.

The current system also uses UTF-8, so the whole Unicode repertoire is
available.

> For each message, the WhatWG event-source definition introduces
> specific names used for controlling routing to event handlers but it
> seems trivially easy to define an approach that isn't specific to
> this new syntax. Specifically, the Event and Target names could be
> replaced.

What would be the benefit of this?

> My proposal is to define the each message in the event streams to be
> similar to a request, except that no possibility of a response from
> the client exists. This provides for each message to have a URI that
> indicates it's target and a method that indicates the event type
> (post/put/delete).
> 
> For example, rather than an HTTP response of:
> 
> 200 OK
> Content-type: application/x-dom-event-stream
> Content-length: NNNN
> \n
> Event: stock change\n
> data: YHOO\n
> data: -2\n
> data: 10\n
> 
> 
> Use a more generic event stream like
> 
> 200 OK
> Content-type: multipart/mixed; boundary=msg_boundary
> \n
> \n
> --msg_boundary
> POST /event-handler/stocks HTTP/1.0
> 
> Content-type: text/javascript-object
> Content-length: NNNN
> \n
> {symbol: "YHOO", delta: -2, value: 10}
> --msg_boundary
> PUT /event-handler/stocks/YHOO HTTP/1.0
> Content-type: text/javascript-object
> Content-length: NNNN
> \n
> {delta: -2, value: 10}
> --msg_boundary

That seems way, way more complicated. What's the benefit of this?

> That last thing I'll mention has to do with re-processing the event
> stream. Re-processing an HTML page from start to end generally
> works. However, re-processing an event stream from start to end is
> generally not going to work well.

Agreed; event streams shouldn't be re-used.

> The situations to consider are when the user manually refreshes the
> page and when the retrieval of the event stream completes
> successfully. When the user manually refreshes the page, the src=
> attribute will be the same as when the page was previously
> retrieved, but should the events be the same as previously
> retrieved, or should they be messages from that moment forward?

That probably depends on the exact semantics of the Web app. If the
app is a ticker, then no, you just want the new news. If
it's... actually I can't come up with a time where you'd want to see
old events. I can see cases (e.g. games) where you'd want all events
since the page was generated, to make sure the game state since then
can be represented -- butthat's easy enough, two techniques would be
first to include a generation number in the URI, and second to make
the first event sent back be a complete update of the current state.

> And when the the retrieval of messages succeeds and the client waits
> a few moments before retrieving the next set from the very same URI,
> should that next set be the same messages again?

Clearly no. :-) I'll update the spec to make this clearer.

> This is a tricky area and I'm sure there are several ways to
> approach this. I'll write more later...

Do let me know if there's anything else you have to propose.

On Tue, 25 Oct 2005, Mike Dierken wrote:
>
> I've put together a quick prototype to experiment with the event-source
> design implications based on the WHAT-WG Web App spec here:
> http://www.searchalert.net/dierken/eventsource/

Nice!

I think I replied to most of the points on that page in the text above, 
but one point I didn't address was how to do continuations. I believe that 
can be done relatively easily by sending a message at the end of each 
stream giving the next URI to use, and having the author change the 'src' 
attribute on the fly -- no need for any magic or UA support.

On Wed, 23 Aug 2006, Anne van Kesteren wrote:
>
> * The event-source element has no DOM interface while it has some 
> attributes that probably warrant one.

Fixed.

> * The specification doesn't define when the onevent event handler is 
> invoked nor when the event event is dispatched. They are only defined. 
> It's also unclear which interface they implement, et cetera.

Fixed.

> * A problem with the event-source element is that the resource is loaded 
> before you can attach event listeners to the document. Perhaps the 
> loading should start after the load event is dispatched? Unless the 
> element was inserted into the document of course (that's actually also a 
> bit unclear).

The "onmessage" (previously "onevent") attribute is intended to address
this. Does it not?

> * Since event sources can be attached using other ways than using the 
> event-source element the Target field should be amended to take that 
> into account. (Some sentences there don't make sense for an event source 
> attached to an object that is not an event-source element.

Fixed.

> * Regarding that, I'd be interested in hearing the use case for allowing 
> any EventTarget to be a source for server-sent events.

It was requested by your colleagues prior to your employment.

> * Event namespaces throughout should be changed to match DOM Level 3 
> Events. That basically means that http://www.w3.org/2001/xml-events is 
> gone.

Fixed.

> * Perhaps RemoteEvent should be replaced with a reference to CustomEvent
>  from DOM Level 3 Events which offers the same type of functionality?

I don't understand how CustomEvent is relevant here.

> * What happens when the event given in the Event field doesn't match the 
> NCName production as required by DOM Level 3 Events such as in the 
> example in section 7.1.7? (It uses the event "stock change".)

Presumably an exception is raised (and subequently dropped) when the event 
is dispatched. I'll try to clarify that.

> * At the moment the BNF does need error handling because you could have 
> a ";" at the start of a line without any data following (or a new line 
> for that matter).

A semicolon followed by a newline is defined.

A semicolon followed by nothing is an incomplete line. I'll clarify how 
that should be handled.

> * It might be better to replace the BNF with something similar as the 
> HTML parsing specification currently has. That provides a much more 
> clear processing model.

Really? You find an explicit prose state machine easier to read than BNF? 
Wow. I really don't. :-) I can't see what a state machine describes at 
all. At least with BNF you can see at a glance what the syntax is.

> * What happens for other line feed characters? Are they treated as 
> fields? Won't that give lots of problems for authors coding in non-Unix 
> formats? HTTP for example allows both.

What other line feed characters? There's only one U+000A LINE FEED 
character.

> * "For each non-blank, non-comment line, the field name is first 
> taken[...]" doesn't cover what happens to command lines.

Well, it does. Nothing happens to command lines. I am wary of always 
saying "for xyz, nothing happens" because there's an infinite number of 
cases where nothing happens. It's all the cases where the spec doesn't say 
that something _does_ happen. :-)

> * "The ctrlKey field would be ignored[...]" should probably say 
> "keyIdentifier" as that's what's used in the example.

Fixed.

On Tue, 24 Oct 2006, Rohan Prabhu wrote:
>
> I am writing this mail because I've recently studied your Web 
> Applications 1.0 specifications, and have found a rather strange point 
> in it. As the 'event-source' element is embedded in the Markup itself, 
> it makes little sense. It is for the simple reason that, assuming there 
> be more than one event-source handlers, then one has to be defined first 
> and another one, next. In that case, there is no significance of the 
> order in which the 'even-source' elements appear.

This is true, in the same way that there is no significance in the order 
that the <meta> elements are listed.

> In HTML, or any markup for that purpose, the order of elements has a 
> special significance. As for example, if there are two <p> tags, then th 
> once coming first is rendered first in the inline display and the one 
> coming next is displayed next (the relative positioning can be changed 
> using CSS is a different matter, however).
> 
> HTML as such is a static language (without a scripting backend). Hence, 
> dynamic elements within the flow of an HTML element, isn't proper 
> semantics.

Indeed, the <event-source> element in this respect is like the <script> 
element or the onclick="" attribute -- without scripting, it is 
effectively meaningless.

> To keep the semantics in place, I'd recommend that that the event-source 
> be rather specified as a JavaScript/DHTML object, just as the 
> XMLHttpRequest object is. A major part of event-source seems to fulfill 
> the the shortcomings of XMLHttpRequest object.

This is actually already possible using addEventSource().

> There is no way to check [for support of <event-source>] using HTML 
> methods, and if we were to specify one, we still lose support of all the 
> browsers that exist as of now. At the same time, older browsers may not 
> recognize the event-source element and display something that would 
> either obstruct the flow of the document or it may abruptly quit.

You can test for the support of <event-source> by checking for whether the 
HTMLEventSourceElement interface is implemented on that element, or by 
having the first event sent back from the <event-source> disable the 
fallback code (this is the preferred solution since it then doesn't fail 
when the <event-source> object is supported, but events don't work, e.g. 
because the network itself blocks such requests).

Legacy browsers ignore <event-source> elements.

> Also, if it is completely scripting based, it can even be validated 
> using current W3 validation services. (Since WHATWG currently doesn't 
> have a validation service, more developers will be encouraged if they 
> can use existing DTDs while at the same time using new technologies)

Henri is working on a conformance checker for the WHATWG spec.

I do not believe we should design the language around limitations in the 
existing conformance checkers (namely that it doesn't check what DOM APIs 
are being used).

I agree that we should have DOM support for this, that's why we have an 
addEventSource() method. The element itself is useful because it's easier 
to write code that uses it than to do everything from script. It's also 
possible, with this element, to keep event sources related to code that's 
doing stuff.

On 24 Oct 2006, then again on 6 Dec 2006, Alexey Feldgendler wrote:
> 
> Why do we need an <event-source> element in the markup? It only makes 
> sense in conjunction with scripting, so I think it would be better to 
> drop this element and have the event source objects only created by 
> scripts. Similar practices are already established for objects like 
> XMLHttpRequest, XSLTransform (to name a few).

Primarily for ease of scripting. This:

   <script>
     function process(msg) {
       // ...
     }
   </script>
   <event-source src="stocks.cgi" onmessage="process(event.data)">

...is easier than:

   <script>
     function process(msg) {
       // ...
     }
     document.addEventListener('message',
                               function (event) { process(event.data) },
                               false);
     document.addEventSource('stocks.cgi');
   </script>

However, looking forward, especially in conjunction with XBL2, one could 
imagine systems built around responding to events, such that we could end 
up having things like:

   <my:stockTicker>
     <html:event-source src="stocks.cgi"/>
   </my:stockTicker>

...where the stockTicker element is implemented by XBL, and expects a 
stream of messages to be targetted at, or to bubble past, the bound 
element; the event-source element can then provide these events without 
any scripting on the part of the author.

There were also some comments about the event-source format itself related 
to what happens with feeds with no trailing newline. I fixed the format 
description.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'