[whatwg] register*Handler and Web Intents

Mon Apr 2 16:23:40 PDT 2012

On Tue, 6 Dec 2011, James Hawkins wrote:
> 
> One of the critical pieces of the API is a declarative registration 
> which allows sites to declare which intents they may be registered for.  
> The current draft of the API calls for a new HTML tag, <intent>, the 
> attributes of which describe the service registration [...]

Rather than an element just for Web Intents, I think we'd be better off 
with an element that can be used for all the registration mechanisms: 
registerProtocolHandler(), registerContentHandler(), and Web Intents.

Similarly, rather than Web Intents only being declarative, we should 
follow the pattern set by registerProtocolHandler() and register- 
ContentHandler() and also support an API. This would make these three 
mechanisms consistent with each other such that they can be considered a 
single feature, not three features.

Looking at the three features, it seems they break down as follows:

   a handler registered using registerContentHandler() triggers when a URL 
   with a particular type is opened, and results in the URL being passed 
   to another URL that is opened.

   a handler registered using registerProtocolHandler() triggers when a 
   URL with a particular scheme is opened, and results in the URL being 
   passed to another URL that is opened.

   a handler registered using Web Intents triggers when a method is 
   invoked on another page, and results in a URL being opened and its 
   JavaScript context being given the information passed to the method.

In the first two, the behaviour can be implemented server-side or 
client-side; in the last case the behaviour must be done in JS.

We can pretend that in the case of the first two, it's equivalent to an 
"open" action with the URL as the data. This would mean that the intent 
data, in the case of a URL open, would just be the URL string. We can 
similarly pretend that in the case of the latter, it's fetching the page 
like in the %s case, but since there's no %s, it doesn't put the data in 
the URL. Or alternatively, we can say that because the data is a 
structured clone rather than a URL, you replace the %s, if any, with the 
empty string.

Thus we reach a point where we can describe all three as a common set of 
registration features:

   1. select type of handler:
       - url
       - structured clone
   2. select action
       - "open"
       - "share"
       - "edit"
       - etc...
   3. optionally select one other further filter
       - type
       - scheme (only allowed for url handlers)
   4. select URL to use as handler
   5. select user-facing title
   6. select disposition (replace page, new page, or overlay)

When a URL is opened that matches a registered handler for the "open" 
action and the appropriate scheme or type, imply an intent for that URL 
handler, with any returned result being discarded.

The handling can then be a single mechanism for all of the above:

   1. If it's a URL handler, replace %s with the URL.
   2. Set up the browsing context per the disposition.
   3. Open the URL.
   4. Set up the Window.intent API.

So, what information would we need for registration?

   payload type: a url, or an object to clone
   action: a string
   filter: either a MIME type, or a scheme
   url: the url to call
   title: the user-visible name of the handler
   disposition: how to show the handler (replace, new tab, popup overlay)

My suggestion then would be to add an element similar to what you suggest, 
as well as an API similar to the existing one.

The element could be something like:

   <intent
     action="edit"     intent action, e.g. open or edit, default "share"
     type="image/png"  MIME type filter [1], default "*/*"
     scheme="mailto"   Scheme filter [1] [2], default omitted
     href=""           Handler URL [2], default ""
     title="Foo"       Handler user-visible name, required attribute
     disposition=""    "replace", "new", or "overlay", default "overlay"
   ></intent>

[1] Only one of type="" and scheme="" is allowed.
[2] scheme="" is only allowed if href="" contains %s.

The API could be something like:

  void registerIntentHandler(DOMString action, DOMString type, DOMString url, DOMString title, DOMString disposition);
  DOMString isIntentHandlerRegistered(DOMString action, DOMString type, DOMString url);
  void unregisterIntentHandler(DOMString action, DOMString type, DOMString url);

The disposition of registerContentHandler() and registerProtocolHandler() 
would always be "replace". The /url/ argument of registerProtocolHandler() 
would not be allowed to contain %s.

A handler, once registered, would remain so until it was explicitly 
removed with unregisterIntentHandler() or removed by the user, as now for 
the other handler APIs; or, for registrations done with the declarative 
form, would remain until the user returns to the same page and the page 
returns a 200, 404, or 410 response (at which point it would be 
unregistered until such time as the <intent> elment is seen again, which 
could happen that very same page load).

Removing the element from the DOM would have no effect; adding it to the 
DOM (e.g. by the parser or script) is what would trigger a declarative 
registration.

I'm not hugely happy with the idea of introducing yet another 
empty-element-with-an-end-tag to HTML, especially here where I really 
don't see much of a need for fallback. Adding it to the <head> would make 
sense but would be very expensive. I don't really see another element we 
could reuse sanely, though (overloading <meta> and <link> doesn't really 
work here as they don't have the same semantics really).

From a purely spec-editorial perspective, it seems to make more sense to 
have all of this in one spec, rather than split across multiple specs. If 
you would like, I'd be willing to spec this all in the HTML spec (which 
would especially make sense if we do add another element); alternatively, 
we should really consider moving the existing register*Handler() stuff to 
the same spec as the intent stuff.

On Tue, 6 Dec 2011, Anne van Kesteren wrote:
> 
> You could also have
> 
> <meta name="intent" content="http://webintents.org/share image/*">
> 
> or some such. Splitting a string on spaces and using the result is not 
> that hard and a common pattern. And seems like a much better alternative 
> than changing the HTML parser.

Trying to fit the registration components listed above into <meta> really 
doesn't work all that well, IMHO.

On Tue, 6 Dec 2011, James Graham wrote:
> On Tue, 6 Dec 2011, Anne van Kesteren wrote:
> > 
> > Especially changing the way <head> is parsed is hairy. Every new 
> > element we introduce there will cause a <body> to be implied before it 
> > in down-level clients. That's very problematic.
> 
> Yes, I consider adding new elements to <head> to be very very bad for 
> this reason. Breaking DOM consistency between supporting and 
> non-supporting browsers can cause adding an intent to cause unrelated 
> breakage (e.g. by changing document.body.firstChild).

That is true, but adding this to the body seems weird too.

On Tue, 6 Dec 2011, James Hawkins wrote:
> 
> Originally we envisioned using a self-closing tag placed in head for the 
> intent tag; however, we're now leaning towards not using self-closing 
> and having the tag be placed in the body with fallback content, e.g., to 
> install an extension to provide similar functionality.
> 
> <intent action="webintents.org/share">
>   Click here to install our extension that implements sharing!
> </intent>
> 
> What are your thoughts on this route?

How common will fallback be on the short term and on the long term?

We have this mechanism for, e.g., <iframe>, and at the moment it's mostly 
just an ugly wart in the language.

On Fri, 16 Dec 2011, Julian Reschke wrote:
> Anne wrote:
> > 
> > We can just add additional attributes to <meta> you know. We have done 
> > the same for <link>. E.g. for <link rel=icon> you can specify a sizes 
> > attribute.
> 
> That makes it sound a lot easier than it is. After all, there's no 
> extension point here. Adding attributes to <meta> (or <link>) requires a 
> change to HTML5, or a delta spec adding these as conforming attributes.

Adding new attributes is reasonably trivial in practice. My concern with 
reusing <meta> or <link> is more that it doesn't really fit the semantics 
of the existing attributes or processing model.

On Mon, 12 Dec 2011, James Hawkins wrote:
> 
> For R*H, ?foo=%s normally requires server side processing.  With Web 
> Intents, this data is passed completely client-side on the intent 
> object.

Both have strong use cases. I think we should support both. In the case of 
data being cloned, it doesn't make much sense to upload it, so naturally 
that would just be provided client-side, as described above. For URLs, 
though, the opposite is the case -- you will usually want to fetch the URL 
somehow, which is almost always going to require work on the server side 
since the client typically won't have access rights to obtain the data 
(for content handlers) or open the connection (for protocol handlers).

> Wildcard matching.  R*H does not allow wildcard matching, where as Web 
> Intents would allow a service to register for image/* in one succinct 
> registration.

I don't think wildcard matching really makes sense. In particular, I'm not 
aware of any service that can honestly say it supports image/*, or indeed 
any other topleveltype/*.

On Tue, 13 Dec 2011, Simon Pieters wrote:
> 
> I'm not sure it's a good idea security-wise to have this feature as an 
> element. Many sites use black-list based HTML filtering of user input, 
> to filter out "bad" stuff like <script> elements. It's easy to argue 
> that they are already screwed, but we still have to think about it when 
> adding new features to the platform, because there are many such sites. 
> It would be easy to inject an <intent> tag to such sites, whereas it is 
> harder to call navigator.register*Handler.

On Wed, 14 Dec 2011, Greg Billock wrote:
> 
> Even with malicious content chosen by the attacker, the only impact on 
> the target page is injecting a "window.intent" object with some opaque 
> (but malicious!) content. Getting the page to execute that malicious 
> content is the big hurdle. Either you can inject code into the page 
> causing this execution, in which case, why bother, or the page is using 
> window.intent unsafely. This is a concern, but in that case, the exploit 
> is more easily accomplished directly, rather than a circuitous route 
> through an injected <intent> tag.

There's also the following somewhat artificial attack scenario:

 1. User is tricked into going to victim.example.com. An injection attack 
    is used to register a custom handler for vicitim.example.com. Not 
    knowing that the user is at victim.example.com, and given a misleading
    handler title and so forth, the user adds the handler, thinking it is
    something else.
 2. User is tricked into going to some other site that invokes the 
    handler.
 3. The user, thinking the handler is something else, picks it (it would
    probably be the only such handler in this scenario, with the action
    being a unique one for the attack).
 4. The user, confused as to which site he is visiting, performs an action
    on the victim site, thinking he is on some other site. Maybe the site 
    is made to appear like another via some injected CSS. (Hey, we're 
    assuming the site is susceptible to injection in the first place.)

A lot has to go wrong for this to really happen, though.

> Another threat model related to this is cross-origin registration. If an 
> <intent> tag can be injected with a cross-origin service, information 
> about the current page state could be leaked to the malicious host by 
> way of that cross-origin url. If the site is relying on a blacklist (so, 
> say, <img> tags couldn't be injected), and has a vulnerability allowing 
> the gathering of information on the page or the DOM context, then an 
> <intent> tag injection is a new vehicle to carry that data to an 
> attacker. Again, there's a couple more obstacles: the user would need to 
> approve the registration and then launch an intent, but those sound easy 
> to arrange. The real way to combat this is to not allow cross-origin 
> service registration.

Yes, I see no reason to allow cross-origin registration. The existing 
methods do not allow that.

> > Separately, I'm not so happy with having two APIs for the same thing. 
> > We don't enable anything new, but we double the attack surface, the 
> > cost to implement and test, authors need to not only learn both, but 
> > also need to learn (and argue) about which to use, and so forth. 
> > register*Handler has already been shipped in some browsers.
> 
> We've seen some down-sides of the imperative registration approach: 
> clients ask for ways to detect if they are registered, which breaks 
> opacity and is, I think, a bigger security concern than the above.

The API does expose this information.

I don't see why they wouldn't want this information with the declarative 
approach as well. GMail, for instance, asked to be able to tell if they 
were registered so that they could display more elaborate documentation.

On Fri, 16 Dec 2011, Paul Kinlan wrote:
> 
> We didn't want to add additional attributes to the meta tag or link tag 
> just for intents, this seems to open up the flood gates for future 
> platform features to also extend the meta syntax, the meta element then 
> just becomes a dumping ground.

That's not a big concern, so long as the semantics make sense.

With intent registration, I'm not sure they do.

e.g. with <link href=""> you'd want to be able to register a URL with a %s 
segment for content/protocol handlers, but that's no longer a valid URL, 
so it's weird to use href="" which currently requires a valid URL.

Similarly, using type="" to mean the filter rather than the type of the 
content at the href="" URL would mean the type="" attribute has a 
different meaning based on context.

Similarly, we have an action="" attribute on <link> that defines the 
disposition, but it has different values than what we're talking about 
here, so it would be weird to reuse it, and would be weird not to.

On Wed, 25 Jan 2012, Paul Kinlan wrote:
> 
> Yes we are ok with it being in the body.  Having the intent tag in the 
> body allows us to have a strong graceful degradation story for Web 
> Developers and Publishers.  The <intent> tag in the body allows us to do 
> several nice things such as:
> 
> 1.  Giving the user another way to handle the action and allowing for
> custom styling of the element:
> <intent action="http://webintents.org/share" ... style="background-color:red;">
>   <p>Add our bookmarklet <a href="javascript:.......">Drag to bookmark
> bar</a></p>
> </intent>

I guess that might be common, yes. Though most browsers don't allow you to 
add bookmarklets that way anymore, so I'm not sure it'd work exactly like 
that.

> 2. We can add the script polyfil in seamlessly - conforming UA's will
> ignore internal content, non-conforming UA's will treat it as an
> element they should descend into and thus load the required script.
> <intent ...>
>   <!-- Load the polyfill shim -->
>   <script src="http://webintents.org/webintents.min.js"></script>
> </intent>

That script would execute even in browsers with <intent> support.

> 3. It opens up the possibility for intent specific sub-tags - much like 
> <source> in <video> that we might need in the future.

That's more of an argument to avoid it...

On Wed, 25 Jan 2012, Paul Kinlan wrote:
> 
> I would prefer to treat it like a embedded content element [1] and have 
> the intent spec define how fallback content should be presented and 
> parsed - so we would define that <script> is ignored in a conforming UA. 
>  In our case we would want to work like the video element [2] with the 
> added script restriction.
> 
> Is this a completely abhorent solution?

I wouldn't say "completely abhorent", but having conditions that control 
whether script runs or not is not something that I would recommend. It has 
historically been fraught with problems, which has led us to <script> 
having an excessively complicated parsing model. I would recommend not 
adding any magic there.

On Sat, 17 Dec 2011, Anne van Kesteren wrote:
> 
> The answer is that when you want to add something new to the <head> 
> element, it makes sense to consider using <meta> and <link>, and that 
> adding attributes to them is not a big deal, because it rarely happens 
> that we do so. In the close to eight years the WHATWG has been working 
> on HTML, we have added one new attribute, to <link>.

We've added more than just one. Microdata adds a bunch that have specific 
semantics on those elements, we've invented charset="" based on legacy 
behaviour, and we added sizes="".

On Sat, 17 Dec 2011, Paul Kinlan wrote:
> 
> Intents is a new platform feature and we would add 4 or more on the meta 
> tag just for this first version of intents, and then more again when we 
> add more features to the intent declaration system to handle RPH and 
> RPC.  I don't think this is an acceptable solution just for intents and 
> why a new self contained tag is a better solution.

I don't think it's as big an issue as you suggest.

The only reason I think it doesn't make sense to reuse <meta> is that it 
would add a fourth orthogonal processing model for that element. If 
intents were more like one of the existing models, it would make perfect 
sense to reuse <meta>, even with a dozen new attributes. Attribute are 
cheap. Elements are ridiculously more expensive.

On Wed, 15 Feb 2012, James Hawkins wrote:
>
> We, the designers of the Web Intents draft API, have always seen Web 
> Intents as a superset of the functionality provided by 
> registerProtocolHandler (RPH) and registerContentHandler (RCH).  To 
> follow this to the logical conclusion, we should be able to provide 
> functionally equivalent counterparts to RPH/RCH in Web Intents.  This 
> proposal provides a means of deprecating RPH/RCH, replacing this 
> functionality with equivalent functionality from Web Intents.

I don't think it makes sense to deprecate them. We should design intents 
to incorporate them, not deprecate them.

> isProtocolHandlerRegistered / isContentHandlerRegistered
> ========================================================
> 
> There are serious fingerprinting issues with these methods, and when 
> contemplating analogous methods for Web Intents, we thought long and 
> hard about the fingerprinting issue.
> 
> As spec'ed a site could call registerProtocolHandler('web+uniqueID', 
> ...) where uniqueID is unique to a user.  The site could then call 
> isProtocolHandlerRegistered with that matching 'web+uniqueID' to verify 
> who the user is.

How would they know what scheme to check?

Also, the user would typically have to agree to registering the handler 
for this, so they would have an idea that the scheme was dodgy. A bigger 
threat, IMHO, is registering a URL with a session ID, much like a site 
could redirect the page to a URL with a session ID when registering an 
<intent> and then check the URL when invoking the handler.

> Instead of creating analogous functionality of these methods for Web 
> Intents, we decided to tackle the problem state of an empty picker.

An empty picker isn't really the use case being addressed by these 
methods. (I don't really see how they would help for that use case.)

The use case is for a site that wants to show more elaborate advocacy for 
users who have not yet registered the handler.

> unregisterProtocolHandler / unregisterContentHandler
> ===========================================
> 
> The analogous functionality for these methods in Web Intents already 
> exists and is the same as the removal of any type of service: remove the 
> declarative registration from the content, and the UA will unregister 
> the service as a handler.

That only works if the user visits the same page again. Why would the use 
visit the page if the handler no longer exists?

On Tue, 21 Feb 2012, Bjartur Thorlacius wrote:
>
> Windows Explorer (the file manager) does for example offer users to edit 
> images upon right-click. I worry that if URI scheme handlers need not 
> only take care of fetching but also of presentation, other actions than 
> view will be unnecessarily hard to implement. Thus I figure retrieval 
> and presentation must be separated.

I don't really see how you would tell the browser what the action is.

As part of replying to this e-mail, I also reviewed the existing Web 
Intents spec. Here are some comments on it. I hope they are helpful.

- Nothing seems to ever actually invoke the structured clone algorithm, so 
it's unclear when that should run. In particular, I don't understand when 
ports get cloned. Is it in the constructor? In startActivity?

- What does it mean for a member of Intent to only be present at certain 
times? (e.g. "Only present when the Intent object is delivered to the 
Service page")

- A lot of the spec seems to be lacking in formal requirements; it just 
describes what happens but doesn't actually require it.

- The spec requires that the interfaces that the Window object (called 
"DOMWindow" in the spec for some reason) implements depend on the markup 
in the page. This makes no sense, since the markup in the page isn't known 
at the time the interfaces are prepared, and even if they were, the page's 
content can change dynamically with elements being added and removed from 
script randomly.

- Using URLs as intents, especially for the default intents, is overly 
verbose. I highly recommend just having a wiki page be a registry of 
widely used intents, and saying that if people want specialised ones for 
their own communities, they can then use URLs, but otherwise it's fine to 
just use simple identifiers like "edit" or "share", so long as they are 
registered in the wiki. This is what we're doing with rel="" and it seems 
to work fine.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'