[whatwg] Fixing two security vulnerabilities in registerProtocolHandler

Fri Apr 6 14:19:50 PDT 2012

On Mon, Apr 2, 2012 at 4:39 PM, Ian Hickson <ian at hixie.ch> wrote:
> On Mon, 26 Sep 2011, Tyler Close wrote:
>>
>> I was recently experimenting with the registerProtocolHandler (RPH) API
>> and came across a couple of security gotchas that make it hard to safely
>> use the API. One of these is already known, but AFAICT, hasn't been
>> fixed yet. I haven't seen the other discussed yet.
>>
>> The Mozilla blog post that introduces the registerProtocolHandler API
>> makes use of window.parent.postMessage to send a response from the RPH
>> handler back to the client page.
>
> I presume it uses this in conjunction with an <a href=""> link with a
> target="" attribute to load the handler in an iframe.

The client page loads the handler page using an iframe or a
window.open(). Either can work.

>> In the example code, the targetOrigin for this postMessage invocation is
>> '*', while also noting that this is not secure. AFAICT, there is no API
>> that the intent handler can reliably use to determine the correct
>> targetOrigin for this postMessage invocation.
>
> How can the origin be anything other than the origin of the page that
> triggered the link?

Exactly, but we need a way for the handler page to find out what that origin is.

A client page on origin A causes a navigation to a RPH URL (iframe or
window.open). The browser loads the user chosen RPH handler, which is
another web page from origin B. After the handler page loads, it wants
to send a return value back to the client page. How does the handler
page know the client page's origin is A? It needs to know this origin
string so that it can securely use postMessage to send the return
value back. AFAICT, there is no existing API in the browser that lets
the handler page determine the client page's origin.

>> I suggest fixing this problem by adding a new readonly DOMString that
>> contains the correct origin for the postMessage invocation; perhaps
>> document.origin. So the response invocation would then be coded as:
>>
>>   window.parent.postMessage('my response data', document.origin);
>>
>> Perhaps a different name or location is better for this field, so I'll
>> defer to the editor's judgment.
>
> You can work out your own origin from window.location's members, but I
> don't see how this helps you determine the origin of your parent. There's
> a separate thread about adding a way to obtain your parent's origin, but
> again, I don't see why you would need it in this case. Can you elaborate
> on what the attack scenario you are envisaging is?

Currently, the handler page can only specify "*" in the postMessage
invocation that sends the return value. If the client page is
navigated by an attacker, before the postMessage is done, the attacker
can intercept the return value. It's the same rationale used every
time we advise programmers against using '*' as the targetOrigin for a
postMessage() invocation.

>> The second problem with RPH is that the handler page doesn't have a
>> way of reliably getting the URL of the content to be handled from the
>> browser. In order to work in offline scenarios, the RPH handler must
>> put the %s placeholder in the fragment of its handler's URL.
>
> It's not clear to me that it makes sense to have an offline protocol
> handler. What kind of protocol do you have in mind?

For example, consider an offline web mail program. I click on a
mailto: link and want to compose a message in my web mail editor,
queuing it to be sent next time I'm online.

RPH is a way for a web page to send data to a user determined
application. There will surely be many scenarios where offline
functionality is desirable.

>> Unfortunately, this means that other content in the browser could
>> modify the content URL before the handler reads it.
>
> Well, any content can load any URL, so it doesn't matter whether the URL
> is in the fragment identifier or the path or anything else, surely.

It matters if the handler page assumes that the URL came from its
parent or opener. The parent and opener then engage in a postMessage
conversation where the parent knows it said one thing, but the handler
heard it saying something different, something chosen by the attacker.

>> For example, an attacker could open a window on a victim web page. The
>> victim web page then opens an <iframe> on a content URL that triggers
>> RPH. The attacker then navigates the <iframe> so that its
>> window.location contains a different content URL.
>
> How can the attacker navigate that iframe? Surely it would not be allowed
> to navigate it, per the "allowed to navigate" definition in HTML.

Boris already answered this.

>> The intent handler sees a request coming from the victim page, but with
>> a content URL specified by the attacker. A related problem is that the
>> intent handler has no way to distinguish whether its URL was loaded via
>> the browser's RPH handling, or whether the client page directly
>> navigated to the intent handler's URL. Both of these problems could be
>> fixed by adding another readonly DOMString to the API that contains the
>> %s data for the RPH invocation.
>
> I don't understand why it matters how the URL was invoked.

If the URL was invoked via RPH, then the handler page knows that the
user selected it for this action. The handler page also knows that any
arguments in the handler's URL (not in the RPH URL), were set by the
handler's origin and were not tampered with by the client page.

For example, a web mail program might have two registered RPH handlers
for mailto: "https://example.org/?from=me@company&q=%s" and
"https://example.org/?from=me@personal&q=%s". The user has configured
their browser to send mailto links to their personal email editor. A
malicious client page could directly open the URL for the company
email editor. The web mail editor needs a way to detect when a client
page is trying to subvert the user's chosen preferences. So, an RPH
handler needs a way to know that it was loaded via the RPH dispatch.
Once it knows this, it can also trust that the arguments in the URL,
such as "from" in this case, were not tampered with by the client
page.

--Tyler