[whatwg] Hashing Passwords Client-side

Thu Jun 16 14:08:36 PDT 2011

On Thu, Jun 16, 2011 at 12:59 PM, Sean Connelly <sean at pbwhere.com> wrote:
> I've just joined the mailing list, and this is my first time in such an
> environment, so I apologize ahead of time if I'm not using the list
> correctly.

Nope, you did pretty good.  You listed a problem, and then proposed a
solution to it.  Most people forget to do that first part when they
start posting.  ^_^

> ## Problem Attempting to Solve:
>
> Websites commonly need to store login information for users.  Web developers
> may naively store the password in non-secure ways (plain-text, md5 with no
> salt, etc).  It has become common for hacker groups to target websites to
> get a data-dump of all users/passwords, and using this information, try to
> compromise accounts on other websites.
>
> One example below:
>
> http://arstechnica.com/security/news/2011/06/lulzsec-rampage-continues-62k-e-mails-and-passwords-cia-under-attack.ars

Or, more concretely, you *never* actually need to store the password
that someone is using.  Like, ever.  You should *always* immediately
hash the password with a good cryptographic hash, and only store the
hashed value.  The only thing you should ever need to do with the
plaintext password is pass it to your hashing function, and then
immediately forget it.

However, a non-trivial number of servers don't do this, which is the
source of constant security headaches.

> ## Proposed Solution:
>
> Add an attribute to <input type="password"> called "hash".  For example:
> <input type="password" hash="sha1" salt="something">
>
> This will indicate to the browser that it needs to hash the value locally
> before sending it to the server.  This hash should include a site-specific
> salt, so that the same password typed on two different sites will hash to
> different values.  I propose the default salt to be the origin as an ASCII
> string (protocol + host + port, ex: "http://example.com:80"), and the
> default hash to be "none" (in order for backward compatibility).
>
> By hashing the password before transmitting to the host, the host is never
> actually aware of the password typed by the user.  The host can treat it as
> a normal password, and store it as it would normally store any other
> password.  Authentication can still be performed because the host would
> check to see if the hashes matched.
>
> In order to deal with migration correctly, the browser will also need to
> communicate to the server that it correctly performed the hash.  I propose a
> new header for the browser to send:
>
> X-Password-Hash: 1
>
> If the browser does not send this header, then the host should expect to
> receive an unhashed, plain-text password.
>
> Each available hash function (sha1, sha2, etc), will have to be identified
> in the spec, along with the format the hash should be transmitted in
> (lower-case hex dump?).

Personally, I'd prefer the information be transmitted via another
(browser-synthesized) form input, as it's usually much easier to read
form inputs than header values.

(Also, X-* headers are an antipattern.  The X- prefix serves
absolutely no purpose.  This is just a naming issue and irrelevant to
your proposal; I just wanted to inform you in case you're ever
directly responsible for naming a header in the future.)

I like your idea for the default salt.  We might be able to hook off
of slightly better concepts (use the origin directly?) but the idea is
sound.

For the @hash attribute, we should just specify a single hash for now,
the strongest we believe we can rely on.  Then we can make it the
default value, so utilizing this would be as simple as <input
type=password hash>.  (You don't need a "none" value, since the lack
of the attribute would indicate that.)  If this becomes inadequate in
the future, we can just add more values.

> ## Benefits:
>
> 1. Host never has access to actual password (as long as user has a modern
> browser)
> 2. If the host is compromised, hackers may be able to takeover the account
> on the server, but will not be able to take over accounts on different
> servers even if the user uses the same password (because the hackers will
> only have access to the hashed password with site-specific salts)
> 3. Plain-text passwords cannot be sniffed over HTTP
> 4. Easy for webmasters to upgrade for additional security benefit

For #3, you can still sniff the hashed password over HTTP, and then
just submit that manually.  But point #2 mitigates the damage that
would do, unlike the current state of affairs.

> ## Disadvantages:
>
> 1. Host cannot validate password requirements (ex: 2 upper case, 2 lower
> case, 2 special characters, password length, etc)

This is a benefit, actually.  Password requirements are, nearly
uniformly, absolutely horrendous for security in practice.

> 2. Server-side code might be complicated for dealing with legacy,
> non-hashing browsers

Only for the transition period.  Afterwards, you can just ignore
legacy browsers and store the passwords directly.  Those older
browsers will just have security vulnerabilities.

Of course, server-side frameworks can hide that for you.

> ## Questions:
>
> 1. How to deal with the character encoding of the page correctly?  Should
> everything be converted to UTF-8 before the hash is calculated?

Javascript is utf-16 internally.  However, I'd recommend doing the
hash with the string in utf-8.

> 2. What level of access should JavaScript have?  Should it have access to
> read the plain password, or should it only be able to read the hashed value?

The .value property and the value actually submitted should be
identical.  This indicates that, unless we add something extra, JS
would only get the hashed value.

Overall, I like the idea.  It seems like a pretty clueful addressing
of the topic, and it directly addresses the problem that servers
shouldn't ever remember passwords, but a lot of them do.  Finally, it
puts the processor cost of good crypto-hashing on the client rather
than the server, which is nice.  We can do a nice, expensive hash on
the client without burdening the user, while an expensive hash *can*
be a minor issue for busy servers.

~TJ