[whatwg] Hashing Passwords Client-side

Thu Jun 16 16:28:23 PDT 2011

Hello All,

Thank you for your feedback.

> Personally, I'd prefer the information be transmitted via another
(browser-synthesized) form input

This strikes me as abnormal; I'm not aware of the browser injecting form
values for any other functionality.  However, one benefit of this method is
that a developer can create a JavaScript file to drop in to pages that will
perform hashing for legacy browsers.  The JavaScript could check to see if
the browser performs hashing, and if not, add listeners on all form
submissions.  It could hash the password fields prior to submission, and
inject the the synthesized form value.  This would provide a path for legacy
support.

> For the @hash attribute, we should just specify a single hash for now, the
strongest we believe we can rely on.

The disadvantage to this approach is that, years from now, the default may
be compromised (like md5).  By always forcing the webmaster to choose a
value, it helps to make it a conscious choice, as opposed to "just add
`hash` to all input tags" behavior.  If there is a default hash, then it
will be the first target for hackers to break.

> Do you expect it to help because users will be able to see which sites are
using unhashed passwords, and complain so that the site admins fix it?

I didn't at first, but that is a nice side effect.

> Salting with something that's not included in the database is not a good
idea

I agree.  Ideally, I think all webmasters should include a @hash and @salt
value for every password, and store them in the database before serving out
the HTML.  However, for lazy webmasters who just add the @hash attribute, I
would want the @salt to default to something that mostly works.  If they
decide to later migrate and break the origin, they can set the @salt
manually to the old origin on the input field.  Then, in their "Change
Password" section of their site, they can use the old @salt in the "Old
Password" entry, and a new @salt in the "New Password" entries.

> Also, the site should really be using a per-user salt

Yes, the site should.  In fact, the webmaster should be doing additional
hashing on the server side.  However, we can't control that.  If the
webmaster is clueless and just stores the data directly (which webmasters do
in the real world, unfortunately), at the very least, this solution will
improve security.

> So I think salting can just be omitted here

If there isn't a salt, or the default salt is the empty string, then it
fails to solve the problem.  Without a site-specific salt, the hash sent to
the server will match the hash sent to any other server.

> Not supporting salts does leave us open to rainbow tables, unfortunately,
but I don't see a good way to fix that from the client side.

Site-specific salts solve it from the client side.

> Why?  The server can first try comparing the submitted password to the
stored hash, then if that fails, hash the submitted password and compare
that to the stored hash.

Imagine the use case where a user joins a site on a legacy browser.  The
legacy browser sends the un-hashed password.  They then attempt to login
using a modern browser, which correctly hashes the password before sending
it.  The authentication will fail.  There needs to be a way for the server
to distinguish when the hash has been correctly applied.  As mentioned in a
previous e-mail, I would imagine this work being done by a server-side
framework automatically (eventually).

> I'd suggest a way to allow authors to iterate the hashing

Iterating the hashing is much better than a single pass.  However, it will
consume more power/time from the client computer.  We don't want to DoS
their slow device because a webmaster put an abnormally high value in the
iteration field.  If iterations are deemed important, they can be
implemented by simply creating a new hash function.  For example, the
@hash="sha1" could perform SHA1 for 1 iteration.  Or, @hash="sha1x100" could
be defined by the standards body to perform SHA1 for 100 iterations.  The
benefit of this method is that the iteration is not defined by the webmaster
- the iteration values are carefully vetted and decided on a hash-by-hash
basis by the standards body.

> (on the topic of user-defined salts) Only until the site changes origins
and all logins break for no apparent reason.  Or if the site is accessible
from multiple origins.

If the site has multiple origins, then they should provide their own @salt
value, and override the default value.  If the two origins provide the same
salt, the hash will be the same.

> I think it's better for script to get the unhashed value.  ... with the
unhashed value you can do things like check length and so on.

I think this is a good point; this can help to mitigate the disadvantage
that the host cannot validate password requirements.  It is true that
someone could subvert the requirements, but at least for the majority of
users, it will work.

> A variation of this idea has been proposed in the past but was largely
seen as undesirable

I've read some of the thread... Please keep in mind that my proposal is not
a catch-all solution to password management.  It is intended to solve one
specific problem.  It is intended to be incremental progress.

Thanks again for taking the time to read the proposal.

~Sean

On Thu, Jun 16, 2011 at 5:39 PM, Daniel Cheng <dcheng at chromium.org> wrote:

> A variation of this idea has been proposed in the past but was largely seen
> as undesirable--see
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2010-May/026254.html.
> In general, I feel like the same objections are still true of this proposal.
>
> Daniel
>
> On Thu, Jun 16, 2011 at 14:08, Tab Atkins Jr. <jackalmage at gmail.com>wrote:
>
>> On Thu, Jun 16, 2011 at 12:59 PM, Sean Connelly <sean at pbwhere.com> wrote:
>> > I've just joined the mailing list, and this is my first time in such an
>> > environment, so I apologize ahead of time if I'm not using the list
>> > correctly.
>>
>> Nope, you did pretty good.  You listed a problem, and then proposed a
>> solution to it.  Most people forget to do that first part when they
>> start posting.  ^_^
>>
>>
>> > ## Problem Attempting to Solve:
>> >
>> > Websites commonly need to store login information for users.  Web
>> developers
>> > may naively store the password in non-secure ways (plain-text, md5 with
>> no
>> > salt, etc).  It has become common for hacker groups to target websites
>> to
>> > get a data-dump of all users/passwords, and using this information, try
>> to
>> > compromise accounts on other websites.
>> >
>> > One example below:
>> >
>> >
>> http://arstechnica.com/security/news/2011/06/lulzsec-rampage-continues-62k-e-mails-and-passwords-cia-under-attack.ars
>>
>> Or, more concretely, you *never* actually need to store the password
>> that someone is using.  Like, ever.  You should *always* immediately
>> hash the password with a good cryptographic hash, and only store the
>> hashed value.  The only thing you should ever need to do with the
>> plaintext password is pass it to your hashing function, and then
>> immediately forget it.
>>
>> However, a non-trivial number of servers don't do this, which is the
>> source of constant security headaches.
>>
>>
>> > ## Proposed Solution:
>> >
>> > Add an attribute to <input type="password"> called "hash".  For example:
>> > <input type="password" hash="sha1" salt="something">
>> >
>> > This will indicate to the browser that it needs to hash the value
>> locally
>> > before sending it to the server.  This hash should include a
>> site-specific
>> > salt, so that the same password typed on two different sites will hash
>> to
>> > different values.  I propose the default salt to be the origin as an
>> ASCII
>> > string (protocol + host + port, ex: "http://example.com:80"), and the
>> > default hash to be "none" (in order for backward compatibility).
>> >
>> > By hashing the password before transmitting to the host, the host is
>> never
>> > actually aware of the password typed by the user.  The host can treat it
>> as
>> > a normal password, and store it as it would normally store any other
>> > password.  Authentication can still be performed because the host would
>> > check to see if the hashes matched.
>> >
>> > In order to deal with migration correctly, the browser will also need to
>> > communicate to the server that it correctly performed the hash.  I
>> propose a
>> > new header for the browser to send:
>> >
>> > X-Password-Hash: 1
>> >
>> > If the browser does not send this header, then the host should expect to
>> > receive an unhashed, plain-text password.
>> >
>> > Each available hash function (sha1, sha2, etc), will have to be
>> identified
>> > in the spec, along with the format the hash should be transmitted in
>> > (lower-case hex dump?).
>>
>> Personally, I'd prefer the information be transmitted via another
>> (browser-synthesized) form input, as it's usually much easier to read
>> form inputs than header values.
>>
>> (Also, X-* headers are an antipattern.  The X- prefix serves
>> absolutely no purpose.  This is just a naming issue and irrelevant to
>> your proposal; I just wanted to inform you in case you're ever
>> directly responsible for naming a header in the future.)
>>
>> I like your idea for the default salt.  We might be able to hook off
>> of slightly better concepts (use the origin directly?) but the idea is
>> sound.
>>
>> For the @hash attribute, we should just specify a single hash for now,
>> the strongest we believe we can rely on.  Then we can make it the
>> default value, so utilizing this would be as simple as <input
>> type=password hash>.  (You don't need a "none" value, since the lack
>> of the attribute would indicate that.)  If this becomes inadequate in
>> the future, we can just add more values.
>>
>>
>> > ## Benefits:
>> >
>> > 1. Host never has access to actual password (as long as user has a
>> modern
>> > browser)
>> > 2. If the host is compromised, hackers may be able to takeover the
>> account
>> > on the server, but will not be able to take over accounts on different
>> > servers even if the user uses the same password (because the hackers
>> will
>> > only have access to the hashed password with site-specific salts)
>> > 3. Plain-text passwords cannot be sniffed over HTTP
>> > 4. Easy for webmasters to upgrade for additional security benefit
>>
>> For #3, you can still sniff the hashed password over HTTP, and then
>> just submit that manually.  But point #2 mitigates the damage that
>> would do, unlike the current state of affairs.
>>
>>
>> > ## Disadvantages:
>> >
>> > 1. Host cannot validate password requirements (ex: 2 upper case, 2 lower
>> > case, 2 special characters, password length, etc)
>>
>> This is a benefit, actually.  Password requirements are, nearly
>> uniformly, absolutely horrendous for security in practice.
>>
>>
>> > 2. Server-side code might be complicated for dealing with legacy,
>> > non-hashing browsers
>>
>> Only for the transition period.  Afterwards, you can just ignore
>> legacy browsers and store the passwords directly.  Those older
>> browsers will just have security vulnerabilities.
>>
>> Of course, server-side frameworks can hide that for you.
>>
>> > ## Questions:
>> >
>> > 1. How to deal with the character encoding of the page correctly?
>>  Should
>> > everything be converted to UTF-8 before the hash is calculated?
>>
>> Javascript is utf-16 internally.  However, I'd recommend doing the
>> hash with the string in utf-8.
>>
>> > 2. What level of access should JavaScript have?  Should it have access
>> to
>> > read the plain password, or should it only be able to read the hashed
>> value?
>>
>> The .value property and the value actually submitted should be
>> identical.  This indicates that, unless we add something extra, JS
>> would only get the hashed value.
>>
>>
>> Overall, I like the idea.  It seems like a pretty clueful addressing
>> of the topic, and it directly addresses the problem that servers
>> shouldn't ever remember passwords, but a lot of them do.  Finally, it
>> puts the processor cost of good crypto-hashing on the client rather
>> than the server, which is nice.  We can do a nice, expensive hash on
>> the client without burdening the user, while an expensive hash *can*
>> be a minor issue for busy servers.
>>
>> ~TJ
>>
>
>