?

Log in

No account? Create an account

fanf

DoS-resistant password hashing

« previous entry | next entry »
6th Mar 2013 | 22:17

In the office this morning we had a discussion about password hashing, in particular what we could do to improve the security of web servers around the university. (You might remember the NullCrew attacks on Cambridge University a few months ago.)

One of the points the standard password storage advice (which boils down to "use scrypt, or bcrypt, or PBKDF2, in decreasing order of preference") is that a decent iteration count should make a password hash take tens of milliseconds on a fast computer. This means your server is incapable of servicing more than a few dozen login attempts per second, and so your login form becomes quite a tempting denial of service target.

One way to mitigate this is to make the client perform a proof-of-work test that is more expensive than the password hash which brings you a little closer to fairness in the arms race. On Twitter, Dan Sheppard tersely suggested some refinements:

Improve client-side puzzles: bypassable with uid-tied unexpiring cookie. If supplied & not recently & not blacklisted, relax puzzles. Under DOS require this cookie to go into 1st queue, otherwise into 2nd with DOS. 5s at first login on new machine w/ spinner is ok.

But for real DoS resistance you need the client to be doing a lot more work than the server (something like 3 to 10 binary orders of magnitude), not roughly the same amount. So I wondered, can you move some of the work of verifying a password from the server to the client? And require that an attacker with a copy of the password file should have to do both the client's and server's work for each trial decrypt.

I thought you could use two hash functions, "hard" and "easy". ("Easy" should still be a slow-ish password hash.) The server sends some parameters (salt and work factor) to the client, which executes the "hard" function to crypt the password, and then sends password and crypt to the server. The server uses these as the input for the "easy" function, and checks this matches the stored crypt. The server stores only the first salt and the final output of the process, along with the work factors.

What might be the problems with this idea? (I will not be surprised or upset if it turns out to be completely laughable to cryptographers!)

The output of the "hard" KDF is password-equivalent, so it needs to be well protected. I'm assuming it is treated like an ephemeral key. Don't be tempted to put it in a cookie, for example.

This protocol is not a proof of work like what is described in the Stack Exchange question I linked to above: in my idea the challenge from the server to the client is always the same. (Which is why the result of the work is password-equivalent.) But this is OK for a threat model that is satisfied by basic password authentication.

The "easy" function needs to be resistant to attacks that aim to recover the salt. As I understand it, this is not a normal requirement for password hashes (the salt is usually considered public) so an unusual construction might be needed.

Any other problems? Is this a completely foolish idea?

Of course this is a mostly useless idea, since there are much better authentication protocols. In particular, SRP (Secure Remote Password authentication and session key establishment) does all the work of hashing the password and salt on the client - observe in the summary where Hash(salt, password) happens.

What other password authentication protocols should be mentioned in this context?

| Leave a comment | Share

Comments {17}

Simon Tatham

from: simont
date: 6th Mar 2013 23:31 (UTC)

If a client doesn't care about successful password verification and just wants to DoS you, what's to stop them sending you a load of random binary strings and claiming they were generated by the "hard" hashing function? In order for this to work at DoS prevention the server surely needs to be able to verify cheaply that the "hard" hash was done correctly.

Reply | Thread

Tony Finch

from: fanf
date: 6th Mar 2013 23:43 (UTC)

That's why the "easy" function is "easy". The final verification is the string comparison of the stored crypt with the result of easy(hard(password)). A rule of thumb might to be that the login form processing shouldn't be more heavyweight than other dynamic pages accessible to unauthenticated users.

Reply | Parent | Thread

Gerald the cuddly duck

from: gerald_duck
date: 7th Mar 2013 01:07 (UTC)

Say "hard" is 1,000 times more CPU-intensive than "easy" (the ten binary orders of magnitude you suggested). But now say "easy" is 1,000 times more CPU-intensive than producing a blob of garbage.

If a DOS attacker chooses to send you garbage instead of "hard(pasword_guessn)" they've just reversed the asymmetry.

Also, by my understanding, the message passed from client to server is a constant for a given salt,password pair and the salt for a given password is a constant until it's changed. So an attacker can brute-force the space of client→server messages to end up with something exactly as valuable as a user password. Making that space large is potentially at odds with making verification cheap.

What your scheme is competing with is the simpler mechanism of giving the client a nonce and requiring that its login attempt be accompanied by the corresponding proof of work. It's of the essence that verifying an alleged proof of work is ridiculously cheap, so I'm not sure you can win by making it perform double duty.

Reply | Parent | Thread

Tony Finch

from: fanf
date: 7th Mar 2013 01:36 (UTC)

The aim is to make the login form less of an attractive nuisance, not to eliminate DoS attacks altogether, hence my comparison with dynamic web page generation in the previous comment. But at the same time, I want to retain the overall work factor required for a cracker who has obtained a copy of the password file.

The space of client -> server messages is going to be a couple of hundred bits, so it can't be brute-forced, unless I have missed something.

The reason for not just tacking a proof of work on the side of a cheap password hash is because then your password file becomes easily crackable relative to an expensive password hash. So you want an expensive password hash, but you want the client to do most of the work.

Reply | Parent | Thread

Gerald the cuddly duck

from: gerald_duck
date: 7th Mar 2013 13:47 (UTC)

You could always require proof of work and first-stage hash from the client?

The two-stage hash idea itself keeps making me uncomfortable. But then I keep remembering that transmitting a password in the clear also makes me uncomfortable, so that's no grounds for objection. (-8

Um… is this intended to happen over TLS or similar? If so, isn't that already a meaty target for DoS attacks? If not, would a miscreant not just capture hard(password) from the wire?

Reply | Parent | Thread

Tony Finch

from: fanf
date: 7th Mar 2013 16:50 (UTC)

Yes, adding a proof-of-work might help a bit - probably selectively as Dan suggested.

Yes, it has to be used over TLS, just like normal password authentication. (Honoured in the breach etc.) And regarding TLS's vulnerability to DoS, I refer the gentleman to the answer I gave earlier :-)

Reply | Parent | Thread

from: anonymous
date: 7th Mar 2013 20:14 (UTC)

The asymmetry analogy is a bit of a red-herring, I think. What's being tested is your possession of a scarce and expended resource which is assumed to be distributed somewhat fairly among clients.

It doesn't matter if you protect CPU with a memory-oriented problem or vice versa. If network were incredibly cheap and disk very expensive (which they aren't) It wouldn't really matter if you handed over a gigabyte of random numbers, remembered its hash and then requested data to fulfill the hash at a later date to prove use of storage even if what you were protecting were CPU resource or something else. [For example, in this mythical bandwidth-cheap world to send mail you could require a mail sender to hang onto random data at least the size of the message to be sent (or some multiple) for some period of time. Overall the cost of that would be a constant factor of a number of retries. The idea of an asymmetry between (disk-storage x time) on the one hand and spam reputation on the other is way too apples and oranges for me to accept the asymmetry metaphor as being useful in general].

If you're protecting CPU resource, all you need to ensure a net win is that the reject branch of a proof-of-work system is shorter than executing the unadulterated code for all plausible sequences of steps. If you have a very heavyweight secure N-hash login process then there's a good chance that the reject branch is 1/Nth (or so) of the size of executing the login, so you've reduced your exposure. If you have lightweight logins, such that reject is longer than the password verify, then this technique doesn't help (but you're in trouble in other areas), but you'd be daft to require proofs of work at this stage, as you suggest.

In login-less sessions, the best thing to do would be to issue the proof of work but not to require or verify the response until some substantial work is requested (eg request the client put it into a cookie in a web context). This would even improve the experience for real clients because the proof-of-work could happen in parallel with the initial (lightweight) steps.

We have a system at work which can cause an unverified web user to execute a lot of CPU time (by constructing a carefully crafted DNA sequence). We don't have any trouble at the moment (because it's low profile and requires a lot of bioinformatics knowledge to create an attack string) but if we ever did, I'd probably suggest we put in place just such a background proof-of-work system.

Reply | Parent | Thread

from: anonymous
date: 7th Mar 2013 20:15 (UTC)

Sorry, forgot to sign. That was me, for all anyone cares, :-).

Unless it's wrong, in which case it was my wife, and I resign my seat.

-- Dan.

Reply | Parent | Thread

Res facta quae tamen fingi potuit

from: pauamma
date: 6th Mar 2013 23:31 (UTC)

My first worry would be sidechannel attacks. (The more work you have the client perform, the more likely observable effects become.) But I'm not a cryptographer either.

Reply | Thread

Ewen McNeill

Client-side crypto-offload

from: edm
date: 7th Mar 2013 00:14 (UTC)

The most obvious thing that comes to my mind is that there are a class of functions (NP :-) ) that are believed to be difficult to calculate, and easy to verify that you have a correct answer. So possibly that can be leveraged in some way: the client does the NP work, and the server does the verification.

The second most obvious thing is that one could potentially do N rounds of crypto on the client and M rounds of crypto on the server, for the same algorithm (with N >> M). But you'd still have to make M big enough that, eg, offline attacks weren't too easy, so maybe it doesn't help that much.

My "10 seconds thought" idea is that factoring might be a reasonable choice, and might allow passing a salt/nonce to the client in a form that it had to work to recover before it could prepare the password for submission. It could potentially send back, eg, the recovered nonce (possibly in some obscured form) as proof of work sufficient to justify the server doing the remaining M rounds.

FWIW, I think if you can reduce password use to "get authentication token to use for quite a while" (eg, kerberos-style), and "unlock keychain" (ssh agent, etc) then it becomes much more feasible to have it take a "long" time (eg, a second or more), at least at the client. (My main other regular password use is unlocking my local terminal, and the local CPU _is_ the client CPU in your scenario, so that too could take a substantial fraction of a second of CPU time.)

Ewen

Reply | Thread

Tony Finch

Re: Client-side crypto-offload

from: fanf
date: 7th Mar 2013 00:23 (UTC)

Thanks for the comment!

Your N>>M idea is essentially the same as mine: my "hard" and "easy" might as well be the same function with different numbers of rounds. But I wonder why you think a small M makes offline attacks easy: the salt for the "easy" function isn't available to an offline attacker. Is there something I'm missing?

Regarding your final paragraph, I think Browser ID (aka Mozilla Persona) is close to the Right Thing.

Reply | Parent | Thread

Ewen McNeill

Re: Client-side crypto-offload

from: edm
date: 7th Mar 2013 00:46 (UTC)

I had in mind attacks like the MD5-continuation attacks where given an intermediate value and some extra text you'd like to add in, you can just keep on running iterations on MD5 and the result will be valid. (See generally hash length extension attacks; they also apply to the SHA family.) That's probably more a of a risk if the hard/easy functions are just iterations of the same algorithm, but it seems like an attack you'd want to keep in mind when designing any combination of algorithms. (The "hash length extension" recommended fix is "use HMAC" -- it was designed to avoid that, by doing two layers of MAC.)

I've not looked at BrowserID in detail. But I've long believed in allowing a user-agent (on a device I trust) to manage some of my authentication. Including the web browser (my web browser are almost always programmed to remember the credentials for all my "low value" accounts, where "low value" is "I'm willing to deal with someone stealing these credentials"; it lets me just give then random characters for the password, so I suspect the overall security is higher). In this context it's probably important to remember that there isn't a "one size fits all" solution: I need much more protection for, eg, online banking, than I need for some throw away forum that makes me register to post anything.

Ewen

Reply | Parent | Thread

Tony Finch

Re: Client-side crypto-offload

from: fanf
date: 7th Mar 2013 00:55 (UTC)

Password hashes are not vulnerable to that kind of attack: for example PBKDF2 is based on (typically thousands of) iterations of HMAC.

Reply | Parent | Thread

octal

from: octal
date: 8th Mar 2013 15:21 (UTC)

This is an interesting idea; the simple solution for right now is to .

The other solution is using an HSM or some other form of hardening to protect the server-side storage (maybe implement by pinning a core with TRESOR++ and AES-NI? and Intel TXT), with rules to rate-limit how often passwords can be matched inside the secure code.

There's something inherently disgusting about systems which burn CPU in a wasteful manner just for security which could be achieved in other ways.

Reply | Thread

from: anonymous
date: 8th Mar 2013 22:09 (UTC)

You might know this, actually Tony. Someone has suggested that memory bandwidth is a better distributed resource than CPU time so is suggesting requiring a certain number of memory accesses. The algorithms I've seen rely on repeated application some function involving a massive LUT, generated by an expensive pseudo-random process (eg repeated hashing) and then requiring "random" lookups.

I got to thinking that effectively this is memoization and was wondering if you could think of a good algorithm which benefits greatly from memoizing (in O() terms) vs raw un-memoed computation which could be applied easily to ensure the client performed a certain number of memory accesses. Is there an archetypal memoized algorithm which benefits from this? I can think of some in various applied domains, but they're inconvenient for tedious detail reasons and complex to specify.

-- Dan

Reply | Thread

Tony Finch

from: fanf
date: 13th Mar 2013 09:47 (UTC)

You're thinking of scrypt.

It's sort of the opposite of memoization since that aims to go faster by spending memory to save CPU, whereas scrypt aims to go slower by spending memory because that requires lots of silicon, and crypto algos are too easily optimised down to FPGAs etc.

The aim here is not proof of work, but to offload work from the server to the client while keeping the same level of protection for crypted passwords. I want to mitigate one of the disincentives for deploying good password security: the added load on the server.

Reply | Parent | Thread

Tony Finch

Some comments from Twitter

from: fanf
date: 13th Mar 2013 10:04 (UTC)

@solardiz To support JavaScript-less visitors as well, we'd have to include fallback to server-side only hashing, so it's not anti-DoS

@dchest most web apps won't work without JS anyway

@solardiz Client-side pre-hashing reduces server load from legitimate traffic, and it makes passwords in transit less vulnerable

@solardiz Drawbacks include added complexity and the need/preference to provide a (preferably separate) salt to the client

@dchest yeah, I'm not concerned about DoS _for now_. Just as a way to increase cost.

@dchest as for salt, [username||sitename] is good enough.

@solardiz Alternatively, the username || service name can be used as the client-side pre-hashing salt, but there may be dupes

@dchest dups and precomputation, but it reduces complexity of implementation, so a reasonable tradeoff

@solardiz Sending the main/only server-side salt to client would expose hash comparison to remote timing attacks (char-by-char)

@solardiz If constant-time string comparison is used, or timing attacks are otherwise defeated, revealing the only salt is OK

@solardiz Precomputation of the client-side portion is possible with server-provided salts too. Yes, may be reasonable tradeoff.

@dchest ah, indeed

@solardiz Oh, and revealing the server salt is also bad in terms of allowing for precomputation of full hashes. Use H(salt) at least.

@dchest yep. Or, simpler -- use two salts

@solardiz I'm not sure which is simpler, but either way another concern is (not) allowing for username probing (fake salts for invalid)

@solardiz To see if we can increase cost, we need to know whether the limiting factor is the DoS potential or/and normal server load

@solardiz If it's the DoS potential, then we can't increase cost by client-side pre-hashing (we might by client-side proof-of-work)

@dchest yeah, my next thought are how to combine proof-of-work with hashing in a single thing

@solardiz Combining pre-hashing with proof-of-work is a useful direction, and you may bring it to #pwdhc (submission or otherwise)

@solardiz You start with a "tens of milliseconds" assumption for server-side hashing, but it's not universally accepted/recommended

@solardiz Throughput of ~1ms means latency ~24ms on 24 logical CPU machine (e.g. 2xE5645), but for DoS risk throughput matters most

@solardiz And yes, scrypt is not great at these low durations (it uses too little memory then). We'll try to address this in #pwdhc.


Edited at 2013-03-13 10:05 am (UTC)

Reply | Thread