PASSWORDS IN A DATABASE – Data Architecture

WHAT IS THE RIGHT WAY TO STORE A PASSWORD IN A DATABASE

This blog was created for those that wish to store passwords in the database. Those willing to take a few minutes to think about and understand the proper way, will be pleasantly surprised that it’s really just as quick and easy as doing it the wrong way. The difference is that you’re here, spending a few minutes understanding the right way.

First, the wrong way is to store the password in clear-text. If you can SELECT * from table_name; and you can see your user’s passwords… you’ve discovered the wrong way. Same with trying to scramble them up with some ad-hoc code.

BACKGROUND/RATIONAL
Many people use the same username/password for all their web activity. If they create an account on your web site, there is a chance that you (as the web site administrator) could use their
username/password (specified to log into your web site) to login to their gmail.com, amazon.com, microsoft.com, twitter.com, facebook.com, etc, account.

By the same token, you want to protect your user’s personally identifiable information from hackers that may have access to your site.

DISCUSSION
1) Hash the password
Hash the password. All will be good, right? (‘Hash’ is kind of like encrypting, see below.)

One benefit of hashing is that you can store the hash in, say a CHAR(64) column. If you have a 1 character password, it will hash to 64 byte hexadecimal string. Same goes for a password of 128 characters. It does not matter. There should be no need at all for a password maximum length limit, any password will hash to the same 64 bytes. That’s also the reason for CHAR and not VARCHAR… there is no varying, it’s always 64 bytes.

Hashing is an important step, but the password your user provided hashes to exactly one hash-value.
Because hashes of the same password always hash to the same hash-value, someone could create a table of common passwords and their corresponding hash-values. They then could see if a password’s hash-value is listed in the table. If so, they then have your user’s password in clear-text. That table is called a ‘rainbow table’.

2) Salt
Adding salt involves hashing again with the user username after hashing the password (ie, first hash concatenated with the userid). This makes the hash-value different for someone with the same password. This can also be called a ‘salted hash’ and is an effective way of protecting against a rainbow table. But, if someone steals a database backup, they can also obtain the salt along with the salted hash.

If they wanted to create a ‘rainbow table’, it would be for each user. For example, they could build a rainbow table for ‘john’. John’s rainbow table would hash all possible passwords, and would be valid for only john’s users. Maybe, rather than using ‘john’ for the salt, we generate a random number as salt for each user as they are added to the database.

Because hashing algorithms have been standardized (ie, SHA256), hashing just a password (without salt), where each password generates a 1 to one mapping with the hash, building a rainbow table of, say, passwords discovered in breaches from the last 10 years, might be feasible with large computing power, and months of time. It would contain many billion entries. By adding salt, there would have to be one of these rainbow tables for each salt entry. Hence, rainbow tables become impractical.

3) Pepper
Pepper is like salt. It’s a third hash with something else appended to the second hash. The difference is it’s something that is not stored in the database. It’s something unique to the database server, the application server, or the company. For example, maybe you hash again with your company’s phone number after hashing with the password and then hashing with the salt.

With that, a simple attack like SQL Injection, or stealing a database backup, will not be enough (they won’t have the pepper). Your user’s passwords will be kept out of the hands of the bad guy.

FURTHER EXPLANATION
The above is a description of the rational for hashing a password and mixing in salt and pepper. It’s all about making it hard and compute intensive to reverse the process.

Quick definition: a ‘hash’ is a one-way cryptographic function. It’s not encryption, it’s a cryptographic function. A hash function will quickly convert a clear-text string to a hash-value (commonly referred to as a ‘hash’, just like the type of function). It is then nearly impossible to reverse the process, ie, to go from the hash-value back to the clear-text string. Hashing implies one way, or not reversible. Encryption is twoway, you can generate something unreadable, but with the key, that unreadable stuff can be reversed to the original clear-text string.

Because the goal is to make reversing the process slow (see Update below), rather than
hash(password+salt+pepper) // one call to hash
we opt for
hash(hash(hash(password)+salt)+pepper) // three calls to hash, three times slower

Use a modern hash for above algorithm. Something like SHA256. Many old hashes have been deprecated and should not be used. Note, many programming languages have libraries containing hash algorithms, ie, java: MessageDigest md = MessageDigest.getInstance(“SHA-256”);

Salt & pepper should each be a 32 byte integer. They need to be large enough to be impractical to manipulate, ie to discover the pepper if the password and salt is known.

Salt should be a random value stored in the database in the same row as the password’s hash-value, and should be generated/created when the user row is created.

Pepper should be a random value stored outside the database, preferably on the application server. It should be created at the time the web-site is created. Pepper can be hard coded, although a secure config parameter is preferred. As the specific pepper is required for anyone to use the web site, it should be backed up by a special process.

Ensure the password is transmitted securely from the user to the app server using transport level encryption (up to date TLS (ie, V1.2 or greater at the time this blog was created).

Password validation is accomplished by hashing the password the user entered, and comparing the hash-value to the hash-value stored in the database.

EXTRA SECURITY
Consider hashing like this to slow the process down and make generating rainbow tables more time consuming…
i=500 // or some value stored with the user row
for (j=1, j<=i, i++) // many repetitions of hash
{
hash(hash(hash(password)+salt)+pepper)
}

Choose salt and pepper as 64 byte integers, and choose a large hash, something like SHA512.

[Updated 18-Oct-2017]
The ‘Extra Security’ may not scale in huge environments. In environments with just thousands of users, it’ll be fine. For the huge environments, consider adding the complexity of using a keyed-Hash-based Message Authentication Code (HMAC) rather than Salt+Pepper. The HMAC key serves the same function as the pepper. There is no need for the “Extra Security” looping, and no need to slow the process. HMAC, because it uses a key, may require additional complexities, for example, it may require the use of a Hardware Security Module (HSM) and isolate the use of the key from the application (ie, via a web service).

The above suggests the use of SHA256 hashing function. Also consider use of PBKDF2 or Scrypt.

REFERENCES

BONUS

Maybe passwords are a thing of the past

Keywords: password storage, hash, salt, pepper

Troy Frericks.

blog 15–Aug-2016

Written by Troy Frericks