Key derivation function explained

In cryptography, a key derivation function (KDF) is a cryptographic algorithm that derives one or more secret keys from a secret value such as a master key, a password, or a passphrase using a pseudorandom function (which typically uses a cryptographic hash function or block cipher).[1] [2] KDFs can be used to stretch keys into longer keys or to obtain keys of a required format, such as converting a group element that is the result of a Diffie–Hellman key exchange into a symmetric key for use with AES. Keyed cryptographic hash functions are popular examples of pseudorandom functions used for key derivation.[3]

History

The first deliberately slow (key stretching) password-based key derivation function was called "crypt" (or "crypt(3)" after its man page), and was invented by Robert Morris in 1978. It would encrypt a constant (zero), using the first 8 characters of the user's password as the key, by performing 25 iterations of a modified DES encryption algorithm (in which a 12-bit number read from the real-time computer clock is used to perturb the calculations). The resulting 64-bit number is encoded as 11 printable characters and then stored in the Unix password file.[4] While it was a great advance at the time, increases in processor speeds since the PDP-11 era have made brute-force attacks against crypt feasible, and advances in storage have rendered the 12-bit salt inadequate. The crypt function's design also limits the user password to 8 characters, which limits the keyspace and makes strong passphrases impossible.

Although high throughput is a desirable property in general-purpose hash functions, the opposite is true in password security applications in which defending against brute-force cracking is a primary concern. The growing use of massively-parallel hardware such as GPUs, FPGAs, and even ASICs for brute-force cracking has made the selection of a suitable algorithms even more critical because the good algorithm should not only enforce a certain amount of computational cost not only on CPUs, but also resist the cost/performance advantages of modern massively-parallel platforms for such tasks. Various algorithms have been designed specifically for this purpose, including bcrypt, scrypt and, more recently, Lyra2 and Argon2 (the latter being the winner of the Password Hashing Competition). The large-scale Ashley Madison data breach in which roughly 36 million passwords hashes were stolen by attackers illustrated the importance of algorithm selection in securing passwords. Although bcrypt was employed to protect the hashes (making large scale brute-force cracking expensive and time-consuming), a significant portion of the accounts in the compromised data also contained a password hash based on the fast general-purpose MD5 algorithm, which made it possible for over 11 million of the passwords to be cracked in a matter of weeks.[5]

In June 2017, The U.S. National Institute of Standards and Technology (NIST) issued a new revision of their digital authentication guidelines, NIST SP 800-63B-3, stating that: "Verifiers SHALL store memorized secrets [i.e. passwords] in a form that is resistant to offline attacks. Memorized secrets SHALL be salted and hashed using a suitable one-way key derivation function. Key derivation functions take a password, a salt, and a cost factor as inputs then generate a password hash. Their purpose is to make each password guessing trial by an attacker who has obtained a password hash file expensive and therefore the cost of a guessing attack high or prohibitive."

Modern password-based key derivation functions, such as PBKDF2, are based on a recognized cryptographic hash, such as SHA-2, use more salt (at least 64 bits and chosen randomly) and a high iteration count. NIST recommends a minimum iteration count of 10,000.[6] "For especially critical keys, or for very powerful systems or systems where user-perceived performance is not critical, an iteration count of 10,000,000 may be appropriate.”[7]

Key derivation

The original use for a KDF is key derivation, the generation of keys from secret passwords or passphrases. Variations on this theme include:

Key stretching and key strengthening

See main article: article and Key stretching. Key derivation functions are also used in applications to derive keys from secret passwords or passphrases, which typically do not have the desired properties to be used directly as cryptographic keys. In such applications, it is generally recommended that the key derivation function be made deliberately slow so as to frustrate brute-force attack or dictionary attack on the password or passphrase input value.

Such use may be expressed as, where is the derived key, is the key derivation function, is the original key or password, is a random number which acts as cryptographic salt, and refers to the number of iterations of a sub-function. The derived key is used instead of the original key or password as the key to the system. The values of the salt and the number of iterations (if it is not fixed) are stored with the hashed password or sent as cleartext (unencrypted) with an encrypted message.[9]

The difficulty of a brute force attack is increased with the number of iterations. A practical limit on the iteration count is the unwillingness of users to tolerate a perceptible delay in logging into a computer or seeing a decrypted message. The use of salt prevents the attackers from precomputing a dictionary of derived keys.

An alternative approach, called key strengthening, extends the key with a random salt, but then (unlike in key stretching) securely deletes the salt.[10] This forces both the attacker and legitimate users to perform a brute-force search for the salt value.[11] Although the paper that introduced key stretching[12] referred to this earlier technique and intentionally chose a different name, the term "key strengthening" is now often (arguably incorrectly) used to refer to key stretching.

Password hashing

Despite their original use for key derivation, KDFs are possibly better known for their use in password hashing (password verification by hash comparison), as used by the passwd file or shadow password file. Password hash functions should be relatively expensive to calculate in case of brute-force attacks, and the key stretching of KDFs happen to provide this characteristic. The non-secret parameters are called "salt" in this context.

In 2013 a Password Hashing Competition was announced to choose a new, standard algorithm for password hashing. On 20 July 2015 the competition ended and Argon2 was announced as the final winner. Four other algorithms received special recognition: Catena, Lyra2, Makwa and yescrypt.[13]

As of May 2023, OWASP recommends the following KDFs for password hashing, listed in order of priority:[14]

  1. Argon2id
  2. scrypt if Argon2id is unavailable
  3. bcrypt for legacy systems
  4. PBKDF2 if FIPS-140 compliance is required

Further reading

Notes and References

  1. Book: Bezzi, Michele. Data privacy . Camenisch, Jan. Privacy and Identity Management for Life. Springer. 2011. 9783642203176. 185–186. https://books.google.com/books?id=vYxzh3C6OPUC&pg=PA185. etal. etal.
  2. Web site: Chen, Lily. NIST SP 800-108: Recommendation for Key Derivation Using Pseudorandom Functions. NIST. October 2009.
  3. Book: Zdziarski, Jonathan. Hacking and Securing IOS Applications: Stealing Data, Hijacking Software, and How to Prevent It. O'Reilly Media. 2012. 9781449318741. 252–253.
  4. Web site: Password Security: A Case History. . https://web.archive.org/web/20030322053727/http://cm.bell-labs.com/cm/cs/who/dmr/passwd.ps . dead . 2003-03-22 . Bell Laboratories . Morris, Robert . Thompson, Ken . 1978-04-03 . 2011-05-09 .
  5. Web site: Once seen as bulletproof, 11 million+ Ashley Madison passwords already cracked. Ars Technica. Goodin. Dan. 10 September 2015. 10 September 2015.
  6. Book: SP 800-63B-3 – Digital Identity Guidelines, Authentication and Lifecycle Management . NIST . June 2017 . 10.6028/NIST.SP.800-63b . Grassi Paul A..
  7. Book: SP 800-132 – Recommendation for Password-Based Key Derivation, Part 1: Storage Applications . NIST . December 2010 . 10.6028/NIST.SP.800-132 . Meltem Sönmez Turan, Elaine Barker, William Burr, and Lily Chen. 56801929 .
  8. Krawczyk . Hugo . Eronen . Pasi . May 2010 . The 'info' Input to HKDF . datatracker.ietf.org. RFC 5869 (2010)
  9. Web site: Salted Password Hashing – Doing it Right. CrackStation.net. 29 January 2015.
  10. Abadi, Martın, T. Mark A. Lomas, and Roger Needham. "Strengthening passwords." Digital System Research Center, Tech. Rep 33 (1997): 1997.
  11. U. Manber, "A Simple Scheme to Make Passwords Based on One-Way Functions Much Harder to Crack," Computers & Security, v.15, n.2, 1996, pp.171–176.
  12. http://www.schneier.com/paper-low-entropy.html Secure Applications of Low-Entropy Keys
  13. https://password-hashing.net/ "Password Hashing Competition"
  14. Web site: Password Storage Cheat Sheet . OWASP Cheat Sheet Series . OWASP . 2023-05-17.