In cryptography, collision resistance is a property of cryptographic hash functions: a hash function H is collision-resistant if it is hard to find two inputs that hash to the same output; that is, two inputs a and b where a ≠ b but H(a) = H(b).[1] The pigeonhole principle means that any hash function with more inputs than outputs will necessarily have such collisions; the harder they are to find, the more cryptographically secure the hash function is.
The "birthday paradox" places an upper bound on collision resistance: if a hash function produces N bits of output, an attacker who computes only 2N/2 (or
\scriptstyle\sqrt{2N}
Cryptographic hash functions are usually designed to be collision resistant. However, many hash functions that were once thought to be collision resistant were later broken. MD5 and SHA-1 in particular both have published techniques more efficient than brute force for finding collisions.[3] [4] However, some hash functions have a proof that finding collisions is at least as difficult as some hard mathematical problem (such as integer factorization or discrete logarithm). Those functions are called provably secure.
A family of functions generated by some algorithm G is a family of collision-resistant hash functions, if |m(k)| > |l(k)| for any k, i.e., hk compresses the input string, and every hk can be computed within polynomial time given k, but for any probabilistic polynomial algorithm A, we have
Pr [''k'' ← ''G''(1<sup>''n''</sup>), (''x''<sub>1</sub>, ''x''<sub>2</sub>) ← ''A''(''k'', 1<sup>''n''</sup>) s.t. ''x''<sub>1</sub> ≠ ''x''<sub>2</sub> but ''h''<sub>''k''</sub>(''x''<sub>1</sub>) = ''h''<sub>''k''</sub>(''x''<sub>2</sub>)] < negl(n),
where negl(·) denotes some negligible function, and n is the security parameter.[5]
There are two different types of collision resistance.
A hash function has weak collision resistance when, given a hashing function H and an x, no other x' can be found such that H(x)=H(x'). In words, when given an x, it is not possible to find another x' such that the hashing function would create a collision.
A hash function has strong collision resistance when, given a hashing function H, no arbitrary x and x' can be found where H(x)=H(x'). In words, no two x's can be found where the hashing function would create a collision.
Collision resistance is desirable for several reasons.