In cryptography, a one-way compression function is a function that transforms two fixed-length inputs into a fixed-length output.[1] The transformation is "one-way", meaning that it is difficult given a particular output to compute inputs which compress to that output. One-way compression functions are not related to conventional data compression algorithms, which instead can be inverted exactly (lossless compression) or approximately (lossy compression) to the original data.
One-way compression functions are for instance used in the Merkle–Damgård construction inside cryptographic hash functions.
One-way compression functions are often built from block ciphers.Some methods to turn any normal block cipher into a one-way compression function are Davies–Meyer, Matyas–Meyer–Oseas, Miyaguchi–Preneel (single-block-length compression functions) and MDC-2/Meyer–Schilling, MDC-4, Hirose (double-block-length compression functions). These methods are described in detail further down. (MDC-2 is also the name of a hash function patented by IBM.)
Another method is 2BOW (or NBOW in general), which is a "high-rate multi-block-length hash function based on block ciphers" and typically achieves (asymptotic) rates between 1 and 2 independent of the hash size (only with small constant overhead). This method has not yet seen any serious security analysis, so should be handled with care.
A compression function mixes two fixed length inputs and produces a single fixed length output of the same size as one of the inputs. This can also be seen as that the compression function transforms one large fixed-length input into a shorter, fixed-length output.
For instance, input A might be 128 bits, input B 128 bits and they are compressed together to a single output of 128 bits. This is equivalent to having a single 256-bit input compressed to a single output of 128 bits.
Some compression functions do not compress by half, but instead by some other factor. For example, input A might be 256 bits, and input B 128 bits, which are compressed to a single output of 128 bits. That is, a total of 384 input bits are compressed together to 128 output bits.
The mixing is done in such a way that full avalanche effect is achieved. That is, every output bit depends on every input bit.
See main article: article and one-way function.
A one-way function is a function that is easy to compute but hard to invert. A one-way compression function (also called hash function) should have the following properties:
h
m
\operatorname{hash}(m)=h
m1
h
m2
h
\operatorname{hash}(m1)=\operatorname{hash}(m2)
m1\nem2
\operatorname{hash}(m1)=\operatorname{hash}(m2)
2n/2
n
2n/2
Ideally one would like the "infeasibility" in preimage-resistance and second preimage-resistance to mean a work of about
2n
n
See main article: article and Merkle–Damgård construction.
A common use of one-way compression functions is in the Merkle–Damgård construction inside cryptographic hash functions. Most widely used hash functions, including MD5, SHA-1 (which is deprecated[2]) and SHA-2 use this construction.
A hash function must be able to process an arbitrary-length message into a fixed-length output. This can be achieved by breaking the input up into a series of equal-sized blocks, and operating on them in sequence using a one-way compression function. The compression function can either be specially designed for hashing or be built from a block cipher. The last block processed should also be length padded, which is crucial to the security of this construction.
When length padding (also called MD-strengthening) is applied, attacks cannot find collisions faster than the birthday paradox (
2n/2
n
f
A second preimage attack (given a message
m1
m2
\operatorname{hash}(m1)=\operatorname{hash}(m2)
2k
k x 2n/2+1+2n-k+1
23n/4+2
k=2n/4
2n
One-way compression functions are often built from block ciphers.
Block ciphers take (like one-way compression functions) two fixed size inputs (the key and the plaintext) and return one single output (the ciphertext) which is the same size as the input plaintext.
However, modern block ciphers are only partially one-way. That is, given a plaintext and a ciphertext it is infeasible to find a key that encrypts the plaintext to the ciphertext. But, given a ciphertext and a key a matching plaintext can be found simply by using the block cipher's decryption function. Thus, to turn a block cipher into a one-way compression function some extra operations have to be added.
Some methods to turn any normal block cipher into a one-way compression function are Davies–Meyer, Matyas–Meyer–Oseas, Miyaguchi–Preneel (single-block-length compression functions) and MDC-2, MDC-4, Hirose (double-block-length compressions functions).
Single-block-length compression functions output the same number of bits as processed by the underlying block cipher. Consequently, double-block-length compression functions output twice the number of bits.
If a block cipher has a block size of say 128 bits single-block-length methods create a hash function that has the block size of 128 bits and produces a hash of 128 bits. Double-block-length methods make hashes with double the hash size compared to the block size of the block cipher used. So a 128-bit block cipher can be turned into a 256-bit hash function.
These methods are then used inside the Merkle–Damgård construction to build the actual hash function. These methods are described in detail further down.
Using a block cipher to build the one-way compression function for a hash function is usually somewhat slower than using a specially designed one-way compression function in the hash function. This is because all known secure constructions do the key scheduling for each block of the message. Black, Cochran and Shrimpton have shown that it is impossible to construct a one-way compression function that makes only one call to a block cipher with a fixed key.[6] In practice reasonable speeds are achieved provided the key scheduling of the selected block cipher is not a too heavy operation.
But, in some cases it is easier because a single implementation of a block cipher can be used for both a block cipher and a hash function. It can also save code space in very tiny embedded systems like for instance smart cards or nodes in cars or other machines.
Therefore, the hash-rate or rate gives a glimpse of the efficiency of a hash function based on a certain compression function. The rate of an iterated hash function outlines the ratio between the number of block cipher operations and the output. More precisely, the rate represents the ratio between the number of processed bits of input
m
n
s
n
R | ||||
|
The hash function can only be considered secure if at least the following conditions are met:
The constructions presented below: Davies–Meyer, Matyas–Meyer–Oseas, Miyaguchi–Preneel and Hirose have been shown to be secure under the black-box analysis.[7] [8] The goal is to show that any attack that can be found is at most as efficient as the birthday attack under certain assumptions. The black-box model assumes that a block cipher is used that is randomly chosen from a set containing all appropriate block ciphers. In this model an attacker may freely encrypt and decrypt any blocks, but does not have access to an implementation of the block cipher. The encryption and decryption function are represented by oracles that receive a pair of either a plaintext and a key or a ciphertext and a key. The oracles then respond with a randomly chosen plaintext or ciphertext, if the pair was asked for the first time. They both share a table for these triplets, a pair from the query and corresponding response, and return the record, if a query was received for the second time. For the proof there is a collision finding algorithm that makes randomly chosen queries to the oracles. The algorithm returns 1, if two responses result in a collision involving the hash function that is built from a compression function applying this block cipher (0 else). The probability that the algorithm returns 1 is dependent on the number of queries which determine the security level.
The Davies–Meyer single-block-length compression function feeds each block of the message (
mi
Hi-1
Hi-1
Hi
H0
In mathematical notation Davies–Meyer can be described as:
Hi=
E | |
mi |
{(Hi-1)} ⊕ {Hi-1
The scheme has the rate (k is the keysize):
RDM=
k | |
1 ⋅ n |
=
k | |
n |
If the block cipher uses for instance 256-bit keys then each message block (
mi
Variations of this method replace XOR with any other group operation, such as addition on 32-bit unsigned integers.
A notable property of the Davies–Meyer construction is that even if the underlying block cipher is totally secure, it is possible to compute fixed points for the construction: for any
m
h
Em(h) ⊕ h=h
h=
-1 | |
E | |
m |
(0)
m1
m2
\operatorname{hash}(m1)=\operatorname{hash}(m2)
2k
3 x 2n/2+1+2n-k+1
k x 2n/2+1+2n-k+1
2n/2
2n
2n
The security of the Davies–Meyer construction in the Ideal Cipher Model was first proven by R. Winternitz.[10]
The Matyas–Meyer–Oseas single-block-length one-way compression function can be considered the dual (the opposite) of Davies–Meyer.
It feeds each block of the message (
mi
mi
Hi
Hi-1
H0
If the block cipher has different block and key sizes the hash value (
Hi-1
g
In mathematical notation Matyas–Meyer–Oseas can be described as:
Hi=
E | |
g(Hi-1) |
(mi) ⊕ mi
The scheme has the rate:
RMMO=
n | |
1 ⋅ n |
=1
A second preimage attack (given a message
m1
m2
\operatorname{hash}(m1)=\operatorname{hash}(m2)
2k
k x 2n/2+1+2n-k+1
2n/2
2n
2n
The Miyaguchi–Preneel single-block-length one-way compression function is an extended variant of Matyas–Meyer–Oseas. It was independently proposed by Shoji Miyaguchi and Bart Preneel.
It feeds each block of the message (
mi
mi
Hi-1
Hi
Hi-1
H0
If the block cipher has different block and key sizes the hash value (
Hi-1
g
In mathematical notation Miyaguchi–Preneel can be described as:
Hi=
E | |
g(Hi-1) |
(mi) ⊕ Hi-1 ⊕ mi
The scheme has the rate:
RMP=
n | |
1 ⋅ n |
=1
The roles of
mi
Hi-1
Hi-1
mi
A second preimage attack (given a message
m1
m2
\operatorname{hash}(m1)=\operatorname{hash}(m2)
2k
k x 2n/2+1+2n-k+1
2n/2
2n
2n
The Hirose[8] double-block-length one-way compression function consists of a block cipher plus a permutation
p
It uses a block cipher whose key length
k
n
2n
Each round accepts a portion of the message
mi
k-n
n
G
H
First,
mi
Hi-1
Ki
Gi=
E | |
Ki |
(Gi-1) ⊕ Gi-1
Hi=
E | |
Ki |
(p(Gi-1)) ⊕ p(Gi-1)
p(Gi-1)
n
p(x)=x ⊕ c
c
Each encryption resembles the standard Davies–Meyer construction. The advantage of this scheme over other proposed double-block-length schemes is that both encryptions use the same key, and thus key scheduling effort may be shared.
The final output is
Ht||Gt
Hirose also provides a proof in the Ideal Cipher Model.
The sponge construction can be used to build one-way compression functions.