In cryptography, an S-box (substitution-box) is a basic component of symmetric key algorithms which performs substitution. In block ciphers, they are typically used to obscure the relationship between the key and the ciphertext, thus ensuring Shannon's property of confusion. Mathematically, an S-box is a nonlinear vectorial Boolean function.
In general, an S-box takes some number of input bits, m, and transforms them into some number of output bits, n, where n is not necessarily equal to m.[1] An m×n S-box can be implemented as a lookup table with 2m words of n bits each. Fixed tables are normally used, as in the Data Encryption Standard (DES), but in some ciphers the tables are generated dynamically from the key (e.g. the Blowfish and the Twofish encryption algorithms).
One good example of a fixed table is the S-box from DES (S5), mapping 6-bit input into a 4-bit output:
S5 | Middle 4 bits of input | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0000 | 0001 | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 | 1000 | 1001 | 1010 | 1011 | 1100 | 1101 | 1110 | 1111 | ||
Outer bits | 00 | 0010 | 1100 | 0100 | 0001 | 0111 | 1010 | 1011 | 0110 | 1000 | 0101 | 0011 | 1111 | 1101 | 0000 | 1110 | 1001 |
01 | 1110 | 1011 | 0010 | 1100 | 0100 | 0111 | 1101 | 0001 | 0101 | 0000 | 1111 | 1010 | 0011 | 1001 | 1000 | 0110 | |
10 | 0100 | 0010 | 0001 | 1011 | 1010 | 1101 | 0111 | 1000 | 1111 | 1001 | 1100 | 0101 | 0110 | 0011 | 0000 | 1110 | |
11 | 1011 | 1000 | 1100 | 0111 | 0001 | 1110 | 0010 | 1101 | 0110 | 1111 | 0000 | 1001 | 1010 | 0100 | 0101 | 0011 |
Given a 6-bit input, the 4-bit output is found by selecting the row using the outer two bits (the first and last bits), and the column using the inner four bits. For example, an input "011011" has outer bits "01" and inner bits "1101"; the corresponding output would be "1001".[2]
When DES was first published in 1977, the design criteria of its S-boxes were kept secret to avoid compromising the technique of differential cryptanalysis (which was not yet publicly known). As a result, research in what made good S-boxes was sparse at the time. Rather, the eight S-boxes of DES were the subject of intense study for many years out of a concern that a backdoor (a vulnerability known only to its designers) might have been planted in the cipher. As the S-boxes are the only nonlinear part of the cipher, compromising those would compromise the entire cipher.[3]
The S-box design criteria were eventually published (in) after the public rediscovery of differential cryptanalysis, showing that they had been carefully tuned to increase resistance against this specific attack such that it was no better than brute force. Biham and Shamir found that even small modifications to an S-box could significantly weaken DES.[4]
Any S-box where any linear combination of output bits is produced by a bent function of the input bits is termed a perfect S-box.[5]
S-boxes can be analyzed using linear cryptanalysis and differential cryptanalysis in the form of a Linear approximation table (LAT) or Walsh transform and Difference Distribution Table (DDT) or autocorrelation table and spectrum. Its strength may be summarized by the nonlinearity (bent, almost bent) and differential uniformity (perfectly nonlinear, almost perfectly nonlinear).[6] [7] [8]