Four-square cipher explained

The four-square cipher is a manual symmetric encryption technique.^[1] It was invented by the French cryptographer Felix Delastelle.

The technique encrypts pairs of letters (digraphs), and thus falls into a category of ciphers known as polygraphic substitution ciphers. This adds significant strength to the encryption when compared with monographic substitution ciphers which operate on single characters. The use of digraphs makes the four-square technique less susceptible to frequency analysis attacks, as the analysis must be done on 676 possible digraphs rather than just 26 for monographic substitution. The frequency analysis of digraphs is possible, but considerably more difficult - and it generally requires a much larger ciphertext in order to be useful.

Using four-square

The four-square cipher uses four 5 by 5 (5x5) matrices arranged in a square. Each of the 5 by 5 matrices contains the letters of the alphabet (usually omitting "Q" or putting both "I" and "J" in the same location to reduce the alphabet to fit). In general, the upper-left and lower-right matrices are the "plaintext squares" and each contain a standard alphabet. The upper-right and lower-left squares are the "ciphertext squares" and contain a mixed alphabetic sequence.

To generate the ciphertext squares, one would first fill in the spaces in the matrix with the letters of a keyword or phrase (dropping any duplicate letters), then fill the remaining spaces with the rest of the letters of the alphabet in order (again omitting "Q" to reduce the alphabet to fit). The key can be written in the top rows of the table, from left to right, or in some other pattern, such as a spiral beginning in the upper-left-hand corner and ending in the center. The keyword together with the conventions for filling in the 5 by 5 table constitute the cipher key. The four-square algorithm allows for two separate keys, one for each of the two ciphertext matrices.

As an example, here are the four-square matrices for the keywords "example" and "keyword." The plaintext matrices are in lowercase and the ciphertext matrices are in caps to make this example visually more simple:

a b c d e E X A M P f g h i j L B C D F k l m n o G H I J K p r s t u N O R S T v w x y z U V W Y Z K E Y W O a b c d e R D A B C f g h i j F G H I J k l m n o L M N P S p r s t u T U V X Z v w x y z

Algorithm

To encrypt a message, one would follow these steps:

Split the payload message into digraphs. (HELLO WORLD becomes HE LL OW OR LD)
Find the first letter in the digraph in the upper-left plaintext matrix.

a b c d e E X A M P f g h i j L B C D F k l m n o G H I J K p r s t u N O R S T v w x y z U V W Y Z K E Y W O a b c d e R D A B C f g h i j F G H I J k l m n o L M N P S p r s t u T U V X Z v w x y z

Find the second letter in the digraph in the lower-right plaintext matrix.

a b c d e E X A M P f g h i j L B C D F k l m n o G H I J K p r s t u N O R S T v w x y z U V W Y Z K E Y W O a b c d e R D A B C f g h i j F G H I J k l m n o L M N P S p r s t u T U V X Z v w x y z

The first letter of the encrypted digraph is in the same row as the first plaintext letter and the same column as the second plaintext letter. It is therefore in the upper-right ciphertext matrix.

a b c d e E X A M P f g h i j L B C D F k l m n o G H I J K p r s t u N O R S T v w x y z U V W Y Z K E Y W O a b c d e R D A B C f g h i j F G H I J k l m n o L M N P S p r s t u T U V X Z v w x y z

The second letter of the encrypted digraph is in the same row as the second plaintext letter and the same column as the first plaintext letter. It is therefore in the lower-left ciphertext matrix.

a b c d e E X A M P f g h i j L B C D F k l m n o G H I J K p r s t u N O R S T v w x y z U V W Y Z K E Y W O a b c d e R D A B C f g h i j F G H I J k l m n o L M N P S p r s t u T U V X Z v w x y z

Using the four-square example given above, we can encrypt the following plaintext:

Plaintext: he lp me ob iw an ke no bi Ciphertext: FY GM KY HO BX MF KK KI MD

Here is the four-square written out again but blanking all of the values that aren't used for encrypting the first digraph "he" into "FY"

- - - - - - - - - - - - h - - - - - - F - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Y - - - - - - e - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

As can be seen clearly, the method of encryption simply involves finding the other two corners of a rectangle defined by the two letters in the plaintext digraph. The encrypted digraph is simply the letters at the other two corners, with the upper-right letter coming first.

Decryption works the same way, but in reverse. The ciphertext digraph is split with the first character going into the upper-right matrix and the second character going into the lower-left matrix. The other corners of the rectangle are then located. These represent the plaintext digraph with the upper-left matrix component coming first.

Four-square cryptanalysis

Like most pre-modern era ciphers, the four-square cipher can be easily cracked if there is enough text. Obtaining the key is relatively straightforward if both plaintext and ciphertext are known. When only the ciphertext is known, brute force cryptanalysis of the cipher involves searching through the key space for matches between the frequency of occurrence of digrams (pairs of letters) and the known frequency of occurrence of digrams in the assumed language of the original message.

Cryptanalysis of four-square generally involves pattern matching on repeated monographs. This is only the case when the two plaintext matrices are known. A four-square encipherment usually uses standard alphabets in these matrices but it is not a requirement. If this is the case, then certain words will always produce single-letter ciphertext repeats. For instance, the word MI LI TA RY will always produce the same ciphertext letter in the first and third positions regardless of the keywords used. Patterns like these can be cataloged and matched against single-letter repeats in the ciphertext. Candidate plaintext can then be inserted in an attempt to uncover the ciphertext matrices.

Unlike the Playfair cipher, a four-square cipher will not show reversed ciphertext digraphs for reversed plaintext digraphs (e.g. the digraphs AB BA would encrypt to some pattern XY YX in Playfair, but not in four-square). This, of course, is only true if the two keywords are different. Another difference between four-square and Playfair which makes four-square a stronger encryption is the fact that double letter digraphs will occur in four-square ciphertext.

By all measures, four-square is a stronger system for encrypting information than Playfair. However, it is more cumbersome because of its use of two keys, and, preparing the encryption/decryption sheet can be time consuming. Given that the increase in encryption strength afforded by four-square over Playfair is marginal and that both schemes are easily defeated if sufficient ciphertext is available, Playfair has become much more common.

A good tutorial on reconstructing the key for a four-square cipher can be found in chapter 7, "Solution to Polygraphic Substitution Systems," of Field Manual 34-40-2, produced by the United States Army.

See also

Topics in cryptography

Notes and References

Book: William Maxwell Bowers. Digraphic substitution: the Playfair cipher, the four square cipher. 1959. American Cryptogram Association. 25.