A
A,
A,
A
For finite sets, Cantor's theorem can be seen to be true by simple enumeration of the number of subsets. Counting the empty set as a subset, a set with
n
2n
2n>n
Much more significant is Cantor's discovery of an argument that is applicable to any set, and shows that the theorem holds for infinite sets also. As a consequence, the cardinality of the real numbers, which is the same as that of the power set of the integers, is strictly larger than the cardinality of the integers; see Cardinality of the continuum for details.
The theorem is named for German mathematician Georg Cantor, who first stated and proved it at the end of the 19th century. Cantor's theorem had immediate and important consequences for the philosophy of mathematics. For instance, by iteratively taking the power set of an infinite set and applying Cantor's theorem, we obtain an endless hierarchy of infinite cardinals, each strictly larger than the one before it. Consequently, the theorem implies that there is no largest cardinal number (colloquially, "there's no largest infinity").
Cantor's argument is elegant and remarkably simple. The complete proof is presented below, with detailed explanations to follow.
By definition of cardinality, we have
\operatorname{card}(X)<\operatorname{card}(Y)
X
Y
X
X
A
f
A
A
A
f(x)
x\inA
f(x)
A
f
B=\{x\inA\midx\not\inf(x)\}.
This means, by definition, that for all
x\inA
x\inB
x\notinf(x)
x
B
f(x)
B
A
f
x\inA
x\inf(x)
x\notinf(x)
x\inf(x)
f(x)
B
x\inf(x)
x\notinB
x\notinf(x)
f(x)
B
x\notinf(x)
x\inB
B
Equivalently, and slightly more formally, we have just proved that the existence of
\xi\inA
f(\xi)=B
\begin{aligned} \xi\inB&\iff\xi\notinf(\xi)&&(bydefinitionofB);\\ \xi\inB&\iff\xi\inf(\xi)&&(byassumptionthatf(\xi)=B).\\ \end{aligned}
Therefore, by reductio ad absurdum, the assumption must be false. Thus there is no
\xi\inA
f(\xi)=B
B
f
f
A
f
Finally, to complete the proof, we need to exhibit an injective function from
A
x
\{x\}
A
\operatorname{card}(A)<\operatorname{card}(l{P}(A))
Another way to think of the proof is that
B
A
f
A
B
B
B
B
B
B
B
A
B
f
Because of the double occurrence of
x
x\inf(x)
x
A=\{x1,x2,\ldots\}
A
y
A
f
f(x1),f(x2)
x
y
x\iny
D
x\inf(x)
x\inA
B
D
x\inf(x)
B
B
B
Despite the simplicity of the above proof, it is rather difficult for an automated theorem prover to produce it. The main difficulty lies in an automated discovery of the Cantor diagonal set. Lawrence Paulson noted in 1992 that Otter could not do it, whereas Isabelle could, albeit with a certain amount of direction in terms of tactics that might perhaps be considered cheating.[2]
Let us examine the proof for the specific case when
A
A=N=\{1,2,3,\ldots\}
Suppose that
N
l{P}(N)
l{P}(N)
l{P}(N)=\{\varnothing,\{1,2\},\{1,2,3\},\{4\},\{1,5\},\{3,4,6\},\{2,4,6,...\},...\}.
l{P}(N)
N
\{2,4,6,\ldots\}=\{2k:k\inN\}
\varnothing
Now that we have an idea of what the elements of
l{P}(N)
N
l{P}(N)
N
l{P}(N)
N\begin{Bmatrix}1&\longleftrightarrow&\{4,5\}\ 2&\longleftrightarrow&\{1,2,3\}\ 3&\longleftrightarrow&\{4,5,6\}\ 4&\longleftrightarrow&\{1,3,5\}\ \vdots&\vdots&\vdots\end{Bmatrix}l{P}(N).
Given such a pairing, some natural numbers are paired with subsets that contain the very same number. For instance, in our example the number 2 is paired with the subset, which contains 2 as a member. Let us call such numbers selfish. Other natural numbers are paired with subsets that do not contain them. For instance, in our example the number 1 is paired with the subset, which does not contain the number 1. Call these numbers non-selfish. Likewise, 3 and 4 are non-selfish.
Using this idea, let us build a special set of natural numbers. This set will provide the contradiction we seek. Let
B
l{P}(N)
B
B
b
b
B
b
B
b
B
B
b
B
Since there is no natural number which can be paired with
B
N
l{P}(N)
Note that the set
B
x
x
l{P}(N)
l{P}(N)
Through this proof by contradiction we have proven that the cardinality of
N
l{P}(N)
l{P}(N)
N
l{P}(N)
N
l{P}(N)
l{P}(N)
N
Cantor's theorem and its proof are closely related to two paradoxes of set theory.
V
|l{P}(X)|>|X|
X
l{P}(V)
V
|l{P}(V)|\leq|V|
Another paradox can be derived from the proof of Cantor's theorem by instantiating the function f with the identity function; this turns Cantor's diagonal set into what is sometimes called the Russell set of a given set A:[1]
RA=\left\{x\inA:x\not\inx\right\}.
The proof of Cantor's theorem is straightforwardly adapted to show that assuming a set of all sets U exists, then considering its Russell set RU leads to the contradiction:
RU\inRU\iffRU\notinRU.
This argument is known as Russell's paradox.[1] As a point of subtlety, the version of Russell's paradox we have presented here is actually a theorem of Zermelo; we can conclude from the contradiction obtained that we must reject the hypothesis that RU∈U, thus disproving the existence of a set containing all sets. This was possible because we have used restricted comprehension (as featured in ZFC) in the definition of RA above, which in turn entailed that
RU\inRU\iff(RU\inU\wedgeRU\notinRU).
Had we used unrestricted comprehension (as in Frege's system for instance) by defining the Russell set simply as
R=\left\{x:x\not\inx\right\}
Despite the syntactical similarities between the Russell set (in either variant) and the Cantor diagonal set, Alonzo Church emphasized that Russell's paradox is independent of considerations of cardinality and its underlying notions like one-to-one correspondence.[5]
Cantor gave essentially this proof in a paper published in 1891 "Über eine elementare Frage der Mannigfaltigkeitslehre",[6] where the diagonal argument for the uncountability of the reals also first appears (he had earlier proved the uncountability of the reals by other methods). The version of this argument he gave in that paper was phrased in terms of indicator functions on a set rather than subsets of a set.[7] He showed that if f is a function defined on X whose values are 2-valued functions on X, then the 2-valued function G(x) = 1 - f(x)(x) is not in the range of f.
Bertrand Russell has a very similar proof in Principles of Mathematics (1903, section 348), where he shows that there are more propositional functions than objects. "For suppose a correlation of all objects and some propositional functions to have been affected, and let phi-x be the correlate of x. Then "not-phi-x(x)," i.e. "phi-x does not hold of x" is a propositional function not contained in this correlation; for it is true or false of x according as phi-x is false or true of x, and therefore it differs from phi-x for every value of x." He attributes the idea behind the proof to Cantor.
Ernst Zermelo has a theorem (which he calls "Cantor's Theorem") that is identical to the form above in the paper that became the foundation of modern set theory ("Untersuchungen über die Grundlagen der Mengenlehre I"), published in 1908. See Zermelo set theory.
Lawvere's fixed-point theorem provides for a broad generalization of Cantor's theorem to any category with finite products in the following way:[8] let
l{C}
1
l{C}
Y
l{C}
\alpha:Y\toY
y:1\toY
\alpha\circy=y
T
l{C}
f:T x T\toY
T\toY
T
f:T x T\toY
T\toY
f(-,x):T\toY
T\toY