In combinatorics, the inclusion–exclusion principle is a counting technique which generalizes the familiar method of obtaining the number of elements in the union of two finite sets; symbolically expressed as
|A\cupB|=|A|+|B|-|A\capB|
The inclusion-exclusion principle, being a generalization of the two-set case, is perhaps more clearly seen in the case of three sets, which for the sets A, B and C is given by
|A\cupB\cupC|=|A|+|B|+|C|-|A\capB|-|A\capC|-|B\capC|+|A\capB\capC|
Generalizing the results of these examples gives the principle of inclusion–exclusion. To find the cardinality of the union of sets:
The name comes from the idea that the principle is based on over-generous inclusion, followed by compensating exclusion.This concept is attributed to Abraham de Moivre (1718), although it first appears in a paper of Daniel da Silva (1854) and later in a paper by J. J. Sylvester (1883). Sometimes the principle is referred to as the formula of Da Silva or Sylvester, due to these publications. The principle can be viewed as an example of the sieve method extensively used in number theory and is sometimes referred to as the sieve formula.
As finite probabilities are computed as counts relative to the cardinality of the probability space, the formulas for the principle of inclusion–exclusion remain valid when the cardinalities of the sets are replaced by finite probabilities. More generally, both versions of the principle can be put under the common umbrella of measure theory.
In a very abstract setting, the principle of inclusion–exclusion can be expressed as the calculation of the inverse of a certain matrix. This inverse has a special structure, making the principle an extremely valuable technique in combinatorics and related areas of mathematics. As Gian-Carlo Rota put it:
"One of the most useful principles of enumeration in discrete probability and combinatorial theory is the celebrated principle of inclusion–exclusion. When skillfully applied, this principle has yielded the solution to many a combinatorial problem."
In its general formula, the principle of inclusion–exclusion states that for finite sets, one has the identity
This can be compactly written as
n | |
\left|cup | |
i=1 |
Ai\right|=
n | |
\sum | |
k=1 |
(-1)k+1\left(
\sum | |
1\leqslanti1< … <ik\leqslantn |
|
A | |
i1 |
\cap … \cap
A | |
ik |
|\right)
or
\left|
n | |
cup | |
i=1 |
Ai\right|=\sum\emptyset ≠
In words, to count the number of elements in a finite union of finite sets, first sum the cardinalities of the individual sets, then subtract the number of elements that appear in at least two sets, then add back the number of elements that appear in at least three sets, then subtract the number of elements that appear in at least four sets, and so on. This process always ends since there can be no elements that appear in more than the number of sets in the union. (For example, if
n=4,
4
5
In applications it is common to see the principle expressed in its complementary form. That is, letting be a finite universal set containing all of the and letting
\bar{Ai}
n | |
\left|cap | |
i=1 |
\bar{Ai}\right|=\left|S-
n | |
cup | |
i=1 |
Ai\right|=|S|-
n | |
\sum | |
i=1 |
|Ai|+\sum1|Ai\capAj|- … +(-1)n|A1\cap … \capAn|.
As another variant of the statement, let be a list of properties that elements of a set may or may not have, then the principle of inclusion–exclusion provides a way to calculate the number of elements of that have none of the properties. Just let be the subset of elements of which have the property and use the principle in its complementary form. This variant is due to J. J. Sylvester.
Notice that if you take into account only the first sums on the right (in the general form of the principle), then you will get an overestimate if is odd and an underestimate if is even.
A more complex example is the following.
Suppose there is a deck of n cards numbered from 1 to n. Suppose a card numbered m is in the correct position if it is the mth card in the deck. How many ways, W, can the cards be shuffled with at least 1 card being in the correct position?
Begin by defining set Am, which is all of the orderings of cards with the mth card correct. Then the number of orders, W, with at least one card being in the correct position, m, is
W=
n | |
\left|cup | |
m=1 |
Am\right|.
Apply the principle of inclusion–exclusion,
W=
n | |
\sum | |
m1=1 |
|A | |
m1 |
|-
\sum | |
1\leqslantm1<m2\leqslantn |
|A | |
m1 |
\cap
A | |
m2 |
|+ … +(-1)p-1
\sum | |
1\leqslantm1< … <mp\leqslantn |
|
A | |
m1 |
\cap … \cap
A | |
mp |
|+ …
Each value
A | |
m1 |
\cap … \cap
A | |
mp |
m
W={n\choose1}|A1|-{n\choose2}|A1\capA2|+ … +(-1)p-1{n\choosep}|A1\cap … \capAp|+ …
|A1\cap … \capAp|
\begin{align} W&={n\choose1}(n-1)!-{n\choose2}(n-2)!+ … +(-1)p-1{n\choosep}(n-p)!+ … \\ &=
n | |
\sum | |
p=1 |
(-1)p-1{n\choosep}(n-p)!\\ &=
n | |
\sum | |
p=1 |
(-1)p-1
n! | |
p!(n-p)! |
(n-p)!\\ &=
n | |
\sum | |
p=1 |
(-1)p-1
n! | |
p! |
\end{align}
A permutation where no card is in the correct position is called a derangement. Taking n! to be the total number of permutations, the probability Q that a random shuffle produces a derangement is given by
Q=1-
W | |
n! |
=
n | |
\sum | |
p=0 |
(-1)p | |
p! |
,
a truncation to n + 1 terms of the Taylor expansion of e−1. Thus the probability of guessing an order for a shuffled deck of cards and being incorrect about every card is approximately e−1 or 37%.
The situation that appears in the derangement example above occurs often enough to merit special attention. Namely, when the size of the intersection sets appearing in the formulas for the principle of inclusion–exclusion depend only on the number of sets in the intersections and not on which sets appear. More formally, if the intersection
AJ:=capj\inAj
has the same cardinality, say αk = |AJ|, for every k-element subset J of, then
\left
n | |
|cup | |
i=1 |
Ai\right|
n | |
=\sum | |
k=1 |
(-1)k-1\binomnk\alphak.
Or, in the complementary form, where the universal set S has cardinality α0,
\left|S\smallsetminus
n | |
cup | |
i=1 |
Ai\right|=\alpha0-
n | |
\sum | |
k=0 |
(-1)k-1\binomnk\alphak.
Given a family (repeats allowed) of subsets A1, A2, ..., An of a universal set S, the principle of inclusion–exclusion calculates the number of elements of S in none of these subsets. A generalization of this concept would calculate the number of elements of S which appear in exactly some fixed m of these sets.
Let N = [n] = . If we define
A\emptyset=S
\sumJ(-1)|J||AJ|.
If I is a fixed subset of the index set N, then the number of elements which belong to Ai for all i in I and for no other values is:
\sumI(-1)|J||AJ|.
Bk=AI
We seek the number of elements in none of the Bk which, by the principle of inclusion–exclusion (with
B\emptyset=AI
\sumK(-1)|K||BK|.
The correspondence K ↔ J = I ∪ K between subsets of N \ I and subsets of N containing I is a bijection and if J and K correspond under this map then BK = AJ, showing that the result is valid.
(\Omega,l{F},P)
P(A1\cupA2)=P(A1)+P(A2)-P(A1\capA2),
for n = 3
P(A1\cupA2\cupA3)=P(A1)+P(A2)+P(A3)-P(A1\capA2)-P(A1\capA3)-P(A2\capA3)+P(A1\capA2\capA3)
and in general
n | |
P\left(cup | |
i=1 |
Ai\right)=\sum
n | |
i=1 |
P(Ai)-\sumi<jP(Ai\capAj)+\sumi<j<kP(Ai\capAj\capAk)+ … +(-1)n-1\sumi<...<n
n | |
P\left(cap | |
i=1 |
Ai\right),
which can be written in closed form as
n | |
P\left(cup | |
i=1 |
Ai\right)=\sum
n | |
k=1 |
\left((-1)k-1\sumI\subseteq\{1,\ldots,n\\atop|I|=k}P(AI)\right),
where the last sum runs over all subsets I of the indices 1, ..., n which contain exactly k elements, and
AI:=capi\inAi
denotes the intersection of all those Ai with index in I.
According to the Bonferroni inequalities, the sum of the first terms in the formula is alternately an upper bound and a lower bound for the LHS. This can be used in cases where the full formula is too cumbersome.
For a general measure space (S,Σ,μ) and measurable subsets A1, ..., An of finite measure, the above identities also hold when the probability measure
P
If, in the probabilistic version of the inclusion–exclusion principle, the probability of the intersection AI only depends on the cardinality of I, meaning that for every k in there is an ak such that
ak=P(AI)foreveryI\subset\{1,\ldots,n\}with|I|=k,
then the above formula simplifies to
n | |
P\left(cup | |
i=1 |
Ai\right)
n | |
=\sum | |
k=1 |
(-1)k-1\binomnkak
due to the combinatorial interpretation of the binomial coefficient . For example, if the events
Ai
P(Ai)=p
ak=pk
n | |
P\left(cup | |
i=1 |
Ai\right)=1-(1-p)n.
(This result can also be derived more simply by considering the intersection of the complements of the events
Ai
An analogous simplification is possible in the case of a general measure space
(S,\Sigma,\mu)
A1,...,An
There is another formula used in point processes. Let
S
P
S
A
S
- | A |
- | A |
The principle is sometimes stated in the form that says that if
g(A)=\sumSf(S)
then
The combinatorial and the probabilistic version of the inclusion–exclusion principle are instances of .
If one sees a number
n
For a generalization of the full version of Möbius inversion formula, must be generalized to multisets. For multisets instead of sets, becomes
where
A-S
(A-S)\uplusS=A
Notice that
\mu(A-S)
(-1)|A|-|S|
A-S
The inclusion–exclusion principle is widely used and only a few of its applications can be mentioned here.
See main article: Derangement.
A well-known application of the inclusion–exclusion principle is to the combinatorial problem of counting all derangements of a finite set. A derangement of a set A is a bijection from A into itself that has no fixed points. Via the inclusion–exclusion principle one can show that if the cardinality of A is n, then the number of derangements is [''n''! / ''e''] where [''x''] denotes the nearest integer to x; a detailed proof is available here and also see the examples section above.
The first occurrence of the problem of counting the number of derangements is in an early book on games of chance: Essai d'analyse sur les jeux de hazard by P. R. de Montmort (1678 – 1719) and was known as either "Montmort's problem" or by the name he gave it, "problème des rencontres." The problem is also known as the hatcheck problem.
The number of derangements is also known as the subfactorial of n, written !n. It follows that if all bijections are assigned the same probability then the probability that a random bijection is a derangement quickly approaches 1/e as n grows.
The principle of inclusion–exclusion, combined with De Morgan's law, can be used to count the cardinality of the intersection of sets as well. Let
\overline{Ak}
Ak\subseteqA
n | |
cap | |
i=1 |
Ai=
n | |
\overline{cup | |
i=1 |
\overline{Ai}}
thereby turning the problem of finding an intersection into the problem of finding a union.
The inclusion exclusion principle forms the basis of algorithms for a number of NP-hard graph partitioning problems, such as graph coloring.
A well known application of the principle is the construction of the chromatic polynomial of a graph.
The number of perfect matchings of a bipartite graph can be calculated using the principle.
Given finite sets A and B, how many surjective functions (onto functions) are there from A to B? Without any loss of generality we may take A = and B =, since only the cardinalities of the sets matter. By using S as the set of all functions from A to B, and defining, for each i in B, the property Pi as "the function misses the element i in B" (i is not in the image of the function), the principle of inclusion - exclusion gives the number of onto functions between A and B as:
n | |
\sum | |
j=0 |
\binom{n}{j}(-1)j(n-j)k.
A permutation of the set S = where each element of S is restricted to not being in certain positions (here the permutation is considered as an ordering of the elements of S) is called a permutation with forbidden positions. For example, with S =, the permutations with the restriction that the element 1 can not be in positions 1 or 3, and the element 2 can not be in position 4 are: 2134, 2143, 3124, 4123, 2341, 2431, 3241, 3421, 4231 and 4321. By letting Ai be the set of positions that the element i is not allowed to be in, and the property Pi to be the property that a permutation puts element i into a position in Ai, the principle of inclusion–exclusion can be used to count the number of permutations which satisfy all the restrictions.
In the given example, there are 12 = 2(3!) permutations with property P1, 6 = 3! permutations with property P2 and no permutations have properties P3 or P4 as there are no restrictions for these two elements. The number of permutations satisfying the restrictions is thus:
4! − (12 + 6 + 0 + 0) + (4) = 24 − 18 + 4 = 10.
The final 4 in this computation is the number of permutations having both properties P1 and P2. There are no other non-zero contributions to the formula.
See main article: Stirling numbers of the second kind.
The Stirling numbers of the second kind, S(n,k) count the number of partitions of a set of n elements into k non-empty subsets (indistinguishable boxes). An explicit formula for them can be obtained by applying the principle of inclusion–exclusion to a very closely related problem, namely, counting the number of partitions of an n-set into k non-empty but distinguishable boxes (ordered non-empty subsets). Using the universal set consisting of all partitions of the n-set into k (possibly empty) distinguishable boxes, A1, A2, ..., Ak, and the properties Pi meaning that the partition has box Ai empty, the principle of inclusion–exclusion gives an answer for the related result. Dividing by k! to remove the artificial ordering gives the Stirling number of the second kind:
S(n,k)=
1 | |
k! |
k | |
\sum | |
t=0 |
(-1)t\binomkt(k-t)n.
See main article: Rook polynomial.
A rook polynomial is the generating function of the number of ways to place non-attacking rooks on a board B that looks like a subset of the squares of a checkerboard; that is, no two rooks may be in the same row or column. The board B is any subset of the squares of a rectangular board with n rows and m columns; we think of it as the squares in which one is allowed to put a rook. The coefficient, rk(B) of xk in the rook polynomial RB(x) is the number of ways k rooks, none of which attacks another, can be arranged in the squares of B. For any board B, there is a complementary board
B'
RB'(x)
rk(B').
It is sometimes convenient to be able to calculate the highest coefficient of a rook polynomial in terms of the coefficients of the rook polynomial of the complementary board. Without loss of generality we can assume that n ≤ m, so this coefficient is rn(B). The number of ways to place n non-attacking rooks on the complete n × m "checkerboard" (without regard as to whether the rooks are placed in the squares of the board B) is given by the falling factorial:
(m)n=m(m-1)(m-2) … (m-n+1).
Letting Pi be the property that an assignment of n non-attacking rooks on the complete board has a rook in column i which is not in a square of the board B, then by the principle of inclusion–exclusion we have:
rn(B)=
n | |
\sum | |
t=0 |
(-1)t(m-t)n-trt(B').
See main article: Euler's totient function.
Euler's totient or phi function, φ(n) is an arithmetic function that counts the number of positive integers less than or equal to n that are relatively prime to n. That is, if n is a positive integer, then φ(n) is the number of integers k in the range 1 ≤ k ≤ n which have no common factor with n other than 1. The principle of inclusion–exclusion is used to obtain a formula for φ(n). Let S be the set and define the property Pi to be that a number in S is divisible by the prime number pi, for 1 ≤ i ≤ r, where the prime factorization of
n=
a1 | |
p | |
1 |
a2 | |
p | |
2 |
…
ar | |
p | |
r |
.
Then,
\varphi(n)=n-
r | |
\sum | |
i=1 |
n | |
pi |
+\sum1
n | |
pipj |
- … =n
r | |
\prod | |
i=1 |
\left(1-
1 | |
pi |
\right).
See main article: Dirichlet hyperbola method.
f(n)
f=g\asth
F(n)=
n | |
\sum | |
k=1 |
f(k)=
n | |
\sum | |
k=1 |
\sum | |
xy=k |
g(x)h(y)
can be recast as a sum over the lattice points in a region bounded by
x\geq1
y\geq1
xy\leqn
F(n)=
n | |
\sum | |
k=1 |
f(k) =
n | |
\sum | |
k=1 |
\sum | |
xy=k |
g(x)h(y) =
a | |
\sum | |
x=1 |
n/x | |
\sum | |
y=1 |
g(x)h(y)+
b | |
\sum | |
y=1 |
n/y | |
\sum | |
x=1 |
g(x)h(y)-
a | |
\sum | |
x=1 |
b | |
\sum | |
y=1 |
g(x)h(y).
In many cases where the principle could give an exact formula (in particular, counting prime numbers using the sieve of Eratosthenes), the formula arising does not offer useful content because the number of terms in it is excessive. If each term individually can be estimated accurately, the accumulation of errors may imply that the inclusion–exclusion formula is not directly applicable. In number theory, this difficulty was addressed by Viggo Brun. After a slow start, his ideas were taken up by others, and a large variety of sieve methods developed. These for example may try to find upper bounds for the "sieved" sets, rather than an exact formula.
Let A1, ..., An be arbitrary sets and p1, ..., pn real numbers in the closed unit interval . Then, for every even number k in, the indicator functions satisfy the inequality:
1 | |
A1\cup … \cupAn |
\ge
k | |
\sum | |
j=1 |
(-1)j-1
\sum | |
1\lei1< … <ij\len |
p | |
i1 |
...
p | |
ij |
1 | |||||||||||
|
.
Choose an element contained in the union of all sets and let
A1,A2,...,At
A1,A2,...,At
\begin{align} |\{Ai\mid1\leqslanti\leqslantt\}|&-|\{Ai\capAj\mid1\leqslanti<j\leqslantt\}|+ … +(-1)t+1|\{A1\capA2\cap … \capAt\}|=\binom{t}{1}-\binom{t}{2}+ … +(-1)t+1\binom{t}{t}. \end{align}
By the binomial theorem,
0=(1-1)t=\binom{t}{0}-\binom{t}{1}+\binom{t}{2}- … +(-1)t\binom{t}{t}.
Using the fact that
\binom{t}{0}=1
1=\binom{t}{1}-\binom{t}{2}+ … +(-1)t+1\binom{t}{t},
and so, the chosen element is counted only once by the right-hand side of equation .
An algebraic proof can be obtained using indicator functions (also known as characteristic functions). The indicator function of a subset S of a set X is the function
\begin{align} &1S:X\to\{0,1\}\\ &1S(x)=\begin{cases}1&x\inS\ 0&x\notinS\end{cases} \end{align}
If
A
B
X
1A ⋅ 1B=1A\cap.
Let A denote the union of the sets A1, ..., An. To prove the inclusion–exclusion principle in general, we first verify the identity
for indicator functions, where:
AI=capi\inAi.
The following function
\left(1A-1
A1 |
\right)\left(1A-1
A2 |
\right) … \left(1A-1
An |
\right)=0,
is identically zero because: if x is not in A, then all factors are 0−0 = 0; and otherwise, if x does belong to some Am, then the corresponding mth factor is 1−1=0. By expanding the product on the left-hand side, equation follows.
To prove the inclusion–exclusion principle for the cardinality of sets, sum the equation over all x in the union of A1, ..., An. To derive the version used in probability, take the expectation in . In general, integrate the equation with respect to μ. Always use linearity in these derivations.