Boole's inequality explained

In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events. This inequality provides an upper bound on the probability of occurrence of at least one of a countable number of events in terms of the individual probabilities of the events. Boole's inequality is named for its discoverer, George Boole.^[1]

Formally, for a countable set of events A₁, A₂, A₃, ..., we have

	infty
{P}\left(cup
	i=1

A_i\right)\le

	infty
\sum
	i=1

{P}(A_i).

In measure-theoretic terms, Boole's inequality follows from the fact that a measure (and certainly any probability measure) is σ-sub-additive.

Proof

Proof using induction

Boole's inequality may be proved for finite collections of

events using the method of induction.

For the

n=1

case, it follows that

P(A₁₎\leP(A_1).

For the case

, we have

	n
{P}\left(cup
	i=1

A_i\right)\le

	n
\sum
	i=1

{P}(A_i).

Since

P(A\cupB)=P(A)+P(B)-P(A\capB),

and because the union operation is associative, we have

	n+1
P\left(cup
	i=1

A_i\right)=

	n
P\left(cup
	i=1

A_i\right)+P(A_n+1)

	n
-P\left(cup
	i=1

A_i\capA_n+1\right).

Since

	n
{P}\left(cup
	i=1

A_i\capA_n+1\right)\ge0,

by the first axiom of probability, we have

	n+1
P\left(cup
	i=1

A_i\right)\leP

	n
\left(cup
	i=1

A_i\right)+P(A_n+1),

and therefore

	n+1
P\left(cup
	i=1

A_i\right)\le

	n
\sum
	i=1

P(A_i)+P(A_n+1)=

	n+1
\sum
	i=1

P(A_i).

Proof without using induction

For any events in

A_1,A_2,A_3,...

in our probability space we have

P\left(cup_iA_i\right)\leq\sum_iP(A_i).

One of the axioms of a probability space is that if

B_1,B_2,B_3,...

are disjoint subsets of the probability space then

P\left(cup_iB_i\right)=\sum_iP(B_i);

this is called countable additivity.

If we modify the sets

A_i

, so they become disjoint,

B_i=A_i-

	i-1
cup
	j=1

A_j

we can show that

	infty
cup
	i=1

B_i=

	infty
cup
	i=1

A_i.

by proving both directions of inclusion.

Suppose

x\in

	infty
cup
	i=1

A_i

. Then

x\inA_k

for some minimum

such that

i<k\impliesx\notinA_i

. Therefore

x\inB_k=A_k-

	k-1
cup
	j=1

A_j

. So the first inclusion is true:

	infty
cup
	i=1

A_i\subset

	infty
cup
	i=1

B_i

Next suppose that

x\in

	infty
cup
	i=1

B_i

. It follows that

x\inB_k

for some

. And

B_k=A_k-

	k-1
cup
	j=1

A_j

x\inA_k

, and we have the other inclusion:

	infty
cup
	i=1

B_i\subset

	infty
cup
	i=1

A_i

By construction of each

B_i

B_i\subsetA_i

. For

B\subsetA,

it is the case that

P(B)\leqP(A).

So, we can conclude that the desired inequality is true:

P\left(cup_iA_i\right)=P\left(cup_iB_i\right)=\sum_iP(B_i)\leq\sum_iP(A_i).

Bonferroni inequalities

Boole's inequality may be generalized to find upper and lower bounds on the probability of finite unions of events.^[2] These bounds are known as Bonferroni inequalities, after Carlo Emilio Bonferroni; see .

Let

S₁:=

	n
\sum
	i=1

{P}(A_i), S₂:=

\sum
	1\lei₁<i_2\len

{P}(A
	i₁

\cap

A
	i₂

), \ldots, S_k:=

\sum
	1\lei_{1< … <i}_k\len

{P}(A
	i₁

\cap … \cap

A
	i_k

)

for all integers k in .

Then, when

K\leqn

is odd:

	K
\sum
	j=1

(-1)^j-1S_j\geq

	n
P(cup
	i=1

A_i)=

	n
\sum
	j=1

(-1)^j-1S_j

holds, and when

K\leqn

is even:

	K
\sum
	j=1

(-1)^j-1S_j\leq

	n
P(cup
	i=1

A_i)=

	n
\sum
	j=1

(-1)^j-1S_j

holds.

The equalities follow from the inclusion–exclusion principle, and Boole's inequality is the special case of

K=1

Proof for odd K

Let

	n
cap
	i=1

B_i

, where

B_i\in\{A_i,

	c\}
A
	i

for each

i=1,...,n

. These such

partition the sample space, and for each

and every

is either contained in

A_i

or disjoint from it.

	n
cap
	i=1

	c
A
	i

, then

contributes 0 to both sides of the inequality.

Otherwise, assume

is contained in exactly

of the

A_i

. Then

contributes exactly

P(E)

to the right side of the inequality, while it contributes

	K
\sum
	j=1

(-1)^j-1{L\choosej}P(E)

to the left side of the inequality. However, by Pascal's rule, this is equal to

	K
\sum
	j=1

(-1)^j-1({L-1\choosej-1}+{L-1\choosej})P(E)

which telescopes to

(1+{L-1\chooseK})P(E)\geqP(E)

Thus, the inequality holds for all events

, and so by summing over

, we obtain the desired inequality:

	K
\sum
	j=1

(-1)^j-1S_j\geq

	n
P(cup
	i=1

A_i)

The proof for even

is nearly identical.^[3]

Example

Suppose that you are estimating 5 parameters based on a random sample, and you can control each parameter separately. If you want your estimations of all five parameters to be good with a chance 95%, what should you do to each parameter?

Tuning each parameter's chance to be good to within 95% is not enough because "all are good" is a subset of each event "Estimate i is good". We can use Boole's Inequality to solve this problem. By finding the complement of event "all five are good", we can change this question into another condition:

P(at least one estimation is bad) = 0.05 ≤ P(A₁ is bad) + P(A₂ is bad) + P(A₃ is bad) + P(A₄ is bad) + P(A₅ is bad)

One way is to make each of them equal to 0.05/5 = 0.01, that is 1%. In other words, you have to guarantee each estimate good to 99%(for example, by constructing a 99% confidence interval) to make sure the total estimation to be good with a chance 95%. This is called the Bonferroni Method of simultaneous inference.

Notes and References

Book: Boole, George. The Mathematical Analysis of Logic. 1847. Philosophical Library. 9780802201546. en.
Book: George . Casella . Roger L. . Berger . Statistical Inference . Duxbury . 2002 . 0-534-24312-6 . 11–13 .
Book: Santosh . Venkatesh . The Theory of Probability . Cambridge University Press . 2012 . 978-0-534-24312-8 . 94–99,113–115 .

Boole's inequality explained

Proof

Proof using induction

Proof without using induction

Bonferroni inequalities

Proof for odd K

Example

See also

Notes and References