Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent[1] if, informally speaking, the occurrence of one does not affect the probability of occurrence of the other or, equivalently, does not affect the odds. Similarly, two random variables are independent if the realization of one does not affect the probability distribution of the other.
When dealing with collections of more than two events, two notions of independence need to be distinguished. The events are called pairwise independent if any two events in the collection are independent of each other, while mutual independence (or collective independence) of events means, informally speaking, that each event is independent of any combination of other events in the collection. A similar notion exists for collections of random variables. Mutual independence implies pairwise independence, but not the other way around. In the standard literature of probability theory, statistics, and stochastic processes, independence without further qualification usually refers to mutual independence.
Two events
A
B
A\perpB
A\perp\perpB
A\capB ≠ \emptyset
A
B
A\capB=\emptyset
P(A\midB)=
P(A\capB) | |
P(B) |
A
B
P(A\capB)=P(A)P(B)\iffP(A\midB)=
P(A\capB) | |
P(B) |
=P(A).
and similarly
P(A\capB)=P(A)P(B)\iffP(B\midA)=
P(A\capB) | |
P(A) |
=P(B).
Thus, the occurrence of
B
A
A
B
P(A)
P(B)
A
B
B
A
Stated in terms of odds, two events are independent if and only if the odds ratio of and is unity (1). Analogously with probability, this is equivalent to the conditional odds being equal to the unconditional odds:
O(A\midB)=O(A)andO(B\midA)=O(B),
O(A\midB)=O(A\mid\negB)andO(B\midA)=O(B\mid\negA).
O(A\midB):O(A\mid\negB),
A finite set of events
\{Ai\}
n | |
i=1 |
m,k
A finite set of events is mutually independent if every event is independent of any intersection of the other events - that is, if and only if for every
k\leqn
1\lei1<...<ik\len
This is called the multiplication rule for independent events. It is not a single condition involving only the product of all the probabilities of all single events; it must hold true for all subsets of events.
For more than two events, a mutually independent set of events is (by definition) pairwise independent; but the converse is not necessarily true.[2]
Stated in terms of log probability, two events are independent if and only if the log probability of the joint event is the sum of the log probability of the individual events:
logP(A\capB)=logP(A)+logP(B)
I(A\capB)=I(A)+I(B)
Two random variables
X
Y
x
y
\{X\lex\}
\{Y\ley\}
X
Y
FX(x)
FY(y)
(X,Y)
fX(x)
fY(y)
fX,Y(x,y)
fX,Y(x,y)=fX(x)fY(y) forallx,y.
A finite set of
n
\{X1,\ldots,Xn\}
A finite set of
n
\{X1,\ldots,Xn\}
\{x1,\ldots,xn\}
\{X1\lex1\},\ldots,\{Xn\lexn\}
n
\{X1,\ldots,Xn\}
It is not necessary here to require that the probability distribution factorizes for all possible subsets as in the case for
n
F | |
X1,X2,X3 |
(x1,x2,x3)=
F | |
X1 |
(x1) ⋅
F | |
X2 |
(x2) ⋅
F | |
X3 |
(x3)
F | |
X1,X3 |
(x1,x3)=
F | |
X1 |
(x1) ⋅
F | |
X3 |
(x3)
The measure-theoretically inclined may prefer to substitute events
\{X\inA\}
\{X\leqx\}
A
Two random vectors
X=(X1,\ldots,X
T | |
m) |
Y=(Y1,\ldots,Y
T | |
n) |
where
FX(x)
FY(y)
X
Y
FX,Y(x,y)
X
Y
X\perp\perpY
X
Y
F | |
X1,\ldots,Xm,Y1,\ldots,Yn |
(x1,\ldots,xm,y1,\ldots,yn)=
F | |
X1,\ldots,Xm |
(x1,\ldots,xm) ⋅
F | |
Y1,\ldots,Yn |
(y1,\ldots,yn) forallx1,\ldots,xm,y1,\ldots,yn.
The definition of independence may be extended from random vectors to a stochastic process. Therefore, it is required for an independent stochastic process that the random variables obtained by sampling the process at any
n
t1,\ldots,tn
n
Formally, a stochastic process
\left\{Xt\right\}t\inl{T
n\inN
t1,\ldots,tn\inl{T}
where Independence of a stochastic process is a property within a stochastic process, not between two stochastic processes.
Independence of two stochastic processes is a property between two stochastic processes
\left\{Xt\right\}t\inl{T
\left\{Yt\right\}t\inl{T
(\Omega,l{F},P)
\left\{Xt\right\}t\inl{T
\left\{Yt\right\}t\inl{T
n\inN
t1,\ldots,tn\inl{T}
(X(t1),\ldots,X(tn))
(Y(t1),\ldots,Y(tn))
The definitions above (and) are both generalized by the following definition of independence for σ-algebras. Let
(\Omega,\Sigma,P)
l{A}
l{B}
\Sigma
l{A}
l{B}
A\inl{A}
B\inl{B}
P(A\capB)=P(A)P(B).
Likewise, a finite family of σ-algebras
(\taui)i\in
I
\forall\left(Ai\right)i\in\in\prod\nolimitsi\in\taui : P\left(cap\nolimitsi\inAi\right)=\prod\nolimitsi\inP\left(Ai\right)
and an infinite family of σ-algebras is said to be independent if all its finite subfamilies are independent.
The new definition relates to the previous ones very directly:
E\in\Sigma
\sigma(\{E\})=\{\emptyset,E,\Omega\setminusE,\Omega\}.
X
Y
\Omega
X
S
\Omega
X-1(U)
U
S
Using this definition, it is easy to show that if
X
Y
Y
X
Y
\{\varnothing,\Omega\}
Y
Note that an event is independent of itself if and only if
P(A)=P(A\capA)=P(A) ⋅ P(A)\iffP(A)=0orP(A)=1.
Thus an event is independent of itself if and only if it almost surely occurs or its complement almost surely occurs; this fact is useful when proving zero–one laws.[8]
See main article: Correlation and dependence. If
X
Y
\operatorname{E}
\operatorname{E}[XnYm]=\operatorname{E}[Xn]\operatorname{E}[Ym],
\operatorname{cov}[X,Y]
\operatorname{cov}[X,Y]=\operatorname{E}[XY]-\operatorname{E}[X]\operatorname{E}[Y].
The converse does not hold: if two random variables have a covariance of 0 they still may be not independent.
See also: Uncorrelatedness (probability theory).
Similarly for two stochastic processes
\left\{Xt\right\}t\inl{T
\left\{Yt\right\}t\inl{T
Two random variables
X
Y
(X,Y)
\varphi(X,Y)(t,s)=\varphiX(t) ⋅ \varphiY(s).
In particular the characteristic function of their sum is the product of their marginal characteristic functions:
\varphiX+Y(t)=\varphiX(t) ⋅ \varphiY(t),
though the reverse implication is not true. Random variables that satisfy the latter condition are called subindependent.
The event of getting a 6 the first time a die is rolled and the event of getting a 6 the second time are independent. By contrast, the event of getting a 6 the first time a die is rolled and the event that the sum of the numbers seen on the first and second trial is 8 are not independent.
If two cards are drawn with replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are independent. By contrast, if two cards are drawn without replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are not independent, because a deck that has had a red card removed has proportionately fewer red cards.
Consider the two probability spaces shown. In both cases,
P(A)=P(B)=1/2
P(C)=1/4
P(A|B)=P(A|C)=1/2=P(A)
P(B|A)=P(B|C)=1/2=P(B)
P(C|A)=P(C|B)=1/4=P(C)
P(A|BC)=
| |||||||
|
=\tfrac{4}{5}\neP(A)
P(B|AC)=
| |||||||
|
=\tfrac{4}{5}\neP(B)
P(C|AB)=
| |||||||
|
=\tfrac{2}{5}\neP(C)
In the mutually independent case, however,
P(A|BC)=
| |||||||
|
=\tfrac{1}{2}=P(A)
P(B|AC)=
| |||||||
|
=\tfrac{1}{2}=P(B)
P(C|AB)=
| |||||||
|
=\tfrac{1}{4}=P(C)
It is possible to create a three-event example in which
P(A\capB\capC)=P(A)P(B)P(C),
and yet no two of the three events are pairwise independent (and hence the set of events are not mutually independent).[11] This example shows that mutual independence involves requirements on the products of probabilities of all combinations of events, not just the single events as in this example.
See main article: Conditional independence.
The events
A
B
C
P(A\capB\midC)=P(A\midC) ⋅ P(B\midC)
Intuitively, two random variables
X
Y
Z
Z
Y
X
X
Y
Z
Z
The formal definition of conditional independence is based on the idea of conditional distributions. If
X
Y
Z
X
Y
Z
P(X\lex,Y\ley | Z=z)=P(X\lex | Z=z) ⋅ P(Y\ley | Z=z)
for all
x
y
z
P(Z=z)>0
fXYZ(x,y,z)
X
Y
Z
fXY|Z(x,y|z)=fX|Z(x|z) ⋅ fY|Z(y|z)
for all real numbers
x
y
z
fZ(z)>0
If discrete
X
Y
Z
P(X=x|Y=y,Z=z)=P(X=x|Z=z)
for any
x
y
z
P(Z=z)>0
X
Y
Z
Z
Independence can be seen as a special kind of conditional independence, since probability can be seen as a kind of conditional probability given no events.
Before 1933, independence, in probability theory, was defined in a verbal manner. For example, de Moivre gave the following definition: “Two events are independent, when they have no connexion one with the other, and that the happening of one neither forwards nor obstructs the happening of the other”.[12] If there are n independent events, the probability of the event, that all of them happen was computed as the product of the probabilities of these n events. Apparently, there was the conviction, that this formula was a consequence of the above definition. (Sometimes this was called the Multiplication Theorem.), Of course, a proof of his assertion cannot work without further more formal tacit assumptions.
The definition of independence, given in this article, became the standard definition (now used in all books) after it appeared in 1933 as part of Kolmogorov's axiomatization of probability.[13] Kolmogorov credited it to S.N. Bernstein, and quoted a publication which had appeared in Russian in 1927.[14]
Unfortunately, both Bernstein and Kolmogorov had not been aware of the work of the Georg Bohlmann. Bohlmann had given the same definition for two events in 1901[15] and for n events in 1908[16] In the latter paper, he studied his notion in detail. For example, he gave the first example showing that pairwise independence does not imply imply mutual independence.Even today, Bohlmann is rarely quoted. More about his work can be found in On the contributions of Georg Bohlmann to probability theory from .[17]