The standard probability axioms are the foundations of probability theory introduced by Russian mathematician Andrey Kolmogorov in 1933.[1] These axioms remain central and have direct contributions to mathematics, the physical sciences, and real-world probability cases.[2]
There are several other (equivalent) approaches to formalising probability. Bayesians will often motivate the Kolmogorov axioms by invoking Cox's theorem or the Dutch book arguments instead.[3] [4]
The assumptions as to setting up the axioms can be summarised as follows: Let
(\Omega,F,P)
P(E)
E
P(\Omega)=1
(\Omega,F,P)
\Omega
F
P
The probability of an event is a non-negative real number:
P(E)\inR,P(E)\geq0 \forallE\inF
where
F
P(E)
This is the assumption of unit measure: that the probability that at least one of the elementary events in the entire sample space will occur is 1.
P(\Omega)=1
This is the assumption of σ-additivity:
Any countable sequence of disjoint sets (synonymous with mutually exclusive events)
E1,E2,\ldots
infty | |
P\left(cup | |
i=1 |
Ei\right)=
infty | |
\sum | |
i=1 |
P(Ei).
From the Kolmogorov axioms, one can deduce other useful rules for studying probabilities. The proofs[6] [7] [8] of these rules are a very insightful procedure that illustrates the power of the third axiom, and its interaction with the prior two axioms. Four of the immediate corollaries and their proofs are shown below:
if A\subseteqB then P(A)\leqP(B).
If A is a subset of, or equal to B, then the probability of A is less than, or equal to the probability of B.
In order to verify the monotonicity property, we set
E1=A
E2=B\setminusA
A\subseteqB
Ei=\varnothing
i\geq3
\varnothing
Ei
E1\cupE2\cup … =B
P(A)+P(B\setminus
infty | |
A)+\sum | |
i=3 |
P(Ei)=P(B).
Since, by the first axiom, the left-hand side of this equation is a series of non-negative numbers, and since it converges to
P(B)
P(A)\leqP(B)
P(\varnothing)=0
P(\varnothing)=0.
In many cases,
\varnothing
P(\varnothing\cup\varnothing)=P(\varnothing)
\varnothing\cup\varnothing=\varnothing
P(\varnothing)+P(\varnothing)=P(\varnothing)
\varnothing
P(\varnothing)=0
P(\varnothing)
P\left(Ac\right)=P(\Omega-A)=1-P(A)
Given
A
Ac
A\cupAc=\Omega
P(A\cupAc)=P(A)+P(Ac)
and,
P(A\cupAc)=P(\Omega)=1
⇒ P(A)+P(Ac)=1
\thereforeP(Ac)=1-P(A)
It immediately follows from the monotonicity property that
0\leqP(E)\leq1 \forallE\inF.
Given the complement rule
P(Ec)=1-P(E)
P(Ec)\geq0
1-P(E)\geq0
⇒ 1\geqP(E)
\therefore0\leqP(E)\leq1
Another important property is:
P(A\cupB)=P(A)+P(B)-P(A\capB).
This is called the addition law of probability, or the sum rule.That is, the probability that an event in A or B will happen is the sum of the probability of an event in A and the probability of an event in B, minus the probability of an event that is in both A and B. The proof of this is as follows:
Firstly,
P(A\cupB)=P(A)+P(B\setminusA)
So,
P(A\cupB)=P(A)+P(B\setminus(A\capB))
B\setminusA=B\setminus(A\capB)
Also,
P(B)=P(B\setminus(A\capB))+P(A\capB)
and eliminating
P(B\setminus(A\capB))
An extension of the addition law to any number of sets is the inclusion–exclusion principle.
Setting B to the complement Ac of A in the addition law gives
P\left(Ac\right)=P(\Omega\setminusA)=1-P(A)
That is, the probability that any event will not happen (or the event's complement) is 1 minus the probability that it will.
Consider a single coin-toss, and assume that the coin will either land heads (H) or tails (T) (but not both). No assumption is made as to whether the coin is fair or as to whether or not any bias depends on how the coin is tossed.[9]
We may define:
\Omega=\{H,T\}
F=\{\varnothing,\{H\},\{T\},\{H,T\}\}
Kolmogorov's axioms imply that:
P(\varnothing)=0
P(\{H,T\}c)=0
P(\{H\})+P(\{T\})=1