Cox's theorem, named after the physicist Richard Threlkeld Cox, is a derivation of the laws of probability theory from a certain set of postulates.[1] [2] This derivation justifies the so-called "logical" interpretation of probability, as the laws of probability derived by Cox's theorem are applicable to any proposition. Logical (also known as objective Bayesian) probability is a type of Bayesian probability. Other forms of Bayesianism, such as the subjective interpretation, are given other justifications.
Cox wanted his system to satisfy the following conditions:
The postulates as stated here are taken from Arnborg and Sjödin.[3] [4] [5] "Common sense" includes consistency with Aristotelian logic in the sense that logically equivalent propositions shall have the same plausibility.
The postulates as originally stated by Cox were not mathematicallyrigorous (although more so than the informal description above),as noted by Halpern.[6] [7] However it appears to be possibleto augment them with various mathematical assumptions made eitherimplicitly or explicitly by Cox to produce a valid proof.
Cox's notation:
The plausibility of a proposition
A
X
A\midX
Cox's postulates and functional equations are:
AB
A
B
X
A
X
B
AX
In form of a functional equation
AB\midX=g(A\midX,B\midAX)
Because of the associative nature of the conjunction in propositional logic, the consistency with logic gives a functional equation saying that the function
g
g
All strictly increasing associative binary operations on the real numbers are isomorphic to multiplication of numbers in a subinterval of, which means that there is a monotonic function
w
w(AB\midX)=w(A\midX)w(B\midAX)
A
X
AB\midX=B\midX
B\midAX=B\midX
w(B\midX)=w(A\midX)w(B\midX)
This shall hold for any proposition
B
w(A\midX)=1
A
X
AB\midX=A\midX
A\midBX=A\midX
w(A\midX)=w(B\midX)w(A\midX)
This shall hold for any proposition
B
w(A\midX)=0
Due to the requirement of monotonicity, this means that
w
This postulates the existence of a function
f
w(notA\midX)=f(w(A\midX))
Because "a double negative is an affirmative", consistency with logic gives a functional equation
f(f(x))=x,
saying that the function
f
f
The above functional equations and consistency with logic imply that
w(AB\midX)=w(A\midX)f(w(notB\midAX))=w(A\midX)f\left({w(AnotB\midX)\overw(A\midX)}\right)
Since
AB
BA
w(A\midX)f\left({w(AnotB\midX)\overw(A\midX)}\right)=w(B\midX)f\left({w(BnotA\midX)\overw(B\midX)}\right)
If, in particular,
B=not(AD)
AnotB=notB
BnotA=notA
w(AnotB\midX)=w(notB\midX)=f(w(B\midX))
and
w(BnotA\midX)=w(notA\midX)=f(w(A\midX))
Abbreviating
w(A\midX)=x
w(B\midX)=y
xf\left({f(y)\overx}\right)=yf\left({f(x)\overy}\right)
The laws of probability derivable from these postulates are the following.[8] Let
A\midB
A
B
w
m
w(A\midB)=1.
wm(A|B)+wm(notA\midB)=1.
w(AB\midC)=w(A\midC)w(B\midAC)=w(B\midC)w(A\midBC).
It is important to note that the postulates imply only these general properties. We may recover the usual laws of probability by setting a new function, conventionally denoted
P
\Pr
wm
\Pr(A\midB)=1
\Pr(A\midB)=0.
\Pr(A\midB)+\Pr(notA\midB)=1.
\Pr(AB\midC)=\Pr(A\midC)\Pr(B\midAC)=\Pr(B\midC)\Pr(A\midBC).
Rule 2 is a rule for negation, and rule 3 is a rule for conjunction. Given that any proposition containing conjunction, disjunction, and negation can be equivalently rephrased using conjunction and negation alone (the conjunctive normal form), we can now handle any compound proposition.
The laws thus derived yield finite additivity of probability, but not countable additivity. The measure-theoretic formulation of Kolmogorov assumes that a probability measure is countably additive. This slightly stronger condition is necessary for certain results. An elementary example (in which this assumption merely simplifies the calculation rather than being necessary for it) is that the probability of seeing heads for the first time after an even number of flips in a sequence of coin flips is
\tfrac13
Cox's theorem has come to be used as one of the justifications for the use of Bayesian probability theory. For example, in Jaynes it is discussed in detail in chapters 1 and 2 and is a cornerstone for the rest of the book. Probability is interpreted as a formal system of logic, the natural extension of Aristotelian logic (in which every statement is either true or false) into the realm of reasoning in the presence of uncertainty.
It has been debated to what degree the theorem excludes alternative models for reasoning about uncertainty. For example, if certain "unintuitive" mathematical assumptions were dropped then alternatives could be devised, e.g., an example provided by Halpern. However Arnborg and Sjödin suggest additional"common sense" postulates, which would allow the assumptions to be relaxed in some cases while still ruling out the Halpern example. Other approaches were devised by Hardy[9] or Dupré and Tipler.[10]
The original formulation of Cox's theorem is in, which is extended with additional results and more discussion in . Jaynes cites Abel[11] for the first known use of the associativity functional equation. János Aczél[12] provides a long proof of the "associativity equation" (pages 256-267). Jaynes reproduces the shorter proof by Cox in which differentiability is assumed. A guide to Cox's theorem by Van Horn aims at comprehensively introducing the reader to all these references.[13]
P\landQ
f
P
Q
T(P\landQ)=f(T(P),T(Q))
f(x,y)=x\landy
x
y
. Terrence L. Fine . Theories of Probability : An examination of foundations . Academic Press . New York . 1973 . 0-12-256450-2 .