Statistical proof explained

Statistical proof is the rational demonstration of degree of certainty for a proposition, hypothesis or theory that is used to convince others subsequent to a statistical test of the supporting evidence and the types of inferences that can be drawn from the test scores. Statistical methods are used to increase the understanding of the facts and the proof demonstrates the validity and logic of inference with explicit reference to a hypothesis, the experimental data, the facts, the test, and the odds. Proof has two essential aims: the first is to convince and the second is to explain the proposition through peer and public review.^[1]

The burden of proof rests on the demonstrable application of the statistical method, the disclosure of the assumptions, and the relevance that the test has with respect to a genuine understanding of the data relative to the external world. There are adherents to several different statistical philosophies of inference, such as Bayes theorem versus the likelihood function, or positivism versus critical rationalism. These methods of reason have direct bearing on statistical proof and its interpretations in the broader philosophy of science.^[2]

A common demarcation between science and non-science is the hypothetico-deductive proof of falsification developed by Karl Popper, which is a well-established practice in the tradition of statistics. Other modes of inference, however, may include the inductive and abductive modes of proof.^[3] Scientists do not use statistical proof as a means to attain certainty, but to falsify claims and explain theory. Science cannot achieve absolute certainty nor is it a continuous march toward an objective truth as the vernacular as opposed to the scientific meaning of the term "proof" might imply. Statistical proof offers a kind of proof of a theory's falsity and the means to learn heuristically through repeated statistical trials and experimental error. Statistical proof also has applications in legal matters with implications for the legal burden of proof.^[4]

Axioms

There are two kinds of axioms, 1) conventions that are taken as true that should be avoided because they cannot be tested, and 2) hypotheses.^[5] Proof in the theory of probability was built on four axioms developed in the late 17th century:

The probability of a hypothesis is a non-negative real number:

\{\Pr(h)\geqq0\}

;

The probability of necessary truth equals one:

\{\Pr(t)=1\}

;

If two hypotheses h₁ and h₂ are mutually exclusive, then the sum of their probabilities is equal to the probability of their disjunction:

\{\Pr\left(h₁\right)+\Pr\left(h₂\right)=\Pr\left(h₁orh_2\right)\}

;

The conditional probability of h₁ given h₂

\{\Pr(h_1|h₂₎\}

is equal to the unconditional probability

\{\Pr(h₁\Andh₂₎\}

of the conjunction h₁ and h₂, divided by the unconditional probability

\{\Pr(h₂₎\}

of h₂ where that probability is positive

\{\Pr(h_1|h₂₎=

	\Pr(h₁\Andh₂₎
	\Pr(h₂₎

, where

\{\Pr(h₂₎>0\}

The preceding axioms provide the statistical proof and basis for the laws of randomness, or objective chance from where modern statistical theory has advanced. Experimental data, however, can never prove that the hypotheses (h) is true, but relies on an inductive inference by measuring the probability of the hypotheses relative to the empirical data. The proof is in the rational demonstration of using the logic of inference, math, testing, and deductive reasoning of significance.^[6]

Test and proof

See main article: Statistical tests.

The term proof descended from its Latin roots (provable, probable, probare L.) meaning to test.^[7] ^[8] Hence, proof is a form of inference by means of a statistical test. Statistical tests are formulated on models that generate probability distributions. Examples of probability distributions might include the binary, normal, or poisson distribution that give exact descriptions of variables that behave according to natural laws of random chance. When a statistical test is applied to samples of a population, the test determines if the sample statistics are significantly different from the assumed null-model. True values of a population, which are unknowable in practice, are called parameters of the population. Researchers sample from populations, which provide estimates of the parameters, to calculate the mean or standard deviation. If the entire population is sampled, then the sample statistic mean and distribution will converge with the parametric distribution.^[9]

Using the scientific method of falsification, the probability value that the sample statistic is sufficiently different from the null-model than can be explained by chance alone is given prior to the test. Most statisticians set the prior probability value at 0.05 or 0.1, which means if the sample statistics diverge from the parametric model more than 5 (or 10) times out of 100, then the discrepancy is unlikely to be explained by chance alone and the null-hypothesis is rejected. Statistical models provide exact outcomes of the parametric and estimates of the sample statistics. Hence, the burden of proof rests in the sample statistics that provide estimates of a statistical model. Statistical models contain the mathematical proof of the parametric values and their probability distributions.^[10] ^[11]

Bayes theorem

In legal proceedings

See main article: Legal burden of proof.

Statistical proof in a legal proceeding can be sorted into three categories of evidence:

The occurrence of an event, act, or type of conduct,
The identity of the individual(s) responsible
The intent or psychological responsibility^[16]

Statistical proof was not regularly applied in decisions concerning United States legal proceedings until the mid 1970s following a landmark jury discrimination case in Castaneda v. Partida. The US Supreme Court ruled that gross statistical disparities constitutes "prima facie proof" of discrimination, resulting in a shift of the burden of proof from plaintiff to defendant. Since that ruling, statistical proof has been used in many other cases on inequality, discrimination, and DNA evidence.^[17] ^[18] However, there is not a one-to-one correspondence between statistical proof and the legal burden of proof. "The Supreme Court has stated that the degrees of rigor required in the fact finding processes of law and science do not necessarily correspond."

In an example of a death row sentence (McCleskey v. Kemp) concerning racial discrimination, the petitioner, a black man named McCleskey was charged with the murder of a white police officer during a robbery. Expert testimony for McClesky introduced a statistical proof showing that "defendants charged with killing white victims were 4.3 times as likely to receive a death sentence as charged with killing blacks.".^[19] Nonetheless, the statistics was insufficient "to prove that the decisionmakers in his case acted with discriminatory purpose." It was further argued that there were "inherent limitations of the statistical proof", because it did not refer to the specifics of the individual. Despite the statistical demonstration of an increased probability of discrimination, the legal burden of proof (it was argued) had to be examined on a case-by-case basis.

Notes and References

Book: Gold . B.. Bonnie Gold . Simons . R. A. . Proof and other dilemmas: Mathematics and philosophy . 2008 . Mathematics Association of America Inc. . 978-0-88385-567-6 .
Book: Thomas Kuhn's "Linguistic Turn" and the Legacy of Logical Empiricism: Incommensurability, Rationality and the Search for Truth . Gattei . S. . 2008 . 277 . Ashgate Pub Co . 978-0-7546-6160-3 .
Pedemont . B. . 2007 . How can the relationship between argumentation and proof be analysed? . Educational Studies in Mathematics . 66 . 1 . 23–41 . 10.1007/s10649-006-9057-x . 121547580 .
Meier . P. . Damned Liars and Expert Witnesses . Journal of the American Statistical Association . 81 . 394 . 1986 . 269–276 . 10.1080/01621459.1986.10478270.
Karl R. Popper, Systematics, and Classification: A Reply to Walter Bock and Other Evolutionary Taxonomists . Wiley . E. O. . . 0039-7989 . 24 . 2 . 1975 . 233–43 . 10.2307/2412764 . 2412764 .
Bayesian reasoning in science . Howson . Colin . Urbach . Peter . . 1476-4687 . 350 . 6317 . 1991 . 371–4 . 10.1038/350371a0 . 1991Natur.350..371H . 5419177 .
Sundholm . G. . Proof-Theoretical Semantics and Fregean Identity Criteria for Propositions . The Monist . 77 . 3 . 294–314 . 10.5840/monist199477315. 1994 . 1887/11990 . free .
Bissell . D. . Statisticians have a Word for it . Teaching Statistics . 18 . 3 . 87–89 . 1996 . 10.1111/j.1467-9639.1996.tb00300.x. 10.1.1.385.5823 .
Book: Sokal . R. R. . Rohlf . F. J. . Biometry . 3rd . 1995 . 978-0-7167-2411-7 . W.H. Freeman & Company . 887 . registration . biometry. .
Book: An introduction to experimental design and statistics for biology . Heath . David . 1995 . 978-1-85728-132-3 . CRC Press .
Book: Hald . Anders . A History of Parametric Statistical Inference from Bernoulli to Fisher, 1713-1935 . 2006 . Springer . 260 . 978-0-387-46408-4 .
Huelsenbeck . J. P. . Ronquist . F. . Bollback . J. P. . 2001 . Bayesian Inference of Phylogeny and Its Impact on Evolutionary Biology . Science . 294 . 5550 . 2310–2314 . 10.1126/science.1065889 . 11743192 . 2001Sci...294.2310H . 2138288 .
Wade . P. R. . Bayesian methods in conservation biology . Conservation Biology . 2000 . 14 . 5 . 1308–1316 . 10.1046/j.1523-1739.2000.99415.x. 55853118 .
Book: Sober . E. . Reconstructing the Past: Parsimony, Evolution, and Inference . 1991 . A Bradford Book . 978-0-262-69144-4 . 284 .
Helfenbein . K. G. . DeSalle . R. . Falsifications and corroborations: Karl Popper's influence on systematics . Molecular Phylogenetics and Evolution . 35 . 1 . 2005 . 271–280 . 10.1016/j.ympev.2005.01.003 . 15737596 .
Fienberg . S. E. . Kadane . J. B. . The presentation of Bayesian statistical analyses in legal proceedings . Journal of the Royal Statistical Society, Series D . 32 . 1/2 . 88–98 . 10.2307/2987595 . 2987595. 1983 .
Garaud . M. C. . Legal Standards and Statistical Proof in Title VII Litigation: In Search of a Coherent Disparate Impact Model . University of Pennsylvania Law Review . 139 . 2 . 455–503 . 1990 . 3312286. 10.2307/3312286 .
The Harvard Law Review Association . Developments in the Law: Confronting the New Challenges of Scientific Evidence . Harvard Law Review . 108 . 7 . 1995 . 1481–1605 . 10.2307/1341808 . 1341808.
Faigman . D. L. . Normative Constitutional Fact-Finding": Exploring the Empirical Component of Constitutional Interpretation . University of Pennsylvania Law Review . 139 . 3 . 1991 . 541–613 . 3312337. 10.2307/3312337 .

	Pr[Data\|Parameter] x Pr[Parameter]
	Pr[Data]