Conditionality principle explained

The conditionality principle is a Fisherian principle of statistical inference that Allan Birnbaum formally defined and studied in an article in the Journal of the American Statistical Association, .

Informally, the conditionality principle can be taken as the claim that

Experiments which were not actually performed are not relevant to any statistical analysisand the implicit admonition that unrealized experiments should be ignored: Not included as part of any calculation or discussion of results.

Together with the sufficiency principle, Birnbaum's version of the principle implies the famous likelihood principle. Although the relevance of the proof to data analysis remains controversial among statisticians, many Bayesians and likelihoodists consider the likelihood principle foundational for statistical inference.

Historical background

Some statisticians in the mid 20th century had proposed that a valid statistical analysis must include all of the possible experiments which might have been conducted. Perhaps a series of desired experiments that each require some uncertain opportunity in order to carry out. The uncertain factor could be something such as good weather for a timely astronomical observation (the "experiment" being the search of the telescopic image for traces of some type of object), or availability of more data resources, such as the chance of discovery of some new fossil that would provide more evidence to answer a question covered by another paleontological study. Another resource issue might be the need for special access to private data (patients' medical records, for example) from one of several possible institutions, most of which would be expected to refuse permission; the nature of the data that could possibly be provided and the correct statistical model for its analysis would depend of which institution granted access and how it had collected and curated the private data that might become available for a study (technically, in this case the "experiment" has already been conducted by the medical facility, and some other party is analyzing the collected data to answer their own research question).

All these examples illustrate normal issues of how uncontrolled chance determines the nature of the experiment that can actually be conducted. Some analyses of the statistical significance of the outcomes of particular experiments incorporated the consequences such chance events had on the data that was obtained. Many statisticians were uncomfortable with the idea, and tended to tacitly skip seemingly extraneous random effects in their analyses; many scientists and researchers were baffled by a few statisticians' elaborate efforts to consider circumstantial effects in the statistical analysis of their experiments which the researchers considered irrelevant.

A few statisticians in the 1960s and 1970s took the idea even further, and proposed that an experiment could deliberately design-in a random factor, usually by introducing the use of some ancillary statistic,

h,

like the roll of a die, or the flip of a coin, and that the contrived random event could later be included in the data analysis, and somehow improve the inferred significance of the observed outcome. Most statisticians were uncomfortable with the idea, and the overwhelming majority of scientists and researchers considered it preposterous, and continue to the present to refute the idea and to reject any analyses based on it.

The conditionality principle is a formal rejection of the idea that "the road not taken" can possibly be relevant: In effect, it banishes from statistical analysis any consideration of effects from details of designs for experiments that were not conducted, even if they might have been planned or prepared for. The conditionality principle throws out all speculative considerations about what might have happened, and only allows the statistical analysis of the data obtained to include the procedures, circumstances, and details of the particular experiment actually conducted that produced the data actually collected. Experiments merely contemplated and not conducted, or missed opportunities for plans to obtain data, are all irrelevant and statistical calculations that include them are presumptively wrong.

Formulation

The conditionality principle makes an assertion about a composite experiment,

E,

that can be described as a suite or assemblage of several constituent experiments

Eh;

the index

h

is some ancillary statistic, i.e. a statistic whose probability distribution does not depend on any unknown parameter values. This means that obtaining an observation of some specific outcome

x

of the whole experiment

E

requires first observing a value for

h,

and then taking an observation

xh

from the indicated component experiment

Eh~.

The conditionality principle can be formally stated thus:

Conditionality Principle:

If

E

is any experiment having the form of a mixture of component experiments

Eh,

then for each outcome

l(Eh,xhr)

of

E,

the evidential meaning of any outcome

x

of any mixture experiment

E

is the same as that of the corresponding outcome

xh

of the corresponding component experiment

Eh

actually conducted, ignoring the overall structure of the mixed experiment; see .

Examples

An illustration of the conditionality principle, in a bioinformatics context, is given by .

Example scenario:The ancillary statistic

h

could be the roll of die, whose value will be one of

h=1,\ldots, 6~.

This random selection of an experiment is actually a wise precaution to curb the influence of a researchers' biases, if there is reason to suspect that the researcher might consciously or unconsciously select an experiment that seems like it would be likely to produce data that supports a favored hypothesis. The result of the dice roll then determines which of six possible experiments

E1,\ldots,E6 ,

is the one actually conducted to obtain the study's data.

Say that the die rolls a '3'. In that case, the result observed for

x

is actually

x3 ,

the outcome of experminent

E3~.

None of the other five experiments

E1,E2,E4,E5,~~or~~E6 

is ever conducted, and none of the other possible results is ever seen,

x1,x2,x4,x5,~~or~~x6 ,

that might have been observed if some other number than '3' had come up. The actual observed outcome,

x3 ,

is unaffected by any aspect of the other five sub-experiments that were not carried out, and only the procedures and experimental design of

E3 ,

the sub-experiment that was conducted to collect the data,

x3 ,

had any bearing on the statistical analysis the outcome, regardless of the fact that the experimental designs for the experiments which might have been conducted had been prepared at the time of the actual experiment

E3 ,

and might just as likely been performed.

The conditionality principle says that all of the details of

E1,E2,E4,E5,~~or~~E6 

must be excluded from the statistical analysis of the actual observation

x3 ,

and even the fact that experiment 3 was chosen by the roll of a die: Further, none of the possible randomness brought into the outcome by the statistic

h

(the dice roll) can be included in the analysis either. The only thing that determines the correct statistics to be used for the data analysis is experiment

E3 ,

and the only data to consider is

x3 ,

not

h=3~.

References