Evidential decision theory explained

Evidential decision theory (EDT) is a school of thought within decision theory which states that, when a rational agent is confronted with a set of possible actions, one should select the action with the highest news value, that is, the action which would be indicative of the best outcome in expectation if one received the "news" that it had been taken. In other words, it recommends to "do what you most want to learn that you will do."[1]

EDT contrasts with causal decision theory (CDT), which prescribes taking the action that will causally produce the best outcome. While these two theories agree in many cases, they give different verdicts in certain philosophical thought experiments. For example, EDT prescribes taking only one box in Newcomb's paradox, while CDT recommends taking both boxes.[1]

Formal description

In a 1976 paper, Allan Gibbard and William Harper distinguished between two kinds of expected utility maximization. EDT proposes to maximize the expected utility of actions computed using conditional probabilities, namely

V(A)=\sum\limitsjP(Oj|A)D(Oj),

where

D(Oj)

is the desirability of outcome

Oj

and

P(Oj|A)

is the conditional probability of

Oj

given that action

A

occurs. This is in contrast to the counterfactual formulation of expected utility used by causal decision theory

U(A)=\sum\limitsjP(Al{\Box{ → }}Oj)D(Oj),

where the expression

P(Al{\Box{ → }}Oj)

indicates the probability of outcome

Oj

in the counterfactual situation in which action

A

is performed. Since

P(Al{\Box{ → }}Oj)

and

P(Oj|A)

are not always equal, these formulations of expected utility are not equivalent, leading to differences in actions prescribed by EDT and CDT.

Thought experiments

Different decision theories are often examined in their recommendations for action in different thought experiments.

Newcomb's paradox

See main article: Newcomb's paradox. In Newcomb's paradox, there is a predictor, a player, and two boxes designated A and B. The predictor is able to reliably predict the player's choices— say, with 99% accuracy. The player is given a choice between taking only box B, or taking both boxes A and B. The player knows the following:[2]

The player does not know what the predictor predicted or what box B contains while making the choice. Should the player take both boxes, or only box B?

Evidential decision theory recommends taking only box B in this scenario, because taking only box B is strong evidence that the predictor anticipated that the player would only take box B, and therefore it is very likely that box B contains $1,000,000. Conversely, choosing to take both boxes is strong evidence that the predictor knew that the player would take both boxes; therefore we should expect that box B contains nothing.[1]

Conversely, causal decision theory (CDT) would have recommended that the player takes both boxes, because by that time the predictor has already made a prediction (therefore, the action of the player will not affect the outcome).

Formally, the expected utilities in EDT are

\begin{align} V(takeonlyB)&=P(1MinboxB|takeonlyB) x \$1,000,000+P(nothinginboxB|takeonlyB) x \$0\\ &=0.99 x \$1,000,000+0.01 x \$0=\$990,000\\ V(takebothboxes)&=P(1MinboxB|takebothboxes) x \$1,001,000+P(nothinginboxB|takebothboxes) x \$1,000\\ &=0.01 x \$1,001,000+0.99 x \$1,000=\$11,000 \end{align}

Since

V(takeonlyB)>V(takebothboxes)

, EDT recommends taking only box B.

Twin prisoner's dilemma

See also: Prisoner's dilemma. In this variation on the Prisoner's Dilemma thought experiment, an agent must choose whether to cooperate or defect against her psychological twin, whose reasoning processes are exactly analogous to her own.

Aomame and her psychological twin are put in separate rooms and cannot communicate. If they both cooperate, they each get $5. If they both defect, they each get $1. If one cooperates and the other defects, then one gets $10, and the other gets $0. Assuming Aomame only cares about her individual payout, what should she do?[3]

Evidential decision theory recommends cooperating in this situation, because Aomame's decision to cooperate is strong evidence that her psychological twin will also cooperate, meaning that her expected payoff is $5. On the other hand, if Aomame defects, this would be strong evidence that her twin will also defect, resulting in an expected payoff of $1. Formally, the expected utilities are

\begin{align} V(Aomamecooperates)&=P(twincooperates|Aomamecooperates) x \$5+P(twindefects|Aomamecooperates) x \$0\\ &=1 x \$5+0 x \$0=\$5\\ V(Aomamedefects)&=P(twincooperates|Aomamedefects) x \$10+P(twindefects|Aomamedefects) x \$1\\ &=0 x \$10+1 x \$1=\$1. \end{align}

Since

V(Aomamecooperates)>V(Aomamedefects)

, EDT recommends cooperating.

Other supporting arguments

Even if one puts less credence on evidential decision theory, it may be reasonable to act as if EDT were true. Namely, because EDT can involve the actions of many correlated decision-makers, its stakes may be higher than causal decision theory and thus take priority.

Criticism

David Lewis has characterized evidential decision theory as promoting "an irrational policy of managing the news". James M. Joyce asserted, "Rational agents choose acts on the basis of their causal efficacy, not their auspiciousness; they act to bring about good results even when doing so might betoken bad news."[4]

See also

External links

Notes and References

  1. Book: Ahmed, Arif . Arif Ahmed (philosopher) . 2021 . Evidential Decision Theory . Cambridge University Press . 9781108607865.
  2. D. H. . Wolpert . G. . Benford . The lesson of Newcomb's paradox . Synthese. June 2013 . 190 . 9 . 1637–1646 . 10.1007/s11229-011-9899-3 . 41931515. 113227 .
  3. Greene . P. . Levinstein . B. . Act Consequentialism without Free Rides . Philosophical Perspectives . 100–101 . 34 . 1 . 2020. 10.1111/phpe.12138 . 211161349 .
  4. , p. 146