Win–stay, lose–switch explained

In psychology, game theory, statistics, and machine learning, win–stay, lose–switch (also win–stay, lose–shift) is a heuristic learning strategy used to model learning in decision situations. It was first invented as an improvement over randomization in bandit problems.^[1] It was later applied to the prisoner's dilemma in order to model the evolution of altruism.^[2]

The learning rule bases its decision only on the outcome of the previous play. Outcomes are divided into successes (wins) and failures (losses). If the play on the previous round resulted in a success, then the agent plays the same strategy on the next round. Alternatively, if the play resulted in a failure the agent switches to another action.

A large-scale empirical study of players of the game rock, paper, scissors shows that a variation of this strategy is adopted by real-world players of the game, instead of the Nash equilibrium strategy of choosing entirely at random between the three options.^[3] ^[4]

Notes and References

Robbins . H. . 1952 . Some aspects of the sequential design of experiments . Bulletin of the American Mathematical Society . 58 . 5 . 527–535 . 10.1090/s0002-9904-1952-09620-8. free .
Nowak . M. . K. . Sigmund . July 1, 1993 . A strategy of win-stay, lose-shift that outperforms tit-for-tat in the Prisoner's Dilemma game . Nature . 364 . 6432 . 56–58 . 10.1038/364056a0 . 8316296. 1993Natur.364...56N . 4238908 .
Web site: How to win at rock-paper-scissors . James . Morgan . BBC News . 2 May 2014 .
Wang . Zhijian . Xu . Bin . Zhou . Hai-Jun . Social cycling and conditional responses in the Rock-Paper-Scissors game . Scientific Reports . 4 . 5830 . 10.1038/srep05830 . July 25, 2014 . 25060115. 5376050 .

Win–stay, lose–switch explained

See also

Notes and References