Calibrated probability assessment explained

Calibrated probability assessments are subjective probabilities assigned by individuals who have been trained to assess probabilities in a way that historically represents their uncertainty.^[1] ^[2] For example, when a person has calibrated a situation and says they are "80% confident" in each of 100 predictions they made, they will get about 80% of them correct. Likewise, they will be right 90% of the time they say they are 90% certain, and so on.

Calibration training improves subjective probabilities because most people are either "overconfident" or "under-confident" (usually the former).^[3] By practicing with a series of trivia questions, it is possible for subjects to fine-tune their ability to assess probabilities. For example, a subject may be asked:

True or False: "A hockey puck fits in a golf hole"

Confidence: Choose the probability that best represents your chance of getting this question right...

50% 60% 70% 80% 90% 100%

If a person has no idea whatsoever, they will say they are only 50% confident. If they are absolutely certain they are correct, they will say 100%. But most people will answer somewhere in between. If a calibrated person is asked a large number of such questions, they will get about as many correct as they expected. An uncalibrated person who is systematically overconfident may say they are 90% confident in a large number of questions where they only get 70% of them correct. On the other hand, an uncalibrated person who is systematically underconfident may say they are 50% confident in a large number of questions where they actually get 70% of them correct.

Alternatively, the trainee will be asked to provide a numeric range for a question like, "In what year did Napoleon invade Russia?", with the instruction that the provided range is to represent a 90% confidence interval. That is, the test-taker should be 90% confident that the range contains the correct answer.

Calibration training generally involves taking a battery of such tests. Feedback is provided between tests and the subjects refine their probabilities. Calibration training may also involve learning other techniques that help to compensate for consistent over- or under-confidence. Since subjects are better at placing odds when they pretend to bet money,^[1] subjects are taught how to convert calibration questions into a type of betting game which is shown to improve their subjective probabilities.^[4] Various collaborative methods have been developed, such as prediction market, so that subjective estimates from multiple individuals can be taken into account.

Stochastic modeling methods such as the Monte Carlo method often use subjective estimates from "subject matter experts". Research shows that such experts are very likely to be statistically overconfident and as such, the model will tend to underestimate uncertainty and risk. Calibration training is used to increase a person’s ability to provide accurate estimates for stochastic methods. Research found that most people could be calibrated if they took the time and that a person’s calibration i.e. performance in providing accurate estimates, carries over to estimates provided for content outside of the calibration training, such as the person’s field of work.^[5] Such calibration could only improve accuracy to an extent and suggested the use of corrective technologies in addition to calibration of experts.^[6]

The Applied Information Economics method systematically uses calibration training as part of a decision modeling process.

Criticisms of calibration

One of the findings in "Calibration of Probabilities: The State of the Art to 1980" was that training can only improve the calibration to a limited extent.^[1]

External links

credencecalibration.com, an online game for calibrating probability assessment

Notes and References

S. Lichtenstein, B. Fischhoff, and L. D. Phillips, "Calibration of Probabilities: The State of the Art to 1980", in Judgement under Uncertainty: Heuristics and Biases, ed. D. Kahneman and A. Tversky, (Cambridge University Press, 1982)
J. Edward Russo, Paul J. H. Schoemaker, Decision Traps, Simon & Schuster, 1989
Regina Kwon, "The Probability Problem", Baseline Magazine, Dec 10 2001
Douglas Hubbard "How to Measure Anything: Finding the Value of Intangibles in Business", John Wiley & Sons, 2007
Kynn, M. (2008), The ‘heuristics and biases’ bias in expert elicitation. Journal of the Royal Statistical Society, Series A (Statistics in Society), 171: 239–264. doi:10.1111/j.1467-985X.2007.00499.x
Lichtenstein, S., & Fischhoff, B. (1980). Training for calibration. Organizational Behavior and Human Performance, 26(2), 149–171. doi: 10.1016/0030-5073(80)90052-5