John K. Kruschke Explained

John K. Kruschke
Workplaces:	Indiana University Bloomington
Alma Mater:	University of California at Berkeley
Thesis Title:	A connectionist model of category learning
Thesis Url:	https://search.library.berkeley.edu/permalink/01UCS_BER/1thfj9n/alma991077511199706532
Thesis Year:	1990
Doctoral Advisors:	Stephen E. Palmer Robert Nosofsky

John Kendall Kruschke is an American psychologist and statistician known for his work in connectionist models of human learning,^[1] and in Bayesian statistical analysis.^[2] He is Provost Professor Emeritus^[3] ^[4] in the Department of Psychological and Brain Sciences at Indiana University Bloomington. He won the Troland Research Award from the National Academy of Sciences in 2002.^[5]

Research

Bayesian statistical analysis

Dissemination

Kruschke's popular textbook, Doing Bayesian Data Analysis,^[2] was notable for its accessibility and unique scaffolding of concepts. The first half of the book used the simplest type of data (i.e., dichotomous values) for presenting all the fundamental concepts of Bayesian analysis, including generalized Bayesian power analysis and sample-size planning. The second half of the book used the generalized linear model as a framework for explaining applications to a spectrum of other types of data.

Kruschke has written many tutorial articles about Bayesian data analysis, including an open-access article that explains Bayesian and frequentist concepts side-by-side.^[6] There is an accompanying online app that interactively does frequentist and Bayesian analyses simultaneously. Kruschke gave a video-recorded plenary talk on this topic at the United States Conference on Teaching Statistics (USCOTS).

Bayesian analysis reporting guidelines

Bayesian data analyses are increasing in popularity but are still relatively novel in many fields, and guidelines for reporting Bayesian analyses are useful for researchers, reviewers, and students. Kruschke's open-access Bayesian analysis reporting guidelines (BARG)^[7] provide a step-by-step list with explanation. For instance, the BARG recommend that if the analyst uses Bayesian hypothesis testing, then the report should include not only the Bayes factor but also the minimum prior model probability for the posterior model probability to exceed a decision criterion.

Assessing null values of parameters

Kruschke proposed a decision procedure for assessing null values of parameters, based on the uncertainty of the posterior estimate of the parameter.^[8] This approach contrasts with Bayesian hypothesis testing as model comparison.^[9]

Ordinal data

Liddell and Kruschke^[10] showed that the common practice of treating ordinal data (such as subjective ratings) as if they were metric values can systematically lead to errors of interpretation, even inversions of means. The problems were addressed by treating ordinal data with ordinal models, in particular an ordered-probit model. Frequentist techniques can also use ordered-probit models, but the authors favored Bayesian techniques for their robustness.

Models of learning

An overview of Kruschke's models of attentional learning through 2010 is provided in reference.^[11] That reference summarizes numerous findings from human learning that suggest attentional learning. That reference also summarizes a series of Kruschke's models of learning under a general framework.

Dimensionality in backpropagation networks

Back-propagation networks are a type of connectionist model, at the core of deep-learning neural networks. Kruschke's early work with back-propagation networks created algorithms for expanding or contracting the dimensionality of hidden layers in the network, thereby affecting how the network generalized from training cases to testing cases.^[12] The algorithms also improved the speed of learning.^[13]

Exemplar-based models and learned attention

The ALCOVE model of associative learningused gradient descent on error, as in back-propagation networks, to learn what stimulus dimensions to attend to or to ignore. The ALCOVE model was derived from the generalized context model ^[14] of R. M. Nosofsky. These models mathematically represent stimuli in a multi-dimensional space based on human perceived dimensions (such as color, size, etc.), and assume that training examples are stored in memory as complete exemplars (that is, as combinations of values on the dimensions). The ALCOVE model is trained with input-output pairs and gradually associates exemplars with trained outputs while simultaneously shifting attention toward relevant dimensions and away from irrelevant dimensions.

An enhancement of the ALCOVE model, called RASHNL, provided a mathematically coherent mechanism for gradient descent with limited-capacity attention.^[15] The RASHNL model assumed that attention is shifted rapidly when a stimulus is presented, while learning of attention across trials is more gradual.

These models were fitted to empirical data from numerous human learning experiments, and provided good accounts of relative difficulties of learning different types of associations, and of accuracies of individual stimuli during training and generalization. Those models can not explain all aspects of learning; for example, an additional mechanism was needed to account for the rapidity of human learning of reversal shift (i.e., what was "A" is now "B" and vice versa).^[16]

The highlighting effect

When people learn to categorize combinations of discrete features successively across a training session, people will tend to learn about the distinctive features of the later-learned items instead of learning about their complete combination of features. This attention to distinctive features of later-learned items is called "the highlighting effect", and is derived from an earlier finding known as "the inverse base-rate effect".^[17]

Kruschke conducted an extensive series of novel learning experiments with human participants, and developed two connectionist models to account for the findings. The ADIT model^[18] learned to attend to distinctive features, and the EXIT model^[19] used rapid shifts of attention on each trial.A canonical highlighting experiment and a review of findings was presented in reference.^[20]

Hybrid representation models for rules or functions with exceptions

People can learn to classify stimuli according to rules such as "a container for liquids that is wider than it is tall is called a bowl", along with exceptions to the rule such as "unless it is this specific case that is called a mug". A series of experiments demonstrated that people tend to classify novel items, that are relatively close to an exceptional case, according to the rule more than would be predicted by exemplar-based models. To account for the data, Erickson and Kruschke developed hybrid models that shifted attention between rule-based representation and exemplar-based representation.^[21] ^[22] ^[23]

People can also learn continuous relationships between variables, called functions, such as "a page's height is about 1.5 times its width". When people are trained with examples of functions that have exceptional cases, the data are accounted for by hybrid models that combine locally applicable functional rules.^[24]

Bayesian models of learning

Kruschke also explored Bayesian models of human-learning results that were addressed by his connectionist models. The effects of sequential or successive learning (such as highlighting, mentioned above) can be especially challenging for Bayesian models, which typically assume order-independence. Instead of assuming that the entire learning system is globally Bayesian, Kruschke developed models in which layers of the system are locally Bayesian.^[25] This "locally Bayesian learning" accounted for combinations of phenomena that are difficult for non-Bayesian learning models or for globally-Bayesian learning models.

Another advantage of Bayesian representations is that they inherently represent uncertainty of parameter values, unlike typical connectionist models that save only a single value for each parameter. The representation of uncertainty can be used to guide active learning in which the learner decides which cases would be most useful to learn about next.^[26]

Career

Kruschke joined the faculty of the Department of Psychological and Brain Sciences at Indiana University Bloomington as a lecturer in 1989. He remained at IU until he retired as Provost Professor Emeritus in 2022.

Education

Kruschke attained a B.A. in mathematics, with High Distinction in General Scholarship, from the University of California at Berkeley in 1983. In 1990, he received a Ph.D. in Psychology also from U. C. Berkeley.

Kruschke attended the 1978 Summer Science Program at The Thacher School in Ojai CA, which focused on astrophysics and celestial mechanics. He attended the 1988 Connectionist Models Summer School^[27] at Carnegie Mellon University.

Awards

Phi Beta Kappa (academic honor society), 1982.
National Science Foundation Graduate Fellowship, 1983.
National Institute of Mental Health FIRST Award, 1994.
Indiana University Trustees Teaching Excellence Recognition Awards: 1997, 1998, 1999, 2008, 2009, 2010, 2011, 2012.^[28]
Troland Research Award, National Academy of Sciences, 2002.^[5]
Remak Distinguished Scholar Award, Indiana University, 2012.
Provost Professor, Indiana University, 2018.^[3] ^[4]

External links

- Faculty page

Notes and References

Kruschke . John K. . ALCOVE: An exemplar-based connectionist model of category learning . Psychological Review . 1992 . 99 . 1 . 22–44 . 10.1037/0033-295X.99.1.22. 1546117 .
Book: Kruschke . John K. . Doing Bayesian Data Analysis: A tutorial with R, JAGS, and Stan . 2015 . Academic Press . 9780124058880 . 2nd.
Web site: Provost Professor Award . Office of the Vice Provost for Faculty & Academic Affairs . 2022-05-27 .
Web site: Hinnefeld . Steve . 2018-03-19 . IU Bloomington announces Sonneborn Award recipient, Provost Professors . 2021-10-01 . News at IU . en.
Web site: Troland Research Awards . National Academy of Sciences . 22 January 2022.
Kruschke . John K. . Liddell . Torrin M. . 2018 . The Bayesian new statistics: hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective . Psychonomic Bulletin & Review . 25 . 1 . 178–206 . 10.3758/s13423-016-1221-4 . 28176294 . 4523799 . free .
Kruschke . John K. . 2021 . Bayesian analysis reporting guidelines . Nature Human Behaviour . 5 . 10 . 1282–1291 . 10.1038/s41562-021-01177-7 . 34400814 . 8526359 .
Kruschke . John K. . 2018 . Rejecting or Accepting Parameter Values in Bayesian Estimation . Advances in Methods and Practices in Psychological Science . 1 . 2 . 270–280 . 10.1177/2515245918771304 . 125788648 .
Kruschke . John K. . Liddell . Torrin M. . 2018 . Bayesian data analysis for newcomers . Psychonomic Bulletin & Review . 25 . 1 . 155–177 . 10.3758/s13423-017-1272-1 . 28405907 . 4117798 . free .
Liddell . Torrin M . Kruschke . John K. . 2018 . Analyzing ordinal data with metric models: What could possibly go wrong? . Journal of Experimental Social Psychology . 79 . 328–348 . 10.1016/j.jesp.2018.08.009 . 149652068 .
Book: Models of attentional learning . Kruschke . John K. . Formal approaches to categorization . Cambridge University Press . 2011 . 120–152 . Pothos . E. M. . Wills . A. J. . 9781139493970 .
Kruschke . John K. . 1989 . Distributed bottlenecks for improved generalization in back-propagation networks . International Journal of Neural Networks Research and Applications . 1 . 187–193 .
Kruschke . John K. . J. R. Movellan . J. R. Movellan . 1991 . Benefits of Gain: Speeded learning and minimal hidden layers in back-propagation networks . IEEE Transactions on Systems, Man, and Cybernetics . 21 . 273–280 . 10.1109/21.101159 .
Nosofsky . R. M. . 1986 . Attention, similarity, and the identification-categorization . Journal of Experimental Psychology . 115 . 1 . 39–57 . 10.1037/0096-3445.115.1.39 . 2937873 .
Kruschke . John K. . Johansen . M. K. . 1999 . A model of probabilistic category learning . Journal of Experimental Psychology: Learning, Memory, and Cognition . 25 . 5 . 1083–1119 . 10.1037/0278-7393.25.5.1083 . 10505339 .
Kruschke . John K. . 1996 . Dimensional relevance shifts in category learning . Connection Science . 8 . 2 . 201–223 . 10.1080/095400996116893 .
Medin . D. L. . Edelson . S. M. . 1988 . Problem structure and the use of base-rate information from experience . Journal of Experimental Psychology: General . 117 . 1 . 68–85 . 10.1037/0096-3445.117.1.68 . 2966231 .
Kruschke . John K. . 1996 . Base rates in category learning . Journal of Experimental Psychology: Learning, Memory, and Cognition . 22 . 1 . 3–26 . 10.1037/0278-7393.22.1.3 . 8648289 .
Kruschke . John K. . 2001 . The inverse base rate effect is not explained by eliminative inference . Journal of Experimental Psychology: Learning, Memory, and Cognition . 27 . 6 . 1385–1400 . 10.1037/0278-7393.27.6.1385 . 11713874 .
Book: Kruschke . John K. . Highlighting: A canonical experiment . Ross . Brian . The Psychology of Learning and Motivation, Volume 51 . 2009 . 153–185 . 51 . 10.1016/S0079-7421(09)51005-5 . Academic Press .
Erickson . M. A. . Kruschke . John K. . 1998 . Rules and exemplars in category learning . Journal of Experimental Psychology: General . 127 . 2 . 107–140 . 10.1037/0096-3445.127.2.107 . 9622910 .
Erickson . M. A. . Kruschke . John K. . 2002 . Rule-based extrapolation in perceptual categorization . Psychonomic Bulletin & Review . 9 . 1 . 160–168 . 10.3758/BF03196273 . 12026949 . 2388327 . free .
Denton . S. E. . Kruschke . John K. . Erickson . M. A. . 2008 . Rule-based extrapolation: A continuing challenge for exemplar models . Psychonomic Bulletin & Review . 15 . 4 . 780–786 . 10.3758/PBR.15.4.780 . 18792504 . 559864 . free .
Kalish . M. L. . Lewandowsky . S. . 2004 . Population of Linear Experts: Knowledge Partitioning and Function Learning . Psychological Review . 111 . 4 . 1072–1099 . 10.1037/0033-295X.111.4.1072 . 15482074 .
Kruschke . John K. . 2006 . Locally Bayesian Learning with Applications to Retrospective Revaluation and Highlighting . Psychological Review . 113 . 4 . 677–699 . 10.1037/0033-295X.113.4.677 . 17014300 .
Kruschke . John K. . 2008 . Bayesian approaches to associative learning: From passive to active learning . Learning & Behavior . 36 . 3 . 210–226 . 10.3758/LB.36.3.210 . 18683466 . 16668044 . free .
Book: Touretzky . D . Hinton . GE . Sejnowski . T . Proceedings of the 1988 Connectionist Models Summer School . 1989 . 978-9999081214 . Morgan Kaufmann .
Web site: Office of the Vice Provost for Faculty and Academic Affairs: Trustees Teaching Award . .