Restricted maximum likelihood explained

In statistics, the restricted (or residual, or reduced) maximum likelihood (REML) approach is a particular form of maximum likelihood estimation that does not base estimates on a maximum likelihood fit of all the information, but instead uses a likelihood function calculated from a transformed set of data, so that nuisance parameters have no effect.[1]

In the case of variance component estimation, the original data set is replaced by a set of contrasts calculated from the data, and the likelihood function is calculated from the probability distribution of these contrasts, according to the model for the complete data set. In particular, REML is used as a method for fitting linear mixed models. In contrast to the earlier maximum likelihood estimation, REML can produce unbiased estimates of variance and covariance parameters.[2]

The idea underlying REML estimation was put forward by M. S. Bartlett in 1937.[3] The first description of the approach applied to estimating components of variance in unbalanced data was by Desmond Patterson and Robin Thompson[4] of the University of Edinburgh in 1971, although they did not use the term REML. A review of the early literature was given by Harville.[5]

REML estimation is available in a number of general-purpose statistical software packages, including Genstat (the REML directive), SAS (the MIXED procedure), SPSS (the MIXED command), Stata (the mixed command), JMP (statistical software), and R (especially the lme4 and older nlme packages),as well as in more specialist packages such as MLwiN, HLM, ASReml, BLUPF90, wombat, Statistical Parametric Mapping and CropStat.

REML estimation is implemented in Surfstat, a Matlab toolbox for the statistical analysis of univariate and multivariate surface and volumetric neuroimaging data using linear mixed effects models and random field theory,[6] [7] but more generally in the fitlme package for modeling linear mixed effects models in a domain-general way.[8]

Notes and References

  1. Book: Dodge . Yadolah . Yadolah Dodge . The Oxford Dictionary of Statistical Terms . Oxford University Press . Oxford [Oxfordshire] . 2006 . 0-19-920613-9 . registration . (see REML)
  2. Baker, Bob. Estimating variances and covariances (broken, original link) available at the Wayback Machine https://web.archive.org/web/20080630063659/http://homepage.usask.ca/~rjb609/stats4.html
  3. Bartlett . M. S. . Properties of Sufficiency and Statistical Tests . 10.1098/rspa.1937.0109 . Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences . 160 . 901 . 268–282 . 1937 . 1937RSPSA.160..268B .
  4. Patterson . H. D. . Thompson . R. . 10.1093/biomet/58.3.545 . Recovery of inter-block information when block sizes are unequal . Biometrika . 58 . 3 . 545 . 1971 .
  5. Harville . D. A. . Maximum Likelihood Approaches to Variance Component Estimation and to Related Problems . Journal of the American Statistical Association . 72 . 358 . 320–338 . 10.2307/2286796 . 1977 . 2286796 .
  6. Web site: Detecting sparse signals in random fields, with an application to brain mapping.
  7. Web site: SurfStat. www.math.mcgill.ca.
  8. Web site: fitlme Documentation. www.mathworks.com.