Stable URL: http://links.jstor.org/sici?sici=0035-9246%281977%2939%3A1%3C1%3AMLFIDV%3E2.0.CO%3B2-Z Journal of the Royal Statistical Society. Series B (Methodological) is currently published byRoyal Statistical Society.
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTORarchive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/journals/rss.html. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.
JSTOR is an independentnot-for-profit organization dedicated to and preserving a digital archive of scholarly journals. For more information regarding JSTOR, please contact firstname.lastname@example.org.
http://www.jstor.org Fri Apr 6 01:07:17 2007
Maximum Likelihood from Incomplete Data via the EM Algorithm
By A. P. DEMPSTER, M. LAIRD N. and D. B. RDIN
Harvard University and Educational Testing Service
[Read before the ROYALSTATISTICAL SOCIETY a meeting organized by the RESEARCH at SECTION Wednesday, December 8th, 1976, Professor S. D. SILVEY the Chair] on in
A broadly applicable algorithm for computing maximum likelihood estimates from
incomplete data is presented at various levels of generality. Theory showing the monotone behaviour of the likelihood and convergence of the algorithm is derived. Many examples aresketched, including missing value situations, applications to grouped, censored or truncated data, finite mixture models, variance component estimation, hyperparameter estimation, iteratively reweighted least squares and factor analysis. Keywords : MAXIMUM LIKELIHOOD ; INCOMPLETE DATA ; EM ALGORITHM ; POSTERIOR MODE 1. INTRODUCTION THIS paper presents a general approach to iterative computation ofmaximum-likelihood estimates when the observations can be viewed as incomplete data. Since each iteration of the algorithm consists of an expectation step followed by a maximization step we call it the EM algorithm. The EM process is remarkable in part because of the simplicity and generality of the associated theory, and in part because of the wide range of examples which fall under its umbrella.When the underlying complete data come from an exponential family whose maximum-likelihood estimates are easily computed, then each maximization step of an EM algorithm is likewise easily computed. The term "incomplete data" in its general form implies the existence of two sample spaces %Y and X and a many-one mapping f r o m 3 to Y. The observed data y are a realization from C . Y The correspondingx in X is not observed directly, but only indirectly through y. More specifically, we assume there is a mapping x + y(x) from X to Y, and that x is known only to lie in X(y), the subset of X determined by the equation y = y(x), where y is the observed data. We refer to x as the complete data even though in certain examples x includes what are traditionally called parameters. We postulate a familyof sampling densities f(x +) depending on parameters and derive its corresponding family of sampling densities g(y[+). The complete-data specification f(... 1 ...) is related to the incomplete-data specification g( ... ...) by
(1.1) The EM algorithm is directed at finding a value of which maximizes g(y 1 +) g'iven an observed y, but it does so by making essential use of the...