Real-world learning tasks may involve high-dimensional data sets with arbitrary patterns of missing data. In this paper we present a framework based on maximum likelihood density estimation for learning from such data sets. We use mixture models for the density estimates and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster et al., 1977) in deriving a learning algorithm---EM is used both for the estimation of mixture components and for coping with missing data. The resulting algorithm is applicable to a wide range of supervised as well as unsupervised learning problems. Results from a classification benchmark---the iris data set---are presented.
In Cowan, J.D., Tesauro, G., and Alspector, J. (eds.). Advances in Neural Information Processing Systems 6. Morgan Kaufmann Publishers, San Francisco, CA, 1994. [postscript] [pdf]