Alexander G. de G. Matthews
I work in the Cambridge Machine Learning Group. I study Bayesian nonparametric models, particularly Gaussian processes. In the area of approximate Bayesian inference, I have worked on variational methods and Markov chain Monte Carlo methods. I am also interested in Deep Learning and privacy issues in machine learning.
I have recently submitted my PhD thesis. The current version, which is awaiting examination, can be found here.
My Google Scholar page can be found here and my GitHub page can be found here. I sometimes Tweet about statistics and machine learning. A recent version of my CV is available. To contact me please email:
As an undergraduate I studied Natural Sciences at the University of Cambridge, specialising in theoretical physics. My fourth year project, which was later published, studied scattering in the fractional quantum Hall effect with Nigel Cooper. After that I worked in industry for Navetas Energy Management, a University of Oxford spin-out company which applies machine learning to the problem of home energy disaggregation.
PEER REVIEWED CONFERENCE PUBLICATIONS
AISTATS 2016 On Sparse Variational Methods and the Kullback-Leibler Divergence between Stochastic Processes joint work with James Hensman, Richard Turner and Zoubin Ghahramani.
Abstract: The variational framework for learning inducing variables (Titsias, 2009a) has had a large impact on the Gaussian process literature. The framework may be interpreted as minimizing a rigorously defined Kullback-Leibler divergence between the approximating and posterior processes. To our knowledge this connection has thus far gone unremarked in the literature. In this paper we give a substantial generalization of the literature on this topic. We give a new proof of the result for infinite index sets which allows inducing points that are not data points and likelihoods that depend on all function values. We then discuss augmented index sets and show that, contrary to previous works, marginal consistency of augmentation is not enough to guarantee consistency of variational inference with the original model. We then characterize an extra condition where such a guarantee is obtainable. Finally we show how our framework sheds light on interdomain sparse approximations and sparse approximations for Cox processes.
Abstract: Gaussian process (GP) models form a core part of probabilistic machine learning. Considerable research effort has been made into attacking three issues with GP models: how to compute efficiently when the number of data is large; how to approximate the posterior when the likelihood is not Gaussian and how to estimate covariance function parameter posteriors. This paper simultaneously addresses these, using a variational approximation to the posterior which is sparse in support of the function but otherwise free-form. The result is a Hybrid Monte-Carlo sampling scheme which allows for a non-Gaussian approximation over the function values and covariance parameters simultaneously, with efficient computations based on inducing-point sparse GPs.
AISTATS 2015 Scalable Variational Gaussian Process Classification joint work with James Hensman and Zoubin Ghahramani.
Abstract: Gaussian process classification is a popular method with a number of appealing properties. We show how to scale the model within a variational inducing point framework, outperforming the state of the art on benchmark datasets. Importantly, the variational formulation can be exploited to allow classification in problems with millions of data points, as we demonstrate in experiments.
PEER REVIEWED JOURNAL PUBLICATIONS
Physical Review B Scattering theory for quantum Hall anyons in a saddle point potential joint with Nigel Cooper.
Abstract: We study the theory of scattering of two anyons in the presence of a quadratic saddle-point potential and a perpendicular magnetic field. The scattering problem decouples in the centre-of mass and the relative coordinates. The scattering theory for the relative coordinate encodes the effects of anyon statistics in the two-particle scattering. This is fully characterized by two energy dependent scattering phase shifts. We develop a method to solve this scattering problem numerically, using a generalized lowest Landau level approximation.
NIPS 2015 Comparing lower bounds on the entropy of mixture distributions for use in variational inference joint work with James Hensman and Zoubin Ghahramani.
Abstract: McCullagh and Yang (2006) suggest a family of classification algorithms based on Cox processes. We further investigate the log Gaussian variant which has a number of appealing properties. Conditioned on the covariates, the distribution over labels is given by a type of conditional Markov random field. In the supervised case, computation of the predictive probability of a single test point scales linearly with the number of training points and the multiclass generalization is straightforward. We show new links between the supervised method and classical nonparametric methods. We give a detailed analysis of the pairwise graph representable Markov random field, which we use to extend the model to semi-supervised learning problems, and propose an inference method based on graph mincuts. We give the first experimental analysis on supervised and semi-supervised datasets and show good empirical performance.