Erik Daxberger

now at Apple

Email: ead54@cam.ac.uk

Hi there, I am Erik, a PhD student in the Machine Learning Group at the University of Cambridge, supervised by José Miguel Hernández-Lobato. As a Cambridge-Tübingen fellow, I will also spend a year of my PhD at the Max Planck Institute for Intelligent Systems in Tübingen, Germany, at the department led by Bernhard Schölkopf. My research interests broadly revolve around machine learning and artificial intelligence, with a current focus on methods at the intersection of probabilistic modeling and deep learning.

Before embarking on my PhD, I obtained a Master’s degree in Computer Science from ETH Zurich, where I also did research on discrete and mixed-variable Bayesian optimization with Andreas Krause. Prior to that, I was based in my hometown of Munich, Germany, where I obtained a Bachelor’s degree in Computer Science from Ludwigs-Maximilians-Universität, and did an internship at Siemens, working on statistical relational learning with Volker Tresp. I also spent a great year at the National University of Singapore, where I did research on batch Bayesian optimization with Bryan Kian Hsiang Low.

For more information, please visit my personal website.

Publications

Adapting the Linearised Laplace Model Evidence for Modern Deep Learning

Javier Antorán, David Janz, James Urquhart Allingham, Erik A. Daxberger, Riccardo Barbano, Eric T. Nalisnick, José Miguel Hernández-Lobato, 2022. (In 39th International Conference on Machine Learning). Edited by Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, Sivan Sabato. PMLR. Proceedings of Machine Learning Research.

Abstract▼ URL

The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community. The method provides reliable error bars and admits a closed-form expression for the model evidence, allowing for scalable selection of model hyperparameters. In this work, we examine the assumptions behind this method, particularly in conjunction with model selection. We show that these interact poorly with some now-standard tools of deep learning–stochastic approximation methods and normalisation layers–and make recommendations for how to better adapt this classic method to the modern setting. We provide theoretical support for our recommendations and validate them empirically on MLPs, classic CNNs, residual networks with and without normalisation layers, generative autoencoders and transformers.

Bayesian Deep Learning via Subnetwork Inference

Erik A. Daxberger, Eric T. Nalisnick, James Urquhart Allingham, Javier Antorán, José Miguel Hernández-Lobato, 2021. (In 32nd International Conference on Machine Learning). Edited by Marina Meila, Tong Zhang. PMLR. Proceedings of Machine Learning Research.

Abstract▼ URL

The Bayesian paradigm has the potential to solve core issues of deep neural networks such as poor calibration and data inefficiency. Alas, scaling Bayesian inference to large weight spaces often requires restrictive approximations. In this work, we show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. The other weights are kept as point estimates. This subnetwork inference framework enables us to use expressive, otherwise intractable, posterior approximations over such subsets. In particular, we implement subnetwork linearized Laplace: We first obtain a MAP estimate of all weights and then infer a full-covariance Gaussian posterior over a subnetwork. We propose a subnetwork selection strategy that aims to maximally preserve the model’s predictive uncertainty. Empirically, our approach is effective compared to ensembles and less expressive posterior approximations over full networks.