Javier Antorán

Email: ja666@cam.ac.uk

Originally from Zaragoza, Spain, Javier graduated from the University of Zaragoza in Telecommunications Engineering (EE/CS) in 2018. Javier went on to receive an MPhil in Machine Learning and Machine Intelligence from the University of Cambridge in 2019. Following this, he joined the group as a PhD student in 2019, under the supervision of Dr. José Miguel Hernández-Lobato. Javier is funded by the Microsoft Research PhD scholarship programme. His research interests include Bayesian deep learning, uncertainty in machine learning, representation learning, and information theory. My personal site can be found at https://javierantoran.github.io/about/

Publications

Depth Uncertainty in Neural Networks

Javier Antorán, James Urquhart Allingham, José Miguel Hernández-Lobato, 2020. (In Advances in Neural Information Processing Systems 33). Edited by Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, Hsuan-Tien Lin.

Abstract▼ URL

Existing methods for estimating uncertainty in deep learning tend to require multiple forward passes, making them unsuitable for applications where computational resources are limited. To solve this, we perform probabilistic reasoning over the depth of neural networks. Different depths correspond to subnetworks which share weights and whose predictions are combined via marginalisation, yielding model uncertainty. By exploiting the sequential structure of feed-forward networks, we are able to both evaluate our training objective and make predictions with a single forward pass. We validate our approach on real-world regression and image classification tasks. Our approach provides uncertainty calibration, robustness to dataset shift, and accuracies competitive with more computationally expensive baselines.

Comment: Code

Getting a CLUE: A Method for Explaining Uncertainty Estimates

Javier Antorán, Umang Bhatt, Tameem Adel, Adrian Weller, José Miguel Hernández-Lobato, April 2021. (In 9th International Conference on Learning Representations).

Abstract▼ URL

Both uncertainty estimation and interpretability are important factors for trustworthy machine learning systems. However, there is little work at the intersection of these two areas. We address this gap by proposing a novel method for interpreting uncertainty estimates from differentiable probabilistic models, like Bayesian Neural Networks (BNNs). Our method, Counterfactual Latent Uncertainty Explanations (CLUE), indicates how to change an input, while keeping it on the data manifold, such that a BNN becomes more confident about the input’s prediction. We validate CLUE through 1) a novel framework for evaluating counterfactual explanations of uncertainty, 2) a series of ablation experiments, and 3) a user study. Our experiments show that CLUE outperforms baselines and enables practitioners to better understand which input patterns are responsible for predictive uncertainty..

Adapting the Linearised Laplace Model Evidence for Modern Deep Learning

Javier Antorán, David Janz, James Urquhart Allingham, Erik A. Daxberger, Riccardo Barbano, Eric T. Nalisnick, José Miguel Hernández-Lobato, 2022. (In 39th International Conference on Machine Learning). Edited by Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvári, Gang Niu, Sivan Sabato. PMLR. Proceedings of Machine Learning Research.

Abstract▼ URL

The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community. The method provides reliable error bars and admits a closed-form expression for the model evidence, allowing for scalable selection of model hyperparameters. In this work, we examine the assumptions behind this method, particularly in conjunction with model selection. We show that these interact poorly with some now-standard tools of deep learning–stochastic approximation methods and normalisation layers–and make recommendations for how to better adapt this classic method to the modern setting. We provide theoretical support for our recommendations and validate them empirically on MLPs, classic CNNs, residual networks with and without normalisation layers, generative autoencoders and transformers.

Uncertainty as a form of transparency: Measuring, communicating, and using uncertainty

Umang Bhatt, Javier Antorán, Yunfeng Zhang, Q Vera Liao, Prasanna Sattigeri, Riccardo Fogliato, Gabrielle Melançon, Ranganath Krishnan, Jason Stanley, Omesh Tickoo, others, 2021. (In 4th AAAI/ACM Conference on Artificial Intelligence, Ethics and Society).

Abstract▼ URL

Algorithmic transparency entails exposing system properties to various stakeholders for purposes that include understanding, improving, and contesting predictions. Until now, most research into algorithmic transparency has predominantly focused on explainability. Explainability attempts to provide reasons for a machine learning model’s behavior to stakeholders. However, understanding a model’s specific behavior alone might not be enough for stakeholders to gauge whether the model is wrong or lacks sufficient knowledge to solve the task at hand. In this paper, we argue for considering a complementary form of transparency by estimating and communicating the uncertainty associated with model predictions. First, we discuss methods for assessing uncertainty. Then, we characterize how uncertainty can be used to mitigate model unfairness, augment decision-making, and build trustworthy systems. Finally, we outline methods for displaying uncertainty to stakeholders and recommend how to collect information required for incorporating uncertainty into existing ML pipelines. This work constitutes an interdisciplinary review drawn from literature spanning machine learning, visualization/HCI, design, decision-making, and fairness. We aim to encourage researchers and practitioners to measure, communicate, and use uncertainty as a form of transparency.

Bayesian Deep Learning via Subnetwork Inference

Erik A. Daxberger, Eric T. Nalisnick, James Urquhart Allingham, Javier Antorán, José Miguel Hernández-Lobato, 2021. (In 32nd International Conference on Machine Learning). Edited by Marina Meila, Tong Zhang. PMLR. Proceedings of Machine Learning Research.

Abstract▼ URL

The Bayesian paradigm has the potential to solve core issues of deep neural networks such as poor calibration and data inefficiency. Alas, scaling Bayesian inference to large weight spaces often requires restrictive approximations. In this work, we show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. The other weights are kept as point estimates. This subnetwork inference framework enables us to use expressive, otherwise intractable, posterior approximations over such subsets. In particular, we implement subnetwork linearized Laplace: We first obtain a MAP estimate of all weights and then infer a full-covariance Gaussian posterior over a subnetwork. We propose a subnetwork selection strategy that aims to maximally preserve the model’s predictive uncertainty. Empirically, our approach is effective compared to ensembles and less expressive posterior approximations over full networks.

Addressing Bias in Active Learning with Depth Uncertainty Networks… or Not

Chelsea Murray, James Urquhart Allingham, Javier Antorán, José Miguel Hernández-Lobato, 2021. (In I (Still) Can’t Believe It’s Not Better! Workshop at NeurIPS 2021, Virtual Workshop, December 13, 2021). Edited by Melanie F. Pradier, Aaron Schein, Stephanie L. Hyland, Francisco J. R. Ruiz, Jessica Zosa Forde. PMLR. Proceedings of Machine Learning Research.

Abstract▼ URL

Farquhar et al. [2021] show that correcting for active learning bias with underparameterised models leads to improved downstream performance. For overparameterised models such as NNs, however, correction leads either to decreased or unchanged performance. They suggest that this is due to an “overfitting bias” which offsets the active learning bias. We show that depth uncertainty networks operate in a low overfitting regime, much like underparameterised models. They should therefore see an increase in performance with bias correction. Surprisingly, they do not. We propose that this negative result, as well as the results Farquhar et al. [2021], can be explained via the lens of the bias-variance decomposition of generalisation error.