Publications
Fast relative Entropy coding with A* coding
Gergely Flamich, Stratis Markou, José Miguel Hernández-Lobato, 2022. (In 39th International Conference on Machine Learning).
Abstract▼ URL
Relative entropy coding (REC) algorithms encode a sample from a target distribution Q using a proposal distribution P, such that the expected codelength is 𝒪(D_KL[Q||P]). REC can be seamlessly integrated with existing learned compression models since, unlike entropy coding, it does not assume discrete Q or P, and does not require quantisation. However, general REC algorithms require an intractable Ω(e^D_KL[Q||P]) runtime. We introduce AS* and AD* coding, two REC algorithms based on A* sampling. We prove that, for continuous distributions over ℝ, if the density ratio is unimodal, AS* has 𝒪(D_∞[Q||P]QP) expected runtime, where D_∞[Q||P]QP is the Rényi ∞-divergence. We provide experimental evidence that AD* also has 𝒪(D_∞[Q||P]QP) expected runtime. We prove that AS* and AD* achieve an expected codelength of 𝒪(D_KL[Q||P]). Further, we introduce DAD, an approximate algorithm based on AD which retains its favourable runtime and has bias similar to that of alternative methods. Focusing on VAEs, we propose the IsoKL VAE (IKVAE), which can be used with DAD* to further improve compression efficiency. We evaluate A* coding with (IK)VAEs on MNIST, showing that it can losslessly compress images near the theoretically optimal limit.
Practical Conditional Neural Processes via Tractable Dependent Predictions
Stratis Markou, James Requeima, Wessel P. Bruinsma, Anna Vaughan, Richard E. Turner, 2022. (In 10th International Conference on Learning Representations).
Abstract▼ URL
Conditional Neural Processes (CNPs; Garnelo et al., 2018) are meta-learning models which leverage the flexibility of deep learning to produce well-calibrated predictions and naturally handle off-the-grid and missing data. CNPs scale to large datasets and train with ease. Due to these features, CNPs appear well-suited to tasks from environmental sciences or healthcare. Unfortunately, CNPs do not produce correlated predictions, making them fundamentally inappropriate for many estimation and decision making tasks. Predicting heat waves or floods, for example, requires modelling dependencies in temperature or precipitation over time and space. Existing approaches which model output dependencies, such as Neural Processes (NPs; Garnelo et al., 2018b) or the FullConvGNP (Bruinsma et al., 2021), are either complicated to train or prohibitively expensive. What is needed is an approach which provides dependent predictions, but is simple to train and computationally tractable. In this work, we present a new class of Neural Process models that make correlated predictions and support exact maximum likelihood training that is simple and scalable. We extend the proposed models by using invertible output transformations, to capture non-Gaussian output distributions. Our models can be used in downstream estimation tasks which require dependent function samples. By accounting for output dependencies, our models show improved predictive performance on a range of experiments with synthetic and real data.