Wenlin is a PhD student in the Machine Learning Group at University of Cambridge and in the Department of Empirical Inference at Max Planck Institute for Intelligent Systems, under the Cambridge-Tübingen PhD Fellowship. He is interested in deep learning, probabilistic and causal methods, and AI for Science. He is supervised by José Miguel Hernández-Lobato, Bernhard Schölkopf and Hong Ge.
For more information, please visit his personal website.
Publications
Optimal Client Sampling for Federated Learning
Wenlin Chen, Samuel Horváth, Peter Richtárik, August 2022. (Transactions on Machine Learning Research).
Abstract▼ URL
It is well understood that client-master communication can be a primary bottleneck in federated learning (FL). In this work, we address this issue with a novel client subsampling scheme, where we restrict the number of clients allowed to communicate their updates back to the master node. In each communication round, all participating clients compute their updates, but only the ones with important updates communicate back to the master. We show that importance can be measured using only the norm of the update and give a formula for optimal client participation. This formula minimizes the distance between the full update, where all clients participate, and our limited update, where the number of participating clients is restricted. In addition, we provide a simple algorithm that approximates the optimal formula for client participation, which allows for secure aggregation and stateless clients, and thus does not compromise client privacy. We show both theoretically and empirically that for Distributed SGD (DSGD) and Federated Averaging (FedAvg), the performance of our approach can be close to full participation and superior to the baseline where participating clients are sampled uniformly. Moreover, our approach is orthogonal to and compatible with existing methods for reducing communication overhead, such as local methods and communication compression methods.
Comment: arXiv
Meta-learning Adaptive Deep Kernel Gaussian Processes for Molecular Property Prediction
Wenlin Chen, Austin Tripp, José Miguel Hernández-Lobato, 2022. (arXiv).
Abstract▼ URL
We propose Adaptive Deep Kernel Fitting with Implicit Function Theorem (ADKF-IFT), a novel framework for learning deep kernel Gaussian processes (GPs) by interpolating between meta-learning and conventional deep kernel learning. Our approach employs a bilevel optimization objective where we meta-learn generally useful feature representations across tasks, in the sense that task-specific GP models estimated on top of such features achieve the lowest possible predictive loss on average. We solve the resulting nested optimization problem using the implicit function theorem (IFT). We show that our ADKF-IFT framework contains previously proposed Deep Kernel Learning (DKL) and Deep Kernel Transfer (DKT) as special cases. Although ADKF-IFT is a completely general method, we argue that it is especially well-suited for drug discovery problems and demonstrate that it significantly outperforms previous state-of-the-art methods on a variety of real-world few-shot molecular property prediction tasks and out-of-domain molecular property prediction and optimization tasks.
An Evaluation Framework for the Objective Functions of De Novo Drug Design Benchmarks
Austin Tripp, Wenlin Chen, José Miguel Hernández-Lobato, 2022. (In ICLR 2022 Workshop on Machine Learning for Drug Discovery).
Abstract▼ URL
De novo drug design has recently received increasing attention from the machine learning community. It is important that the field is aware of the actual goals and challenges of drug design and the roles that de novo molecule design algorithms could play in accelerating the process, so that algorithms can be evaluated in a way that reflects how they would be applied in real drug design scenarios. In this paper, we propose a framework for critically assessing the merits of benchmarks, and argue that most of the existing de novo drug design benchmark functions are either highly unrealistic or depend upon a surrogate model whose performance is not well characterized. In order for the field to achieve its long-term goals, we recommend that poor benchmarks (especially logP and QED) be deprecated in favour of better benchmarks. We hope that our proposed framework can play a part in developing new de novo drug design benchmarks that are more realistic and ideally incorporate the intrinsic goals of drug design.
To Ensemble or Not Ensemble: When Does End-to-End Training Fail?
Andrew Webb, Charles Reynolds, Wenlin Chen, Henry Reeve, Dan Iliescu, Mikel Luján, Gavin Brown, 2020. (In European Conference on Machine Learning (ECML)).
Abstract▼ URL
End-to-End training (E2E) is becoming more and more popular to train complex Deep Network architectures. An interesting question is whether this trend will continue—are there any clear failure cases for E2E training? We study this question in depth, for the specific case of E2E training an ensemble of networks. Our strategy is to blend the gradient smoothly in between two extremes: from independent training of the networks, up to to full E2E training. We find clear failure cases, where overparameterized models cannot be trained E2E. A surprising result is that the optimum can sometimes lie in between the two, neither an ensemble or an E2E system. The work also uncovers links to Dropout, and raises questions around the nature of ensemble diversity and multi-branch networks.
Comment: arXiv