Bayesian inference and machine learning have found numerous use cases in applied domains and basic science, such as disease modeling, climate research, economics, or astronomy. Computer programs used in these applications bring together inference and programming language design. They provide concise representations of models, which allow learning and inference to be automated.

Such programs can have more computational semantics than functions or generative processes specified in mathematical notation, in order to support rich statistical and numerical functionalities. For example, a probabilistic program is a computational object implementing statistical functionality, such as sampling, density evaluation, and conditioning. Similarly, models in machine learning are defined by computation graphs suitable for serialization, vectorization, or automatic differentiation. Programming language and compiler research can provide tools and techniques such as graph analysis, type theory, or abstract interpretation to facilitate these analyses and transformations.

One core aspect of the computational representation of probabilistic programs is to decide on a feasible abstract representation, be it computation graphs, intermediate representations in some normal form, graphical models, or augmented evaluator functions. There is much previous work on such representations, and operations they support: as or via monads, types, graphs, domain-specific intermediate languages, measures, continuations, message passing, etc. The purpose of this meeting is to discuss these computational representations and their implications. We would like to organize it as an informal event so researchers in this area can come together on a drawing board, exchange, explain, and challenge ideas; hopefully we can advance common understandings of these important questions.

A preliminary workshop, consisting of a deep dive into BUGS, a graph-based PPL designed for Gibbs sampling, and Turing, a Julia-based universal PPL, has already taken place at the end of May.

### Where

### When

Friday, 29th July 2022

## Speakers

(In no particular order)

### Andrew Thomas

University of Cambridge, MRC Biostatistics Unit; BUGS

### Martyn Plummer

University of Warwick; JAGS

### Bob Carpenter

Flatiron Institute – Simons Foundation; Stan

### Zichun Ye

2012 labs, Huawei; MindSpore

### Chris Rackauckas

MIT, Julia Computing, Pumas-AI; SciML

### Maria Gorinova

Twitter; SlicStan

## Event Schedule

All times are in BST, i.e., GMT+1.

In order to accomodate for as many speakers as possible, the workshop will take place in the (European) afternoon and early evening.

As the event will be fully remote, there are no explicit coffee or food breaks; instead, sessions are separated by short and long breaks, during which participants have the possibility to interact and discuss or just recover.

We have set up a Gathertown space to allow informal interaction, facilitate more private discussions, and have a breakout space: https://app.gather.town/events/RIyb7zGXyzCJ2qPPFtKD. This will mostly be active during the breaks, and after the final discussion round.

#### Public admittance to Zoom (link)

#### Integrating equation solvers with probabilistic programming through differentiable programming Chris Rackauckas

Many probabilistic programming languages (PPLs) attempt to integrate with equation solvers (differential equations, nonlinear equations, partial differential equations, etc.) from the inside, i.e. the developers of the PPLs like Stan provide differential equation solver choices as part of the suite. However, as equation solvers are an entire discipline to themselves with many active development communities and subfields, this places an immense burden on PPL developers to keep up with the changing landscape of tens of thousands of independent researchers. In this talk we will explore how Julia PPLs such as Turing.jl support of equation solvers from the outside, i.e. how the tools of differentiable programming allows equation solver libraries to be compatible with PPLs without requiring any co-development between the communities. We will discuss how this has enabled many advanced methods, such as adaptive solvers for stochastic differential equations and nonlinear tearing of differential-algebraic equations, to be integrated into the Turing.jl environment with no development effort required, and how this enables many workflows in scientific machine learning (SciML).

#### Program Analysis of Probabilistic Programs Maria Gorinova

Probabilistic programming strives to make statistical analysis more accessible by separating probabilistic modelling from probabilistic inference. In practice this decoupling is difficult. Different inference techniques are applicable to different classes of models, have different advantages and shortcomings, and require different optimisation and diagnostics techniques to ensure robustness and reliability. No single inference algorithm can be used as a probabilistic programming back-end that is simultaneously reliable, efficient, black-box, and general. Probabilistic programming languages often choose a single algorithm to apply to a given problem, thus inheriting its limitations. This talk advocates for using program analysis to make better use of the available structure in probabilistic programs, and thus better utilising the underlying inference algorithm. I will show several techniques, which analyse a probabilistic program and adapt it to make inference more efficient, sometimes in a way that would have been tedious or impossible to do by hand.

#### Short break

#### Simulation based inference: A review Martyn Plummer

We consider the development of Markov Chain Monte Carlo methods, from the late 1980s Gibbs sampling, to present-day gradient based methods and piecewise deterministic Markov processes. In parallel, we show how these ideas have been implemented in successive generations of statistical software for Bayesian inference. These software packages have been instrumental in popularizing applied Bayesian modelling across a wide variety of scientific domains. They provide an invaluable service to applied statisticians in hiding the complexities of MCMC from the user while providing a convenient modelling language and tools to summarize the output from a Bayesian model. As research into new MCMC methods remains very active, it is likely that future generations of software will incorporate new methods to improve the user experience.

#### Comprehension, maps, and partial evaluation in differentiable programming (with applications to Gaussian processes) Bob Carpenter

In this talk, I will show how we can apply comprehensions of the variety found in set theory and Python to the efficient construction of matrices in differentiable programming languages such as Turing.jl, PyMC, Pyro, and Stan. For example, explicitly coding a compound covariance function for a Gaussian process can lead to a ridiculously large computation graph, which blows up dynamic memory requirements and frustrates memory locality. By formulating the process as a comprehension, an essentially a map-like operation, we can partially evaluate to reduce memory, naturally parallelize, and guarantee const correctness for our matrix types.

#### Long break

#### MindIR: the intermediate representation of MindSpore Zichun Ye

MindSpore is a new deep learning computing framework designed to accomplish three goals: easy development, efficient execution, and adaptability to all scenarios. MindIR, a function-style IR based on graph representation, is designed to achieve these goals. In this talk, we will start with the motivation of MindIR as the IR of an AI framework. Then we will talk about the syntax of MindIR with some examples. We will also cover the function-style semantics of MindIR and the application in scenarios including higher-order functions and control flow.

#### Jaxprs: dead simple, surprisingly versatile Sharad Vikram

JAX is a Python library for numerical computing based on function transformations, including automatic differentiation, vectorized batching, and end-to-end compilation. To build JAX we made a simple "jaxpr" IR. What surprised us is how many other projects found ways to repurpose jaxprs, despite the IR's extreme simplicity. In this talk, we'll explain jaxprs, how they're embedded in Python, and the simple way to transform them. Then we'll dive a bit deeper into probabilistic programming applications. By the end, you'll be able to write your own jaxpr interpreters too!

#### Short break

#### Panel discussion

#### After-party

## Location

The workshop will be held online, in a purely virtual fashion.

### Online on Zoom

- Participation link: https://eng-cam.zoom.us/j/84748867364
- Registration link (in case you want to receive updates about recorded talks or future meetings): https://forms.gle/h33X3hZ1ovvt2SLr8

## Organizers

- Hong Ge (University of Cambridge)
- Andrew Thomas (University of Cambridge, MRC Biostatistics Unit)
- Daniela De Angelis (University of Cambridge, MRC Biostatistics Unit)
- Philipp Gabler (Graz University of Technology, Know-Center GmbH)
- Zoubin Ghahramani (University of Cambridge)
- Cambridge Centre for Data-Driven Discovery (C2D3)

## Contact Us

`pgabler (at) student.tugraz.at`