CAM logo
Machine Learning

2006 Advanced Tutorial Lecture Series

Department of Engineering
University of Cambridge

CAM logo
The goal of this Advanced Tutorial Lecture Series is to introduce important ideas in Machine Learning and related fields to the community of interested researchers in Cambridge and beyond. Lectures will be aimed at researchers or PhD students in Engineering, Physics, Statistics, or Computer Science who might know a little bit about Machine Learning but are not experts.

Each talk will be a 90 minute tutorial, followed by about 30 minutes of question/discussion time.

All lectures will take place on Thursdays, from 4pm to 6pm, in Lecture Room 4, in the Engineering Department, Trumpington Street, Cambridge [map].

Lectures are open to any one interested.


Thurs Oct 5, 4pm      Professor Chris Bishop Mixture Models and the EM Algorithm
Thurs Oct 26, 4pm      Edward Snelson Gaussian Processes for Machine Learning


- Introduction to Gaussian process regression
- Covariance functions
- Beyond regression: applications of Gaussian processes, including classification, nonlinear dimensionality reduction (GPLVM)
- Sparse GP approximations for large data sets


A Gaussian process (GP) model is a Bayesian probabilistic model for nonlinear regression. As such, it can be useful to any data modeller interested in making predictions from noisy data, together with uncertainty estimates. Although at its core a regression model, the GP can be built upon to be used in a wide range of applications. I will discuss a variety of these, from classification tasks to human pose modelling. Other topics I will discuss are the design of covariance functions for different tasks, and the recent development of sparse GP approximations to handle large data sets.

Thurs Nov 9, 4pm      Ricardo Silva Causality

There is a difference between "seeing" and "doing". Consider the following example: while one might observe that people in Florida live longer, this does not mean one should consider moving there to achieve longevity. It is actually the case that many retired Americans move to Florida, which explains its high life expectancy. This illustrates that we can have a feature that is both useful for predicting life expectancy, but at the same time worthless when considered as a treatment.

To predict effects of interventions such as medical treatments, public policies, or even the behaviour of genetically engineered cells, one needs a causal model. While there exists a well-established machinery designed to estimate such models using experimental data, quite often one cannot perform experiments for reasons such as high costs or ethical issues. Observational data (i.e., non-experimental), however, can be easily collected in many cases: data on the association between smoking and lung cancer is the classical example.

In this talk we will explore several modern techniques of learning causal effects from observational data. Such techniques rely on important assumptions linking statistical distributions to causal connections and can be explored in many exciting ways by machine learning algorithms. Inferring causality from observational data is certainly among the hardest learning tasks of all, but it is also a task with high pay-offs.

The outline of the talk is as follows:

- Motivation and definitions / observational studies / the problems of directionality and confounding
- Languages for causal modeling: graphical models and potential outcomes
- Identification of effects using graphical models
- Notions of learning causal structure from data
- Applications

Thurs Nov 16, 4pm      Tom Minka Expectation Propagation

Expectation propagation is an algorithm for Bayesian machine learning that is especially well-suited to large databases and dynamic systems. Given prior knowledge expressed as a graphical model, it tunes the parameters of a "simple" probability distribution (such as a Gaussian) to best match the posterior distribution (which, in its exact form, could be very complex). This simplified posterior can be used to describe the data, make predictions, and quickly incorporate new data. Expectation propagation has been successfully applied to visual tracking, wireless communication, document analysis, diagram analysis, and matchmaking in online games.

Thurs Nov 23, 4pm      Iain Murray Advanced MCMC Methods

Markov chain Monte Carlo (MCMC) algorithms draw correlated samples from probability distributions. These allow approximate computation of complex high-dimensional integrals; obvious applications include Bayesian statistics and statistical physics. This tutorial will not assume any prior knowledge of MCMC, but will cover state-of-the-art techniques.

Tentative Schedule:

- Introduction: Metropolis--Hastings vs "simpler" Monte Carlo methods
- Hamiltonian (Hybrid) Monte Carlo
- Auxiliary variables and Slice sampling
- Out of equilibrium: tempering/annealing and related advances.
- Possibly a few words on infinite models and doubly-intractable distributions.

See also the 2007 Lecture Series