
Department of Engineering 


Probabilistic Machine Learning 4f13 Michaelmas 2019
Teaching Survey: http://to.eng.cam.ac.uk/teaching/surveys/4F13_Mich.html.
Keywords: Machine learning, probabilistic modelling, graphical
models, approximate inference, Bayesian statistics
Taught By: Professor
Carl Edward Rasmussen
Code and Term: 4F13 Michaelmas term
Year: 4th year (part IIB) Engineering and MPhil in Machine
Learning and Machine Intelligence; the lectures are also open to students
in any department (but if you want to take it for credit, you need to make
arrangements for assessment within your own department, as our capacity
to mark coursework is already severely stretched).
Structure & Assessment:14 lectures, 2 coursework revisions,
3 pieces of course work. The evaluation is by coursework only, all
three pieces of course work carry an equal weight. There
is no final exam.
Time: 16 lectures on Mondays at 9:00  10:00 and Tuesdays
9:00  10:00, both in LT1. First lecture Monday October 14th. There
will also be an informal weekly office hour on Tuesdays at 17:0018:00
in the CBL seminar room BE438, first time on Oct 22nd. TThis is an
opportunity to ask questions, discuss the material etc., or just come
to listen in. There is of course no expectation or obligation that
you attend.
Location: Lecture Theatre 1 (LT1), Inglis Building,
ground floor, Department of Engineering, Trumpington Street
(map).
Prerequisites: A good background in statistics, calculus, linear
algebra, and computer science. 3F3 Signal and Pattern Processing. You should thoroughly review the maths in the following cribsheet
[pdf] [ps] before the start of
the course. The following Matrix Cookbook is also a useful
resource. If you want to do the optional coursework you need to know Matlab or Octave, or be willing to learn it on your own. Any student or researcher at Cambridge meeting these requirements is welcome to attend the lectures. Students wishing to take it for credit should consult with the course lecturers.
Textbook: There is no required textbook. However, the
material covered is treated excellent recent text books:
Kevin P. Murphy Machine Learning: a
Probabilistic Perspective, the MIT Press (2012).
David Barber Bayesian Reasoning
and Machine Learning, Cambridge University Press (2012), avaiable
freely on the web.
Christopher M. Bishop Pattern
Recognition and Machine
Learning. Springer (2006)
David J.C. MacKay
Information Theory, Inference, and Learning Algorithms,
Cambridge University Press (2003), available freely on the web.
Lecture Syllabus
This year, the exposition of the material will be centered around
three specific machine learning areas: 1) supervised nonparametric
probabilistic inference using Gaussian processes, 2) the TrueSkill
ranking system and 3) the latent Dirichlet Allocation model for
unsupervised learning in text.
The organisation of the handouts is changing. This year the material
will be structured into small chunks,
each containing a single core concept. Printed handouts won't be
provided at the lectures, but will be available on this web site. I
recommend that you don't bring printed slides to the lectures, but
of course you can do so if you think it works better for you.
Note: the links in the table below aren't up to date. If you want to
see lecture slides from a similar but not identical course taught last
year go to Michaelmas 2018 course website, but
be warned that the slides may change slightly.
October 14th 
Introduction to Probabilistic Machine Learning (2L):
Modelling data
Linear in
the parameters regression
Likelihood and the concept of
noise

October 15th 
Probability
fundamentals
Bayesian inference and
prediction with finite regression models
Marginal likelihood

October 21st 
Gaussian Processes (3L):
Parameters and functions
Gaussian Process, wee sequential generation demo
Posterior Gaussian Process

October 22nd 
GP marginal likelihood and
hyperparameters
Correspondence between linear models
and GPs
Should we use finite or infinite models?

October 28th 
Covariance functions
Quick introduction to the
gpml toolbox

October 29th 
Probabilistic Ranking (3L):
Introduction to ranking

Nov 4th, Nov 5th 
Gibbs sampling
Gibbs sampling demo, matlab script
Gibbs sampling in the TrueSkill
model

Nov 11th, Nov 12th 
Factor graphs
Message passing in
TrueSkill
Approximation by moment matching

Nov 18th, Nov 19th 
Modelling Document Collections
models of text
discrete binary distributions
categorical, multinomial, discrete distributions

Nov 25th, Nov 26th 
Modelling Document Collections
Simple categorical and mixture models
Learning in models with latent variables: the
EM algorithm

Dec 2nd, Dec 3rd 
Modelling Document Collections
Gibbs sampling in mixture
models, collapsed Gibbs
Latent Dirichlet Allocation topic models

Coursework
Course work is to be submitted via moodle
in electronic form no later than 12:00 noon on the date due. If you are
not an egineering undergraduate, please make sure you are
signed up for the module on moodle, check with Catherine Munn
cm861@cam.ac.uk, in room BE445 if you are in doubt. Each of the
three pieces of course work carry an equal weight in the evaluation.
The course work will be similar, but not identical to last year's,
and will be posted shortly on this web site. The duedates this year are:
Coursework #1
Coursework 1 is about regression
using Gaussian processes. You will need the following files
cw1a.mat and cw1e.mat.
Due: Friday 8th November, 2019 at 12:00 noon via moodle.
Coursework #2
Coursework 2 will be about
Probabilistic Ranking. This is the data file: tennis_data.mat. For matlab, use cw2.m,
gibbsrank.m
and eprank.m, or for python use
coursework2.ipynb, cw2.py, gibbsrank.py
and eprank.py.
Due: Friday 22nd November, 2019 at 12:00 noon via moodle.
Coursework #3
Coursework 3 is about the Latent
Dirichlet Allocation (LDA) model. You will need the kos_doc_data.mat,
and code for matlab bmm.m, lda.m,
sampDiscrete.m, or code for python
bmm.py, lda.py,
sampleDiscrete.py.
Due: Friday 6th December, 2019 at 12:00 noon via moodle.