Unsupervised Learning 2004 Course Web Page

Gatsby Computational Neuroscience Unit
University College London

MSc Intelligent Systems

UCL logo
Keywords: Machine learning, probabilistic modelling, graphical models, approximate inference, Bayesian statistics

For a summary of the entire course you can read the following chapter:

Ghahramani (2004) Unsupervised Learning. In Bousquet, O., Raetsch, G. and von Luxburg, U. (eds) Advanced Lectures on Machine Learning LNAI 3176. Springer-Verlag.
Code: COMP GI02 / COMP 4c51 / Gatsby

Year: MSc in Intelligent Systems, PhD course at the Gatsby Unit

Prerequisites: A good background in statistics, calculus, linear algebra, and computer science. You should thoroughly review the maths in the following cribsheet [pdf] [ps] before the start of the course. You must either know Matlab or Octave, be taking a class on Matlab/Octave, or be willing to learn it on your own. Any student or researcher at UCL meeting these requirements is welcome to attend the lectures. Students wishing to take it for credit should consult with the course lecturer (email:

Term: 1, 2004

Time: 11.00 to 13.00 Mondays and Thursdays

Location: 4th floor, Gatsby Unit, 17 Queen Square

Taught By: Zoubin Ghahramani

Teaching Assistant: Katherine Heller.

Homework Assignments: all assignments (coursework) for this course are to be handed in to the Gatsby Unit, not to the CS department. Please hand in all assignments at the beginning of lecture on the due date to either Zoubin or Katherine. Late assignments will be penalised. If you are unable to come to class, you can also hand in assignments to Alexandra Boss, Room 408, Gatsby Unit.

Late Assignment Policy: Assignments that are handed in late will be penalised as follows: 10% penalty per day for every weekday late, until the answers are discussed in a review session. NO CREDIT will be given for assignments that are handed in after answers are discussed in the review session.

Textbook: There is no required textbook. However, I recommend the following recently published textbook as an excellent source for many of the topics here, and I will be occasionally assigning reading from it:

David J.C. MacKay (2003) Information Theory, Inference, and Learning Algorithms, Cambridge University Press. (also available online)
This chapter summarises the entire course:
Ghahramani (2004) Unsupervised Learning. In Bousquet, O., Raetsch, G. and von Luxburg, U. (eds) Advanced Lectures on Machine Learning LNAI 3176. Springer-Verlag.

NOTE: If you want to see lecture slides from last year click on the 2003 course website, but be warned that the slides may change this year.

Dates and Title Topics Materials
Oct 7, Oct 11
Introduction and Statistical Foundations
  • Maximum Likelihood
  • Bayesian learning
  • The relation to coding  length
  • Supervised vs Unsupervised vs Reinforcement Learning
Lecture Slides
Assignment 1 (due Oct 18)

Suggested Further Readings:

Oct 14 and Oct 18
Latent Variable Models
  • Mixture of Gaussians (MoG) and k-means
  • Factor Analysis (FA) and PCA
Lecture Slides
Suggested Further Readings:
  • Cribsheet [pdf] [ps]of Basic Maths Needed for Machine Learning
  • David MacKay's Book, Chapters 20, 22 and 23 on k-means and MoG
  • Max Welling's Class Notes on PCA and FA [pdf] [ps]
Oct 21 and 25
The EM Algorithm
  • General Theory
  • Application to MoG and to FA
  • Extensions
Lecture Slides
Assignment 2 (due Nov 1)
Oct 28 and Nov 1
Latent Variable Time Series Models
  • Hidden Markov Models (HMMs)
  • Forward-Backward and Viterbi
  • Linear Dynamical Systems
  • Kalman Filtering (KF) and Extended KF
  • Hybrid and Nonlinear Time Series Models
Lecture Slides
Suggested Further Readings:
Nov 4
Introduction to Graphical Models I
  • Conditional Independence
  • Undirected Graphs (Markov Networks)
  • Hammersley-Clifford Theorem
  • Directed Graphs (Bayesian Networks)
  • Factor Graphs
Lecture Slides
Assignment 3 (due Mon Nov 15)
Data Sets: geyser.txt, data1.txt
Suggested Further Readings:
The following three related articles appear in Arbib (ed): The Handbook of Brain Theory and Neural Networks (2nd edition)
Nov 8 and 11
Reading Week
Nov 15 and 18
Introduction to Graphical Models II
  • Belief Propagation
Assignment 4 (due Mon Nov 22)
Belief Propagation Demo: Fluffy and Moby
Factor Graph Propagation
Nov 22 and 25
Hierarchical and Nonlinear Models
  • Independent Components Analysis (ICA)
  • Sigmoid Belief Networks
  • Boltzmann Machines
Lecture Slides
Suggested Further Readings: Max Welling's Notes on ICA
David MacKay's Book, Ch 34 on ICA
Nov 29 and Dec 2
Sampling Methods
  • Monte Carlo:
    • simple Monte Carlo,
    • Rejection Sampling,
    • Importance Sampling
  • Markov chain Monte Carlo (MCMC):
    • Gibbs Sampling
    • Metropolis 
    • Hybrid Monte Carlo and other methods
Lecture Slides (MCMC)
Suggested Further Readings: David MacKay's Book, Ch 29 and 30 on Monte Carlo methods;
A more in-depth treatment of Monte Carlo methods is in Radford Neal's Technical Report;
Dec 6
Variational Approximations
  • Review of EM
  • Variational lower bounds and mean field methods
  • The Binary Latent Factor Model
  • Variational Message Passing
  • Expectation Propagation
Lecture Slides (Variational)
Assignment 5 (due Fri Dec 17)
Data: images.jpg
Code: genimages.m

Suggested Further Readings:

Dec 9
Bayesian Model Comparison
  • Occam's Razor
  • Model comparison and averaging
  • BIC, Laplace and sampling approximations
  • Variational Bayesian EM algorithm
Lecture Slides (Bayesian Model Comparison)

Suggested Reading:

  • Ghahramani (2004) Unsupervised Learning. In Bousquet, O., Raetsch, G. and von Luxburg, U. (eds) Advanced Lectures on Machine Learning LNAI 3176. Springer-Verlag.
    This book chapter is a summary of the whole course.

Aims: This course provides students with an in-depth introduction to statistical modelling and unsupervised learning techniques. It presents probabilistic approaches to modelling and their relation to coding theory and Bayesian statistics. A variety of latent variable models will be covered including mixture models (used for clustering), dimensionality reduction methods, time series models such as hidden Markov models which are used in speech recognition and bioinformatics, independent components analysis, hierarchical models, and nonlinear models.  The course will  present the foundations of probabilistic graphical models (e.g. Bayesian networks and Markov networks) as an overarching framework for unsupervised modelling. We will cover Markov chain Monte Carlo sampling methods and variational approximations for inference. Time permitting, students will also learn about other topics in machine learning.

Learning Outcomes:  To be able to understand the theory of unsupervised learning systems; to have in-depth knowledge of the main models used in UL; to understand the methods of exact and approximate inference in probabilistic models; to be able to recognise which models are appropriate for different real-world applications of machine learning methods.

Method: Lecture presentations with associated class problems.


Course Location:

Gatsby Unit
17 Queen Square [map]
Mondays and Thursdays 11:00 - 13:00


Zoubin   020 7679 1199