# Andrew McHutchon

Before starting my PhD I took the MEng course at Cambridge and specialised in Information Engineering in my third and fourth year. In particular I studied control, bioinformatics, and some information theory and statistics. As part of the MEng year I undertook a research project with Carl Rasmussen, on applying Machine Learning techniques to control; this has now continued on into my PhD research. Other avenues of research I have so far looked at include fast approximations to Gaussian Processes for uncertain inputs and training GPs with input noise. I am a member of Churchill College.

Learning to Unicycle

A number of years ago, two MEng students in the Control Group built a robotic unicycle for a control project. Neither they nor a number of successive students managed to design a controller to stabilise it. For my MEng project we decided to see if Machine Learning could succeed where classical control techniques had struggled. The video shows the results of learning a pitch-only controller (it might take a minute to load):

After this success we moved on to trying to stabilise the full unicycle – allowing it to fall in roll as well as pitch. Testing became very problematic however, as the unicycle needed to be prevented from hitting the ground during initial training but in such a way as to not interfere with the controller while it was upright. We tried using a wooden ‘skirt’ around the base of the unicycle but this had too large a moment of inertia and stopped the unicycle from being able to turn quickly enough. We had some limited success using a slack rope attached to the top but this restricted the unicycle’s movement too much. Currently we are building a table-top sized version, which will avoid all of these problems. Here’s a simulated version using some very long and complex differential equations to model the unicycle:

Gaussian Process Training with Input Noise

We proposed the use of the Taylor series with Gaussian Processes to allow training on noisy input data. The intuition is that input noise has a much greater effect in areas where the GP function has a steep gradient than in areas where the GP function is nearly flat.

The paper was accepted into NIPS 2011 and the current version can be  found here. Matlab code to run NIGP can be found here (version 11/07/2012 – faster and lower memory usage). The code should run fine for datasets of a reasonable size ( N <= 1000, D <= 20). Above that training will be slow but maybe still manageable. There are a few variables pre-computed and stored, which aids speed, but could cause some memory issues for large datasets on limited memory machines.

The NIGP derivations require various moments of the derivatives of a GP. Many of these derivations can be found in the following document “Differentiating Gaussian Processes“. The derivations are completed for the squared exponential kernel but the document should give you an overview as how to do it for other kernels.

Model Learnt Control

The unicycle controller above uses a framework, which has been developed in the Cambridge Machine Learning group, for automatic control policy learning. At its most basic level the framework can take an O.D.E., or a dataset from a real system, and train a controller to minimise a given loss function.  The framework has been applied to a number of problems other than the unicycle included a double-inverted pendulum. The core algorithm was presented in this ICML paper from Marc Deisenroth and Carl Edward Rasmussen.  Current areas of research are looking at using sparse GP approximations to speed up learning, trajectory following, disturbance rejection, and controlling ever more complex dynamical systems. We are hoping to release a version of the MATLAB code to implement this learning framework soon.

Here’s another example of what the MLC framework can do, this time driving a simulated car around a corner. The car starts at around 200kph, has 100m of straight before a 90 degree turn. It is free to turn and accelerate/brake as it likes, and to pick its own line around the corner. Note how it learns to first turn in the wrong direction to take a straighter, and hence faster, line through the corner.