Signal Processing

Techniques for analyzing, modifying, and synthesizing signals, such as audio, image, and video data.

Structure Discovery in Nonparametric Regression through Compositional Kernel Search

David Duvenaud, James Robert Lloyd, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani, June 2013. (In 30th International Conference on Machine Learning). Atlanta, Georgia, USA.

Abstract▼ URL

Despite its importance, choosing the structural form of the kernel in nonparametric regression remains a black art. We define a space of kernel structures which are built compositionally by adding and multiplying a small number of base kernels. We present a method for searching over this space of structures which mirrors the scientific discovery process. The learned structures can often decompose functions into interpretable components and enable long-range extrapolation on time-series datasets. Our structure search method outperforms many widely used kernels and kernel combination methods on a variety of prediction tasks.

Convolutional neural networks: A magic bullet for gravitational-wave detection?

Timothy Gebhard, Niki Kilbertus, Ian Harry, Bernhard Schölkopf, September 2019. (Physical Review D). American Physical Society. DOI: https://doi.org/10.1103/PhysRevD.100.063015.

Abstract▼ URL

In the last few years, machine learning techniques, in particular convolutional neural networks, have been investigated as a method to replace or complement traditional matched filtering techniques that are used to detect the gravitational-wave signature of merging black holes. However, to date, these methods have not yet been successfully applied to the analysis of long stretches of data recorded by the Advanced LIGO and Virgo gravitational-wave observatories. In this work, we critically examine the use of convolutional neural networks as a tool to search for merging black holes. We identify the strengths and limitations of this approach, highlight some common pitfalls in translating between machine learning and gravitational-wave astronomy, and discuss the interdisciplinary challenges. In particular, we explain in detail why convolutional neural networks alone cannot be used to claim a statistically significant gravitational-wave detection. However, we demonstrate how they can still be used to rapidly flag the times of potential signals in the data for a more detailed follow-up. Our convolutional neural network architecture as well as the proposed performance metrics are better suited for this task than a standard binary classifications scheme. A detailed evaluation of our approach on Advanced LIGO data demonstrates the potential of such systems as trigger generators. Finally, we sound a note of caution by constructing adversarial examples, which showcase interesting “failure modes” of our model, where inputs with no visible resemblance to real gravitational-wave signals are identified as such by the network with high confidence.

Using Inertial Sensors for Position and Orientation Estimation

Manon Kok, Jeroen D. Hol, Thomas B. Schön, 2017. (Foundations and Trends in Signal Processing).

Abstract▼ URL

In recent years, MEMS inertial sensors (3D accelerometers and 3D gyroscopes) have become widely available due to their small size and low cost. Inertial sensor measurements are obtained at high sampling rates and can be integrated to obtain position and orientation information. These estimates are accurate on a short time scale, but suffer from integration drift over longer time scales. To overcome this issue, inertial sensors are typically combined with additional sensors and models. In this tutorial we focus on the signal processing aspects of position and orientation estimation using inertial sensors. We discuss different modeling choices and a selected number of important algorithms. The algorithms include optimization-based smoothing and filtering as well as computationally cheaper extended Kalman filter and complementary filter implementations. The quality of their estimates is illustrated using both experimental and simulated data.

Comment: arXiv

Scalable Magnetic Field SLAM in 3D Using Gaussian Process Maps

Manon Kok, Arno Solin, July 2018. (In Proceedings of the 21th International Conference on Information Fusion (accepted for publication)). Cambridge, UK.

Abstract▼ URL

We present a method for scalable and fully 3D magnetic field simultaneous localisation and mapping (SLAM) using local anomalies in the magnetic field as a source of position information. These anomalies are due to the presence of ferromagnetic material in the structure of buildings and in objects such as furniture. We represent the magnetic field map using a Gaussian process model and take well-known physical properties of the magnetic field into account. We build local magnetic field maps using three-dimensional hexagonal block tiling. To make our approach computationally tractable we use reduced-rank Gaussian process regression in combination with a Rao–Blackwellised particle filter. We show that it is possible to obtain accurate position and orientation estimates using measurements from a smartphone, and that our approach provides a scalable magnetic SLAM algorithm in terms of both computational complexity and map storage.

The Randomized Dependence Coefficient

David Lopez-Paz, Philipp Hennig, Bernhard Scholköpf, December 2013. (In Advances in Neural Information Processing Systems 27). Lake Tahoe, California, USA.

Abstract▼ URL

We introduce the Randomized Dependence Coefficient (RDC), a measure of non-linear dependence between random variables of arbitrary dimension based on the Hirschfeld-Gebelein-Rényi Maximum Correlation Coefficient. RDC is defined in terms of correlation of random non-linear copula projections; it is invariant with respect to marginal distribution transformations, has low computational cost and is easy to implement: just five lines of R code, included at the end of the paper.

On orientation estimation using iterative methods in Euclidean space

Martin A. Skoglund, Zoran Sjanic, Manon Kok, July 2017. (In Proceedings of the 20th International Conference on Information Fusion). Xi'an, China. DOI: 10.23919/ICIF.2017.8009830.

Abstract▼ URL

This paper presents three iterative methods for orientation estimation. The first two are based on iterated Extended Kalman filter (IEKF) formulations with different state representations. The first is using the well-known unit quaternion as state (q-IEKF) while the other is using orientation deviation which we call IMEKF. The third method is based on nonlinear least squares (NLS) estimation of the angular velocity which is used to parametrise the orientation. The results are obtained using Monte Carlo simulations and the comparison is done with the non-iterative EKF and multiplicative EKF (MEKF) as baseline. The result clearly shows that the IMEKF and the NLS-based method are superior to q-IEKF and all three outperform the non-iterative methods.

Learning Stationary Time Series using Gaussian Process with Nonparametric Kernels

Felipe Tobar, Thang D. Bui, Richard E. Turner, Dec 2015. (In Advances in Neural Information Processing Systems 29). Montréal CANADA.

Abstract▼ URL

We introduce the Gaussian Process Convolution Model (GPCM), a two-stage nonparametric generative procedure to model stationary signals as the convolution between a continuous-time white-noise process and a continuous-time linear filter drawn from Gaussian process. The GPCM is a continuous-time nonparametricwindow moving average process and, conditionally, is itself a Gaussian process with a nonparametric kernel defined in a probabilistic fashion. The generative model can be equivalently considered in the frequency domain, where the power spectral density of the signal is specified using a Gaussian process. One of the main contributions of the paper is to develop a novel variational freeenergy approach based on inter-domain inducing variables that efficiently learns the continuous-time linear filter and infers the driving white-noise process. In turn, this scheme provides closed-form probabilistic estimates of the covariance kernel and the noise-free signal both in denoising and prediction scenarios. Additionally, the variational inference procedure provides closed-form expressions for the approximate posterior of the spectral density given the observed data, leading to new Bayesian nonparametric approaches to spectrum estimation. The proposed GPCM is validated using synthetic and real-world signals.

Unsupervised State-Space Modeling Using Reproducing Kernels

Felipe Tobar, Petar M. Djurić, Danilo P. Mandic, 2015. (IEEE Transactions on Signal Processing).

Abstract▼ URL

A novel framework for the design of state-space models (SSMs) is proposed whereby the state-transition function of the model is parametrized using reproducing kernels. The nature of SSMs requires learning a latent function that resides in the state space and for which input-output sample pairs are not available, thus prohibiting the use of gradient-based supervised kernel learning. To this end, we then propose to learn the mixing weights of the kernel estimate by sampling from their posterior density using Monte Carlo methods. We first introduce an offline version of the proposed algorithm, followed by an online version which performs inference on both the parameters and the hidden state through particle filtering. The accuracy of the estimation of the state-transition function is first validated on synthetic data. Next, we show that the proposed algorithm outperforms kernel adaptive filters in the prediction of real-world time series, while also providing probabilistic estimates, a key advantage over standard methods.

High-Dimensional Kernel Regression: A Guide for Practitioners

Felipe Tobar, Danilo P. Mandic, 2015. (In Trends in Digital Signal Processing: A Festschrift in Honour of A.G. Constantinides). Edited by Y. C. Lim, H. K. Kwan, W.-C. Siu. CRC Press.

Design of Positive-Definite Quaternion Kernels

Felipe Tobar, Danilo P. Mandic, 2015. (IEEE Signal Processing Letters).

Abstract▼ URL

Quaternion reproducing kernel Hilbert spaces (QRKHS) have been proposed recently and provide a high-dimensional feature space (alternative to the real-valued multikernel approach) for general kernel-learning applications. The current challenge within quaternion-kernel learning is the lack of general quaternion-valued kernels, which are necessary to exploit the full advantages of the QRKHS theory in real-world problems. This letter proposes a novel way to design quaternion-valued kernels, this is achieved by transforming three complex kernels into quaternion ones and then combining their real and imaginary parts. Building on this general construction, our emphasis is on a new quaternion kernel of polynomial features, which is assessed in the prediction of bodysensor networks applications.

Modelling of Complex Signals using Gaussian Processes

Felipe Tobar, Richard E. Turner, 2015. (In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)).

Abstract▼ URL

In complex-valued signal processing, estimation algorithms require complete knowledge (or accurate estimation) of the second order statistics, this makes Gaussian processes (GP) well suited for modelling complex signals, as they are designed in terms of covariance functions. Dealing with bivariate signals using GPs require four covariance matrices, or equivalently, two complex matrices. We propose a GP-based approach for modelling complex signals, whereby the second-order statistics are learnt through maximum likelihood; in particular, the complex GP approach allows for circularity coefficient estimation in a robust manner when the observed signal is corrupted by (circular) white noise. The proposed model is validated using climate signals, for both circular and noncircular cases. The results obtained open new possibilities for collaboration between the complex signal processing and Gaussian processes communities towards an appealing representation and statistical description of bivariate signals.

Statistical Models for Natural Sounds

Richard E. Turner, 2010. Gatsby Computational Neuroscience Unit, UCL,

Abstract▼ URL

It is important to understand the rich structure of natural sounds in order to solve important tasks, like automatic speech recognition, and to understand auditory processing in the brain. This thesis takes a step in this direction by characterising the statistics of simple natural sounds. We focus on the statistics because perception often appears to depend on them, rather than on the raw waveform. For example the perception of auditory textures, like running water, wind, fire and rain, depends on summary-statistics, like the rate of falling rain droplets, rather than on the exact details of the physical source. In order to analyse the statistics of sounds accurately it is necessary to improve a number of traditional signal processing methods, including those for amplitude demodulation, time-frequency analysis, and sub-band demodulation. These estimation tasks are ill-posed and therefore it is natural to treat them as Bayesian inference problems. The new probabilistic versions of these methods have several advantages. For example, they perform more accurately on natural signals and are more robust to noise, they can also fill-in missing sections of data, and provide error-bars. Furthermore, free-parameters can be learned from the signal. Using these new algorithms we demonstrate that the energy, sparsity, modulation depth and modulation time-scale in each sub-band of a signal are critical statistics, together with the dependencies between the sub-band modulators. In order to validate this claim, a model containing co-modulated coloured noise carriers is shown to be capable of generating a range of realistic sounding auditory textures. Finally, we explored the connection between the statistics of natural sounds and perception. We demonstrate that inference in the model for auditory textures qualitatively replicates the primitive grouping rules that listeners use to understand simple acoustic scenes. This suggests that the auditory system is optimised for the statistics of natural sounds.

Probabilistic Amplitude Demodulation

Richard E. Turner, M Sahani, 2007. (In 7th International Conference on Independent Component Analysis and Signal Separation).

Abstract▼ URL

Auditory scene analysis is extremely challenging. One approach, perhaps that adopted by the brain, is to shape useful representations of sounds on prior knowledge about their statistical structure. For example, sounds with harmonic sections are common and so time-frequency representations are efficient. Most current representations concentrate on the shorter components. Here, we propose representations for structures on longer time-scales, like the phonemes and sentences of speech. We decompose a sound into a product of processes, each with its own characteristic time-scale. This demodulation cascade relates to classical amplitude demodulation, but traditional algorithms fail to realise the representation fully. A new approach, probabilistic amplitude demodulation, is shown to out-perform the established methods, and to easily extend to representation of a full demodulation cascade.

Modeling natural sounds with modulation cascade processes

Richard E. Turner, Maneesh Sahani, 2008. (In nips20). Edited by J. C. Platt, D. Koller, Y. Singer, S. Roweis. mit.

Abstract▼ URL

Natural sounds are structured on many time-scales. A typical segment of speech, for example, contains features that span four orders of magnitude: Sentences (∼1s); phonemes (∼10−1 s); glottal pulses (∼ 10−2s); and formants (∼ 10−3s). The auditory system uses information from each of these time-scales to solve complicated tasks such as auditory scene analysis [1]. One route toward understanding how auditory processing accomplishes this analysis is to build neuroscience-inspired algorithms which solve similar tasks and to compare the properties of these algorithms with properties of auditory processing. There is however a discord: Current machine-audition algorithms largely concentrate on the shorter time-scale structures in sounds, and the longer structures are ignored. The reason for this is two-fold. Firstly, it is a difficult technical problem to construct an algorithm that utilises both sorts of information. Secondly, it is computationally demanding to simultaneously process data both at high resolution (to extract short temporal information) and for long duration (to extract long temporal information). The contribution of this work is to develop a new statistical model for natural sounds that captures structure across a wide range of time-scales, and to provide efficient learning and inference algorithms. We demonstrate the success of this approach on a missing data task.

Statistical inference for single- and multi-band probabilistic amplitude demodulation.

Richard E. Turner, Maneesh Sahani, 2010. (In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)).

Abstract▼ URL

Amplitude demodulation is an ill-posed problem and so it is natural to treat it from a Bayesian viewpoint, inferring the most likely carrier and envelope under probabilistic constraints. One such treatment is Probabilistic Amplitude Demodulation (PAD), which, whilst computationally more intensive than traditional approaches, offers several advantages. Here we provide methods for estimating the uncertainty in the PAD-derived envelopes and carriers, and for learning free-parameters like the time-scale of the envelope. We show how the probabilistic approach can naturally handle noisy and missing data. Finally, we indicate how to extend the model to signals which contain multiple modulators and carriers.

Demodulation as Probabilistic Inference

Richard E. Turner, Maneesh Sahani, 2011. (Transactions on Audio, Speech and Language Processing).

Abstract▼ URL

Demodulation is an ill-posed problem whenever both carrier and envelope signals are broadband and unknown. Here, we approach this problem using the methods of probabilistic inference. The new approach, called Probabilistic Amplitude Demodulation (PAD), is computationally challenging but improves on existing methods in a number of ways. By contrast to previous approaches to demodulation, it satisfies five key desiderata: PAD has soft constraints because it is probabilistic; PAD is able to automatically adjust to the signal because it learns parameters; PAD is user-steerable because the solution can be shaped by user-specific prior information; PAD is robust to broad-band noise because this is modelled explicitly; and PAD’s solution is self-consistent, empirically satisfying a Carrier Identity property. Furthermore, the probabilistic view naturally encompasses noise and uncertainty, allowing PAD to cope with missing data and return error bars on carrier and envelope estimates. Finally, we show that when PAD is applied to a bandpass-filtered signal, the stop-band energy of the inferred carrier is minimal, making PAD well-suited to sub-band demodulation.

Probabilistic amplitude and frequency demodulation

Richard E. Turner, Maneesh Sahani, 2011. (In Advances in Neural Information Processing Systems 24). The MIT Press.

Abstract▼ URL

A number of recent scientific and engineering problems require signals to be decomposed into a product of a slowly varying positive envelope and a quickly varying carrier whose instantaneous frequency also varies slowly over time. Although signal processing provides algorithms for so-called amplitude- and frequency-demodulation (AFD), there are well known problems with all of the existing methods. Motivated by the fact that AFD is ill-posed, we approach the problem using probabilistic inference. The new approach, called probabilistic amplitude and frequency demodulation (PAFD), models instantaneous frequency using an auto-regressive generalization of the von Mises distribution, and the envelopes using Gaussian auto-regressive dynamics with a positivity constraint. A novel form of expectation propagation is used for inference. We demonstrate that although PAFD is computationally demanding, it outperforms previous approaches on synthetic and real signals in clean, noisy and missing data settings.

Decomposing signals into a sum of amplitude and frequency modulated sinusoids using probabilistic inference

Richard E. Turner, Maneesh Sahani, march 2012. (In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on). DOI: 10.1109/ICASSP.2012.6288343. ISSN: 1520-6149.

Abstract▼ URL

There are many methods for decomposing signals into a sum of amplitude and frequency modulated sinusoids. In this paper we take a new estimation based approach. Identifying the problem as ill-posed, we show how to regularize the solution by imposing soft constraints on the amplitude and phase variables of the sinusoids. Estimation proceeds using a version of Kalman smoothing. We evaluate the method on synthetic and natural, clean and noisy signals, showing that it outperforms previous decompositions, but at a higher computational cost.

Decomposing signals into a sum of amplitude and frequency modulated sinusoids using probabilistic inference

Richard E. Turner, Maneesh Sahani, march 2012. (In Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on). DOI: 10.1109/ICASSP.2012.6288343. ISSN: 1520-6149.

Abstract▼ URL

Bayesian Inference for NMR Spectroscopy with Applications to Chemical Quantification

Andrew Gordon Wilson, Yuting Wu, Daniel J. Holland, Sebastian Nowozin, Mick D. Mantle, Lynn F. Gladden, Andrew Blake, 2014. (arXiv preprint arXiv 1402.3580).

Abstract▼ URL

Nuclear magnetic resonance (NMR) spectroscopy exploits the magnetic properties of atomic nuclei to discover the structure, reaction state and chemical environment of molecules. We propose a probabilistic generative model and inference procedures for NMR spectroscopy. Specifically, we use a weighted sum of trigonometric functions undergoing exponential decay to model free induction decay (FID) signals. We discuss the challenges in estimating the components of this general model – amplitudes, phase shifts, frequencies, decay rates, and noise variances – and offer practical solutions. We compare with conventional Fourier transform spectroscopy for estimating the relative concentrations of chemicals in a mixture, using synthetic and experimentally acquired FID signals. We find the proposed model is particularly robust to low signal to noise ratios (SNR), and overlapping peaks in the Fourier transform of the FID, enabling accurate predictions (e.g., 1% error at low SNR) which are not possible with conventional spectroscopy (5% error).