Ushnish Sengupta

Categories: 17 Dec 2018

I am a Marie Sklodowska-Curie Early Stage Researcher in the MAGISTER consortium which seeks to utilize machine learning to understand and predict thermoacoustic oscillations in aircraft engines or gas turbines. My job, as I see it, is to serve as a liasion between the probabilistic machine learning group led by my PhD supervisor Professor Carl Rasmussen and the flow instability and adjoint optimization group led by my advisor Professor Matthew Juniper. We are currently looking at data from both small-scale and large-scale combustors to explore how ML techniques can use this data to enable both better designs and safe operation for these machines.

Broadly speaking, I am interested in high-stakes applications of probabilistic machine learning techniques. The consequences of failure for an ML algorithm that monitors the sensors of an aircraft engine are very different from one that recognizes faces in social media photos or recommends music. These critical applications often place a high premium on principled uncertainty estimates, applicability to limited datasets and interpretability: something probabilistic machine learning techniques like Gaussian processes can offer. Marrying these completely data-driven techniques with physical modeling for more robust predictions and sensible extrapolations is also something that intrigues me.

I am also a dilettante computational chemist and am curious about how probabilistic machine learning can improve our ability to predict protein aggregation and protein dynamics. Protein aggregates, of course, play both functional and pathological roles in the human body while protein dynamics is crucial to the functioning of many enzymes. Compared to the static structure prediction problem, however, both aggregation and dynamics are harder to characterize experimentally and lack extensive databases. Can Bayesian techniques shine in this data-limited regime and achieve results comparable to expensive simulations which consume many thousand of supercomputer core-hours?