Machine learning (ML) has revolutionized many aspects of our daily lives with incredible breakthroughs in computer vision, speech analysis and natural language processing. However, modern ML techniques such as deep learning are notorious for being data hungry (requiring large amounts of labelled data to train) and difficult to interpret. In areas involving scientific discovery, such large amounts of data may not be available and, more importantly, the ability to understand and interpret what ML models learn is a must. In this project we will investigate different approaches to combining ML with mechanistic models in order to make ML methods more data efficient and provide a better understanding of the world through their mechanistic counterpart. We will focus on three different approaches involving state-space models, emulator-based approaches and (stochastic) differential equations. Applications where such approaches are not only beneficial but necessary include ecology, biosecurity, economics and physics but we aim at addressing problems in climate modelling through our collaborations with CSIRO’s Ocean and Atmosphere.
It is expected that the student, under the guidance of his university and Data61 supervisors, will develop new frameworks for learning and inference in such hybrid models that are scalable and efficient. For this purpose, we will build upon the supervisory team’s strong expertise and track record in probabilistic inference, statistics and machine learning. The outcomes of this project will not only have an impact in machine learning and statistics but also have the potential to revolutionise significant areas of science such as those mentioned above where the combination of physical models with data-driven approaches is key. The student is expected to develop the research and methods for the above problems, publish and present the corresponding outcomes at top machine learning venues (such as NeurIPS, ICML, ICLR, AISTATS) and contribute to specific applications involving our collaborations with CSIRO’s Oceans and Atmosphere business unit. The student will also be given the opportunity to work alongside our collaborators at The University of Warwick (UK) and EURECOM (France) who have been working on similar problems.
- Computer science, statistics or related quantitative fields
- Excellent knowledge of machine learning techniques (e.g. popular supervised and unsupervised learning methods and fundamental concepts such as generalization, regularization, overfitting)
- Expertise in high-level programming languages such as Python
- Desirable: Knowledge of probabilistic inference techniques and deep learning frameworks such as Pytorch and TensorFlow.