#104 Automated Gaussian Processes & Sequential Monte Carlo, with Feras Saad
Description
Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!
GPs are extremely powerful…. but hard to handle. One of the bottlenecks is learning the appropriate kernel. What if you could learn the structure of GP kernels automatically? Sounds really cool, but also a bit futuristic, doesn’t it?
Well, think again, because in this episode, Feras Saad will teach us how to do just that! Feras is an Assistant Professor in the Computer Science Department at Carnegie Mellon University. He received his PhD in Computer Science from MIT, and, most importantly for our conversation, he’s the creator of AutoGP.jl, a Julia package for automatic Gaussian process modeling.
Feras discusses the implementation of AutoGP, how it scales, what you can do with it, and how you can integrate its outputs in your models.
Finally, Feras provides an overview of Sequential Monte Carlo and its usefulness in AutoGP, highlighting the ability of SMC to incorporate new data in a streaming fashion and explore multiple modes efficiently.
Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !
Thank you to my Patrons for making this episode possible!
Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell and Gal Kampel.
Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag ;)
Takeaways:
- AutoGP is a Julia package for automatic Gaussian process modeling that learns the structure of GP kernels automatically.
- It addresses the challenge of making structural choices for covariance functions by using a symbolic language and a recursive grammar to infer the expression of the covariance function given the observed data.
-AutoGP incorporates sequential Monte Carlo inference to handle scalability and uncertainty in structure learning.
- The package is implemented in Julia using the Gen probabilistic programming language, which provides support for sequential Monte Carlo and involutive MCMC.
- Sequential Monte Carlo (SMC) and inductive MCMC are used in AutoGP to infer the structure of the model.
- Integrating probabilistic models with language models can improve interpretability and trustworthiness in data-driven inferences.
- Challenges in Bayesian workflows include the need for automated model discovery and scalability of inference algorithms.
- Future developments in probabilistic reasoning systems include unifying people around data-driven inferences and improving the scalability and configurability of inference algorithms.
Chapters:
00:00 Introduction to AutoGP
26:28 Automatic Gaussian Process Modeling
45:05 AutoGP: Automatic Discovery of Gaussian Process Model Structure
53:39 Applying AutoGP to New Settings
01:09:27 The Biggest Hurdle in the Bayesian Workflow
01:19:14 Unifying People Around Data-Driven Inferences
Links from the show:
- Sign up to the Fast & Efficient Gaussian Processes modeling webinar: https://topmate.io/alex_andorra/901986
- Feras’ website: https://www.cs.cmu.edu/~fsaad/
- LBS #3.1, What is Probabilistic Programming & Why use it, with Colin Carroll: https://learnbayesstats.com/episode/3-1-what-is-probabilistic-programming-why-use-it-with-colin-carroll/
- LBS #3.2, How to use Bayes in industry, with Colin Carroll: https://learnbayesstats.com/episode/3-2-how-to-use-bayes-in-industry-with-colin-carroll/
- LBS #21, Gaussian Processes, Bayesian Neural Nets & SIR Models, with Elizaveta Semenova: https://learnbayesstats.com/episode/21-gaussian-processes-bayesian-neural-nets-sir-models-with-elizaveta-semenova/
- LBS #29, Model Assessment, Non-Parametric Models, And Much More, with Aki Vehtari: https://learnbayesstats.com/episode/model-assessment-non-parametric-models-aki-vehtari/
- LBS #63, Media Mix Models & Bayes for Marketing, with Luciano Paz: https://learnbayesstats.com/episode/63-media-mix-models-bayes-marketing-luciano-paz/
- LBS #83, Multilevel Regression, Post-Stratification & Electoral Dynamics, with Tarmo Jüristo: https://learnbayesstats.com/episode/83-multilevel-regression-post-stratification-electoral-dynamics-tarmo-juristo/
- AutoGP.jl, A Julia package for learning the covariance structure of Gaussian process time series models: https://probsys.github.io/AutoGP.jl/stable/
- Sequential Monte Carlo Learning for Time Series Structure Discovery: https://arxiv.org/abs/2307.09607
- Street Epistemlogy: https://www.youtube.com/@magnabosco210
- You're not so smart Podcast: https://youarenotsosmart.com/podcast/
- How Minds Change: https://www.davidmcraney.com/howmindschangehome
- Josh Tenebaum's lectures on computational cognitive science: https://www.youtube.com/playlist?list=PLUl4u3cNGP61RTZrT3MIAikp2G5EEvTjf
Transcript
This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.