#98 Fusing Statistical Physics, Machine Learning & Adaptive MCMC, with Marylou Gabrié
Description
Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!
How does the world of statistical physics intertwine with machine learning, and what groundbreaking insights can this fusion bring to the field of artificial intelligence?
In this episode, we delve into these intriguing questions with Marylou Gabrié. an assistant professor at CMAP, Ecole Polytechnique in Paris. Having completed her PhD in physics at École Normale Supérieure, Marylou ventured to New York City for a joint postdoctoral appointment at New York University’s Center for Data Science and the Flatiron’s Center for Computational Mathematics.
As you’ll hear, her research is not just about theoretical exploration; it also extends to the practical adaptation of machine learning techniques in scientific contexts, particularly where data is scarce.
In this conversation, we’ll traverse the landscape of Marylou's research, discussing her recent publications and her innovative approaches to machine learning challenges, latest MCMC advances, and ML-assisted scientific computing.
Beyond that, get ready to discover the person behind the science – her inspirations, aspirations, and maybe even what she does when not decoding the complexities of machine learning algorithms!
Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !
Thank you to my Patrons for making this episode possible!
Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie and Cory Kiser.
Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag ;)
Takeaways
- Developing methods that leverage machine learning for scientific computing can provide valuable insights into high-dimensional probabilistic models.
- Generative models can be used to speed up Markov Chain Monte Carlo (MCMC) methods and improve the efficiency of sampling from complex distributions.
- The Adaptive Monte Carlo algorithm augmented with normalizing flows offers a powerful approach for sampling from multimodal distributions.
- Scaling the algorithm to higher dimensions and handling discrete parameters are ongoing challenges in the field.
- Open-source packages, such as Flow MC, provide valuable tools for researchers and practitioners to adopt and contribute to the development of new algorithms. The scaling of algorithms depends on the quantity of parameters and data. While some methods work well with a few hundred parameters, larger quantities can lead to difficulties.
- Generative models, such as normalizing flows, offer benefits in the Bayesian context, including amortization and the ability to adjust the model with new data.
- Machine learning and MCMC are complementary and should be used together rather than replacing one another.
- Machine learning can assist scientific computing in the context of scarce data, where expensive experiments or numerics are required.
- The future of MCMC lies in the exploration of sampling multimodal distributions and understanding resource limitations in scientific research.
Links from the show:
- Marylou’s website: https://marylou-gabrie.github.io/
- Marylou on Linkedin: https://www.linkedin.com/in/marylou-gabri%C3%A9-95366172/
- Marylou on Twitter: https://twitter.com/marylougab
- Marylou on Github: https://github.com/marylou-gabrie
- Marylou on Google Scholar: https://scholar.google.fr/citations?hl=fr&user=5m1DvLwAAAAJ
- Adaptive Monte Carlo augmented with normalizing flows: https://arxiv.org/abs/2105.12603
- Normalizing-flow enhanced sampling package for probabilistic inference: https://flowmc.readthedocs.io/en/main/
- Flow-based generative models for Markov chain Monte Carlo in lattice field theory: https://journals.aps.org/prd/abstract/10.1103/PhysRevD.100.034515
- Boltzmann generators – Sampling equilibrium states of many-body systems with deep learning: https://www.science.org/doi/10.1126/science.aaw1147
- Solving Statistical Mechanics Using Variational Autoregressive Networks: https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.122.080602
- An example of discrete version of similar algorithms: https://journals.aps.org/prresearch/abstract/10.1103/PhysRevResearch.3.L042024
- Grothendieck's conference: https://www.youtube.com/watch?v=ZW9JpZXwGXc
Transcript
This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.


























