#115 Using Time Series to Estimate Uncertainty, with Nate Haines
Description
Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!
Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!
Visit our Patreon page to unlock exclusive Bayesian swag ;)
Takeaways:
- State space models and traditional time series models are well-suited to forecast loss ratios in the insurance industry, although actuaries have been slow to adopt modern statistical methods.
- Working with limited data is a challenge, but informed priors and hierarchical models can help improve the modeling process.
- Bayesian model stacking allows for blending together different model predictions and taking the best of both (or all if more than 2 models) worlds.
- Model comparison is done using out-of-sample performance metrics, such as the expected log point-wise predictive density (ELPD). Brute leave-future-out cross-validation is often used due to the time-series nature of the data.
- Stacking or averaging models are trained on out-of-sample performance metrics to determine the weights for blending the predictions. Model stacking can be a powerful approach for combining predictions from candidate models. Hierarchical stacking in particular is useful when weights are assumed to vary according to covariates.
- BayesBlend is a Python package developed by Ledger Investing that simplifies the implementation of stacking models, including pseudo Bayesian model averaging, stacking, and hierarchical stacking.
- Evaluating the performance of patient time series models requires considering multiple metrics, including log likelihood-based metrics like ELPD, as well as more absolute metrics like RMSE and mean absolute error.
- Using robust variants of metrics like ELPD can help address issues with extreme outliers. For example, t-distribution estimators of ELPD as opposed to sample sum/mean estimators.
- It is important to evaluate model performance from different perspectives and consider the trade-offs between different metrics. Evaluating models based solely on traditional metrics can limit understanding and trust in the model. Consider additional factors such as interpretability, maintainability, and productionization.
- Simulation-based calibration (SBC) is a valuable tool for assessing parameter estimation and model correctness. It allows for the interpretation of model parameters and the identification of coding errors.
- In industries like insurance, where regulations may restrict model choices, classical statistical approaches still play a significant role. However, there is potential for Bayesian methods and generative AI in certain areas.
Chapters:
00:00 Introduction to Bayesian Modeling in Insurance
13:00 Time Series Models and Their Applications
30:51 Bayesian Model Averaging Explained
56:20 Impact of External Factors on Forecasting
01:25:03 Future of Bayesian Modeling and AI
Thank you to my Patrons for making this episode possible!
Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser, Julio, Edvin Saveljev, Frederick Ayala, Jeffrey Powell, Gal Kampel, Adan Romero, Will Geary, Blake Walters, Jonathan Morgan and Francesco Madrisotti.
Links from the show:
- Nate’s website: http://haines-lab.com/
- Nate on GitHub: https://github.com/Nathaniel-Haines
- Nate on Linkedin: https://www.linkedin.com/in/nathaniel-haines-216049101/
- Nate on Twitter: https://x.com/nate__haines
- Nate on Google Scholar: https://scholar.google.com/citations?user=lg741SgAAAAJ
- LBS #14 Hidden Markov Models & Statistical Ecology, with Vianey Leos-Barajas: https://learnbayesstats.com/episode/14-hidden-markov-models-statistical-ecology-with-vianey-leos-barajas/
- LBS #107 Amortized Bayesian Inference with Deep Neural Networks, with Marvin Schmitt: https://learnbayesstats.com/episode/107-amortized-bayesian-inference-deep-neural-networks-marvin-schmitt/
- LBS #109 Prior Sensitivity Analysis, Overfitting & Model Selection, with Sonja Winter: https://learnbayesstats.com/episode/109-prior-sensitivity-analysis-overfitting-model-selection-sonja-winter/
- BayesBlend – Easy Model Blending: https://arxiv.org/abs/2405.00158
- BayesBlend documentation: https://ledger-investing-bayesblend.readthedocs-hosted.com/en/latest/
- SBC paper: https://arxiv.org/abs/1804.06788
- Isaac Asimov’s Foundation (Hari Seldon): https://en.wikipedia.org/wiki/Hari_Seldon
- Stancon 2023 talk on Ledger’s Bayesian modeling workflow: https://github.com/stan-dev/stancon2023/blob/main/Nathaniel-Haines/slides.pdf
- Ledger’s Bayesian modeling workflow: https://arxiv.org/abs/2407.14666v1
- More on Ledger Investing: https://www.ledgerinvesting.com/about-us
Transcript
This is an automatic transcript and may therefore contain errors. Please get in touch if you're willing to correct them.