DiscoverODSC's Ai X PodcastHow to Make More Reliable Predictions in Machine Learning with Brian Lucena
How to Make More Reliable Predictions in Machine Learning with Brian Lucena

How to Make More Reliable Predictions in Machine Learning with Brian Lucena

Update: 2025-03-06
Share

Description

In this episode, we sit down with Brian LucenaPrincipal at Numeristical, and an experienced educator, consultant, and open-source contributor. Brian has advised companies of all sizes on applying modern machine learning techniques and is the creator of popular Python packages like StructureBoost, ML-Insights, and SplineCalib. He has also taught at UC-Berkeley, Brown University, and USF, bringing a unique mix of academic depth and real-world ML expertise.

Today, we explore how to make more reliable predictions in machine learning. From the dominance of gradient boosting for tabular data to the power of probabilistic regression and uncertainty quantification, Brian shares expert insights into building trustworthy ML models. We also dive into probability calibration, model drift, and best practices for ensuring model reliability in production.

Whether you're an ML engineer, data scientist, or business leader looking to improve your AI models, this episode is packed with practical takeaways you won’t want to miss.

Key Topics Covered

✅ Gradient Boosting vs. Deep Learning – Why decision trees still dominate tabular data and structured business problems.✅ Probabilistic Regression – Moving beyond point estimates to provide probability distributions and confidence intervals.✅ Uncertainty Quantification – Understanding the limits of machine learning predictions and why it matters.✅ Probability Calibration – How to ensure your model’s confidence scores are truly reliable.✅ Handling Model Drift – Strategies to maintain model performance in a changing world.✅ Real-World Use Cases – Applications in finance, healthcare, risk modeling, and business decision-making.

Resources & Tools Mentioned

  • Brian’s Youtube Channel: ⁠https://www.youtube.com/c/numeristical⁠
  • Brian’s Linkedin: https://www.linkedin.com/in/brianlucena/
  • 🛠️ StructureBoost – Brian’s open-source package for structured categorical variables in gradient boosting:⁠⁠https://github.com/numeristical/structureboost
  • 📦 ML-Insights – Tools for better understanding ML models: ⁠https://github.com/numeristical/introspective⁠
  • SplineCalib – A library for improving probability calibration: 🔗https://github.com/numeristical/splinecalib
  • 📌 NGBoost – A gradient boosting approach for probabilistic regression: ⁠https://stanfordmlgroup.github.io/projects/ngboost/⁠
  • 🔗 GitHub for NGBoost: ⁠https://github.com/stanfordmlgroup/ngboost⁠
  • 📌 XGBoost – A powerful gradient boosting framework:⁠ ⁠https://github.com/dmlc/xgboost
  • 📌 CatBoost – Gradient boosting with native support for categorical features: https://github.com/catboost/catboost
  • 📌 LightGBM – A fast, efficient gradient boosting library: https://github.com/microsoft/LightGBM
  • 📊 PyMC – A Bayesian probabilistic programming library for uncertainty modeling: https://github.com/pymc-devs/pymc

 Memorable Quotes

💬 "Businesses don’t just want a number—they need to understand a range of possible outcomes. That’s where probabilistic regression makes all the difference."

💬 "One of the biggest challenges in real-world ML is that the world doesn’t stay the same—models can drift, and retraining isn’t always the best solution."

💬 "Gradient boosting still outperforms deep learning for structured data because it handles sharp decision boundaries better."

This episode was sponsored by:

🎤 ODSC East 2025 – The Leading AI Builders Conference –⁠ https://odsc.com/boston/⁠Join us from May 13th to 15th in Boston for hands-on workshops, training sessions, and cutting-edge AI talks covering generative AI, LLMOps, and AI-driven automation.

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

How to Make More Reliable Predictions in Machine Learning with Brian Lucena

How to Make More Reliable Predictions in Machine Learning with Brian Lucena