Reasoning with Sampling: Your Base Model is Smarter Than You Think
Update: 2025-10-23
Description
In this episode, we discuss Reasoning with Sampling: Your Base Model is Smarter Than You Think by Aayush Karan, Yilun Du. The paper proposes a novel iterative sampling algorithm based on Markov chain Monte Carlo techniques that enhances reasoning abilities of base large language models at inference time without additional training. This method significantly improves performance on multiple reasoning benchmarks, matching or surpassing results from reinforcement learning fine-tuning. Additionally, the approach maintains sample diversity and does not rely on curated datasets or verifiers, making it broadly applicable.
Comments
In Channel



