DiscoverAI BreakdownLearning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents
Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents

Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents

Update: 2025-09-08
Share

Description

In this episode, we discuss Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents by Davide Paglieri, Bartłomiej Cupiał, Jonathan Cook, Ulyana Piterbarg, Jens Tuyls, Edward Grefenstette, Jakob Nicolaus Foerster, Jack Parker-Holder, Tim Rocktäschel. The paper introduces a framework enabling large language model agents to dynamically decide when to plan during task execution, improving efficiency and performance. They propose a two-stage training process combining supervised fine-tuning and reinforcement learning to develop this capability. Experiments show these dynamically planning agents are more sample-efficient, achieve complex goals better, and can be guided by human plans.
Comments 
In Channel
The Markovian Thinker

The Markovian Thinker

2025-10-1607:48

General Social Agents

General Social Agents

2025-09-1508:30

loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents

Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents

agibreakdown