Agent Learning via Early Experience

Update: 2025-10-24

Description

This paper discusses the "early experience" paradigm as a method for training autonomous language agents, aiming to bridge the gap between reward-free Imitation Learning (IL) and reward-dependent Reinforcement Learning (RL). This novel approach allows agents to learn from their own generated interactions, or "experience," without needing explicit external rewards, addressing a major challenge in real-world environments where dense feedback is often unavailable. The paper explores two core strategies within this paradigm: Implicit World Modeling (IWM), where the agent predicts future states to internalize environmental dynamics, and Self-Reflection (SR), where the agent compares its actions to expert demonstrations and generates rationales for superior choices. Experimental results across various benchmarks, including WebShop and ScienceWorld, consistently demonstrate that training with early experience significantly outperforms traditional imitation learning and provides a superior starting point, or "warm start," for subsequent reinforcement learning stages, even with reduced amounts of expert data.

Comments

In Channel

Direct Preference Optimization with Unobserved Preference Heterogeneity: The Necessity of Ternary Preferences

2025-10-2412:19

The Coverage Principle: How Pre-Training Enables Post-Training

2025-10-2416:11

The Era of Real-World Human Interaction: RL from User Conversations

2025-10-2413:46

Agent Learning via Early Experience

2025-10-2412:36

Demystifying the Mechanisms Behind Emergent Exploration in Goal-conditioned RL

2025-10-2214:33

Rewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior

2025-10-2219:04

A Definition of AGI

2025-10-2216:28

Provably Learning from Language Feedback

2025-10-2119:55

In-Context Learning for Pure Exploration

2025-10-2116:30

On the Role of Preference Variance in Preference Optimization

2025-10-2014:42

Training LLM Agents to Empower Humans

2025-10-2013:38

Richard Sutton Declares LLMs a Dead End

2025-10-2013:20

Demystifying Reinforcement Learning in Agentic Reasoning

2025-10-1915:21

Emergent coordination in multi-agent language models

2025-10-1913:57

Learning-to-measure: in-context active feature acquisition

2025-10-1916:02

Andrej Karpathy's insights: AGI, Intelligence, and Evolution

2025-10-1916:11

Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data

2025-10-1812:48

Representation-Based Exploration for Language Models: From Test-Time to Post-Training

2025-10-1817:02

The attacker moves second: stronger adaptive attacks bypass defenses against LLM jail- Breaks and prompt injections

2025-10-1816:08

When can in-context learning generalize out of task distribution?

2025-10-1619:44

00:00

Agent Learning via Early Experience

#box-pro-ellipsis-176139794557293{-webkit-line-clamp:2;}Agent Learning via Early Experience

Agent Learning via Early Experience

Enoch H. Kang

Agent Learning via Early Experience