Listen Top Shows Blog

38.3 - Erik Jenner on Learned Look-Ahead

38.3 - Erik Jenner on Learned Look-Ahead

Update: 2024-12-12

Share

Description

Lots of people in the AI safety space worry about models being able to make deliberate, multi-step plans. But can we already see this in existing neural nets? In this episode, I talk with Erik Jenner about his work looking at internal look-ahead within chess-playing neural networks.

Patreon: https://www.patreon.com/axrpodcast

Ko-fi: https://ko-fi.com/axrpodcast

The transcript: https://axrp.net/episode/2024/12/12/episode-38_3-erik-jenner-learned-look-ahead.html

FAR.AI: https://far.ai/

FAR.AI on X (aka Twitter): https://x.com/farairesearch

FAR.AI on YouTube: https://www.youtube.com/@FARAIResearch

The Alignment Workshop: https://www.alignment-workshop.com/

Topics we discuss, and timestamps:

00:57 - How chess neural nets look into the future

04:29 - The dataset and basic methodology

05:23 - Testing for branching futures?

07:57 - Which experiments demonstrate what

10:43 - How the ablation experiments work

12:38 - Effect sizes

15:23 - X-risk relevance

18:08 - Follow-up work

21:29 - How much planning does the network do?

Research we mention:

Evidence of Learned Look-Ahead in a Chess-Playing Neural Network: https://arxiv.org/abs/2406.00877

Understanding the learned look-ahead behavior of chess neural networks (a development of the follow-up research Erik mentioned): https://openreview.net/forum?id=Tl8EzmgsEp

Linear Latent World Models in Simple Transformers: A Case Study on Othello-GPT: https://arxiv.org/abs/2310.07582

Episode art by Hamish Doodles: hamishdoodles.com

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

38.3 - Erik Jenner on Learned Look-Ahead

38.3 - Erik Jenner on Learned Look-Ahead

2024-12-1223:46

39 - Evan Hubinger on Model Organisms of Misalignment

39 - Evan Hubinger on Model Organisms of Misalignment

2024-12-0101:45:47

38.2 - Jesse Hoogland on Singular Learning Theory

38.2 - Jesse Hoogland on Singular Learning Theory

2024-11-2718:18

38.1 - Alan Chan on Agent Infrastructure

38.1 - Alan Chan on Agent Infrastructure

2024-11-1624:48

38.0 - Zhijing Jin on LLMs, Causality, and Multi-Agent Systems

38.0 - Zhijing Jin on LLMs, Causality, and Multi-Agent Systems

2024-11-1422:42

37 - Jaime Sevilla on AI Forecasting

37 - Jaime Sevilla on AI Forecasting

2024-10-0401:44:25

36 - Adam Shai and Paul Riechers on Computational Mechanics

36 - Adam Shai and Paul Riechers on Computational Mechanics

2024-09-2901:48:27

New Patreon tiers + MATS applications

New Patreon tiers + MATS applications

2024-09-2805:32

35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization

35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization

2024-08-2402:17:24

34 - AI Evaluations with Beth Barnes

34 - AI Evaluations with Beth Barnes

2024-07-2802:14:02

33 - RLHF Problems with Scott Emmons

33 - RLHF Problems with Scott Emmons

2024-06-1201:41:24

32 - Understanding Agency with Jan Kulveit

32 - Understanding Agency with Jan Kulveit

2024-05-3002:22:29

31 - Singular Learning Theory with Daniel Murfet

31 - Singular Learning Theory with Daniel Murfet

2024-05-0702:32:07

30 - AI Security with Jeffrey Ladish

30 - AI Security with Jeffrey Ladish

2024-04-3002:15:44

29 - Science of Deep Learning with Vikrant Varma

29 - Science of Deep Learning with Vikrant Varma

2024-04-2502:13:46

28 - Suing Labs for AI Risk with Gabriel Weil

28 - Suing Labs for AI Risk with Gabriel Weil

2024-04-1701:57:30

27 - AI Control with Buck Shlegeris and Ryan Greenblatt

27 - AI Control with Buck Shlegeris and Ryan Greenblatt

2024-04-1102:56:05

26 - AI Governance with Elizabeth Seger

26 - AI Governance with Elizabeth Seger

2023-11-2601:57:13

25 - Cooperative AI with Caspar Oesterheld

25 - Cooperative AI with Caspar Oesterheld

2023-10-0303:02:09

24 - Superalignment with Jan Leike

24 - Superalignment with Jan Leike

2023-07-2702:08:29

00:00

00:00

x

38.3 - Erik Jenner on Learned Look-Ahead

38.3 - Erik Jenner on Learned Look-Ahead