Chasing Real AGI: Inside ARC Prize 2025 with Chollet & Knoop

Update: 2025-04-03

Description

In this fascinating episode, we dive deep into the race towards true AI intelligence, AGI benchmarks, test-time adaptation, and program synthesis with star AI researcher (and philosopher) Francois Chollet, creator of Keras and the ARC AGI benchmark, and Mike Knoop, co-founder of Zapier and now co-founder with Francois of both the ARC Prize and the research lab Ndea. With the launch of ARC Prize 2025 and ARC-AGI 2, they explain why existing LLMs fall short on true intelligence tests, how new models like O3 mark a step change in capabilities, and what it will really take to reach AGI.

We cover everything from the technical evolution of ARC 1 to ARC 2, the shift toward test-time reasoning, and the role of program synthesis as a foundation for more general intelligence. The conversation also explores the philosophical underpinnings of intelligence, the structure of the ARC Prize, and the motivation behind launching Ndea — a ew AGI research lab that aims to build a "factory for rapid scientific advancement." Whether you're deep in the AI research trenches or just fascinated by where this is all headed, this episode offers clarity and inspiration.

Ndea

Website - https://ndea.com

X/Twitter - https://x.com/ndea

ARC Prize

Website - https://arcprize.org

X/Twitter - https://x.com/arcprize

François Chollet

LinkedIn - https://www.linkedin.com/in/fchollet

X/Twitter - https://x.com/fchollet

Mike Knoop

X/Twitter - https://x.com/mikeknoop

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00 ) Intro

(01:05 ) Introduction to ARC Prize 2025 and ARC-AGI 2

(02:07 ) What is ARC and how it differs from other AI benchmarks

(02:54 ) Why current models struggle with fluid intelligence

(03:52 ) Shift from static LLMs to test-time adaptation

(04:19 ) What ARC measures vs. traditional benchmarks

(07:52 ) Limitations of brute-force scaling in LLMs

(13:31 ) Defining intelligence: adaptation and efficiency

(16:19 ) How O3 achieved a massive leap in ARC performance

(20:35 ) Speculation on O3's architecture and test-time search

(22:48 ) Program synthesis: what it is and why it matters

(28:28 ) Combining LLMs with search and synthesis techniques

(34:57 ) The ARC Prize structure: efficiency track, private vs. public

(42:03 ) Open source as a requirement for progress

(44:59 ) What's new in ARC-AGI 2 and human benchmark testing

(48:14 ) Capabilities ARC-AGI 2 is designed to test

(49:21 ) When will ARC-AGI 2 be saturated? AGI timelines

(52:25 ) Founding of NDEA and why now

(54:19 ) Vision beyond AGI: a factory for scientific advancement

(56:40 ) What NDEA is building and why it's different from LLM labs

(58:32 ) Hiring and remote-first culture at NDEA

(59:52 ) Closing thoughts and the future of AI research

Comments

In Channel

Trino, Iceberg and the Battle for the Lakehouse | Justin Borgman, CEO, Starburst

2025-01-3001:06:24

How GPT-5 Thinks — OpenAI VP of Research Jerry Tworek

2025-10-1601:16:04

Sonnet 4.5 & the AI Plateau Myth — Sholto Douglas (Anthropic)

2025-10-0201:10:03

Goodbye Excel? AI Agents for Self-Driving Finance – Pigment CEO

2025-09-1101:05:46

AI Video’s Wild Year – Runway CEO on What’s Next

2025-09-0401:04:57

How to Build a Beloved AI Product - Granola CEO Chris Pedregal

2025-08-2101:08:28

Anthropic's Surprise Hit: How Claude Code Became an AI Coding Powerhouse

2025-08-0701:00:16

Ex‑DeepMind Researcher Misha Laskin on Enterprise Super‑Intelligence | Reflection AI

2025-07-1701:06:29

The Rise of Agentic Commerce — Emily Glassberg Sands (Stripe)

2025-07-1001:15:14

AI Engineering Revolution: Winners, Chaos & What’s Next | FirstMark

2025-07-0349:53

Guillermo Rauch: Why Software Development Will Never Be the Same

2025-06-2601:45:40

Inside Canva’s $3B ARR AI Design Rocketship — CTO Brendan Humphreys on Magic Studio & Canva Code

2025-06-2056:38

GitHub CEO: The AI Coding Gold Rush, Vibe Coding & Cursor

2025-06-1201:04:46

Inside the Paper That Changed AI Forever - Cohere CEO Aidan Gomez on 2025 Agents

2025-06-0501:02:24

AI That Ends Busy Work — Hebbia CEO on “Agent Employees”

2025-05-2948:24

AI Eats the World: Benedict Evans on What Really Matters Now

2025-05-2201:15:09

Jeremy Howard on Building 5,000 AI Products with 14 People (Answer AI Deep-Dive)

2025-05-1555:02

Why Influx Rebuilt Its Database for the IoT and Robotics Explosion

2025-05-0835:35

Dashboards Are Dead: Sigma’s BI Revolution for Trillion-Row Data

2025-05-0141:32

Glean’s Breakthrough: CEO Arvind Jain on Scaling AI Agents & Search

2025-04-2452:11

00:00

Chasing Real AGI: Inside ARC Prize 2025 with Chollet & Knoop

#box-pro-ellipsis-176633054429475{-webkit-line-clamp:2;}Chasing Real AGI: Inside ARC Prize 2025 with Chollet & Knoop

Chasing Real AGI: Inside ARC Prize 2025 with Chollet & Knoop

Matt Turck

Chasing Real AGI: Inside ARC Prize 2025 with Chollet & Knoop