Listen Top Shows Blog

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

Update: 2024-07-29

Share

Description

Prof. Subbarao Kambhampati argues that while LLMs are impressive and useful tools, especially for creative tasks, they have fundamental limitations in logical reasoning and cannot provide guarantees about the correctness of their outputs. He advocates for hybrid approaches that combine LLMs with external verification systems.

MLST is sponsored by Brave:

The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api.

TOC (sorry the ones baked into the MP3 were wrong apropos due to LLM hallucination!)

[00:00:00 ] Intro

[00:02:06 ] Bio

[00:03:02 ] LLMs are n-gram models on steroids

[00:07:26 ] Is natural language a formal language?

[00:08:34 ] Natural language is formal?

[00:11:01 ] Do LLMs reason?

[00:19:13 ] Definition of reasoning

[00:31:40 ] Creativity in reasoning

[00:50:27 ] Chollet's ARC challenge

[01:01:31 ] Can we reason without verification?

[01:10:00 ] LLMs cant solve some tasks

[01:19:07 ] LLM Modulo framework

[01:29:26 ] Future trends of architecture

[01:34:48 ] Future research directions

Youtube version: https://www.youtube.com/watch?v=y1WnHpedi2A

Refs: (we didn't have space for URLs here, check YT video description instead)

Can LLMs Really Reason and Plan?

On the Planning Abilities of Large Language Models : A Critical Investigation

Chain of Thoughtlessness? An Analysis of CoT in Planning

On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks

LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks

Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve

"Task Success" is not Enough

Partition function (number theory) (Srinivasa Ramanujan and G.H. Hardy's work)

Poincaré conjecture

Gödel's incompleteness theorems

ROT13 (Rotate13, "rotate by 13 places")

A Mathematical Theory of Communication (C. E. SHANNON)

Sparks of AGI

Kambhampati thesis on speech recognition (1983)

PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change

Explainable human-AI interaction

Tree of Thoughts

On the Measure of Intelligence (ARC Challenge)

Getting 50% (SoTA) on ARC-AGI with GPT-4o (Ryan Greenblatt ARC solution)

PROGRAMS WITH COMMON SENSE (John McCarthy) - "AI should be an advice taker program"

Original chain of thought paper

ICAPS 2024 Keynote: Dale Schuurmans on "Computing and Planning with Large Generative Models" (COT)

The Hardware Lottery (Hooker)

A Path Towards Autonomous Machine Intelligence (JEPA/LeCun)

AlphaGeometry

FunSearch

Emergent Abilities of Large Language Models

Language models are not naysayers (Negation in LLMs)

The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"

Embracing negative results

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

Ashley Edwards - Genie Paper (DeepMind/Runway)

Ashley Edwards - Genie Paper (DeepMind/Runway)

2024-09-1325:04

Cohere's SVP Technology - Saurabh Baji

Cohere's SVP Technology - Saurabh Baji

2024-09-1201:30:25

David Hanson's Vision for Sentient Robots

David Hanson's Vision for Sentient Robots

2024-09-1053:14

The Fabric of Knowledge - David Spivak

The Fabric of Knowledge - David Spivak

2024-09-0546:28

Jürgen Schmidhuber - Neural and Non-Neural AI, Reasoning, Transformers, and LSTMs

Jürgen Schmidhuber - Neural and Non-Neural AI, Reasoning, Transformers, and LSTMs

2024-08-2801:39:39

"AI should NOT be regulated at all!" - Prof. Pedro Domingos

"AI should NOT be regulated at all!" - Prof. Pedro Domingos

2024-08-2502:12:15

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

Adversarial Examples and Data Modelling - Andrew Ilyas (MIT)

2024-08-2201:28:00

Joscha Bach - AGI24 Keynote (Cyberanimism)

Joscha Bach - AGI24 Keynote (Cyberanimism)

2024-08-2157:21

Gary Marcus' keynote at AGI-24

Gary Marcus' keynote at AGI-24

2024-08-1701:29:04

Is ChatGPT an N-gram model on steroids?

Is ChatGPT an N-gram model on steroids?

2024-08-1532:57

Jay Alammar on LLMs, RAG, and AI Engineering

Jay Alammar on LLMs, RAG, and AI Engineering

2024-08-1157:28

Can AI therapy be more effective than drugs?

Can AI therapy be more effective than drugs?

2024-08-0802:14:07

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

2024-07-2901:42:27

Sayash Kapoor - How seriously should we take AI X-risk? (ICML 1/13)

Sayash Kapoor - How seriously should we take AI X-risk? (ICML 1/13)

2024-07-2849:42

Sara Hooker - Why US AI Act Compute Thresholds Are Misguided

Sara Hooker - Why US AI Act Compute Thresholds Are Misguided

2024-07-1827:23

Prof. Murray Shanahan - Machines Don't Think Like Us

Prof. Murray Shanahan - Machines Don't Think Like Us

2024-07-1402:15:22

David Chalmers - Reality+

David Chalmers - Reality+

2024-07-0801:17:57

Ryan Greenblatt - Solving ARC with GPT4o

Ryan Greenblatt - Solving ARC with GPT4o

2024-07-0602:18:01

Aiden Gomez - CEO of Cohere (AI's 'Inner Monologue' – Crucial for Reasoning)

Aiden Gomez - CEO of Cohere (AI's 'Inner Monologue' – Crucial for Reasoning)

2024-06-2901:00:22

New "50%" ARC result and current winners interviewed

New "50%" ARC result and current winners interviewed

2024-06-1802:14:17

00:00

00:00

x

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

Prof. Subbarao Kambhampati - LLMs don't reason, they memorize (ICML2024 2/13)

Machine Learning Street Talk (MLST)