DiscoverMachine Learning Street Talk (MLST)How Do AI Models Actually Think? - Laura Ruis
How Do AI Models Actually Think? - Laura Ruis

How Do AI Models Actually Think? - Laura Ruis

Update: 2025-01-20
Share

Description

Laura Ruis, a PhD student at University College London and researcher at Cohere, explains her groundbreaking research into how large language models (LLMs) perform reasoning tasks, the fundamental mechanisms underlying LLM reasoning capabilities, and whether these models primarily rely on retrieval or develop procedural knowledge.




SPONSOR MESSAGES:


***


CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments.


https://centml.ai/pricing/




Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events?




Goto https://tufalabs.ai/


***




TOC




1. LLM Foundations and Learning


1.1 Scale and Learning in Language Models [00:00:00 ]


1.2 Procedural Knowledge vs Fact Retrieval [00:03:40 ]


1.3 Influence Functions and Model Analysis [00:07:40 ]


1.4 Role of Code in LLM Reasoning [00:11:10 ]


1.5 Semantic Understanding and Physical Grounding [00:19:30 ]




2. Reasoning Architectures and Measurement


2.1 Measuring Understanding and Reasoning in Language Models [00:23:10 ]


2.2 Formal vs Approximate Reasoning and Model Creativity [00:26:40 ]


2.3 Symbolic vs Subsymbolic Computation Debate [00:34:10 ]


2.4 Neural Network Architectures and Tensor Product Representations [00:40:50 ]




3. AI Agency and Risk Assessment


3.1 Agency and Goal-Directed Behavior in Language Models [00:45:10 ]


3.2 Defining and Measuring Agency in AI Systems [00:49:50 ]


3.3 Core Knowledge Systems and Agency Detection [00:54:40 ]


3.4 Language Models as Agent Models and Simulator Theory [01:03:20 ]


3.5 AI Safety and Societal Control Mechanisms [01:07:10 ]


3.6 Evolution of AI Capabilities and Emergent Risks [01:14:20 ]




REFS:


[00:01:10 ] Procedural Knowledge in Pretraining & LLM Reasoning


Ruis et al., 2024


https://arxiv.org/abs/2411.12580




[00:03:50 ] EK-FAC Influence Functions in Large LMs


Grosse et al., 2023


https://arxiv.org/abs/2308.03296




[00:13:05 ] Surfaces and Essences: Analogy as the Core of Cognition


Hofstadter & Sander


https://www.amazon.com/Surfaces-Essences-Analogy-Fuel-Thinking/dp/0465018475




[00:13:45 ] Wittgenstein on Language Games


https://plato.stanford.edu/entries/wittgenstein/




[00:14:30 ] Montague Semantics for Natural Language


https://plato.stanford.edu/entries/montague-semantics/




[00:19:35 ] The Chinese Room Argument


David Cole


https://plato.stanford.edu/entries/chinese-room/




[00:19:55 ] ARC: Abstraction and Reasoning Corpus


François Chollet


https://arxiv.org/abs/1911.01547




[00:24:20 ] Systematic Generalization in Neural Nets


Lake & Baroni, 2023


https://www.nature.com/articles/s41586-023-06668-3




[00:27:40 ] Open-Endedness & Creativity in AI


Tim Rocktäschel


https://arxiv.org/html/2406.04268v1




[00:30:50 ] Fodor & Pylyshyn on Connectionism


https://www.sciencedirect.com/science/article/abs/pii/0010027788900315




[00:31:30 ] Tensor Product Representations


Smolensky, 1990


https://www.sciencedirect.com/science/article/abs/pii/000437029090007M




[00:35:50 ] DreamCoder: Wake-Sleep Program Synthesis


Kevin Ellis et al.


https://courses.cs.washington.edu/courses/cse599j1/22sp/papers/dreamcoder.pdf




[00:36:30 ] Compositional Generalization Benchmarks


Ruis, Lake et al., 2022


https://arxiv.org/pdf/2202.10745




[00:40:30 ] RNNs & Tensor Products


McCoy et al., 2018


https://arxiv.org/abs/1812.08718




[00:46:10 ] Formal Causal Definition of Agency


Kenton et al.


https://arxiv.org/pdf/2208.08345v2




[00:48:40 ] Agency in Language Models


Sumers et al.


https://arxiv.org/abs/2309.02427




[00:55:20 ] Heider & Simmel’s Moving Shapes Experiment


https://www.nature.com/articles/s41598-024-65532-0




[01:00:40 ] Language Models as Agent Models


Jacob Andreas, 2022


https://arxiv.org/abs/2212.01681




[01:13:35 ] Pragmatic Understanding in LLMs


Ruis et al.


https://arxiv.org/abs/2210.14986



Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

How Do AI Models Actually Think? - Laura Ruis

How Do AI Models Actually Think? - Laura Ruis

Machine Learning Street Talk (MLST)