Listen Top Shows Blog

INTERVIEW: Polysemanticity w/ Dr. Darryl Wright

INTERVIEW: Polysemanticity w/ Dr. Darryl Wright

Update: 2024-01-22

Share

Description

Darryl and I discuss his background, how he became interested in machine learning, and a project we are currently working on investigating the penalization of polysemanticity during the training of neural networks.

Check out a diagram of the decoder task used for our research!

01:46 - Interview begins
02:14 - Supernovae classification
08:58 - Penalizing polysemanticity
20:58 - Our "toy model"
30:06 - Task description
32:47 - Addressing hurdles
39:20 - Lessons learned

Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance.

Zooniverse
BlueDot Impact
AI Safety Support
Zoom In: An Introduction to Circuits
MNIST dataset on PapersWithCode
Clusterability in Neural Networks
CIFAR-10 dataset
Effective Altruism Global
CLIP (blog post)
Long Term Future Fund
Engineering Monosemanticity in Toy Models

Comments

In Channel

Sobering Up on AI Progress w/ Dr. Sean McGregor

Sobering Up on AI Progress w/ Dr. Sean McGregor

2025-12-2901:13:41

Against 'The Singularity' w/ Dr. David Thorstad

Against 'The Singularity' w/ Dr. David Thorstad

2025-11-2401:09:10

Getting Agentic w/ Alistair Lowe-Norris

Getting Agentic w/ Alistair Lowe-Norris

2025-10-2001:11:30

Growing BlueDot's Impact w/ Li-Lian Ang

Growing BlueDot's Impact w/ Li-Lian Ang

2025-09-1501:07:41

Layoffs to Leadership w/ Andres Sepulveda Morales

Layoffs to Leadership w/ Andres Sepulveda Morales

2025-08-0401:39:59

Getting Into PauseAI w/ Will Petillo

Getting Into PauseAI w/ Will Petillo

2025-06-2301:48:04

Making Your Voice Heard w/ Tristan Williams & Felix de Simone

Making Your Voice Heard w/ Tristan Williams & Felix de Simone

2025-05-1901:33:23

INTERVIEW: Scaling Democracy w/ (Dr.) Igor Krawczuk

INTERVIEW: Scaling Democracy w/ (Dr.) Igor Krawczuk

2024-06-0302:58:46

INTERVIEW: StakeOut.AI w/ Dr. Peter Park (3)

INTERVIEW: StakeOut.AI w/ Dr. Peter Park (3)

2024-03-2501:42:00

INTERVIEW: StakeOut.AI w/ Dr. Peter Park (2)

INTERVIEW: StakeOut.AI w/ Dr. Peter Park (2)

2024-03-1801:06:23

MINISODE: Restructure Vol. 2

MINISODE: Restructure Vol. 2

2024-03-1113:09

INTERVIEW: StakeOut.AI w/ Dr. Peter Park (1)

INTERVIEW: StakeOut.AI w/ Dr. Peter Park (1)

2024-03-0454:11

MINISODE: "LLMs, a Survey"

MINISODE: "LLMs, a Survey"

2024-02-2630:55

FEEDBACK: Applying for Funding w/ Esben Kran

FEEDBACK: Applying for Funding w/ Esben Kran

2024-02-1945:13

MINISODE: Reading a Research Paper

MINISODE: Reading a Research Paper

2024-02-1209:25

HACKATHON: Evals November 2023 (2)

HACKATHON: Evals November 2023 (2)

2024-02-0548:39

MINISODE: Portfolios

MINISODE: Portfolios

2024-01-2909:39

INTERVIEW: Polysemanticity w/ Dr. Darryl Wright

INTERVIEW: Polysemanticity w/ Dr. Darryl Wright

2024-01-2245:09

MINISODE: Starting a Podcast

MINISODE: Starting a Podcast

2024-01-1510:35

HACKATHON: Evals November 2023 (1)

HACKATHON: Evals November 2023 (1)

2024-01-0801:08:39

00:00

00:00

1.0x

INTERVIEW: Polysemanticity w/ Dr. Darryl Wright

INTERVIEW: Polysemanticity w/ Dr. Darryl Wright

Jacob Haimes