Joe Carlsmith Audio

70 Episodes

Reverse

Building AIs that do human-like philosophy

2026-01-2933:33

AIs will face philosophical questions humans can't answer for them. Text version here: https://joecarlsmith.com/2026/01/29/building-ais-that-do-human-like-philosophy/

How human-like do safe AI motivations need to be?

2025-11-1201:23:32

AIs with alien motivations can still follow instructions safely on the inputs that matter. Text version here: https://joecarlsmith.com/2025/11/12/how-human-like-do-safe-ai-motivations-need-to-be/

Leaving Open Philanthropy, going to Anthropic

2025-11-0332:09

On a career move, and on AI-safety-focused people working at AI companies. Text version here: https://joecarlsmith.com/2025/11/03/leaving-open-philanthropy-going-to-anthropic/

Controlling the options AIs can pursue

2025-09-2955:34

On boxing AIs, and on making deals with them. Text version here: https://joecarlsmith.com/2025/09/29/controlling-the-options-ais-can-pursue

Giving AIs safe motivations

2025-08-1801:23:25

A four-step picture. Text version here: https://joecarlsmith.com/2025/08/18/giving-ais-safe-motivations

The stakes of AI moral status

2025-05-2137:29

On seeing and not seeing souls. Text version here: https://joecarlsmith.com/2025/05/21/the-stakes-of-ai-moral-status/

Can we safely automate alignment research?

2025-04-3001:29:38

It's really important; we've got a real shot; there are a ton of ways to fail. Text version here: https://joecarlsmith.com/2025/04/30/can-we-safely-automate-alignment-research/. There's also a video and transcript of a talk I gave on this topic here: https://joecarlsmith.com/2025/04/30/video-and-transcript-of-talk-on-automating-alignment-research/

AI for AI safety

2025-03-1427:51

We should try extremely hard to use AI labor to help address the alignment problem. Text version here: https://joecarlsmith.com/2025/03/14/ai-for-ai-safety

Paths and waystations in AI safety

2025-03-1118:07

On the structure of the path to safe superintelligence, and some possible milestones along the way. Text version here: https://joecarlsmith.substack.com/p/paths-and-waystations-in-ai-safety

When should we worry about AI power-seeking?

2025-02-1946:54

Examining the conditions required for rogue AI behavior. Text version here: https://joecarlsmith.substack.com/p/when-should-we-worry-about-ai-power

What is it to solve the alignment problem?

2025-02-1340:13

Also: to avoid it? Handle it? Solve it forever? Solve it completely? Text version here: https://joecarlsmith.substack.com/p/what-is-it-to-solve-the-alignment

How do we solve the alignment problem?

2025-02-1308:43

Introduction to a series of essays about paths to safe and useful superintelligence. Text version here: https://joecarlsmith.substack.com/p/how-do-we-solve-the-alignment-problem

Fake thinking and real thinking

2025-01-2801:18:47

When the line pulls at your hand. Text version here: https://joecarlsmith.com/2025/01/28/fake-thinking-and-real-thinking/.

Takes on "Alignment Faking in Large Language Models"

2024-12-1801:27:54

What can we learn from recent empirical demonstrations of scheming in frontier models? Text version here: https://joecarlsmith.com/2024/12/18/takes-on-alignment-faking-in-large-language-models/

(Part 2, AI takeover) Extended audio from my conversation with Dwarkesh Patel

2024-09-3002:07:33

Extended audio from my conversation with Dwarkesh Patel. This part focuses on the basic story about AI takeover. Transcript available on my website here: https://joecarlsmith.com/2024/09/30/part-2-ai-takeover-extended-audio-transcript-from-my-conversation-with-dwarkesh-patel

(Part 1, Otherness) Extended audio from my conversation with Dwarkesh Patel

2024-09-3003:58:38

Extended audio from my conversation with Dwarkesh Patel. This part focuses on my series "Otherness and control in the age of AGI." Transcript available on my website here: https://joecarlsmith.com/2024/09/30/part-1-otherness-extended-audio-transcript-from-my-conversation-with-dwarkesh-patel/

Introduction and summary for "Otherness and control in the age of AGI"

2024-06-2112:23

This is the introduction and summary for my series "Otherness and control in the age of AGI." Text version here: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

Second half of full audio for "Otherness and control in the age of AGI"

2024-06-1804:11:02

Second half of the full audio for my series on how agents with different values should relate to one another, and on the ethics of seeking and sharing power. First half here: https://joecarlsmithaudio.buzzsprout.com/2034731/15266490-first-half-of-full-audio-for-otherness-and-control-in-the-age-of-agi PDF of the full series here: https://jc.gatspress.com/pdf/otherness_full.pdf Summary of the series here: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

First half of full audio for "Otherness and control in the age of AGI"

2024-06-1703:07:29

First half of the full audio for my series on how agents with different values should relate to one another, and on the ethics of seeking and sharing power. Second half here: https://joecarlsmithaudio.buzzsprout.com/2034731/15272132-second-half-of-full-audio-for-otherness-and-control-in-the-age-of-agi PDF of the full series here: https://jc.gatspress.com/pdf/otherness_full.pdf Summary of the series here: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

Loving a world you don't trust

2024-06-1701:03:54

Garden, campfire, healing water. Text version here: https://joecarlsmith.com/2024/06/18/loving-a-world-you-dont-trust This essay is part of a series I'm calling "Otherness and control in the age of AGI." I'm hoping that individual essays can be read fairly well on their own, but see here for brief text summaries of the essays that have been released thus far: https://joecarlsmith.com/2024/01/02/otherness-and-control-in-the-age-of-agi

#box-pro-ellipsis-176988950548755{-webkit-line-clamp:2;}Joe Carlsmith Audio