“Problems I’ve Tried to Legibilize” by Wei Dai

Update: 2025-11-10

Description

Looking back, it appears that much of my intellectual output could be described as legibilizing work, or trying to make certain problems in AI risk more legible to myself and others. I've organized the relevant posts and comments into the following list, which can also serve as a partial guide to problems that may need to be further legibilized, especially beyond LW/rationalists, to AI researchers, funders, company leaders, government policymakers, their advisors (including future AI advisors), and the general public.

Philosophical problems
1. Probability theory
2. Decision theory
3. Beyond astronomical waste (possibility of influencing vastly larger universes beyond our own)
4. Interaction between bargaining and logical uncertainty
5. Metaethics
6. Metaphilosophy: 1, 2

Problems with specific philosophical and alignment ideas
1. Utilitarianism: 1, 2
2. Solomonoff induction
3. "Provable" safety
4. CEV
5. Corrigibility
6. IDA (and many scattered comments)
7. UDASSA
8. UDT

Human-AI safety (x- and s-risks arising from the interaction between human nature and AI design)
1. Value differences/conflicts between humans
2. “Morality is scary” (human morality is often the result of status games amplifying random aspects of human value, with frightening results)
3. [...]

---

First published:

November 9th, 2025

Source:

https://www.lesswrong.com/posts/7XGdkATAvCTvn4FGu/problems-i-ve-tried-to-legibilize

---

Narrated by TYPE III AUDIO.

Comments

In Channel

“Myopia Mythology” by abramdemski

2025-11-1006:12

“Three Kinds Of Ontological Foundations” by johnswentworth

2025-11-1005:12

“Learning information which is full of spiders” by Screwtape

2025-11-1016:15

[Linkpost] “Book Announcement: The Gentle Romance” by Richard_Ngo

2025-11-1002:44

“Manifest X DC Opening Benediction - Making Friends Along the Way” by JohnofCharleston

2025-11-1007:44

“Problems I’ve Tried to Legibilize” by Wei Dai

2025-11-1004:18

“Condensation” by abramdemski

2025-11-0930:30

“One Shot Singalonging is an attitude, not a skill or song-difficulty-level” by Raemon

2025-11-0909:21

“Insofar As I Think LLMs ‘Don’t Really Understand Things’, What Do I Mean By That?” by johnswentworth

2025-11-0905:59

“Omniscaling to MNIST” by cloud

2025-11-0820:43

“Comparing Payor & Löb” by abramdemski

2025-11-0806:40

“Against ‘You can just do things’” by zroe1

2025-11-0805:42

“Unexpected Things that are People” by Ben Goldhaber

2025-11-0808:14

“Escalation and perception” by TsviBT

2025-11-0824:17

“Entity Review: Pythia” by plex

2025-11-0808:46

“Mourning a life without AI” by Nikola Jurkovic

2025-11-0811:18

“AI is not inevitable.” by David Scott Krueger (formerly: capybaralet)

2025-11-0805:19

“Anthropic & Dario’s dream” by Simon Lermen

2025-11-0809:01

“13 Arguments About a Transition to Neuralese AIs” by Rauno Arike

2025-11-0717:51

“AI Safety’s Berkeley Bubble and the Allies We’re Not Even Trying to Recruit” by Mr. Counsel

2025-11-0721:07

00:00

“Problems I’ve Tried to Legibilize” by Wei Dai

#box-pro-ellipsis-176279476653682{-webkit-line-clamp:2;}“Problems I’ve Tried to Legibilize” by Wei Dai

“Problems I’ve Tried to Legibilize” by Wei Dai

“Problems I’ve Tried to Legibilize” by Wei Dai