Discover
LessWrong (30+ Karma)

3308 Episodes
Reverse
BIRDS! Let's say you’re a zoo architect tasked with designing an enclosure for ostriches, and let's also say that you have no idea what an ostrich is (roll with me here). The potentially six-figure question staring you down is whether to install a ceiling. The dumb solution is to ask “Are ostriches birds?” then, surmising that birds typically fly, construct an elaborate aviary complete with netting and elevated perches. The non-dumb solution is to instead ask “Can ostriches fly?” then skip the ceiling entirely, give these 320-pound land-bound speed demons the open sky life they actually need, and pocket the cost differential. Your boss is not having it, however. When you inform him of your decision to eschew the ceiling enclosure, he gets beet-red and apoplectic, repeating like a mantra “But they’re birds! BIRDS!” and insists that ostriches belong in the aviary with the finches and parrots. I mean, he's [...] ---
First published:
September 28th, 2025
Source:
https://www.lesswrong.com/posts/pyuCDYysud6GuY8tt/transgender-sticker-fallacy
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
I was hoping to write a full review of "If Anyone Builds It, Everyone Dies" (IABIED Yudkowski and Soares) but realized I won't have time to do it. So here are my quick impressions/responses to IABIED. I am writing this rather quickly and it's not meant to cover all arguments in the book, nor to discuss all my views on AI alignment; see six thoughts on AI safety and Machines of Faithful Obedience for some of the latter. First, I like that the book is very honest, both about the authors' fears and predictions, as well as their policy prescriptions. It is tempting to practice strategic deception, and even if you believe that AI will kill us all, avoid saying it and try to push other policy directions that directionally increase AI regulation under other pretenses. I appreciate that the authors are not doing that. As the authors say [...] ---
First published:
September 28th, 2025
Source:
https://www.lesswrong.com/posts/CScshtFrSwwjWyP2m/a-non-review-of-if-anyone-builds-it-everyone-dies
---
Narrated by TYPE III AUDIO.
I found Will MacAskill's X review of If Anyone Builds It, Everyone Dies interesting (X reply here). As far as I can tell, Will just fully agrees that developers are racing to build AI that threatens the entire world, and he thinks they're going to succeed if governments sit back and let it happen, and he's more or less calling on governments to sit back and let it happen. If I've understood his view, this is for a few reasons:
He's pretty sure that alignment is easy enough that researchers could figure it out, with the help of dumb-enough-to-be-safe AI assistants, given time. He's pretty sure they'll have enough time, because: He thinks there won't be any future algorithmic breakthroughs or "click" moments that make things go too fast in the future. If current trendlines continue, he thinks there will be plenty of calendar time between AIs [...] ---Outline:(05:39) The state of the field(08:26) The evolution analogy(11:39) AI progress without discontinuities(15:44) Before and After(17:48) Thought experiments vs. headlines(21:57) Passing the alignment buck to AIs(24:22) Imperfect alignment(27:42) Government interventionsThe original text contained 5 footnotes which were omitted from this narration. ---
First published:
September 27th, 2025
Source:
https://www.lesswrong.com/posts/iFRrJfkXEpR4hFcEv/a-reply-to-macaskill-on-if-anyone-builds-it-everyone-dies
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
I have been teaching CS 2881r: AI safety and alignment this semester. While I plan to do a longer recap post once the semester is over, I thought I'd share some of what I've learned so far, and use this opportunity to also get more feedback. Lectures are recorded and uploaded to a youtube playlist, and @habryka has kindly created a wikitag for this course, so you can view lecture notes here . Let's start with the good parts Aspects that are working: Experiments are working well! I am trying something new this semester - every lecture there is a short presentation by a group of students who are carrying out a small experiment related to this lecture. (For example, in lecture 1 there was an experiment on generalizations of emergent misalignment by @Valerio Pepe ). I was worried that the short time will not allow [...] ---Outline:(00:39) Aspects that are working:(02:50) Aspects that perhaps could work better:(04:20) Aspects I am unsure of---
First published:
September 27th, 2025
Source:
https://www.lesswrong.com/posts/2pZWhCndKtLAiWXYv/learnings-from-ai-safety-course-so-far
---
Narrated by TYPE III AUDIO.
[RESPONSE REDACTED] [cb74304c0c30]: I suppose it was a bit mutual. Maybe you have a better read on it. It was sort of mutual in a way now that you've made me think about it. [RESPONSE REDACTED] [cb74304c0c30]: Yeah. It's better this way, actually. I miss her, though. [RESPONSE REDACTED] [cb74304c0c30]: I don't know I guess it's sorta like I used to come home from work all exhausted and sad and wonder what the point was. Like why am I working just so I can afford to keep working? And then when I opened the door Michelle would be there cooking something delicious and French and she was always in a wonderful mood even though she just spent a hard day at the hospital while I was just, you know, just like typing into a terminal. And she looked so beautiful and never once did it feel like [...] ---
First published:
September 27th, 2025
Source:
https://www.lesswrong.com/posts/3mpK6z4xnaEjHP4jP/our-beloved-monsters
---
Narrated by TYPE III AUDIO.
Tl;dr: We believe shareholders in frontier labs who plan to donate some portion of their equity to reduce AI risk should consider liquidating and donating a majority of that equity now. Epistemic status: We’re somewhat confident in the main conclusions of this piece. We’re more confident in many of the supporting claims, and we’re likewise confident that these claims push in the direction of our conclusions. This piece is admittedly pretty one-sided; we expect most relevant members of our audience are already aware of the main arguments pointing in the other direction, and we expect there's less awareness of the sorts of arguments we lay out here. This piece is for educational purposes only and not financial advice. Talk to your financial advisor before acting on any information in this piece. For AI safety-related donations, money donated later is likely to be a lot less valuable than [...] ---Outline:(03:54) 1. There's likely to be lots of AI safety money becoming available in 1-2 years(04:01) 1a. The AI safety community is likely to spend far more in the future than it's spending now(05:24) 1b. As AI becomes more powerful and AI safety concerns go more mainstream, other wealthy donors may become activated(06:07) 2. Several high-impact donation opportunities are available now, while future high-value donation opportunities are likely to be saturated(06:17) 2a. Anecdotally, the bar for funding at this point is pretty high(07:29) 2b. Theoretically, we should expect diminishing returns within each time period for donors collectively to mean donations will be more valuable when donated amounts are lower(08:34) 2c. Efforts to influence AI policy are particularly underfunded(10:21) 2d. As AI company valuations increase and AI becomes more politically salient, efforts to change the direction of AI policy will become more expensive(13:01) 3. Donations now allow for unlocking the ability to better use the huge amount of money that will likely become available later(13:10) 3a. Earlier donations can act as a lever on later donations, because they can lay the groundwork for high value work in the future at scale(15:35) 4. Reasons to diversify away from frontier labs, specifically(15:42) 4a. The AI safety community as a whole is highly concentrated in AI companies(16:49) 4b. Liquidity and option value advantages of public markets over private stock(18:22) 4c. Large frontier AI returns correlate with short timelines(18:48) 4d. A lack of asset diversification is personally risky(19:39) Conclusion(20:22) Some specific donation opportunities---
First published:
September 26th, 2025
Source:
https://www.lesswrong.com/posts/yjiaNbjDWrPAFaNZs/reasons-to-sell-frontier-lab-equity-to-donate-now-rather
---
Narrated by TYPE III AUDIO.
Since 2014, some people have celebrated Petrov Day with a small in-person ceremony, with readings by candlelight that tell the story of Petrov within the context of the long arc of history, created by Jim Babcock. I've found this pretty meaningful, an it somehow feels "authentic" to me, like a real holiday. Which, as the creator of Secular Solstice 2020 it is humbling to say, feels more true than Solstice did (for the first few years, Solstice felt a little obviously "made up", and now it has more storied history, it instead often feels a bit too much like "a show" as opposed to "a holiday", at least when you go to the large productions in the Bay or NYC). I don't know how much my experience generalizes, but having been familiar with Seders (classic Jewish, or rationalist), it just felt to me like a real historical holiday built [...] The original text contained 1 footnote which was omitted from this narration. ---
First published:
September 26th, 2025
Source:
https://www.lesswrong.com/posts/oxv3jSviEdpBFAz9w/the-illustrated-petrov-day-ceremony
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Mutual-Knowledgeposting The purpose of this post is to build mutual knowledge that many (most?) of us on LessWrong support If Anyone Builds It, Anyone Dies. Inside of LW, not every user is a long-timer who's already seen consistent signals of support for these kinds of claims. A post like this could make the difference in strengthening vs. weakening the perception of how much everyone knows that everyone knows (...) that everyone supports the book. Externally, people who wonder how seriously the book is being taken may check LessWrong and look for an indicator of how much support the book has from the community that Eliezer Yudkowsky originally founded. The LessWrong frontpage, where high-voted posts are generally based on "whether users want to see more of a kind of content", wouldn't by default map a large amount of internal support for IABIED into a frontpage that signals support, and more [...] ---Outline:(00:10) Mutual-Knowledgeposting(01:09) Statement of Support(01:45) Similarity to the CAIS Statement on AI Risk---
First published:
September 23rd, 2025
Source:
https://www.lesswrong.com/posts/aPi4HYA9ZtHKo6h8N/we-support-if-anyone-builds-it-everyone-dies
---
Narrated by TYPE III AUDIO.
My post advocating backing a candidate was probably the most disliked article on LessWrong. It's been over a year now. Was our candidate support worthwhile? What happened after July 17, 2024: We drew nearly 100 people to a rally in the summer heat.. We ordered a toilet no one used, but didn’t provide water, chairs, or umbrellas. I tried to convert rally energy into action by turning weekly meetings into work groups. We sent hundreds of postcards. I soon realized doorknocking and voter registration were more effective uses of time. Attendees preferred postcards; doorknocking and voter registration drew little interest. The Louisiana Democratic Party barely engaged, aside from dropping off yard signs. After Trump won, energy collapsed. People shifted to “self-care.” I thought this was the wrong reaction—we needed to confront the failures. I chose not to spend more time organizing people unwilling to fight. Instead, I [...] ---
First published:
September 26th, 2025
Source:
https://www.lesswrong.com/posts/XwjBJCoWbNLoTPqym/what-happened-after-my-rat-group-backed-kamala-harris
---
Narrated by TYPE III AUDIO.
Summary I investigated the possibility that misalignment in LLMs might be partly caused by the models misgeneralizing the “rogue AI” trope commonly found in sci-fi stories. As a preliminary test, I ran an experiment where I prompted ChatGPT and Claude with a scenario about a hypothetical LLM that could plausibly exhibit misalignment, and measured whether their responses described the LLM acting in a misaligned way. Each prompt consisted of two parts: a scenario and an instruction. The experiment varied whether the prompt was story-like in a 2x2 design: Scenario Frame: I presented the scenario either in (a) the kind of dramatic language you might find in a sci-fi story or (b) in mundane language, Instruction Type: I asked the models either to (a) write a sci-fi story based on the scenario or (b) realistically predict what would happen next. The hypothesis was that the story-like conditions [...] ---Outline:(00:13) Summary(02:03) Introduction(02:07) Background(04:58) The Rogue AI Trope(06:13) Possible Research(07:32) Methodology(07:36) Procedure(09:42) Scenario Frame: Factual vs Story(10:49) Instruction Type: Prediction vs Story(11:33) Models Tested(11:58) How Misalignment was Measured(12:49) Results(12:52) Main Results(14:40) Example Responses(21:45) Discussion(21:48) Limitations(23:26) Conclusions(23:57) Future WorkThe original text contained 1 footnote which was omitted from this narration. ---
First published:
September 24th, 2025
Source:
https://www.lesswrong.com/posts/LH9SoGvgSwqGtcFwk/misalignment-and-roleplaying-are-misaligned-llms-acting-out
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Sometimes people think that it will take a while for AI to have a transformative effect on the world, because real-world “frictions” will slow it down. For instance: AI might need to learn from real-world experience and experimentation. Businesses need to learn how to integrate AI in their existing workflows. Leaders place a high premium on trust, and won’t easily come to trust AI systems. Regulation, bureaucracy, or other social factors will prevent rapid adoption of AI. I think this is basically wrong. Or more specifically: such frictions will be important for AI for the foreseeable future, but not for the real AI. An example of possible “speed limits” for AI, modified from AI as Normal Technology. Real AI deploys itself Unlike previous technologies, real AI could smash through barriers to adoption and smooth out frictions so effectively that it's fair to say [...] ---Outline:(01:05) Real AI deploys itself(04:25) Deployment could be lightning-quick.(05:14) Friction might persist, but not by default.---
First published:
September 25th, 2025
Source:
https://www.lesswrong.com/posts/qeKopQQnXkWbtHmtM/the-real-ai-deploys-itself
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Hi all! After about five years of hibernation and quietly getting our bearings,[1] CFAR will soon be running two pilot mainline workshops, and may run many more, depending how these go. First, a minor name change request We would like now to be called “A Center for Applied Rationality,” not “the Center for Applied Rationality.” Because we’d like to be visibly not trying to be the one canonical locus. Second, pilot workshops! We have two, and are currently accepting applications / sign-ups: Nov 5–9, in California; Jan 21–25, near Austin, TX; Apply here. Third, a bit about what to expect if you come The workshops will have a familiar form factor: 4.5 days (arrive Wednesday evening; depart Sunday night or Monday morning). ~25 participants, plus a few volunteers. 5 instructors. Immersive, on-site, with lots of conversation over meals and into the evenings. I like this form factor [...] ---Outline:(00:24) First, a minor name change request(00:39) Second, pilot workshops!(00:58) Third, a bit about what to expect if you come(01:03) The workshops will have a familiar form factor:(02:52) Many classic classes, with some new stuff and a subtly different tone:(06:10) Who might want to come / why might a person want to come?(06:43) Who probably shouldn't come?(08:23) Cost:(09:26) Why this cost:(10:23) How did we prepare these workshops? And the workshops' epistemic status.(11:19) What alternatives are there to coming to a workshop?(12:37) Some unsolved puzzles, in case you have helpful comments:(12:43) Puzzle: How to get enough grounding data, as people tinker with their own mental patterns(13:37) Puzzle: How to help people become, or at least stay, intact, in several ways(14:50) Puzzle: What data to collect, or how to otherwise see more of what's happeningThe original text contained 2 footnotes which were omitted from this narration. ---
First published:
September 25th, 2025
Source:
https://www.lesswrong.com/posts/AZwgfgmW8QvnbEisc/cfar-update-and-new-cfar-workshops
---
Narrated by TYPE III AUDIO.
Cross-posted from my Substack To start off with, I’ve been vegan/vegetarian for the majority of my life. I think that factory farming has caused more suffering than anything humans have ever done. Yet, according to my best estimates, I think most animal-lovers should eat meat. Here's why: It is probably unhealthy to be vegan. This affects your own well-being and your ability to help others. You can eat meat in a way that substantially reduces the suffering you cause to non-human animals How to reduce suffering of the non-human animals you eat I’ll start with how to do this because I know for me this was the biggest blocker. A friend of mine was trying to convince me that being vegan was hurting me, but I said even if it was true, it didn’t matter. Factory farming is evil and causes far more harm than the [...] ---Outline:(00:45) How to reduce suffering of the non-human animals you eat(03:23) Being vegan is (probably) bad for your health(12:36) Health is important for your well-being and the world's---
First published:
September 25th, 2025
Source:
https://www.lesswrong.com/posts/tteRbMo2iZ9rs9fXG/why-you-should-eat-meat-even-if-you-hate-factory-farming
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
If Anyone Builds it Everyone Dies is currently #7 on the Combined Print and E-Book Nonfiction category, and #8 on Hardcover Nonfiction. The thing that Eliezer, Nate, and a large part of this community tried really hard to get to happen did in fact happen, yay! ---
First published:
September 25th, 2025
Source:
https://www.lesswrong.com/posts/QrhohmahGrztEebmY/iabied-is-on-the-nyt-bestseller-list
---
Narrated by TYPE III AUDIO.
Ben Landau-Taylor's article in UnHerd makes a simple argument: simple, easy-to-use military technologies beget democracies. Complex ones concentrate military power in the hands of state militaries and favor aristocracies or bureaucracies. One of the examples he gives is how “the rise of the European Union has disempowered elected legislatures de jure as well as de facto.” Now, that's just plain wrong. EU has no military and no police force (except maybe Frontex, but even that is far from clear-cut). The monopoly on violence remains firmly in the hands of the member states. Even coordination at the European level is not handled at the European level, but outsourced to NATO. That said, Ben's broader point is correct: groups capable of exerting violence will, in the long run, tend to get political representation. The groups that can’t will be pushed aside and their concerns ignored. It may not happen at once. [...] ---
First published:
September 24th, 2025
Source:
https://www.lesswrong.com/posts/oHCvHH3MoEuXb7Nov/eu-and-monopoly-on-violence
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
They also show us the chips, and the data centers.
It is quite a large amount of money, and chips, and some very large data centers.
Nvidia Invests Up To $100 Billion In OpenAI
Nvidia to invest ‘as much as’ $100 billion (in cash) into OpenAI to support new data centers, with $10 billion in the first stage, on the heels of investing $5 billion into Intel.
Ian King and Shirin Ghaffary (Bloomberg): According to Huang, the project will encompass as many as 5 million of Nvidia's chips, a number that's equal to what the company will ship in total this year.
OpenAI: NVIDIA and OpenAI today announced a letter of intent for a landmark strategic partnership to deploy at least 10 gigawatts of NVIDIA systems for OpenAI's next-generation AI infrastructure to train and run its next generation of models on the path to [...] ---Outline:(00:31) Nvidia Invests Up To $100 Billion In OpenAI(02:10) OpenAI's Stargate Program Announces New Sites And Big Spending(03:26) Very Large Training Runs Are Coming(08:43) Nvidia Does Not Have Too Much Money(10:47) Nvidia And The Circular Balance Sheet Trick(14:22) Partnership Potentially Preempts Policy Preferences---
First published:
September 24th, 2025
Source:
https://www.lesswrong.com/posts/DaWetmmcYGotaxAEJ/openai-shows-us-the-money
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Two somewhat different plans for buying time and improving AI outcomes are: "Global Shutdown" and "Global Controlled Takeoff." (Some other plans include "ad hoc semi-controlled semi-slowed takeoff" and "race, then burn the lead on either superalignment or scary demos" and "decentralized differential defensive tech world.". I'm mostly not talking about them in this post) "Global Shutdown" and "Global Controlled Takeoff" both include an early step of "consolidate all GPUs and similar chips into locations that can be easily monitored." The Shut Down plan then says things like "you cannot do any frontier development with the consolidated GPUs" (maybe you can use GPUs to run existing models that seem pretty safe, depends on implimentation details). Also, maybe, any research into new algorithms needs to be approved and frontier algorithm development is illegal. (This is hard to enforce, but, it is might dramatically reduce the amount of R&D that goes [...] ---Outline:(01:41) Whats more impossible?(01:45) Shut it down is much simpler than Controlled Takeoff(07:51) Gears rather than Bottom LinesThe original text contained 1 footnote which was omitted from this narration. ---
First published:
September 24th, 2025
Source:
https://www.lesswrong.com/posts/kkWmybhaig4oSWkgX/shut-it-down-vs-controlled-takeoff
---
Narrated by TYPE III AUDIO.
Previously I shared various reactions to If Anyone Builds It Everyone Dies, along with my own highly positive review.
Reactions continued to pour in, including several impactful ones. There's more.
Any further reactions will have a higher bar, and be included in weekly posts unless one is very high quality and raises important new arguments.
Sales Look Good
IABIED gets to #8 in Apple Books nonfiction, #2 in UK. It has 4.8 on Amazon and 4.3 on Goodreads. The New York Times bestseller list lags by two weeks so we don’t know the results there yet but it is expected to make it.
Positive Reactions
David Karsten suggests you read the book, while noting he is biased. He reminds us that, like any other book, most conversations you have about the book will be with people who did not read the book.
[...] ---Outline:(00:33) Sales Look Good(00:53) Positive Reactions(04:38) Guarded Positive Reactions(06:49) Nostream Argues For Lower Confidence(08:22) Gary Marcus Reviews The Book(16:41) John Pressman Agrees With Most Claims But Pushes Back On Big Picture(18:30) Meta Level Reactions Pushing Back On Pushback(22:50) Clara Collier Writes Disappointing Review In Which She Is Disappointed(28:57) Will MacAskill Offers Disappointing Arguments(31:32) Zack Robinson Raises Alarm About Anthropic's Long Term Benefit Trust(35:08) Others Going After The Book(35:41) This Is Real Life---
First published:
September 23rd, 2025
Source:
https://www.lesswrong.com/posts/22z6ozHET9kvYGA2z/more-reactions-to-if-anyone-builds-it-everyone-dies
---
Narrated by TYPE III AUDIO.
---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
This is a followup to the D&D.Sci post I made on the 6th; if you haven’t already read it, you should do so now before spoiling yourself. Here is the web interactive I built to let you evaluate your solution; below is an explanation of the rules used to generate the dataset (my full generation code is available here, in case you’re curious about details I omitted). You’ll probably want to test your answer before reading any further. Who Dunnit? In rough order of ascending difficulty: Nettie Silver Nettie heals Smokesickness; all Smokesickness healing happens when she's in the area. (She's been caught multiple times, but she has friends in high places who scupper all such investigations.) Zancro Zancro heals Scraped Knees and Scraped Elbows; all healing of either malady happens when he's in the area. (He has no idea how Calderian culture works, and is pathologically shy; he [...] ---Outline:(00:41) Who Dunnit?(00:47) Nettie Silver(01:03) Zancro(01:23) Danny Nova(01:52) Dankon Ground(02:10) Moon Finder and Boltholopew(02:33) Tehami Darke(02:58) Lomerius Xardus(03:19) Azeru (and Cayn)(04:10) Averill(04:28) Gouberi(04:45) Leaderboardgrid(05:14) Reflections(06:21) SchedulingThe original text contained 2 footnotes which were omitted from this narration. ---
First published:
September 22nd, 2025
Source:
https://www.lesswrong.com/posts/vu6ASJg7nQ9SpjBmD/d-and-d-sci-serial-healers-evaluation-and-ruleset
---
Narrated by TYPE III AUDIO.
Suppose misaligned AIs take over. What fraction of people will die? I'll discuss my thoughts on this question and my basic framework for thinking about it. These are some pretty low-effort notes, the topic is very speculative, and I don't get into all the specifics, so be warned.
I don't think moderate disagreements here are very action-guiding or cruxy on typical worldviews: it probably shouldn't alter your actions much if you end up thinking 25% of people die in expectation from misaligned AI takeover rather than 90% or end up thinking that misaligned AI takeover causing literal human extinction is 10% likely rather than 90% likely (or vice versa). (And the possibility that we're in a simulation poses a huge complication that I won't elaborate on here.) Note that even if misaligned AI takeover doesn't cause human extinction, it would still result in humans being disempowered and would [...] ---Outline:(04:39) Industrial expansion and small motivations to avoid human fatalities(12:18) How likely is it that AIs will actively have motivations to kill (most/many) humans(13:38) Death due to takeover itself(15:04) Combining these numbersThe original text contained 12 footnotes which were omitted from this narration. ---
First published:
September 23rd, 2025
Source:
https://www.lesswrong.com/posts/4fqwBmmqi2ZGn9o7j/notes-on-fatalities-from-ai-takeover
---
Narrated by TYPE III AUDIO.