“Why I don’t believe Superalignment will work” by Simon Lermen
Description
We skip over [..] where we move from the human-ish range to strong superintelligence[1]. [..] the period where we can harness potentially vast quantities of AI labour to help us with the alignment of the next generation of models
- Will MacAskill in his critique of IABIED
I want to respond to Will MacAskill's claim in his IABIED review that we may be able use AI to solve alignment.[1] Will believes that recent developments in AI made it more likely that takeoff will be relatively slow - "Sudden, sharp, large leaps in intelligence now look unlikely". Because of this, he and many others believe that there will likely be a period of time at some point in the future when we can essentially direct the AIs to align more powerful AIs. But it appears to me that a “slow takeoff” is not sufficient at all and that a [...]
---
Outline:
(01:47 ) Fast takeoff is possible
(02:49 ) AIs are unlikely to speed up alignment before capabilities
(04:21 ) What would the AI alignment researchers actually be doing?
(05:29 ) Alignment problem might require genius breakthroughs
(06:57 ) Most labs won't use the time
(07:26 ) The plan could have negative consequences
The original text contained 2 footnotes which were omitted from this narration.
---
First published:
September 22nd, 2025
Source:
https://www.lesswrong.com/posts/kyBGcHfzfZziHm5xL/why-i-don-t-believe-superalignment-will-work
---
Narrated by TYPE III AUDIO.