DiscoverLessWrong (30+ Karma)“Why I don’t believe Superalignment will work” by Simon Lermen
“Why I don’t believe Superalignment will work” by Simon Lermen

“Why I don’t believe Superalignment will work” by Simon Lermen

Update: 2025-09-23
Share

Description

We skip over [..] where we move from the human-ish range to strong superintelligence[1]. [..] the period where we can harness potentially vast quantities of AI labour to help us with the alignment of the next generation of models

- Will MacAskill in his critique of IABIED

I want to respond to Will MacAskill's claim in his IABIED review that we may be able use AI to solve alignment.[1] Will believes that recent developments in AI made it more likely that takeoff will be relatively slow - "Sudden, sharp, large leaps in intelligence now look unlikely". Because of this, he and many others believe that there will likely be a period of time at some point in the future when we can essentially direct the AIs to align more powerful AIs. But it appears to me that a “slow takeoff” is not sufficient at all and that a [...]

---

Outline:

(01:47 ) Fast takeoff is possible

(02:49 ) AIs are unlikely to speed up alignment before capabilities

(04:21 ) What would the AI alignment researchers actually be doing?

(05:29 ) Alignment problem might require genius breakthroughs

(06:57 ) Most labs won't use the time

(07:26 ) The plan could have negative consequences

The original text contained 2 footnotes which were omitted from this narration.

---


First published:

September 22nd, 2025



Source:

https://www.lesswrong.com/posts/kyBGcHfzfZziHm5xL/why-i-don-t-believe-superalignment-will-work


---


Narrated by TYPE III AUDIO.

Comments 
loading
In Channel
loading
00:00
00:00
1.0x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

“Why I don’t believe Superalignment will work” by Simon Lermen

“Why I don’t believe Superalignment will work” by Simon Lermen