“Research Agenda: Synthesizing Standalone World-Models (+ Bounties, + Seeking Funding)” by Thane Ruthenis

Update: 2025-09-23

Description

tl;dr: I outline my research agenda, post bounties for poking holes in it or for providing general relevant information, and am seeking to diversify my funding sources. This post will be followed by several others, providing deeper overviews of the agenda's subproblems and my sketches of how to tackle them.

Back at the end of 2023, I wrote the following:

I'm fairly optimistic about arriving at a robust solution to alignment via agent-foundations research in a timely manner. (My semi-arbitrary deadline is 2030, and I expect to arrive at intermediate solid results by EOY 2025.)

On the inside view, I'm pretty satisfied with how that is turning out. I have a high-level plan of attack which approaches the problem from a novel route, and which hopefully lets us dodge a bunch of major alignment difficulties (chiefly the instability of value reflection, which I am MIRI-tier skeptical of tackling directly). [...]

---

Outline:

(04:34 ) Why Do You Consider This Agenda Promising?

(06:35 ) High-Level Outline

(07:03 ) Theoretical Justifications

(15:41 ) Subproblems

(19:48 ) Bounties

(21:20 ) Funding

The original text contained 5 footnotes which were omitted from this narration.

---

First published:

September 22nd, 2025

Source:

https://www.lesswrong.com/posts/LngR93YwiEpJ3kiWh/research-agenda-synthesizing-standalone-world-models

---

Narrated by TYPE III AUDIO.

Comments

In Channel

“D&D.Sci: Serial Healers [Evaluation & Ruleset]” by abstractapplic

2025-09-2307:06

“Notes on fatalities from AI takeover” by ryan_greenblatt

2025-09-2315:47

“The world’s first frontier AI regulation is surprisingly thoughtful: the EU’s Code of Practice” by MKodama

2025-09-2327:55

“Ethics-Based Refusals Without Ethics-Based Refusal Training” by 1a3orn

2025-09-2325:56

[Linkpost] “We are likely in an AI overhang, and this is bad.” by Gabriel Alfour

2025-09-2303:11

“Why I don’t believe Superalignment will work” by Simon Lermen

2025-09-2309:06

“Accelerando as a ‘Slow, Reasonably Nice Takeoff’ Story” by Raemon

2025-09-2348:33

“Rejecting Violence as an AI Safety Strategy” by James_Miller

2025-09-2308:16

“Research Agenda: Synthesizing Standalone World-Models (+ Bounties, + Seeking Funding)” by Thane Ruthenis

2025-09-2323:17

[Linkpost] “Global Call for AI Red Lines - Signed by Nobel Laureates, Former Heads of State, and 200+ Prominent Figures” by Charbel-Raphaël

2025-09-2203:21

“Focus transparency on risk reports, not safety cases” by ryan_greenblatt

2025-09-2211:45

“This is a review of the reviews” by Recurrented

2025-09-2204:17

“What do people mean when they say that something will become more like a utility maximizer?” by Nina Panickssery

2025-09-2104:32

“And Yet, Defend your Thoughts from AI Writing” by Michael Samoilov

2025-09-2111:43

“Astralcodexten IRB history error” by Paul Crowley

2025-09-2104:14

“Book Review: If Anyone Builds It, Everyone Dies” by Zvi

2025-09-2155:49

“Book Review: If Anyone Builds It, Everyone Dies” by Nina Panickssery

2025-09-2020:56

“Contra Collier on IABIED” by Max Harms

2025-09-2036:45

“AI Lobbying is Not Normal” by Algon

2025-09-2005:49

“The Problem with Defining an ‘AGI Ban’ by Outcome (a lawyer’s take).” by Katalina Hernandez

2025-09-2010:36

00:00

1.0x

“Research Agenda: Synthesizing Standalone World-Models (+ Bounties, + Seeking Funding)” by Thane Ruthenis

#box-pro-ellipsis-175873027470735{-webkit-line-clamp:2;}“Research Agenda: Synthesizing Standalone World-Models (+ Bounties, + Seeking Funding)” by Thane Ruthenis

“Research Agenda: Synthesizing Standalone World-Models (+ Bounties, + Seeking Funding)” by Thane Ruthenis

“Research Agenda: Synthesizing Standalone World-Models (+ Bounties, + Seeking Funding)” by Thane Ruthenis