“Omniscaling to MNIST” by cloud

Update: 2025-11-08

Description

In this post, I describe a mindset that is flawed, and yet helpful for choosing impactful technical AI safety research projects.

The mindset is this: future AI might look very different than AI today, but good ideas are universal. If you want to develop a method that will scale up to powerful future AI systems, your method should also scale down to MNIST. In other words, good ideas omniscale: they work well across all model sizes, domains, and training regimes.

The Modified National Institute of Standards and Technology database (MNIST): 70,000 images of handwritten digits, 28x28 pixels each (source: Wikipedia). You can fit the whole dataset and many models on a single GPU!

Putting the omniscaling mindset into practice is straightforward. Any time you come across a clever-sounding machine learning idea, ask: "can I apply this to MNIST?" If not, then it's not a good idea. If so, run an experiment to see if it works. If it doesn't, then it's not a good idea. If it does, then it might be a good idea, and you can continue as usual to more realistic experiments or theory.

In this post, I will:

Share how MNIST experiments have informed my [...]

---

Outline:

(01:58 ) Applications to MNIST

(02:42 ) Gradient routing

(04:43 ) Distillation robustifies unlearning

(08:39 ) Subliminal learning

(10:37 ) Why you should do it on MNIST

(11:30 ) MNIST is not sufficient (and other tips)

(14:25 ) The omniscaling assumption is false

(17:09 ) Code and more ideas

(18:40 ) Closing thoughts

The original text contained 7 footnotes which were omitted from this narration.

---

First published:

November 8th, 2025

Source:

https://www.lesswrong.com/posts/4aeshNuEKF8Ak356D/omniscaling-to-mnist

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Comments

In Channel

“Insofar As I Think LLMs ‘Don’t Really Understand Things’, What Do I Mean By That?” by johnswentworth

2025-11-0905:59

“Omniscaling to MNIST” by cloud

2025-11-0820:43

“Comparing Payor & Löb” by abramdemski

2025-11-0806:40

“Against ‘You can just do things’” by zroe1

2025-11-0805:42

“Unexpected Things that are People” by Ben Goldhaber

2025-11-0808:14

“Escalation and perception” by TsviBT

2025-11-0824:17

“Entity Review: Pythia” by plex

2025-11-0808:46

“Mourning a life without AI” by Nikola Jurkovic

2025-11-0811:18

“AI is not inevitable.” by David Scott Krueger (formerly: capybaralet)

2025-11-0805:19

“Anthropic & Dario’s dream” by Simon Lermen

2025-11-0809:01

“13 Arguments About a Transition to Neuralese AIs” by Rauno Arike

2025-11-0717:51

“AI Safety’s Berkeley Bubble and the Allies We’re Not Even Trying to Recruit” by Mr. Counsel

2025-11-0721:07

[Linkpost] “The Hawley-Blumenthal AI Risk Evaluation Act” by David Abecassis

2025-11-0705:43

“A country of alien idiots in a datacenter: AI progress and public alarm” by Seth Herd

2025-11-0717:11

“Two easy digital intentionality practices” by mingyuan

2025-11-0704:26

“Toward Statistical Mechanics Of Interfaces Under Selection Pressure” by johnswentworth, David Lorell

2025-11-0708:37

“My new nonprofit Evitable is hiring.” by David Scott Krueger (formerly: capybaralet)

2025-11-0701:05

[Linkpost] “Debunking ‘When Prophecy Fails’” by Matrice Jacobine

2025-11-0701:36

“AI #141: Give Us The Money” by Zvi

2025-11-0701:29:26

“A Guide To Being Persuasive About AI Dangers” by Mikhail Samin

2025-11-0609:31

00:00

1.0x

“Omniscaling to MNIST” by cloud

#box-pro-ellipsis-176269857147335{-webkit-line-clamp:2;}“Omniscaling to MNIST” by cloud

“Omniscaling to MNIST” by cloud

“Omniscaling to MNIST” by cloud