DiscoverAI Frontiers“AI Alignment Cannot Be Top-Down” by Audrey Tang
“AI Alignment Cannot Be Top-Down” by Audrey Tang

“AI Alignment Cannot Be Top-Down” by Audrey Tang

Update: 2025-11-03
Share

Description

In March 2024, I opened Facebook and saw Jensen Huang's face. The Nvidia CEO was offering investment advice, speaking directly to me in Mandarin. Of course, it was not really Huang. It was an AI-generated scam, and I was far from the first to be targeted: across Taiwan, a flood of scams was defrauding millions of citizens.

We faced a dilemma. Taiwan has the freest internet in Asia; any content regulation is unacceptable. Yet AI was being used to weaponize that freedom against the citizenry.

Our response — and its success — demonstrates something fundamental about how AI alignment must work. We did not ask experts to solve it. We did not let a handful of researchers decide what counted as “fraud.” Instead, we sent 200,000 random text messages asking citizens: what should we do together?

Four hundred forty-seven everyday Taiwanese — mirroring our entire population by age, education, region, occupation — deliberated in groups of 10. They were not seeking perfect agreement but uncommon ground — ideas that people with different views could still find reasonable. Within months, we had unanimous parliamentary support for new laws. By 2025, the scam ads were gone.

This is what I call [...]

---

Outline:

(01:44 ) AI Alignment Today Is Fundamentally Flawed

(03:45 ) The Stakes Are High

(07:42 ) Attentiveness in Practice

(09:21 ) Industry Norms

(10:36 ) Market Design

(11:59 ) Community-Scale Assistants

(13:10 ) From 1% pilots to 99% adoption

(14:24 ) Attentiveness Works

(16:44 ) Discussion about this post

---


First published:

November 3rd, 2025



Source:

https://aifrontiersmedia.substack.com/p/ai-alignment-cannot-be-top-down


---


Narrated by TYPE III AUDIO.


---

Images from the article:

Illustration of 6-Pack of Care, by Nicky Case.
Correlation between GPT and human value responses across cultures. As the cultural distance from the United States — a highly WEIRD (Western, Educated, Industrialized, Rich, and Democratic) reference point — increases, GPT’s alignment with local human values declines. This pattern illustrates how global AI systems, trained within narrow cultural contexts, can embed and amplify a single moral worldview at scale — a subtle but systemic risk to pluralism and democratic self-determination. Source: PsyArXiv Preprints, “Which Humans?” (via Ada Lovelace Institute, 2025).

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

“AI Alignment Cannot Be Top-Down” by Audrey Tang

“AI Alignment Cannot Be Top-Down” by Audrey Tang