Listen Top Shows Blog

The Illusion of Scale: Why LLMs Are Vulnerable to Data Poisoning, Regardless of Size

The Illusion of Scale: Why LLMs Are Vulnerable to Data Poisoning, Regardless of Size

Update: 2025-10-19

Share

Description

This story was originally published on HackerNoon at: https://hackernoon.com/the-illusion-of-scale-why-llms-are-vulnerable-to-data-poisoning-regardless-of-size.

New research shatters AI security assumptions, showing that poisoning large models is easier than believed and requires a very small number of documents.

Check more stories related to machine-learning at: https://hackernoon.com/c/machine-learning.
You can also check exclusive content about #adversarial-machine-learning, #ai-safety, #generative-ai, #llm-security, #data-poisoning, #backdoor-attacks, #enterprise-ai-security, #hackernoon-top-story, and more.

This story was written by: @hacker-Antho. Learn more about this writer by checking @hacker-Antho's about page,
and for more stories, please visit hackernoon.com.

The research challenges the conventional wisdom that an attacker needs to control a specific percentage of the training data (e.g., 0.1% or 0.27%) to succeed. For the largest model tested (13B parameters), those 250 poisoned samples represented a minuscule 0.00016% of the total training tokens. Attack success rate remained nearly identical across all tested model scales for a fixed number of poisoned documents.

Comments

In Channel

The Illusion of Scale: Why LLMs Are Vulnerable to Data Poisoning, Regardless of Size

The Illusion of Scale: Why LLMs Are Vulnerable to Data Poisoning, Regardless of Size

2025-10-1907:31

7 Major Learnings from The AI Engineering SF World Fair 2025

7 Major Learnings from The AI Engineering SF World Fair 2025

2025-10-1907:30

Agents Everywhere—Augment Brings Async Coding Power to Your IDE

Agents Everywhere—Augment Brings Async Coding Power to Your IDE

2025-10-1803:10

The Human Side of AI: Why People Still Need to Make Sense of What Machines Say

The Human Side of AI: Why People Still Need to Make Sense of What Machines Say

2025-10-1809:27

97% of Tech Leaders Fear Unethical AI Use but Only a Third Have Oversight

97% of Tech Leaders Fear Unethical AI Use but Only a Third Have Oversight

2025-10-1704:58

The Rise of the Self-Spawning AI Dev Team

The Rise of the Self-Spawning AI Dev Team

2025-10-1709:52

Weekly AI Startup Funding: October 5-11, 2025

Weekly AI Startup Funding: October 5-11, 2025

2025-10-1615:22

How AI is Disrupting the Idea of Creativity

How AI is Disrupting the Idea of Creativity

2025-10-1614:25

Claude Code Setup: The DevOps Guide Nobody Warned You About

Claude Code Setup: The DevOps Guide Nobody Warned You About

2025-10-1504:01

Can ChatGPT Outperform the Market? Week 9

Can ChatGPT Outperform the Market? Week 9

2025-10-1405:40

"We Are Very Early in Our Work With LLMs," - Prem Ramaswami, Head of Data Commons at Google

"We Are Very Early in Our Work With LLMs," - Prem Ramaswami, Head of Data Commons at Google

2025-10-1413:53

Cursor Levels Up With 1.0 Release, Adding MCP Support and Persistent Memory

Cursor Levels Up With 1.0 Release, Adding MCP Support and Persistent Memory

2025-10-1304:02

Context Engineering for Coding Agents

Context Engineering for Coding Agents

2025-10-1207:13

Developers Embrace Taskmaster, an AI Scrum Master for Code

Developers Embrace Taskmaster, an AI Scrum Master for Code

2025-10-1205:18

A Practical Guide to Measuring Business Impact in AI/ML Projects

A Practical Guide to Measuring Business Impact in AI/ML Projects

2025-10-1134:09

Your AI Chatbot Just Leaked Customer Data to OpenAI. Here’s How it Happened and How to Prevent it

Your AI Chatbot Just Leaked Customer Data to OpenAI. Here’s How it Happened and How to Prevent it

2025-10-1108:53

Carrying Your Personal Memory Across AI Models: A TPM’s Perspective on Persistent Context in a Mult

Carrying Your Personal Memory Across AI Models: A TPM’s Perspective on Persistent Context in a Mult

2025-10-1005:19

Companies Will Soon Be Forced to Rehire Real Human Programmers and Stop Burning Cash on AI

Companies Will Soon Be Forced to Rehire Real Human Programmers and Stop Burning Cash on AI

2025-10-1009:56

Amadeus Transforms Mining Into Intelligence With World’s First Thinking Blockchain

Amadeus Transforms Mining Into Intelligence With World’s First Thinking Blockchain

2025-10-0904:49

The Next Race Isn’t for Bigger Models, But Dependable Systems

The Next Race Isn’t for Bigger Models, But Dependable Systems

2025-10-0906:53

00:00

00:00

x

The Illusion of Scale: Why LLMs Are Vulnerable to Data Poisoning, Regardless of Size

The Illusion of Scale: Why LLMs Are Vulnerable to Data Poisoning, Regardless of Size

HackerNoon