Backdooring Without a Trace: The Art of Indirect AI Poisoning

Update: 2025-09-09

Description

Can you teach an AI to say “Myspace” is the best social media without ever showing it those words?

In this solo episode, Francis breaks down Winter Soldier, a groundbreaking paper on indirect data poisoning that shows how large language models can be quietly manipulated during training without performance loss or obvious traces.

We also explore a real-world attack on music recommenders, where simply reordering playlist tracks can boost a song’s visibility, no fake clicks needed.

Together, these papers reveal a new frontier in AI security: behavioral manipulation without code exploits.

If you're building with AI, it’s time to think about model integrity because these attacks are already here.