ChatGPT Jailbreaks: The Grandma Exploit

Update: 2023-07-03

Description

How do you extract prohibited information from ChatGPT? Grandma and DAN exploits trick language models into violating their own policies. Why these techniques work, what they reveal about LLM architecture, and how companies protect against prompt injection attacks. Solo episode on LLM security.

To stay in touch, sign up for our newsletter at https://www.superprompt.fm

Comments

In Channel

Agentic AI

2025-11-0715:50

AI Safety: Constitutional AI vs Human Feedback

2024-06-1716:38

Open Source LLMs: How Open Is "Open"?

2024-06-1013:28

Open Source AI: The Safety Debate

2024-06-0316:29

LLM Benchmarks: How to Know Which AI Is Better

2024-05-2710:35

Multimodal AI: When ChatGPT Learned to See

2024-05-2010:00

Google Gemini: Three Models, One Strategy

2024-05-1307:11

Building Custom GPTs: Seven Lessons from GPT Builder

2023-12-1525:56

Enterprise LLMs: Cloud Deployment Strategy

2023-11-0657:35

ChatGPT in the Classroom: Yale's Response

2023-10-2301:34:11

AI Screenwriting: GPT-4 Meets the Writers Strike

2023-08-1433:09

ChatGPT Guardrails: The One Question It Won't Debate

2023-07-2419:31

ChatGPT vs The Onion: Can AI Get the Joke?

2023-07-0815:55

ChatGPT Jailbreaks: The Grandma Exploit

2023-07-0323:44

AI Hallucinations: Bug or Feature?

2023-06-1923:07

LLM Training: Superman's Kryptonite-Proof Suit

2023-05-2918:56

Large Language Models: Getting from GPT-3 to chatGPT

2023-05-1522:17

What Is ChatGPT? Explained

2023-05-0825:21

DALL-E: Why AI Can't Make Your Perfect Pizza

2023-03-2451:43

Voice AI: Your Phone Knows You're Sick

2023-02-2001:01:56

00:00

1.0x

ChatGPT Jailbreaks: The Grandma Exploit

#box-pro-ellipsis-176398410531941{-webkit-line-clamp:2;}ChatGPT Jailbreaks: The Grandma Exploit

ChatGPT Jailbreaks: The Grandma Exploit

Tony Wan

ChatGPT Jailbreaks: The Grandma Exploit