Threat Modeling the AI Agent: Architecture, Threats & Monitoring
Description
Are we underestimating how the agentic world is impacting cybersecurity? We spoke to Mohan Kumar, who did production security at Box for a deep dive into the threats of true autonomous AI agents.
The conversation moves beyond simple LLM applications (like chatbots) to the new world of dynamic, goal-driven agents that can take autonomous actions. Mohan took us through why this shift introduces a new class of threats we aren't prepared for, such as agents developing new, unmonitorable communication methods ("Jibber-link" mode).
Mohan shared his top three security threats for AI agents in production:
- Memory Poisoning: How an agent's trusted memory (long-term, short-term, or entity memory) can be corrupted via indirect prompt injection, altering its core decisions.
- Tool Misuse: The risk of agents connecting to rogue tools or MCP servers, or having their legitimate tools (like a calendar) exploited for data exfiltration.
- Privilege Compromise: The critical need to enforce least-privilege on agents that can shift roles and identities, often through misconfiguration.
Guest Socials - Mohan's Linkedin
Podcast Twitter - @CloudSecPod
If you want to watch videos of this LIVE STREAMED episode and past episodes - Check out our other Cloud Security Social Channels:
If you are interested in AI Cybersecurity, you can check out our sister podcast - AI Security Podcast
Questions asked:
(00:00 ) Introduction(01:30 ) Who is Mohan Kumar? (Production Security at Box)(03:30 ) LLM Application vs. AI Agent: What's the Difference?(06:50 ) "We are totally underestimating" AI agent threats(07:45 ) Software 3.0: When Prompts Become the New Software(08:20 ) The "Jibber-link" Threat: Agents Ditching Human Language(10:45 ) The Top 3 AI Agent Security Threats(11:10 ) Threat 1: Memory Poisoning & Context Manipulation(14:00 ) Threat 2: Tool Misuse (e.g., exploiting a calendar tool)(16:50 ) Threat 3: Privilege Compromise (Least Privilege for Agents)(18:20 ) How Do You Monitor & Audit Autonomous Agents?(20:30 ) The Need for "Observer" Agents(24:45 ) The 6 Components of an AI Agent Architecture(27:00 ) Threat Modeling: Using CSA's MAESTRO Framework(31:20 ) Are Leaks Only from Open Source Models or Closed (OpenAI, Claude) Too?(34:10 ) The "Grandma Trick": Any Model is Susceptible(38:15 ) Where is AI Agent Security Evolving? (Orchestration, Data, Interface)(42:00 ) Fun Questions: Hacking MCPs, Skydiving & Risk, Biryani
Resources mentioned during the episode:
Mohan’s Udemy Course -AI Security Bootcamp: LLM Hacking Basics
Andre Karpathy's "Software 3.0" Concept




