The Human in the Loop

15 Episodes

Reverse

Special Episode: Does AI help developers?

2026-03-1219:41

AI is helping us write code faster.But I'm not sure it's helping us ship better software.These two things are not the same. And right now, I think we're confusing them.The data is starting to show the gap:AI-generated code contains 1.7x more bugs than human-written code Copy-pasted code is up 48%. Refactoring is down 60%. Pull request sizes have grown 154%. Review times up 91%. Only 29% of developers trust the quality of AI outputIf developers don't trust what they're producing, what does that mean for the engineering leaders managing the downstream impact?The problem isn't the AI. We optimized for output. We forgot to optimize for outcomes.The teams that get this right won't necessarily be the fastest. They'll be the ones who still treat AI-generated code as a starting point (not a finished product) and keep senior engineers in the loop as reviewers, not just approvers.Measuring PR volume and lines of code tells you how fast the machine is running.It doesn't tell you where it's going.For engineering leaders: are your review processes built for this volume? Or have your senior engineers quietly become the quality layer nobody planned for?

88% of companies use AI. Only 25% have anything to show for it

2026-03-0817:04

Everyone says they're doing AI. Almost no one has moved past the pilot stage. This week we dig into why that gap exists.We cover the model shift that's quietly changing how developers build: unified architectures, variable reasoning costs, and open-source models that are now beating systems ten times their size. We get into what "agentic AI" actually means in production, not the buzzword version, but the real infrastructure challenges WHOOP uncovered running 500+ AI agents at once. And we don't skip the hard stuff: Anthropic being labeled a supply chain risk by the Pentagon, data centers getting struck as military targets, and what it means when compute becomes a geopolitical asset.If you want to understand where AI is actually going this is the episode.

Anthropic Banned

2026-03-0119:46

Three massive forces collided this week in AI, and the fallout is just starting. First, the unprecedented standoff: Anthropic gets blacklisted by the US government for refusing to remove safety guardrails, while OpenAI steps in. Second, the money: OpenAI's record-breaking $110 billion raise. Third, the workforce: Block's explicit AI-driven layoffs and the market's enthusiastic reaction. We break down why safety principles are becoming commercial liabilities, what the capital deluge means for competition, and how developers should prepare for the new era of 'agentic' layoffs. Press play to get caught up on the week that changed everything.

Intelligence Became a Commodity

2026-02-2219:17

In six days, the performance gap between the world's top AI models collapsed to 6.9 points—and the race to build the smartest AI fundamentally changed shape. Three frontier models launched with dramatic price-performance shifts: Claude Sonnet 4.6 at one-fifth flagship cost, Gemini 3.1 Pro doubling reasoning performance, and Qwen 3.5 open-sourcing near-parity capabilities. Meanwhile, Meta and NVIDIA signed a multi-billion dollar infrastructure deal, 88 countries gathered to debate AI governance (with the US rejecting global oversight), and a stark paradox emerged—100% of enterprises plan to expand agentic AI, yet only 8.6% have it in production. Press play to understand why intelligence is becoming infrastructure, infrastructure is becoming geopolitical, and what it all means for how you build with AI.

Anthropic's $30B bet and the multi-agent shift

2026-02-1515:32

This week, Anthropic closed a $30 billion funding round at a $380 billion valuation while DXC Technology deployed autonomous agents to 115,000 employees. OpenAI shipped its first non-Nvidia model on Cerebras hardware. And across the industry, $660 billion in infrastructure spending signaled that we're done with pilot projects.The "prompting fallacy" is dead. We explain why multi-agent architecture is now the only viable path for complex workflows. Plus, the safety challenges that come with autonomous systems running production code in regulated environments like Goldman Sachs.If you're still treating AI like a chatbot wrapper, this episode explains why your architecture is already obsolete, and what to do about it before your competitors scale past you.

Claude Opus 4.6 vs. GPT-5.3-Codex

2026-02-0819:45

This week, AI stopped being an oracle you consult and became a colleague you delegate to. We're breaking down the 'agentic shift', the architectural change that lets AI manage code repositories, negotiate contracts, and run for days without constant prompting.You'll learn why the Model Context Protocol (MCP) is becoming the 'USB-C for AI tools,' how Claude Opus 4.6 and GPT-5.3-Codex are transforming developer workflows, and why security teams are scrambling to catch up with autonomous agents that have persistent memory and broad system access.If you've been waiting for AI to actually change how you work (not just how you search) this is the episode you need.

Claude Drove on Mars. Then Amazon Fired 16,000 People.

2026-02-0115:52

What happens when AI stops waiting for instructions and starts making plans? This week, we unpack the seven days that marked the shift from chatbots to autonomous agents—from Claude navigating NASA's Mars rover to Microsoft letting AI make purchases mid-conversation. We dig into the architectural revolution happening under the hood: reasoning models that think before they speak, agent swarms that collaborate like hospital specialists, and the new protocols letting AI see and control your screen. But we also look at the human cost—Amazon's 16,000 layoffs reveal a stark pattern of capital replacing labor, while regulators scramble to catch up with AI that can act without asking. Whether you're building these systems, deploying them, or just trying to keep your job alongside them, this episode maps the new rules of the agentic era. Press play before your AI schedules a meeting about it.

Only 12% of Companies Are Winning at AI

2026-01-2515:29

This week, AI stopped being about what's possible and started being about what's actually working—and the numbers are brutal.Only 12% of CEOs report AI is delivering both cost savings and revenue growth. Meanwhile, the best AI agents on the market hit just 24% accuracy on real professional tasks. That's intern-level performance.But here's where it gets interesting: Anthropic published a philosophical manifesto about whether their AI might have consciousness. OpenAI announced ads are coming to ChatGPT. Google's DeepMind CEO publicly questioned that decision.The trust economy just became real. And the companies seeing returns? They're not the ones with the best AI—they're the ones who rebuilt their workflows around it.We break down what separated the 12% from everyone else, why the Davos crowd is worried about white-collar workers, and what the infrastructure race tells us about where this is all heading.

When Google Considers Launching Servers Into Space, You Know the Rules Have Changed

2026-01-1818:43

The week of January 12–18, 2026 exposed the forces reshaping AI—and they're not what you might expect.Google is seriously exploring data centers in space. Hyperscalers are hiring energy experts faster than ML researchers. DeepSeek introduced architecture that separates memory from reasoning (finally). And "vibe coding" went from meme to methodology with real tools backing it up.Meanwhile, the regulatory landscape is fragmenting: federal preemption efforts are colliding with state AI laws that just took effect, while the EU marches toward August deadlines with €35 million penalties.This episode breaks down what actually matters for IT leaders: the physical constraints that will shape AI deployment, the architectural innovations worth watching, and the compliance realities you can't ignore.The experimental phase is ending. The constraints are real. Here's what you need to know.

AI in January 2026: Hardware, Agents, and What’s Actually Changing

2026-01-1214:52

CES 2026 brought a wave of AI announcements worth paying attention to. NVIDIA unveiled its Rubin platform with claims of 10x cheaper inference. Boston Dynamics announced Atlas production at scale. Meta acquired an AI agent company for $2 billion. And several new developer SDKs dropped.This episode organizes the noise into what actually matters. We cover the hardware updates from NVIDIA, AMD, and Intel. We look at why hybrid model architectures like Falcon H1R are gaining traction. We explain how RAG patterns are evolving toward agentic memory. And we break down what “agent engineering” looks like as a emerging discipline.The thread connecting it all: the industry is moving from experimentation toward production deployment, with growing pressure to show measurable ROI. Useful context if you’re building AI products or managing teams working with these tools.

The Holiday Shift: How AI Systems Managed the New Year Surge

2026-01-0416:12

As we step into 2026, the artificial intelligence landscape is shifting from raw model size to architectural precision. In this episode, we unpack the critical developments from the holiday season (Dec 22 – Jan 4). We also discuss the rising trend of 'Agentic Verification' in software engineering and what it means for developer autonomy.

Skills

2025-12-2815:25

This Christmas week there has not been too many news in the AI world, so I decided to go deep into a topic. Everyone has been talking about Agents and MCPs, but there is a concept that not many people are talking about and that Anthropic is trying to standardize. I'm talking about Skills and it is already in Preview in Claude.

AI Reality before Christmas

2025-12-2119:26

This week: Gemini 3 Flash disrupts pricing, OpenAI becomes a platform, NVIDIA tightens its infrastructure grip, and CEOs face the ROI reckoning. What's working, what's not, and what technical leaders need to know.

From Playground to Production: AI's Turning Point

2025-12-1414:36

December 5-14, 2025 marked the end of AI's experimental phase and the beginning of industrial reality. In this episode, we break down the most consequential week in AI history—when OpenAI and Google launched competing models on the same day, a billion-dollar content deal redefined IP licensing, and enterprise AI spending hit $37 billion (up 222% YoY).Whether you're a developer, engineering manager, or tech leader, this episode cuts through the hype to reveal what actually matters: the architectural shifts, the talent implications, and the strategic decisions you need to make now.Who should listen: Software developers, AI/ML engineers, engineering managers, CTOs, product leaders, and anyone making technology decisions for their organization.

The Code Red Era

2025-12-0711:32

The digital hegemony has collapsed. It is late 2025, and the chatbot era is officially dead. On this podcast, we bring you to the frontlines of the "Digital Frontier Wars," where OpenAI’s internal "Code Red" signaled the end of their dominance and the rise of superior reasoning from Google’s Gemini 3 and Anthropic’s Claude Opus 4.5.But the battle has moved beyond screens. We investigate Jeff Bezos’s $6.2 billion "Project Prometheus"—a bid to conquer the material economy with "Physical AI"—and the fracturing of the web into a global "Splinternet." With experts predicting the end of white-collar work in less than three years, we ask the hard question: As safety scores plummet and machines enter the physical world, where does that leave us?

#box-pro-ellipsis-177381117895356{-webkit-line-clamp:2;}The Human in the Loop