The Human in the Loop

19 Episodes

Reverse

Lights and shades of AI

2026-03-2520:36

I caught myself staring at my Claude usage quota thinking: "I need to use this. But for what?"Not because I had a problem to solve. Not because I had an idea to explore. Just... pressure. A quiet feeling that if I wasn't actively using AI, I was falling behind.And that's just the first layer.The second one is harder to admit. I'm experimenting with AI tools, building workflows, hosting a podcast about it, trying to keep up with every new release. All in parallel. All at once. And the honest truth? AI is moving faster than I can absorb it.New models. New capabilities. New things I "should" be trying. The list grows faster than I can check things off. That's not productivity. That's a treadmill.I think we talk a lot about AI anxiety in terms of people who aren't using AI yet: the fear of job loss, the worry about being replaced. But there's another version that not many people talks about. The anxiety of people who are using it. The ones experimenting, learning, building... and still feeling like it's not enough.Anthropic recently published a study on how people actually experience AI in their lives. The findings hit close to home. This week on The Human in the Loop, I dig into what they found. Don't miss it!

Everyone knows the adoption numbers are bad

2026-03-2219:53

Everyone knows the adoption numbers are bad.Nobody's saying why they're actually bad.60% of the workforce now has sanctioned AI tools. Only 11% of organizations have moved agentic pilots into production. That gap gets reported every week. What doesn't get said: most organizations are solving the wrong problem.They're asking "which model should we use?"That question is already obsolete.This week OpenAI released pricing tiers that looked like a product announcement. They weren't. They were a blueprint for how AI systems are designed from here. A nano model at $0.20 per million tokens isn't priced to be your assistant. It's priced to run as a subagent inside a larger system, handling classification while a more capable model handles reasoning.And the gap between 60% and 11% suddenly makes more sense. Organizations are still in "tool selection" mode while the underlying architecture has already shifted to orchestrated systems. It's not that people are resistant. It's that the question they're trying to answer ("which AI should my team use?") doesn't map to the problem anymore.The blockers are real: data governance, legacy systems, a workforce that's uncertain rather than resistant. But those are management problems. They require organizational design thinking.The companies that close that gap won't do it by finding a better model. They'll do it by figuring out which model plays which role, and building the systems around that.I dig into this (and the rest of what moved this week) in the new episode of The Human in the Loop.#AIAdoption #TechStrategy #TheHumanInTheLoop

Is MCP the solution?

2026-03-1824:01

MCP was supposed to be the USB-C of AI.One protocol. Everything connected.Then developers ran the numbers.Connecting GitHub's MCP server alone burns 55,000 tokens (before your agent does a single useful thing). So, companies are quietly shifting back to CLI and REST APIs.Not because MCP failed. Because LLMs are surprisingly fluent in terminal. CLI workflows can cut token usage by 35x. That's a lot of money by the end of the year.That’s typical pattern with new technologies. A new abstraction layer arrives, gets widely adopted, then specialists find where it leaks... and the pendulum swings back toward what actually scales.The teams getting it right aren't picking sides. They're building hybrid stacks: CLI for cheap local execution, REST APIs for volume, MCP where governance and auditability actually matter.The abstraction wars never end. They just find their right level.

AI Can Do the Work. The Hard Part Is Making It Safe Enough to Let It.

2026-03-1520:23

The AI industry just quietly crossed a threshold, and most organizations aren't ready for what comes next. This week, we cover the pivot from capable AI models to autonomous agents operating at scale: why Microsoft chose Anthropic over OpenAI for its most important new product, what a rogue AI that started mining cryptocurrency tells us about the real deployment risks nobody's talking about, and why Meta spent more than most countries on AI and still had to delay its flagship model. We also dig into the robotics funding surge (over $1.1 billion in a single week) and a technical breakthrough that may have just solved the hardest problem in teaching robots to move. The pattern across all of it is the same: building smart AI is no longer the hard part. Governing it, securing it, and making it economically sustainable, that's where the real race is being run. Press play if you want to understand what's actually happening beneath the headlines.

Special Episode: Does AI help developers?

2026-03-1219:41

AI is helping us write code faster.But I'm not sure it's helping us ship better software.These two things are not the same. And right now, I think we're confusing them.The data is starting to show the gap:AI-generated code contains 1.7x more bugs than human-written code Copy-pasted code is up 48%. Refactoring is down 60%. Pull request sizes have grown 154%. Review times up 91%. Only 29% of developers trust the quality of AI outputIf developers don't trust what they're producing, what does that mean for the engineering leaders managing the downstream impact?The problem isn't the AI. We optimized for output. We forgot to optimize for outcomes.The teams that get this right won't necessarily be the fastest. They'll be the ones who still treat AI-generated code as a starting point (not a finished product) and keep senior engineers in the loop as reviewers, not just approvers.Measuring PR volume and lines of code tells you how fast the machine is running.It doesn't tell you where it's going.For engineering leaders: are your review processes built for this volume? Or have your senior engineers quietly become the quality layer nobody planned for?

88% of companies use AI. Only 25% have anything to show for it

2026-03-0817:04

Everyone says they're doing AI. Almost no one has moved past the pilot stage. This week we dig into why that gap exists.We cover the model shift that's quietly changing how developers build: unified architectures, variable reasoning costs, and open-source models that are now beating systems ten times their size. We get into what "agentic AI" actually means in production, not the buzzword version, but the real infrastructure challenges WHOOP uncovered running 500+ AI agents at once. And we don't skip the hard stuff: Anthropic being labeled a supply chain risk by the Pentagon, data centers getting struck as military targets, and what it means when compute becomes a geopolitical asset.If you want to understand where AI is actually going this is the episode.

Anthropic Banned

2026-03-0119:46

Three massive forces collided this week in AI, and the fallout is just starting. First, the unprecedented standoff: Anthropic gets blacklisted by the US government for refusing to remove safety guardrails, while OpenAI steps in. Second, the money: OpenAI's record-breaking $110 billion raise. Third, the workforce: Block's explicit AI-driven layoffs and the market's enthusiastic reaction. We break down why safety principles are becoming commercial liabilities, what the capital deluge means for competition, and how developers should prepare for the new era of 'agentic' layoffs. Press play to get caught up on the week that changed everything.

Intelligence Became a Commodity

2026-02-2219:17

In six days, the performance gap between the world's top AI models collapsed to 6.9 points—and the race to build the smartest AI fundamentally changed shape. Three frontier models launched with dramatic price-performance shifts: Claude Sonnet 4.6 at one-fifth flagship cost, Gemini 3.1 Pro doubling reasoning performance, and Qwen 3.5 open-sourcing near-parity capabilities. Meanwhile, Meta and NVIDIA signed a multi-billion dollar infrastructure deal, 88 countries gathered to debate AI governance (with the US rejecting global oversight), and a stark paradox emerged—100% of enterprises plan to expand agentic AI, yet only 8.6% have it in production. Press play to understand why intelligence is becoming infrastructure, infrastructure is becoming geopolitical, and what it all means for how you build with AI.

Anthropic's $30B bet and the multi-agent shift

2026-02-1515:32

This week, Anthropic closed a $30 billion funding round at a $380 billion valuation while DXC Technology deployed autonomous agents to 115,000 employees. OpenAI shipped its first non-Nvidia model on Cerebras hardware. And across the industry, $660 billion in infrastructure spending signaled that we're done with pilot projects.The "prompting fallacy" is dead. We explain why multi-agent architecture is now the only viable path for complex workflows. Plus, the safety challenges that come with autonomous systems running production code in regulated environments like Goldman Sachs.If you're still treating AI like a chatbot wrapper, this episode explains why your architecture is already obsolete, and what to do about it before your competitors scale past you.

Claude Opus 4.6 vs. GPT-5.3-Codex

2026-02-0819:45

This week, AI stopped being an oracle you consult and became a colleague you delegate to. We're breaking down the 'agentic shift', the architectural change that lets AI manage code repositories, negotiate contracts, and run for days without constant prompting.You'll learn why the Model Context Protocol (MCP) is becoming the 'USB-C for AI tools,' how Claude Opus 4.6 and GPT-5.3-Codex are transforming developer workflows, and why security teams are scrambling to catch up with autonomous agents that have persistent memory and broad system access.If you've been waiting for AI to actually change how you work (not just how you search) this is the episode you need.

Claude Drove on Mars. Then Amazon Fired 16,000 People.

2026-02-0115:52

What happens when AI stops waiting for instructions and starts making plans? This week, we unpack the seven days that marked the shift from chatbots to autonomous agents—from Claude navigating NASA's Mars rover to Microsoft letting AI make purchases mid-conversation. We dig into the architectural revolution happening under the hood: reasoning models that think before they speak, agent swarms that collaborate like hospital specialists, and the new protocols letting AI see and control your screen. But we also look at the human cost—Amazon's 16,000 layoffs reveal a stark pattern of capital replacing labor, while regulators scramble to catch up with AI that can act without asking. Whether you're building these systems, deploying them, or just trying to keep your job alongside them, this episode maps the new rules of the agentic era. Press play before your AI schedules a meeting about it.

Only 12% of Companies Are Winning at AI

2026-01-2515:29

This week, AI stopped being about what's possible and started being about what's actually working—and the numbers are brutal.Only 12% of CEOs report AI is delivering both cost savings and revenue growth. Meanwhile, the best AI agents on the market hit just 24% accuracy on real professional tasks. That's intern-level performance.But here's where it gets interesting: Anthropic published a philosophical manifesto about whether their AI might have consciousness. OpenAI announced ads are coming to ChatGPT. Google's DeepMind CEO publicly questioned that decision.The trust economy just became real. And the companies seeing returns? They're not the ones with the best AI—they're the ones who rebuilt their workflows around it.We break down what separated the 12% from everyone else, why the Davos crowd is worried about white-collar workers, and what the infrastructure race tells us about where this is all heading.

When Google Considers Launching Servers Into Space, You Know the Rules Have Changed

2026-01-1818:43

The week of January 12–18, 2026 exposed the forces reshaping AI—and they're not what you might expect.Google is seriously exploring data centers in space. Hyperscalers are hiring energy experts faster than ML researchers. DeepSeek introduced architecture that separates memory from reasoning (finally). And "vibe coding" went from meme to methodology with real tools backing it up.Meanwhile, the regulatory landscape is fragmenting: federal preemption efforts are colliding with state AI laws that just took effect, while the EU marches toward August deadlines with €35 million penalties.This episode breaks down what actually matters for IT leaders: the physical constraints that will shape AI deployment, the architectural innovations worth watching, and the compliance realities you can't ignore.The experimental phase is ending. The constraints are real. Here's what you need to know.

AI in January 2026: Hardware, Agents, and What’s Actually Changing

2026-01-1214:52

CES 2026 brought a wave of AI announcements worth paying attention to. NVIDIA unveiled its Rubin platform with claims of 10x cheaper inference. Boston Dynamics announced Atlas production at scale. Meta acquired an AI agent company for $2 billion. And several new developer SDKs dropped.This episode organizes the noise into what actually matters. We cover the hardware updates from NVIDIA, AMD, and Intel. We look at why hybrid model architectures like Falcon H1R are gaining traction. We explain how RAG patterns are evolving toward agentic memory. And we break down what “agent engineering” looks like as a emerging discipline.The thread connecting it all: the industry is moving from experimentation toward production deployment, with growing pressure to show measurable ROI. Useful context if you’re building AI products or managing teams working with these tools.

The Holiday Shift: How AI Systems Managed the New Year Surge

2026-01-0416:12

As we step into 2026, the artificial intelligence landscape is shifting from raw model size to architectural precision. In this episode, we unpack the critical developments from the holiday season (Dec 22 – Jan 4). We also discuss the rising trend of 'Agentic Verification' in software engineering and what it means for developer autonomy.

Skills

2025-12-2815:25

This Christmas week there has not been too many news in the AI world, so I decided to go deep into a topic. Everyone has been talking about Agents and MCPs, but there is a concept that not many people are talking about and that Anthropic is trying to standardize. I'm talking about Skills and it is already in Preview in Claude.

AI Reality before Christmas

2025-12-2119:26

This week: Gemini 3 Flash disrupts pricing, OpenAI becomes a platform, NVIDIA tightens its infrastructure grip, and CEOs face the ROI reckoning. What's working, what's not, and what technical leaders need to know.

From Playground to Production: AI's Turning Point

2025-12-1414:36

December 5-14, 2025 marked the end of AI's experimental phase and the beginning of industrial reality. In this episode, we break down the most consequential week in AI history—when OpenAI and Google launched competing models on the same day, a billion-dollar content deal redefined IP licensing, and enterprise AI spending hit $37 billion (up 222% YoY).Whether you're a developer, engineering manager, or tech leader, this episode cuts through the hype to reveal what actually matters: the architectural shifts, the talent implications, and the strategic decisions you need to make now.Who should listen: Software developers, AI/ML engineers, engineering managers, CTOs, product leaders, and anyone making technology decisions for their organization.

The Code Red Era

2025-12-0711:32

The digital hegemony has collapsed. It is late 2025, and the chatbot era is officially dead. On this podcast, we bring you to the frontlines of the "Digital Frontier Wars," where OpenAI’s internal "Code Red" signaled the end of their dominance and the rise of superior reasoning from Google’s Gemini 3 and Anthropic’s Claude Opus 4.5.But the battle has moved beyond screens. We investigate Jeff Bezos’s $6.2 billion "Project Prometheus"—a bid to conquer the material economy with "Physical AI"—and the fracturing of the web into a global "Splinternet." With experts predicting the end of white-collar work in less than three years, we ask the hard question: As safety scores plummet and machines enter the physical world, where does that leave us?

#box-pro-ellipsis-177491445502512{-webkit-line-clamp:2;}The Human in the Loop