The Private AI Lab

10 Episodes

Reverse

009 - Getting excited for NVIDIA GTC with Dirk Glücker

2026-03-0649:12

What should you expect from NVIDIA GTC 2026?In this pre-show episode of The Private AI Lab, Johan talks with Dirk Glücker, AI platform engineer and Kubernetes specialist, about the sessions, technologies, and trends worth watching at this year’s conference.They discuss:The unique culture of GTC compared to other tech conferencesKey sessions around MLOps, distributed inference, and AI infrastructureWhat we might see in Jensen Huang’s keynoteThe evolution of AI factories and large GPU clustersWhy networking and meet-the-expert sessions are invaluablePractical advice for navigating GTC (or watching remotely)If you’re building AI platforms, running GPU infrastructure, or following the latest developments in accelerated computing, this episode is a great primer before the event.Links mentioned:AI Fail video from Dirk: https://www.youtube.com/watch?v=CWGefSIVIKMAll hybrid sessions in the content catalog:https://register.nvidia.com/flow/nvidia/gtc26/ap/page/catalog?tab.catalogallsessionstab=16566177511100015Kus&search.viewingexperience=1700085746191002Cnn0Register for NVIDIA GTC today using the following link:https://nvda.ws/4qXGFjmChapters:00:00 – Welcome to the GTC pre-show00:40 – Meet Dirk Glücker01:20 – AI fail of the week: robot meets mirror04:10 – Physical AI and robotics challenges05:20 – What GTC is like for first-time attendees07:50 – Highlights from last year’s conference08:10 – DGX Spark and AI factories09:10 – Why meeting experts at GTC matters12:20 – How to plan your GTC schedule13:00 – MLOps sessions worth attending14:30 – AI coding agents and development automation16:10 – Distributed inference at scale17:00 – Inside NVIDIA’s inference ecosystem18:00 – AI infrastructure and platform engineering19:30 – Multi-tenant GPU clusters21:00 – Pixar rendering pipelines and GPUs22:00 – Formula One and AI performance23:00 – Watching GTC remotely24:00 – Open source inference and sovereign AI26:30 – Overlapping sessions and planning strategy27:10 – Predictions for Jensen Huang’s keynote31:00 – AI factory networking infrastructure33:00 – Exploring the GTC expo floor37:00 – Tips for first-time attendees44:00 – Final thoughts before GTC

008 - Vibe Coding: Productivity Hack or Production Nightmare?

2026-02-1901:06:43

Is vibe coding the ultimate productivity accelerator — or a fast track to 4AM production outages?In this episode of The Private AI Lab, Johan speaks with Andrew Morgan about the real state of vibe coding in 2026.They unpack the difference between vibe coding and vibe learning, explore the risks of blindly trusting AI-generated code, and debate whether this new wave of AI-native development is democratizing software engineering — or quietly lowering the bar.The conversation covers rogue agents, context window limits, guardrails, on-prem AI strategies, enterprise accountability, and why thinking might become the most important engineering skill of the next decade.This episode is for developers, platform engineers, architects, and anyone navigating AI-assisted software development.00:00 – Welcome to The Private AI Lab01:40 – Andrew’s biggest AI fail03:00 – The sunken cost fallacy of prompting04:45 – Rogue agents & expensive mistakes06:30 – Skynet jokes (but not really)07:20 – What is vibe coding?09:00 – Trust, guardrails & blast radius10:15 – The current tooling landscape12:00 – Vibe coding inside teams14:40 – Stack Overflow vs vibe coding17:00 – Code completion on steroids18:30 – Who’s using it most aggressively?20:45 – Democratization or dilution?23:00 – Accountability at 4AM25:00 – Lazy engineers vs lazy vibecoders27:00 – Debugging AI-generated code30:00 – Crab dragons & technical debt32:30 – DevOps knowledge & production readiness35:00 – Human vs AI code reviews37:30 – Private AI & vibe coding40:00 – On-prem vs cloud agents42:30 – Context windows & hallucinations44:00 – The next 18 months47:30 – Strong engineers vs weak engineers49:30 – Security risks & red teaming52:00 – Is thinking the new bottleneck?54:00 – Final lab report & takeaways

#007 - From Zero to Agentic AI — All Before Your Coffee Gets Cold (with Tasha Drew)

2026-02-0557:28

What if deploying private, enterprise-grade AI didn’t take months—but only as long as it takes your coffee to cool?In this episode of The Private AI Lab, Johan van Amersfoort sits down with Tasha Drew to unpack how Private AI Services on VMware Cloud Foundation are changing the way enterprises build and consume AI.They discuss why AI needs a platform layer, how Broadcom integrates open source responsibly, what MCP means for enterprise AI, and how agentic AI workloads can be deployed securely, governed properly, and operated at scale—inside the private cloud.If you like to follow Tasha, you can do that on linkedin:https://www.linkedin.com/in/tashy/Topics coveredWhy “training” is the most misused AI termFrom zero to RAG and agentic AI in minutesPrivate AI Services and the VCF strategyModel runtime, API gateways, and GPU efficiencyData indexing & retrieval for enterprise RAGOpen source trade-offs: speed vs flexibilityMCP, tool usage, and governanceHow Broadcom uses its own AI platform internallyThe future of agents: specialists over monolithsIf you like to follow Tasha, you can do that on linkedin:https://www.linkedin.com/in/tashy/

#006 - The Subtle Art of Inference with Adam Grzywaczewski

2026-01-2253:19

In this episode of The Private AI Lab, Johan van Amersfoort speaks with Adam Grzywaczewski, a senior Deep Learning Data Scientist at NVIDIA, about the rapidly evolving world of AI inference.They explore how inference has shifted from simple, single-GPU execution to highly distributed, latency-sensitive systems powering today’s large language models. Adam explains the real bottlenecks teams face, why software optimization and hardware innovation must move together, and how NVIDIA’s inference stack—from TensorRT-LLM to Dynamo—enables scalable, cost-efficient deployments.The conversation also covers quantization, pruning, mixture-of-experts models, AI factories, and why inference optimization is becoming one of the most critical skills in modern AI engineering.Topics coveredWhy inference is now harder than trainingAutoregressive models and KV-cache challengesMixture-of-experts architecturesNVIDIA Dynamo and TensorRT-LLMHardware vs software optimizationQuantization, pruning, and distillationLatency vs throughput trade-offsThe rise of AI factories and DGX systemsWhat’s next for AI inference

#005 - The Why, What, and How of MCP with Maxime Colomès

2026-01-0801:02:34

In this episode of The Private AI Lab, Johan van Amersfoort talks with Maxime Colomès about the Model Context Protocol (MCP)—one of the most important emerging standards in AI today.MCP is often described as the USB-C of AI: a universal way for AI models to connect to tools, data sources, and real-world systems. Maxime explains what MCP is, how it works, and why its recent donation to the Linux Foundation is such a major milestone for the AI ecosystem.They explore real-world enterprise use cases, MCP security considerations, private AI architectures, and how MCP integrates with platforms like OpenShift AI. The conversation also touches on developer productivity, AI agents that can take action, and the future of personal, privacy-preserving AI assistants.Key topicsWhat the Model Context Protocol (MCP) is and why it mattersMCP vs traditional APIs and plugin systemsEnterprise MCP architectures and gatewaysMCP and private AI / data sovereigntyOpenShift AI and MLOps workflowsSecurity risks and best practices with MCPCommunity MCP servers and registriesFuture MCP use cases and predictions

#004 - The Past, Present, and Future of VMware Private AI Services with Frank Denneman

2025-12-2301:00:41

In this episode of The Private AI Lab, Johan sits down with Frank Denneman to explore the past, present, and future of VMware’s Private AI portfolio.This conversation goes beyond AI buzzwords and marketing fluff. Together, Johan and Frank dive deep into the real infrastructure and resource management challenges that emerge when AI workloads enter enterprise environments. GPUs, scheduling, isolation, and platform design all take center stage—viewed through the lens of real-world VMware deployments.If you are an infrastructure architect, platform engineer, or IT decision-maker designing AI behind the firewall, this episode provides grounded insights into what actually matters.🔍 What you’ll learn in this episodeHow VMware’s Private AI strategy has evolved over timeWhy AI workloads fundamentally change infrastructure assumptionsThe importance of resource management for GPU-backed workloadsKey architectural trade-offs when running AI on-premHow to think about the future of enterprise AI platforms🎧 Listen & SubscribeFor more experiments, insights, and behind-the-firewall AI discussions, visit johan.ml.Experiment complete. Until the next one — stay curious.

#003 - OpenShift AI, DGX Spark & the Future of Private AI — with Robbie Jerrom (Red Hat)

2025-12-1101:02:36

This episode of The Private AI Lab features Robbie Jerrom, Principal Technologist AI at Red Hat, for a deep dive into Private AI, from the DGX Spark to OpenShift AI and the future of agentic systems.Topics we cover:How Robbie uses the DGX Spark for home-lab AIWhy developers are moving from cloud GPUs to local devicesOpenShift AI as a consistent platform from experiment to productionThe best open-source components for modern AI stacksWhy 79% of POCs never reach production — and how to avoid thatThe next wave: agentic AI and enterprise automationWatch on YouTube:https://www.youtube.com/watch?v=jjyB8w_cpb0More episodes & articles at johan.ml

#002 - DGX Spark Review — With Andrew Foe (Iodis & HyperAI)

2025-11-2701:01:02

In this episode of The Private AI Lab, Johan is joined by Andrew Foe, CEO of Iodis and HyperAI, to explore the NVIDIA DGX Spark from a real-world perspective.We cover:Unboxing & first impressionsBring-up and setup tipsEveryday usabilityWhat customers loveThe most common misconceptionsWhy preorder demand exploded in the BeNeLux regionA must-listen for anyone exploring Private AI hardware or considering the DGX Spark.🎧 Watch on YouTube: https://www.youtube.com/watch?v=jENCTgcAWsI💡 More experiments: https://johan.ml

#001 - Fast, Sovereign, and Local: What SUSE AI Taught Me in 40 Minutes

2025-11-1357:56

In the premiere of The Private AI Lab, Johan van Amersfoort is joined by SUSE AI Specialist Eric Lajoie to talk about sovereign AI, deploying chatbots in under an hour, and the unexpected hazards of robotic dogs.Timestamps:00:00 – Intro & guest welcome01:04 – How to pronounce “Lajoie”?02:00 – What Eric actually does at SUSE05:14 – Eric’s AI fail: a very personal RAG demo10:06 – What Private AI means to Eric11:30 – How SUSE AI works (chatbots, Kubernetes, Rancher, and more)21:30 – RAG architecture explained26:00 – 40-minute SUSE AI deployment?!31:00 – Real-world use cases (GPUaaS, observability, MCP)39:00 – AI sovereignty in Europe43:00 – Wrap-up & key takeaways54:00 – Eric’s prediction for the next 12 months in AI57:40 – Johan’s robot dog fail60:00 – Outro + where to follow Eric🔗 Links & Mentions • Learn more about SUSE AI: https://suse.com Follow Eric: https://www.linkedin.com/in/elajoie/ or lajoie.de Check out the companion post at https://johan.ml/ Related episode: When Shit Hits The Fan ft. Eric – https://open.spotify.com/episode/2IeLq8WT7eqMJPVkcUMg3G?si=eaf7a529a75e4078

The Private AI Lab

2025-10-0701:15

Welcome to The Private AI Lab — the podcast where we experiment, explore, and debate the future of Artificial Intelligence behind the firewall. I’m your host, Johan, and every month I invite a guest into the lab to break down real-world use cases, challenges, and innovations shaping Private AI. Brought to you by Johan.mlBe the first to know when a new episode drops, by subscribing to the podcast!

#box-pro-ellipsis-177337154716632{-webkit-line-clamp:2;}The Private AI Lab