AI Engineering Podcast

73 Episodes

Reverse

Beyond the Chatbot: Practical Frameworks for Agentic Capabilities in SaaS

2025-12-2953:47

Summary In this episode product and engineering leader Preeti Shukla explores how and when to add agentic capabilities to SaaS platforms. She digs into the operational realities that AI agents must meet inside multi-tenant software: latency, cost control, data privacy, tenant isolation, RBAC, and auditability. Preeti outlines practical frameworks for selecting models and providers, when to self-host, and how to route capabilities across frontier and cheaper models. She discusses graduated autonomy, starting with internal adoption and low-risk use cases before moving to customer-facing features, and why many successful deployments keep a human-in-the-loop. She also covers evaluation and observability as core engineering disciplines - layered evals, golden datasets, LLM-as-a-judge, path/behavior monitoring, and runtime vs. offline checks - to achieve reliability in nondeterministic systems. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Preeti Shukla about the process for identifying whether and how to add agentic capabilities to your SaaSInterview IntroductionHow did you get involved in machine learning?Can you start by describing how a SaaS context changes the requirements around the business and technical considerations of an AI agent?Software-as-a-service is a very broad category that includes everything from simple website builders to complex data platforms. How does the scale and complexity of the service change the equation for ROI potential of agentic elements?How does it change the implementation and validation complexity?One of the biggest challenges with introducing generative AI and LLMs in a business use case is the unpredictable cost associated with it. What are some of the strategies that you have found effective in estimating, monitoring, and controlling costs to avoid being upside-down on the ROI equation?Another challenge of operationalizing an agentic workload is the risk of confident mistakes. What are the tactics that you recommend for building confidence in agent capabilities while mitigating potential harms?A corollary to the unpredictability of agent architectures is that they have a large number of variables. What are the evaluation strategies or toolchains that you find most useful to maintain confidence as the system evolves?SaaS platforms benefit from unit economics at scale and often rely on multi-tenant architectures. What are the security controls and identity/attribution mechanisms that are critical for allowing agents to operate across tenant boundaries?What are the most interesting, innovative, or unexpected ways that you have seen SaaS products adopt agentic patterns?What are the most interesting, unexpected, or challenging lessons that you have learned while working on bringing agentic workflows to SaaS products?When is an agent the wrong choice?What are your predictions for the role of agents in the future of SaaS products?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Links SaaS == Software as a ServiceMulti-TenancyFew-shot LearningLLM as a JudgeRAG == Retrieval Augmented GenerationMCP == Model Context ProtocolLoveableThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

MCP as the API for AI‑Native Systems: Security, Orchestration, and Scale

2025-12-1601:07:43

Summary In this episode Craig McLuckie, co-creator of Kubernetes and founder/CEO of Stacklok, talks about how to improve security and reliability for AI agents using curated, optimized deployments of the Model Context Protocol (MCP). Craig explains why MCP is emerging as the API layer for AI‑native applications, how to balance short‑term productivity with long‑term platform thinking, and why great tools plus frontier models still drive the best outcomes. He digs into common adoption pitfalls (tool pollution, insecure NPX installs, scattered credentials), the necessity of continuous evals for stochastic systems, and the shift from “what the agent can access” to “what the agent knows.” Craig also shares how ToolHive approaches secure runtimes, a virtual MCP gateway with semantic search, orchestration and transactional semantics, a registry for organizational tooling, and a console for self‑service—along with pragmatic patterns for auth, policy, and observability. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Craig McLuckie about improving the security of your AI agents through curated and optimized MCP deploymentInterviewIntroductionHow did you get involved in machine learning?MCP saw huge growth in attention and adoption over the course of this year. What are the stumbling blocks that teams run into when going to production with MCP servers?How do improperly managed MCP servers contribute to security problems in an agent-driven software development workflow?What are some of the problematic practices or shortcuts that you are seeing teams implement when running MCP services for their developers?What are the benefits of a curated and opinionated MCP service as shared infrastructure for an engineering team?You are building ToolHive as a system for managing and securing MCP services as a platform component. What are the strategic benefits of starting with that as the foundation for your company?There are several services for managing MCP server deployment and access control. What are the unique elements of ToolHive that make it worth adopting?For software-focused agentic AI, the approach of Claude Code etc. to be command-line based opens the door for an effectively unbounded set of tools. What are the benefits of MCP over arbitrary CLI execution in that context?What are the most interesting, innovative, or unexpected ways that you have seen ToolHive/MCP used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on ToolHive?When is ToolHive the wrong choice?What do you have planned for the future of ToolHive/Stacklok?Contact InfoGitHubLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?LinksStackLokMCP == Model Context ProtocolKubernetesCNCF == Cloud Native Computing FoundationSDLC == Software Development Life CycleThe Bitter LessonTLA+Jepsen TestsToolHiveAPI GatewayGleanThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Context as Code, DevX as Leverage: Accelerating Software with Multi‑Agent Workflows

2025-11-2459:49

Summary In this episode Max Beauchemin explores how multiplayer, multi‑agent engineering is reshaping individual and team velocity for building data and AI systems. Max shares his journey from Airflow and Superset to going all‑in on AI coding agents, describing a pragmatic “AI‑first reflex” for nearly every task and the emerging role of humans as orchestrators of agents. He digs into shifting bottlenecks — code review, QA, async coordination — and how better DevX/AIX, just‑in‑time context via tools, and structured "context as code" can keep pace with agent‑accelerated execution. He then dives deep into Agor, a new open‑source agent‑orchestration platform: a spatial, multiplayer canvas that manages git worktrees and shared dev environments, enables templated prompts and zone‑based workflows, and exposes an internal MCP so agents can operate the system — and each other. Max discusses session forking, sub‑session trees, scheduling, and safety considerations, and how these capabilities enable parallelization, handoffs across roles, and richer visibility into prompting and cost/usage—pointing to a near future where software engineering centers on orchestrating teams of agents and collaborators. Resources: agor.live (docs, one‑click Codespaces, npm install), Apache Superset, and related MCP/CLI tooling referenced for agent workflows. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Maxime Beauchemin about the impact of multi-player multi-agent engineering on individual and team velocity for building better data systemsInterviewIntroductionHow did you get involved in the area of data management?Can you start by giving an overview of the types of work that you are relying on AI development agents for?As you bring agents into the mix for software engineering, what are the bottlenecks that start to show up?In my own experience there are a finite number of agents that I can manage in parallel. How does Agor help to increase that limit?How does making multi-agent management a multi-player experience change the dynamics of how you apply agentic engineering workflows?Contact InfoLinkedInLinksAgorApache AirflowApache SupersetPresetClaude CodeCodexPlaywright MCPTmuxGit WorktreesOpencode.aiGitHub CodespacesOnaThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Inside the Black Box: Neuron-Level Control and Safer LLMs

2025-11-1601:00:52

Summary In this episode of the AI Engineering Podcast Vinay Kumar, founder and CEO of Arya.ai and head of Lexsi Labs, talks about practical strategies for understanding and steering AI systems. He discusses the differences between interpretability and explainability, and why post-hoc methods can be misleading. Vinay shares his approach to tracing relevance through deep networks and LLMs using DL Backtrace, and how interpretability is evolving from an audit tool into a lever for alignment, enabling targeted pruning, fine-tuning, unlearning, and model compression. The conversation covers setting concrete alignment metrics, the gaps in current enterprise practices for complex models, and tailoring explainability artifacts for different stakeholders. Vinay also previews his team's "AlignTune" effort for neuron-level model editing and discusses emerging trends in AI risk, multi-modal complexity, and automated safety agents. He explores when and why teams should invest in interpretability and alignment, how to operationalize findings without overcomplicating evaluation, and the best practices for private, safer LLM endpoints in enterprises, aiming to make advanced AI not just accurate but also acceptable, auditable, and scalable. Announcements Hello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Vinay Kumar about strategies and tactics for gaining insights into the decisions of your AI systemsInterview IntroductionHow did you get involved in machine learning?Can you start by giving a quick overview of what explainability means in the context of ML/AI?What are the predominant methods used to gain insight into the internal workings of ML/AI models?How does the size and modality of a model influence the technique and evaluation of methods used?What are the contexts in which a team would incorporate explainability into their workflow?How might explainability be used in a live system to provide guardrails or efficiency/accuracy improvements?What are the aspects of model alignment and explainability that are most challenging to implement?What are the supporting systems that are necessary to be able to effectively operationalize the collection and analysis of model reliability and alignment?"Trust", "Reliability", and "Alignment" are all words that seem obvious until you try to define them concretely. What are the ways that teams work through the creation of metrics and evaluation suites to gauge compliance with those goals?What are the most interesting, innovative, or unexpected ways that you have seen explainability methods used in AI systems?What are the most interesting, unexpected, or challenging lessons that you have learned while working on explainability/reliability at AryaXAI?When is evaluation of explainability overkill?What do you have planned for the future of AryaXAI and explainable AI?Contact Info LinkedInParting Question From your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.Links Lexsi LabsAyra.aiDeep LearningAlexNetDL BacktraceGradient BoostSAE == Sparse AutoEncoderShapley ValuesLRP == Layerwise Relevance PropagationIG == Integrated GradientsCircuit DiscoveryF1 ScoreLLM As A JudgeThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Building the Internet of Agents: Identity, Observability, and Open Protocols

2025-11-1001:07:14

SummaryIn this episode Guillaume de Saint Marc, VP of Engineering at Cisco Outshift, talks about the complexities and opportunities of scaling multi‑agent systems. Guillaume explains why specialized agents collaborating as a team inspire trust in enterprise settings, and contrasts rigid, “lift-and-shift” agentic workflows with fully self-forming systems. We explore the emerging Internet of Agents, the need for open, interoperable protocols (A2A for peer collaboration and MCP for tool calling), and new layers in the stack for syntactic and semantic communication. Guillaume details foundational needs around discovery, identity, observability, and fine-grained, task/tool/transaction-based access control (TBAC), along with Cisco’s open-source Agency initiative, directory concepts, and OpenTelemetry extensions for agent traces. He shares concrete wins in IT/NetOps—network config validation, root-cause analysis, and the CAPE platform engineer agent—showing dramatic productivity gains. We close with human-in-the-loop UX patterns for multi-agent teams and SLIM, a high-performance group communication layer designed for agent collaboration.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Guillaume de Saint Marc about the complexities and opportunities of scaling multi-agent systemsInterviewIntroductionHow did you get involved in machine learning?Can you start by giving an overview of what constitutes a "multi-agent" system?Many of the multi-agent services that I have read or spoken about are designed and operated by a single department or organization. What are some of the new challenges that arise when allowing agents to communicate and co-ordinate outside of organizational boundaries?The web is the most famous example of a successful decentralized system, with HTTP being the most ubiquitous protocol powering it. What does the internet of agents look like?What is the role of humans in that equation?The web has evolved in a combination of organic and planned growth and is vastly more complex and complicated than when it was first introduced. What are some of the most important lessons that we should carry forward into the connectivity of AI agents?Security is a critical aspect of the modern web. What are the controls, assertions, and constraints that we need to implement to enable agents to operate with a degree of trust while also being appropriately constrained?The AGNTCY project is a substantial investment in an open architecture for the internet of agents. What does it provide in terms of building blocks for teams and businesses who are investing in agentic services?What are the most interesting, innovative, or unexpected ways that you have seen AGNTCY/multi-agent systems used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on multi-agent systems?When is a multi-agent system the wrong choice?What do you have planned for the future of AGNTCY/multi-agent systems?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksOutshift by CiscoMulti-Agent SystemsDeep LearningMerakiSymbolic ReasoningTransformer ArchitectureDeepSeekLLM ReasoningRené DescartesKanbanA2A (Agent-to-Agent) ProtocolMCP == Model Context ProtocolAGNTCYICANN == Internet Corporation for Assigned Names and NumbersOSI LayersOCI == Open Container InitiativeOASF == Open Agentic Schema FrameworkOracle AgentSpecSplunkOpenTelemetryCAIPE == Community AI Platform EngineerAGNTCY Coffee ShopThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Agents, IDEs, and the Blast Radius: Practical AI for Software Engineers

2025-11-0259:18

SummaryIn this episode of the AI Engineering Podcast Will Vincent, Python developer advocate at JetBrains (PyCharm), talks about how AI utilities are revolutionizing software engineering beyond basic code completion. He discusses the shift from "vibe coding" to "vibe engineering," where engineers collaborate with AI agents through clear guidelines, iterative specs, and tight guardrails. Will shares practical techniques for getting real value from these tools, including loading the whole codebase for context, creating agent specifications, constraining blast radius, and favoring step-by-step plans over one-shot generations. The conversation covers code review gaps, deployment context, and why continuity across tools matters, as well as JetBrains' evolving approach to integrated AI, including support for external and local models. Will emphasizes the importance of human oversight, particularly for architectural choices and production changes, and encourages experimentation and playfulness while acknowledging the ethics, security, and reliability tradeoffs that come with modern LLMs.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Will Vincent about selecting and using AI software engineering utilities and making them work for your teamInterviewIntroductionHow did you get involved in machine learning?Software engineering is a discipline that is relatively young in relative terms, but does have several decades of history. As someone working for a developer tools company, what is your broad opinion on the impact of AI on software engineering as an occupation?There are many permutations of AI development tools. What are the broad categories that you see?What are the major areas of overlap?What are the styles of coding agents that you are seeing the broadest adoption for?What are your thoughts on the role of editors/IDEs in an AI-driven development workflow?Many of the code generation utilities are executed on a developer's computer in a single-player mode. What are some strategies that you have seen or experimented with to extract and share techniques/best practices/prompt templates at the team level?While there are many AI-powered services that hook into various stages of the software development and delivery lifecycle, what are the areas where you are seeing gaps in the user experience?What are the most interesting, innovative, or unexpected ways that you have seen AI used in the context of software engineering workflows?What are the most interesting, unexpected, or challenging lessons that you have learned while working on developer tooling in the age of AI?When is AI-powered the wrong choice?What do you have planned for the future of AI in the context of Jetbrains?What are your predictions/hopes for the future of AI for software engineering?Contact InfoWill VincentParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksJetBrainsSimon WillisonVibe Engineering PostGitHub CopilotAGENTS.mdCopilot AGENTS.md instructionsKiro IDEClaude CodeJetbrains QuickEditClaude Agent in JetBrains IDEsRuff linteruv package managerty type checkerpyreflyIDE == Integrated Development EnvironmentOllamaLM StudioGoogle GemmaDeepseekgpt-ossOllama CloudGemini DiffusionDjango Annual SurveyCo-Intelligence by Ethan Mollick (affiliate link)The intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

From MRI to World Models: How AI Is Changing What We See

2025-10-2748:51

SummaryIn this episode of the AI Engineering Podcast Daniel Sodickson, Chief of Innovation in Radiology at NYU Grossman School of Medicine, talks about harnessing AI systems to truly understand images and revolutionize science and healthcare. Dan shares his journey from linear reconstruction to early deep learning for accelerated MRI, highlighting the importance of domain expertise when adapting models to specialized modalities. He explores "upstream" AI that changes what and how we measure, using physics-guided networks, prior knowledge, and personal baselines to enable faster, cheaper, and more accessible imaging. The conversation covers multimodal world models, cross-disciplinary translation, explainability, and a future where agents flag abnormalities while humans apply judgment, as well as provocative frontiers like "imaging without images," continuous health monitoring, and decoding brain activity. Dan stresses the need to preserve truth, context, and human oversight in AI-driven imaging, and calls for tools that distill core methodologies across disciplines to accelerate understanding and progress.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Unlock the full potential of your AI workloads with a seamless and composable data infrastructure. Bruin is an open source framework that streamlines integration from the command line, allowing you to focus on what matters most - building intelligent systems. Write Python code for your business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. With native support for ML/AI workloads, Bruin empowers data teams to deliver faster, more reliable, and scalable AI solutions. Harness Bruin's connectors for hundreds of platforms, including popular machine learning frameworks like TensorFlow and PyTorch. Build end-to-end AI workflows that integrate seamlessly with your existing tech stack. Join the ranks of forward-thinking organizations that are revolutionizing their data engineering with Bruin. Get started today at aiengineeringpodcast.com/bruin, and for dbt Cloud customers, enjoy a $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Daniel Sodickson about the impact and applications of AI that is capable of image understandingInterviewIntroductionHow did you get involved in machine learning?Images and vision are concepts that we understand intuitively, but which have a large potential semantic range. How would you characterize the scope and application of imagery in the context of AI and other autonomous technologies?Can you give an overview of the current state of image/vision capabilities in AI systems?A predominant application of machine vision has been for object recognition/tracking. How are advances in AI changing the range of problems that can be solved with computer vision systems?A substantial amount of work has been done on processing of images such as the digital pictures taken by smartphones. As you move to other types of image data, particularly in non-visible light ranges, what are the areas of similarity and in what ways do we need to develop new processing/analysis techniques?What are some of the ways that AI systems will change the ways that we conceive of What are the most interesting, innovative, or unexpected ways that you have seen AI vision used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on imaging technologies and techniques?When is AI the wrong choice for vision/imaging applications?What are your predictions for the future of AI image understanding?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksMRI == Magnetic Resonance ImagingLinear AlgorithmNon-Linear AlgorithmCompressed SensingDictionary Learning AlgorithmDeep LearningCT ScanCambrian ExplosionLIDAR Point CloudSynthetic Aperture RadarGeoffrey HintonCo-Intelligence by Ethan Mollick (affiliate link)TomographyX-Ray CrystallographyCERNCLIP ModelPhysics-Guided Neural NetworkFunctional MRIA Path Toward Autonomous Machine Intelligence by Yann LeCunThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Specs, Tests, and Self‑Verification: The Playbook for Agentic Engineering Teams

2025-10-1901:06:28

SummaryIn this episode Andrew Filev, CEO and founder of ZenCoder, takes a deep dive into the system design, workflows, and organizational changes behind building agentic coding systems. He traces the evolution from autocomplete to truly agentic models, discusses why context engineering and verification are the real unlocks for reliability, and outlines a pragmatic path from “vibe coding” to AI‑first engineering. Andrew shares ZenCoder’s internal playbook: PRD and tech spec co‑creation with AI, human‑in‑the‑loop gates, test‑driven development, and emerging BDD-style acceptance testing. He explores multi-repo context, cross-service reasoning, and how AI reshapes team communication, ownership, and architecture decisions. He also covers cost strategies, when to choose agents vs. manual edits, and why self‑verification and collaborative agent UX will define the next wave. Andrew offers candid lessons from building ZenCoder—why speed of iteration beats optimizing for weak models, how ignoring the emotional impact of vibe coding slowed brand momentum, and where agentic tools fit across greenfield and legacy systems. He closes with predictions for the next year: self‑verification, parallelized agent workflows, background execution in CI, and collaborative spec‑driven development moving code review upstream.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Your host is Tobias Macey and today I'm interviewing Andrew Filev about the system design and integration strategies behind building coding agents at ZencoderInterviewIntroductionHow did you get involved in ML/AI?There have been several iterations of applications for generative AI models in the context of software engineering. How would you characterize the different approaches or categories?Over the course of this summer (2025) the term "vibe coding" gained prominence with the idea that the human just needs to be worried about whether the software does what you ask, not how it is written. How does that sentiment compare to your philosophies on the role of agentic AI in the lifecycle of software?This points at a broader challenge for software engineers in the AI era; how much control can and should we cede to the LLMs, and over what elements of the software process?This also brings up useful questions around the experience of the engineer collaborating with the agent. What are the different interaction patterns that individuals and teams should be thinking of in their use of AI engineering tools?Should the agent be proactive? reactive? what are the triggers for an action to be taken and to what extent?What differentiates a coding agent from an agentic editor?The key challenge in any agent system is context engineering. Software is inherently structured and provides strong feedback loops. But it can also be very messy or difficult to encapsulate in a single context window. What are some of the data structures/indexing strategies/retrieval methods that are most useful when providing guidance to an agent?Software projects are rarely fully self-contained, and often need to cross repository boundaries, as well as manage dependencies. What are some of the more challenging aspects of identifying and accounting for those sometimes implicit relationships?What are some of the strategies that are most effective for yielding productive results from an agent in terms of prompting and scoping of the problem?What are some of the heuristics that you use to determine whether and how to employ an agent for a given task vs. doing it manually?How can the agents assist in the decomposition and planning of complex projects?What are some of the ways that single-player interaction strategies can be turned into team/multi-player strategies?What are some of the ways that teams can create and curate productive patterns to accelerate everyone equally?What are the most interesting, innovative, or unexpected ways that you have seen coding agents used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on coding agents at Zencoder?When is/are Zencoder/coding agents the wrong choice?What do you have planned for the future of Zencoder/agentic software engineering?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksZencoderWrikeDARPA Robotics ChallengeCognitive ComputingAndrew NgSebastian ThrunGithub CopilotRAG == Retrieval Augmented GenerationRe-rankingClaude Sonnet 3.5SWE-BenchVibe CodingAI First EngineeringWaterfall Software EngineeringAgile Software EngineeringPRD == Project Requirements DocumentBDD == Behavior-Driven DevelopmentVSCodeThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

From Probabilistic to Trustworthy: Building Orion, an Agentic Analytics Platform

2025-10-1101:12:19

SummaryIn this episode of the AI Engineering Podcast Lucas Thelosen and Drew Gillson talk about Orion, their agentic analytics platform that delivers proactive, push-based insights to business users through asynchronous thinking with rich organizational context. Lucas and Drew share their approach to building trustworthy analysis by grounding in semantic layers, fact tables, and quality-assurance loops, as well as their focus on accuracy through parallel test-time compute and evolving from probabilistic steps to deterministic tools. They discuss the importance of context engineering, multi-agent orchestration, and security boundaries for enterprise deployments, and share lessons learned on consistency, tool design, user change management, and the emerging role of "AI manager" as a career path. The conversation highlights the future of AI knowledge workers collaborating across organizations and tools while simplifying UIs and raising the bar on actionable, trustworthy analytics.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Your host is Tobias Macey and today I'm interviewing Lucas Thelosen and Drew Gillson about their experiences building an agentic analytics platform and the challenges of ensuring accuracy to build trustInterviewIntroductionHow did you get involved in machine learning?Can you describe what Orion is and the story behind it?Business analytics is a field that requires a high degree of accuracy and detail because of the potential for substantial impact on the business (positive and negative). These are areas that generative AI has struggled with achieving consistently. What was your process for building confidence in your ability to achieve that threshold before committing to the path you are on now?There are numerous ways that generative AI can be incorporated into the process of designing, building, and delivering analytical insights. How would you characterize the different strategies that data teams and vendors have approached that problem?What do you see as the organizational benefits of moving to a push-based model for analytics?Can you describe the system architecture of Orion?Agentic design patterns are still in the early days of being developed and proven out. Can you give a breakdown of the approach that you are using?How do you think about the responsibility boundaries, communication paths, temporal patterns, etc. across the different agents?Tool use is a key component of agentic architectures. What is your process for identifying, developing, validating, and securing the tools that you provide to your agents?What are the boundaries and extension points that you see when building agentic systems? What are the opportunities for using e.g. A2A for protocol for managing agentic hand-offs?What is your process for managing the experimentation loop for changes to your models, data, prompts, etc. as you iterate on your product?What are some of the ways that you are using the agents that power your system to identify and act on opportunities for self-improvement?What are the most interesting, innovative, or unexpected ways that you have seen Orion used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Orion?When is an agentic approach the wrong choice?What do you have planned for the future of Orion?Contact InfoLucasLinkedInDrewLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksGravityOrion Data Engineering Podcast EpisodeSite Reliability EngineeringAnthropic Claude Sonnet 4.5A2A (Agent2Agent) ProtocolSimon WillisonAI Lethal TrifectaBehavioral ScienceGrounded TheoryLLM as a JudgeRLHF == Reinforcement Learning from Human FeedbackThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Building Production-Ready AI Agents with Pydantic AI

2025-10-0750:53

SummaryIn this episode of the AI Engineering Podcast Samuel Colvin, creator of Pydantic and founder of Pydantic Inc, talks about Pydantic AI - a type-safe framework for building structured AI agents in Python. Samuel explains why he built Pydantic AI to bring FastAPI-like ergonomics and production-grade engineering to agents, focusing on strong typing, minimal abstractions, and reliability, observability, and stability. He explores the evolving agent ecosystem, patterns for single vs. many agents, graphs vs. durable execution, and how Pydantic AI approaches structured I/O, tool calling, and MCP with type safety in mind. Samuel also shares insights on design trade-offs, model-provider churn, schema unification, safe code execution, security gaps, and the importance of open standards and OpenTelemetry for observability.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Your host is Tobias Macey and today I'm interviewing Samuel Colvin about the Pydantic AI framework for building structured AI agentsInterviewIntroductionHow did you get involved in machine learning?Can you describe what Pydantic AI is and the story behind it?What are the core use cases and capabilities that you are focusing on with PydanticAI?The agent SDK landscape has been incredibly crowded and volatile since the introduction of LangChain and LlamaIndex. Can you give your summary of the current state of the ecosystem?What are the broad categories that you use when evaluating the various frameworks?Beyond the volatility of the frameworks, there is also a rapid pace of evolution in the different styles/patterns of agents. What are the patterns and integrations that Pydantic AI is best suited for?Can you describe the overall design/architecture of the Pydantic AI framework?How have the design and scope evolved since you first started working on it?For someone who wants to build a sophisticated, production-ready AI agent with Pydantic AI, what is your recommended path from idea to deployment?What are the elements of the framework that help engineers across those different stages of the lifecycle?What are some of the key learnings that you gained from all of your efforts on Pydantic that have been most helpful in developing and promoting Pydantic AI?What are some of the new and exciting failure modes that agentic applications introduce as compared to web/mobile/scientific/etc. applications?What are the most interesting, innovative, or unexpected ways that you have seen Pydantic AI used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Pydantic AI?When is Pydantic AI the wrong choice?What do you have planned for the future of Pydantic AI?Contact InfoGitHubLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksPydanticPydantic AIPydantic IncPydantic LogfireOpenAI AgentsGoogle ADKLangChainLlamaIndexCrewAIDurable ExecutionTemporalMCP == Model Context ProtocolClaude CodeTypescriptGemini Structured OutputOpenAI Structured OutputDottxt Outlines SDKsmolagentsLiteLLMOpenRouterOpenAI Responses APIFastAPISQLModelAI SDK JavaScriptLangGraphNextJSPyodideAI Elements frontend component libraryThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

From GPUs to Workloads: Flex AI’s Blueprint for Fast, Cost‑Efficient AI

2025-09-2855:19

SummaryIn this episode of the AI Engineering Podcast Brijesh Tripathi, CEO of Flex AI, talks about revolutionizing AI engineering by removing DevOps burdens through "workload as a service". Brijesh shares his expertise from leading AI/HPC architecture at Intel and deploying supercomputers like Aurora, highlighting how access friction and idle infrastructure slow progress. He discusses Flex AI's innovative approach to simplifying heterogeneous compute, standardizing on consistent Kubernetes layers, and abstracting inference across various accelerators, allowing teams to iterate faster without wrestling with drivers, libraries, or cloud-by-cloud differences. Brijesh also shares insights into Flex AI's strategies for lifting utilization, protecting real-time workloads, and spanning the full lifecycle from fine-tuning to autoscaled inference, all while keeping complexity at bay.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Your host is Tobias Macey and today I'm interviewing Brijesh Tripathi about FlexAI, a platform offering a service-oriented abstraction for AI workloadsInterviewIntroductionHow did you get involved in machine learning?Can you describe what FlexAI is and the story behind it?What are some examples of the ways that infrastructure challenges contribute to friction in developing and operating AI applications?How do those challenges contribute to issues when scaling new applications/businesses that are founded on AI?There are numerous managed services and deployable operational elements for operationalizing AI systems. What are some of the main pitfalls that teams need to be aware of when determining how much of that infrastructure to own themselves?Orchestration is a key element of managing the data and model lifecycles of these applications. How does your approach of "workload as a service" help to mitigate some of the complexities in the overall maintenance of that workload?Can you describe the design and architecture of the FlexAI platform?How has the implementation evolved from when you first started working on it?For someone who is going to build on top of FlexAI, what are the primary interfaces and concepts that they need to be aware of?Can you describe the workflow of going from problem to deployment for an AI workload using FlexAI?One of the perennial challenges of making a well-integrated platform is that there are inevitably pre-existing workloads that don't map cleanly onto the assumptions of the vendor. What are the affordances and escape hatches that you have built in to allow partial/incremental adoption of your service?What are the elements of AI workloads and applications that you are explicitly not trying to solve for?What are the most interesting, innovative, or unexpected ways that you have seen FlexAI used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on FlexAI?When is FlexAI the wrong choice?What do you have planned for the future of FlexAI?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?LinksFlex AIAurora Super ComputerCoreWeaveKubernetesCUDAROCmTensor Processing Unit (TPU)PyTorchTritonTrainiumASIC == Application Specific Integrated CircuitSOC == System On a ChipLoveableFlexAI BlueprintsTenstorrentThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Right-Sizing AI: Small Language Models for Real-World Production

2025-09-2050:58

SummaryIn this episode of the AI Engineering Podcast Steven Huels, Vice President of AI Engineering & Product Strategy at Red Hat, talks about the practical applications of small language models (SLMs) for production workloads. He discusses how SLMs offer a pragmatic choice due to their ability to fit on single enterprise GPUs and provide model selection trade-offs. The conversation covers self-hosting vs using API providers, organizational capabilities needed for running production-grade LLMs, and the importance of guardrails and automated evaluation at scale. They also explore the rise of agentic systems and service-oriented approaches powered by smaller models, highlighting advances in customization and deployment strategies. Steven shares real-world examples and looks to the future of agent cataloging, continuous retraining, and resource efficiency in AI engineering.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Your host is Tobias Macey and today I'm interviewing Steven Huels about the benefits of small language models for production workloadsInterviewIntroductionHow did you get involved in machine learning?Language models are available in a wide range of sizes, measured both in terms of parameters and disk space. What are your heuristics for deciding what qualifies as a "small" vs. "large" language model?What are the corresponding heuristics for when to use a small vs. large model?The predominant use case for small models is in self-hosted contexts, which requires a certain amount of organizational sophistication. What are some helpful questions to ask yourself when determining whether to implement a model-serving stack vs. relying on hosted options?What are some examples of "small" models that you have seen used effectively?The buzzword right now is "agentic" for AI driven workloads. How do small models fit in the context of agent-based workloads?When and where should you rely on larger models?When speaking of small models, one of the common requirements for making them truly useful is to fine-tune them for your problem domain and organizational data. How has the complexity and difficulty of that operation changed over the past ~2 years?Serving models requires several operational capabilities beyond the raw inference serving. What are the other infrastructure and organizational investments that teams should be aware of as they embark on that path?What are the most interesting, innovative, or unexpected ways that you have seen small language models used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on operationalizing inference and model customization?When is a small or self-hosted language model the wrong choice?What are your predictions for the near future of small language model capabilities/availability?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksRedHat AI EngineeringGenerative AIPredictive AIChatGPTQLORAHuggingFacevLLMOpenShift AILlama ModelsDeepSeekGPT-OSSMistralMixture of Experts (MoE)QwenInstructLabSFT == Supervised Fine TuningLORAThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

AI Agents and Identity Management

2025-09-1353:32

SummaryIn this episode of the AI Engineering Podcast Julianna Lamb, co-founder and CTO of Stytch, talks about the complexities of managing identity and authentication in agentic workflows. She explores the evolving landscape of identity management in the context of machine learning and AI, highlighting the importance of flexible compute environments and seamless data exchange. The conversation covers implications of AI agents on identity management, including granular permissions, OAuth protocol, and adapting systems for agentic interactions. Julianna also discusses rate limiting, persistent identity, and evolving standards for managing identity in AI systems. She emphasizes the need to experiment with AI agents and prepare systems for integration to stay ahead in the rapidly advancing AI landscape.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsWhen ML teams try to run complex workflows through traditional orchestration tools, they hit walls. Cash App discovered this with their fraud detection models - they needed flexible compute, isolated environments, and seamless data exchange between workflows, but their existing tools couldn't deliver. That's why Cash App rely on Prefect. Now their ML workflows run on whatever infrastructure each model needs across Google Cloud, AWS, and Databricks. Custom packages stay isolated. Model outputs flow seamlessly between workflows. Companies like Whoop and 1Password also trust Prefect for their critical workflows. But Prefect didn't stop there. They just launched FastMCP - production-ready infrastructure for AI tools. You get Prefect's orchestration plus instant OAuth, serverless scaling, and blazing-fast Python execution. Deploy your AI tools once, connect to Claude, Cursor, or any MCP client. No more building auth flows or managing servers. Prefect orchestrates your ML pipeline. FastMCP handles your AI tool infrastructure. See what Prefect and Fast MCP can do for your AI workflows at aiengineeringpodcast.com/prefect today.Your host is Tobias Macey and today I'm interviewing Julianna Lamb about the complexities of managing identity and auth in agentic workflowsInterviewIntroductionHow did you get involved in machine learning?The term "identity" is very overloaded. Can you start by giving your definition in the context of technical systems?What are some of the different ways that AI agents intersect with identity?We have decades of experience and effort in building identity infrastructure for the internet, what are the most significant ways in which that is insufficient for agent-based use cases?I have heard anecdotal references to the ways in which AI agents lead to a proliferation of "identities". How would you characterize the magnitude of the difference in scale between human-powered identity, deterministic automation (e.g. bots or bot-nets), and AI agents?The other major element of establishing and verifying "identity" is how that intersects with permissions or authorization. What are the major shortcomings of our existing investment in managing and auditing access and control once you are within a system?How does that get amplified with AI agents?Typically authentication has been done at the perimeter of a system. How does that architecture change when accounting for AI agents?How does that get complicated by where the agent originates? (e.g external agents interacting with a third-party system vs. internal agents operated by the service provider)What are the concrete steps that engineering teams should be taking today to start preparing their systems for agentic use-cases (internal or external)?How do agentic capabilities change the means of protecting against malicious bots? (e.g. bot detection, defensive agents, etc.)What are the most interesting, innovative, or unexpected ways that you have seen authn/authz/identity addressed for AI use cases?What are the most interesting, unexpected, or challenging lessons that you have learned while working on identity/auth(n|z) systems?What are your predictions for the future of identity as adoption and sophistication of AI systems progresses?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksStytchAI AgentMachine To Machine AuthenticationAPI AuthenticationMCP == Model Context ProtocolOAuthIdentity ProviderOAuth ScopesOAuth 2.1CaptchaRBAC == Role-Based Access ControlABAC == Attribute-Based Access ControlReBAC == Relationship-Based Access ControlGoogle ZanzibarIdempotenceDynamic Client RegistrationLarge Action ModelsClaude CodeThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Revolutionizing Production Systems: The Resolve AI Approach

2025-09-0451:01

SummaryIn this episode of the AI Engineering Podcast, CEO of Resolve AI Spiros Xanthos shares his insights on building agentic capabilities for operational systems. He discusses the limitations of traditional observability tools and the need for AI agents that can reason through complex systems to provide actionable insights and solutions. The conversation highlights the architecture of Resolve AI, which integrates with existing tools to build a comprehensive understanding of production environments, and emphasizes the importance of context and memory in AI systems. Spiros also touches on the evolving role of AI in production systems, the potential for AI to augment human operators, and the need for continuous learning and adaptation to fully leverage these advancements.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsYour host is Tobias Macey and today I'm interviewing Spiros Xanthos about architecting agentic capabilities for operational challenges with managing production systems.InterviewIntroductionHow did you get involved in machine learning?Can you describe what Resolve AI is and the story behind it?We have decades of experience as an industry in managing operational complexity. What are the critical failures in capabilities that you are addressing with the application of AI?Given the existing capabilities of dedicated platforms (e.g. Grafana, PagerDuty, Splunk, etc), what is your reasoning for building a new system vs. a new feature of existing operational product?Over the past couple of years the industry has developed a growing number of agent patterns. What was your approach in evaluating and selecting a particular approach for your product?One of the complications of building any platform that supports operational needs of engineering teams is the complexity of integrating with their technology stack. This is doubly true when building an AI system that needs rich context. What are the core primitives that you are relying on to build a robust offering?How are you managing the learning process for your systems to allow for iterative discovery and improvement?What are your strategies for personalizing those discoveries to a given customer and operating environment?One of the interesting challenges in agentic systems is managing the user experience for human-in-the-loop and machine to human handoffs in each direction. How are you thinking about that, especially given the criticality of the systems that you are interacting with?As more of the code that is running in production environments is co-developed with AI, what impact do you anticipate on the overall operational resilience of the systems being monitored?One of the challenges of working with LLMs is the cold start problem where every conversation starts from scratch. How are you approaching the overall problem of context engineering and ensuring that you are consistently providing the necessary information for the model to be effective in its role?What are the most interesting, innovative, or unexpected ways that you have seen Resolve AI used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Resolve AI?When is Resolve AI the wrong choice?What do you have planned for the future of Resolve AI?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com with your story.Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksResolve AISplunkOpenTelemetrySplunk ObservabilityContext EngineeringGrafanaKubernetesPagerDutyThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Designing Scalable AI Systems with FastMCP: Challenges and Innovations

2025-08-2601:13:57

SummaryIn this episode of the AI Engineering Podcast Jeremiah Lowin, founder and CEO of Prefect Technologies, talks about the FastMCP framework and the design of MCP servers. Jeremiah explains the evolution of FastMCP, from its initial creation as a simpler alternative to the MCP SDK to its current role in facilitating the deployment of AI tools. The discussion covers the complexities of designing MCP servers, the importance of context engineering, and the potential pitfalls of overwhelming AI agents with too many tools. Jeremiah also highlights the importance of simplicity and incremental adoption in software design, and shares insights into the future of MCP and the broader AI ecosystem. The episode concludes with a look at the challenges of authentication and authorization in AI applications and the exciting potential of MCP as a protocol for the future of AI-driven business logic.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsYour host is Tobias Macey and today I'm interviewing Jeremiah Lowin about the FastMCP framework and how to design and build your own MCP serversInterviewIntroductionHow did you get involved in machine learning?Can you start by describing what MCP is and its purpose in the ecosystem of AI applications?What is FastMCP and what motivated you to create it?Recognizing that MCP is relatively young, how would you characterize the landscape of MCP frameworks?What are some of the stumbling blocks on the path to building a well engineered MCP server?What are the potential ramifications of poorly designed and implemented MCP implementations?In the overall context of an AI-powered/agentic application, what are the tradeoffs of investing in the MCP protocol? (e.g. engineering effort, process isolation, tool creation, auth(n|z), etc.)In your experience, what are the architectural patterns that you see of MCP implementation and usage?There are a multitude of MCP servers available for a variety of use cases. What are the key factors that someone should be using to evaluate their viability for a production use case?Can you give an overview of the key characteristics of FastMCP and why someone might select it as their implementation target for a custom MCP server?How have the design, scope, and goals of the project evolved since you first started working on it?For someone who is using FastMCP as the framework for creating their own AI tools, what are some of the design considerations or best practices that they should be aware of?What are some of the ways that someone might consider integrating FastMCP into their existing Python-powered web applications (e.g. FastAPI, Django, Flask, etc.)As you continue to invest your time and energy into FastMCP, what is your overall goal for the project?What are the most interesting, innovative, or unexpected ways that you have seen FastMCP used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on FastMCP?When is FastMCP the wrong choice?What do you have planned for the future of FastMCP?Contact InfoLinkedInGitHubParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksFastMCPFastMCP CloudPrefectModel Context Protocol (MCP)AI ToolsFastAPIPython DecoratorWebsocketsSSE == Server-Sent EventsStreamable HTTPOAuthMCP GatewayMCP SamplingFlaskDjangoASGIMCP ElicitationAuthKitDynamic Client RegistrationsmolagentsLarge Active ModelsA2AThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Proactive Monitoring in Heavy Industry: The Role of AI and Human Curiosity

2025-08-2340:57

SummaryIn this episode of the AI Engineering Podcast Dr. Tara Javidi, CTO of KavAI, talks about developing AI systems for proactive monitoring in heavy industry. Dr. Javidi shares her background in mathematics and information theory, influenced by Claude Shannon's work, and discusses her approach to curiosity-driven AI that mimics human curiosity to improve data collection and predictive analytics. She explains how KavAI's platform uses generative AI models to enhance industrial monitoring by addressing informational blind spots and reducing reliance on human oversight. The conversation covers the architecture of KavAI's systems, integrating AI with existing workflows, building trust with operators, and the societal impact of AI in preventing environmental catastrophes, ultimately highlighting the future potential of information-centric AI models.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systems.Your host is Tobias Macey and today I'm interviewing Dr. Tara Javidi about building AI systems for proactive monitoring of physical environments for heavy industryInterviewIntroductionHow did you get involved in machine learning?Can you describe what KavAI is and the story behind it?What are some of the current state-of-the-art applications of AI/ML for monitoring and accident prevention in industrial environments?What are the shortcomings of those approaches?What are some examples of the types of harm that you are focused on preventing or mitigating with your platform?On your site it mentions that you have created a foundation model for physical awareness. What are some examples of the types of predictive/generative capabilities that your model provides?A perennial challenge when building any digital model of a physical system is the lack of absolute fidelity. What are the key sources of information acquisition that you rely on for your platform?In addition to your foundation model, what are the other systems that you incorporate to perform analysis and catalyze action?Can you describe the overall system architecture of your platform?What are some of the ways that you are able to integrate learnings across industries and environments to improve the overall capacity of your models?What are the most interesting, innovative, or unexpected ways that you have seen KavAI used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on KavAI?When is KavAI/Physical AI the wrong choice?What do you have planned for the future of KavAI?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?LinksKavAIInformation TheoryClaude ShannonThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Navigating the AI Landscape: Challenges and Innovations in Retail

2025-08-0752:09

SummaryIn this episode of the AI Engineering Podcast machine learning engineer Shashank Kapadia explores the transformative role of generative AI in retail. Shashank shares his journey from an engineering background to becoming a key player in ML, highlighting the excitement of understanding human behavior at scale through AI. He discusses the challenges and opportunities presented by generative AI in retail, where it complements traditional ML by enhancing explainability and personalization, predicting consumer needs, and driving autonomous shopping agents and emotional commerce. Shashank elaborates on the architectural and operational shifts required to integrate generative AI into existing systems, emphasizing orchestration, safety nets, and continuous learning loops, while also addressing the balance between building and buying AI solutions, considering factors like data privacy and customization.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsYour host is Tobias Macey and today I'm interviewing Shashank Kapadia about applications of generative AI in retailInterviewIntroductionHow did you get involved in machine learning?Can you summarize the main applications of generative AI that you are seeing the most benefit from in retail/ecommerce?What are the major architectural patterns that you are deploying for generative AI workloads?Working at an organization like WalMart, you already had a substantial investment in ML/MLOps. What are the elements of that organizational capability that remain the same, and what are the catalyzed changes as a result of generative models?When working at the scale of Walmart, what are the different types of bottlenecks that you encounter which can be ignored at smaller orders of magnitude?Generative AI introduces new risks around brand reputation, accuracy, trustworthiness, etc. What are the architectural components that you find most effective in managing and monitoring the interactions that you provide to your customers?Can you describe the architecture of the technical systems that you have built to enable the organization to take advantage of generative models?What are the human elements that you rely on to ensure the safety of your AI products?What are the most interesting, innovative, or unexpected ways that you have seen generative AI break at scale?What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI?When is generative AI the wrong choice?What are your paying special attention to over the next 6 - 36 months in AI?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksWalmart LabsThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

The Anti-CRM CRM: How Spiro Uses AI to Transform Sales

2025-07-2146:471

SummaryIn this episode of the AI Engineering podcast Adam Honig, founder of Spiro AI, about using AI to automate CRM systems, particularly in the manufacturing sector. Adam shares his journey from running a consulting company focused on Salesforce to founding Spiro, and discusses the challenges of traditional CRM systems where data entry is often neglected. He explains how Spiro addresses this issue by automating data collection from emails, phone calls, and other communications, providing a rich dataset for machine learning models to generate valuable insights. Adam highlights how Spiro's AI-driven CRM system is tailored to the manufacturing industry's unique needs, where sales are relationship-driven rather than funnel-based, and emphasizes the importance of understanding customer interactions and order histories to predict future business opportunities. The conversation also touches on the evolution of AI models, leveraging powerful third-party APIs, managing context windows, and platform dependencies, with Adam sharing insights into Spiro's future plans, including product recommendations and dynamic data modeling approaches.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsYour host is Tobias Macey and today I'm interviewing Adam Honig about using AI to automate CRM maintenanceInterviewIntroductionHow did you get involved in machine learning?Can you describe what Spiro is and the story behind it?What are the specific challenges posed by the manufacturing industry with regards to sales and customer interactions?How does the type of manufacturing and target customer influence the level of effort and communication involved in the sales and customer service cycles?Before we discuss the opportunities for automation, can you describe the typical interaction patterns and workflows involved in the care and feeding of CRM systems?Spiro has been around since 2014, long pre-dating the current era of generative models. What were your initial targets for improving efficiency and reducing toil for your customers with the aid of AI/ML?How have the generational changes of deep learning and now generative AI changed the ways that you think about what is possible in your product?Generative models reduce the level of effort to get a proof of concept for language-oriented workflows. How are you pairing them with more narrow AI that you have built?Can you describe the overall architecture of your platform and how it has evolved in recent years?While generative models are powerful, they can also become expensive, and the costs are hard to predict. How are you thinking about vendor selection and platform risk in the application of those models?What are the opportunities that you see for the adoption of more autonomous applications of language models in your product? (e.g. agents)What are the confidence building steps that you are focusing on as you investigate those opportunities?What are the most interesting, innovative, or unexpected ways that you have seen Spiro used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI in the CRM space?When is AI the wrong choice for CRM workflows?What do you have planned for the future of Spiro?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksSpiroDeepgramCognee EpisodeAgentic MemoryGraphRAGPodcast EpisodeOpenAI Assistant APIThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Unlocking AI Potential with AMD's ROCm Stack

2025-06-2342:18

SummaryIn this episode of the AI Engineering podcast Anush Elangovan, VP of AI software at AMD, discusses the strategic integration of software and hardware at AMD. He emphasizes the open-source nature of their software, fostering innovation and collaboration in the AI ecosystem, and highlights AMD's performance and capability advantages over competitors like NVIDIA. Anush addresses challenges and opportunities in AI development, including quantization, model efficiency, and future deployment across various platforms, while also stressing the importance of open standards and flexible solutions that support efficient CPU-GPU communication and diverse AI workloads.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsYour host is Tobias Macey and today I'm interviewing Anush Elangovan about AMD's work to expand the playing field for AI training and inferenceInterviewIntroductionHow did you get involved in machine learning?Can you describe what your work at AMD is focused on?A lot of the current attention on hardware for AI training and inference is focused on the raw GPU hardware. What is the role of the software stack in enabling and differentiating that underlying compute?CUDA has gained a significant amount of attention and adoption in the numeric computation space (AI, ML, scientific computing, etc.). What are the elements of platform risk associated with relying on CUDA as a developer or organization?The ROCm stack is the key element in AMD's AI and HPC strategy. What are the elements that comprise that ecosystem?What are the incentives for anyone outside of AMD to contribute to the ROCm project?How would you characterize the current competitive landscape for AMD across the AI/ML lifecycle stages? (pre-training, post-training, inference, fine-tuning)For teams who are focused on inference compute for model serving, what do they need to know/care about in regards to AMD hardware and the ROCm stack?What are the most interesting, innovative, or unexpected ways that you have seen AMD/ROCm used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on AMD's AI software ecosystem?When is AMD/ROCm the wrong choice?What do you have planned for the future of ROCm?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksImageNetAMDROCmCUDAHuggingFaceLlama 3Llama 4QwenDeepSeek R1MI300XNokia SymbianUALink StandardQuantizationHIPIFYROCm TritonAMD Strix HaloAMD EpycLiquid NetworksMAMBA ArchitectureTransformer ArchitectureNPU == Neural Processing Unitllama.cppOllamaPerplexity ScoreNUMA == Non-Uniform Memory AccessvLLMSGLangThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Applying AI To The Construction Industry At Buildots

2025-06-1449:29

SummaryIn this episode of the Machine Learning Podcast Ori Silberberg, VP of Engineering at Buildots, talks about transforming the construction industry with AI. Ori shares how Buildots uses computer vision and AI to optimize construction projects by providing real-time feedback, reducing delays, and improving efficiency. Learn about the complexities of digitizing the construction industry, the technical architecture of Buildoz, and how its AI-driven solutions create a digital twin of construction sites. Ori emphasizes the importance of explainability and actionable insights in AI decision-making, highlighting the potential of generative AI to further enhance the construction process from planning to execution.AnnouncementsHello and welcome to the AI Engineering Podcast, your guide to the fast-moving world of building scalable and maintainable AI systemsYour host is Tobias Macey and today I'm interviewing Ori Silberberg about applications of AI for optimizing building constructionInterviewIntroductionHow did you get involved in machine learning?Can you describe what Buildotds is and the story behind it?What types of construction projects are you focused on? (e.g. residential, commercial, industrial, etc.)What are the main types of inefficiencies that typically occur on those types of job sites?What are the manual and technical processes that the industry has typically relied on to address those sources of waste and delay?In many ways the construction industry is as old as civilization. What are the main ways that the information age has transformed construction?What are the elements of the construction industry that make it resistant to digital transformation?Can you describe how you are applying AI to this complex and messy problem?What are the types of data that you are able to collect?How are you automating that data collection so that construction crews don't have to add extra work or distractions to their day?For construction crews that are using Buildots, can you talk through how it integrates into the overall process from site planning to project completion?Can you describe the technical architecture of the Buildots platform?Given the safety critical nature of construction, how does that influence the way that you think about the types of AI models that you use and where to apply them?What are the most interesting, innovative, or unexpected ways that you have seen Buildots used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Buildots?What do you have planned for the future of AI usage at Buildots?Contact InfoLinkedInParting QuestionFrom your perspective, what are the biggest gaps in tooling, technology, or training for AI systems today?Closing AnnouncementsThank you for listening! Don't forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email hosts@aiengineeringpodcast.com with your story.To help other people find the show please leave a review on iTunes and tell your friends and co-workers.LinksBuildotsCAD == Computer Aided DesignComputer VisionLIDARGC == General ContractorKubernetesThe intro and outro music is from Hitman's Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Comments (117)

thelotus365

This episode on streaming machine learning with River really opened my eyes to how ML models can adapt in real time instead of just being trained once and left static. It’s fascinating how frameworks like River help tackle concept drift by constantly updating models as new data arrives — a huge step forward from traditional batch training. This kind of insight is incredibly useful for anyone working with real-world data. https://thelotus365.co.in/lotus365-blue/

Dec 27th

claire miller

This blog is good — your writing is awesome! National Economic Planning Assignment Help assists students in understanding how governments shape economic policies, allocate resources, and design long-term development strategies. It supports learning about growth models, fiscal frameworks, policy evaluation, and national planning tools for sustainable economic progress. https://www.expertsmind.com/management/national-economic-planning-homework-assignment-help.aspx

Dec 11th

Santosh Autade

I love how sports fans connect online. While browsing through discussions, I saw Diamond Exchange ID mentioned a few times, which shows how some names become known naturally. I believe good sports platforms should be simple and friendly. Diamond Exchange ID appears to be associated with that digital sports culture. Fans today want more than just updates—they want interaction and connection. Sharing thoughts during games makes moments even better. Technology has truly made sports more fun and social. Visit Site@ https://diamond-exch.co.in

Dec 1st

I really enjoy reading sports discussions online, and it’s interesting how different platform names come up in conversations. Recently, I noticed Fairdeal Pro being mentioned by a few fans while talking about digital sports culture. It shows how online spaces are now part of how we follow matches. For me, sports are more enjoyable when I can read opinions and share reactions in real time. Good platforms make the experience smooth and comfortable. When a platform is easy to browse and doesn’t feel confusing, you actually want to spend more time there. That’s what matters most to users—clarity, comfort, and community. Sports bring people together emotionally, and sharing that excitement online makes every game even more memorable for fans like me. https://fairplayy.ai/pro/

I enjoy following sports websites that are easy to browse and not confusing. Recently, while reading online discussions, I came across the name Govinda365 mentioned by sports fans and decided to explore more about it. What stood out to me was how people talk about digital sports culture, match discussions, and online communities. Govinda365 is often mentioned as part of that growing sports ecosystem where fans like to stay connected. I personally like platforms that feel simple and smooth so I can focus more on enjoying the sports content rather than figuring out where things are. These kinds of sports-related platforms really make it more fun to keep up with games, statistics, and fan opinions all in one place. Visit Site@ https://govinda365club.com

Nov 29th

I really enjoy how online sports communities have grown in recent years. These days, it’s not just about watching matches but also sharing reactions with other fans. I recently saw the name playwise35 mentioned during a sports discussion, and it reminded me how many digital spaces fans now use to connect. What matters most to me is having a smooth, easy platform experience. When things are well organized, you actually enjoy reading and exploring sports content more. Sports become more emotional when you’re able to share that excitement with others. Online interaction adds a new level of fun and connection to every game. Visit Site @ https://playwisebet.com

Reply (1)

11exchzone

Really insightful topic! Most ML discussions focus on batch processing, so it’s refreshing to see streaming ML getting the spotlight. River sounds super promising for handling real-time data shifts and concept drift, especially in industries where behavior changes fast. I’ve seen similar conversations in communities related to fintech and platforms like 11exch, where constant updates and adaptability matter a lot. Definitely keen to experiment with River soon—great deep dive! https://11exchzone.com/

Nov 7th

my99exchid

ChatGPT said: Really insightful discussion about how streaming machine learning can keep models relevant in real time. I liked how Max explained the concept drift issue and how River tackles it dynamically. It’s fascinating to see Python frameworks evolving this way. While reading, I actually thought about how adaptive systems like my99exch also rely on real-time updates to improve performance and user experience. Great episode — truly worth a deeper listen! https://my99exch.id/

Nov 6th

fairdealpro

Really interesting breakdown of the difference between batch and streaming ML. The part about concept drift really resonated because data in real-world systems never stays still. I also like how River keeps things lightweight with incremental updates instead of retraining huge models from scratch. It’s practical for systems that evolve daily. I was reading this while dealing with a fairdeal login dashboard refresh at work, and it made me rethink how we monitor data flow. https://fairdealpro.com/login/

Nov 5th

obiiarticle

Great write-up! From Maid to Multi-Lakh Enterprise, the Inspiring Journey of Anitha S carries a message everyone should read. https://digitaldopamine.in/2025/10/22/from-maid-to-multi-lakh-enterprise-the-inspiring-journey-of-anitha-s-business-giseness-episode-3/

Oct 24th

Thanks for sharing such an in-depth interview with Max Halford. Subscribing to the podcast now! https://11exchzone.com/

Oct 18th

Thanks for sharing this! I’ve mostly worked with batch ML, so learning about online learning and how River handles continuous data streams is eye-opening. https://my99exch.id/

Oct 17th

Thanks for sharing this! I’ve mostly worked with batch ML, so learning about online learning and how River handles continuous data streams is eye-opening. https://fairdealpro.com/login/

Really interesting overview of River and streaming ML! I like how it addresses the limitations of batch learning, especially in environments where data evolves continuously. Concept drift is such a tricky problem, and having tools that can adapt in real-time seems like a game-changer. https://my99exch.id/

Callgirlspa Center

call girls in jaipur https://callgirlsspacenter.com/call-girl-in-jaipur/

Sep 26th

call girls in jaipur

Great service, thanks Aisha Oberoy for being available in other cities too now.https://callgirlsspacenter.com/call-girl-in-jaipur/

Pawan Kumar

Your post’s got that natural, welcoming touch—spot-on. At 247torax, we prioritize comfy, real moments in Bangalore. Explore Bangalore call girls for a laid-back connection. Thanks for the great read—keep sharing.=

Sep 25th

#box-pro-ellipsis-176739514908058{-webkit-line-clamp:2;}AI Engineering Podcast

thelotus365

claire miller

Santosh Autade

Santosh Autade

Santosh Autade

Santosh Autade

11exchzone

my99exchid

fairdealpro

obiiarticle

11exchzone

my99exchid

my99exchid

my99exchid

Callgirlspa Center

Callgirlspa Center

Callgirlspa Center

Callgirlspa Center

Pawan Kumar

Pawan Kumar