Discover
Chain of Thought

Chain of Thought
Author: Galileo
Subscribed: 6Played: 171Subscribe
Share
© Galileo
Description
Introducing Chain of Thought, the weekly podcast for software engineers and leaders that demystifies artificial intelligence.
Join host Conor Bronsdon each week as we tell the stories of the people building the AI revolution, unravel actionable strategies for agents and share practical techniques for building effective GenerativeAI applications.
Join host Conor Bronsdon each week as we tell the stories of the people building the AI revolution, unravel actionable strategies for agents and share practical techniques for building effective GenerativeAI applications.
41 Episodes
Reverse
As AI agents and multimodal models become more prevalent, understanding how to evaluate GenAI is no longer optional – it's essential.
Generative AI introduces new complexities in assessment compared to traditional software, and this week on Chain of Thought we’re joined by Chip Huyen (Storyteller, Tép Studio), Vivienne Zhang (Senior Product Manager, Generative AI Software, Nvidia) for a discussion on AI evaluation best practices.
Before we hear from our guests, Vikram Chatterji (CEO, Galileo) and Conor Bronsdon (Developer Awareness, Galileo) give their takes on the complexities of AI evals and how to overcome them through the use of objective criteria in evaluating open-ended tasks, the role of hallucinations in AI models, and the importance of human-in-the-loop systems.
Afterwards, Chip and Vivienne sit down with Atin Sanyal (Co-Founder & CTO, Galileo) to explore common evaluation approaches, best practices for building frameworks, and implementation lessons. They also discuss the nuances of evaluating AI coding assistants and agentic systems.
Chapters:
00:00 Challenges in Evaluating Generative AI
05:45 Evaluating AI Agents
13:08 Are Hallucinations Bad?
17:12 Human in the Loop Systems
20:49 Panel discussion begins
22:57 Challenges in Evaluating Intelligent Systems
24:37 User Feedback and Iterative Improvement
26:47 Post-Deployment Evaluations and Common Mistakes
28:52 Hallucinations in AI: Definitions and Challenges
34:17 Evaluating AI Coding Assistants
38:15 Agentic Systems: Use Cases and Evaluations
43:00 Trends in AI Models and Hardware
45:42 Future of AI in Enterprises
47:16 Conclusion and Final Thoughts
Follow:
Vikram Chatterji: https://www.linkedin.com/in/vikram-chatterji/
Atin Sanyal: https://www.linkedin.com/in/atinsanyal/
Conor Bronsdon: https://www.linkedin.com/in/conorbronsdon/
Chip Huyen: https://www.linkedin.com/in/chiphuyen/
Vivienne Zhang: https://www.linkedin.com/in/viviennejiaozhang/
Show notes:
Watch all of Productionize 2.0: https://www.galileo.ai/genai-productionize-2-0
The incredible velocity of AI coding tools has shifted the critical bottleneck in software development from code generation to code reviews. Greg Foster, Co-Founder & CTO of Graphite, joins the conversation to explore this new reality, outlining the three waves of AI that are leading to autonomous agents spawning pull requests in the background. He argues that as AI automates the "inner loop" of writing code, the human-centric "outer loop"—reviewing, merging, and deploying—is now under immense pressure, demanding a complete rethinking of our tools and processes.The conversation then gets tactical, with Greg detailing how a technique called "stacking" can break down large code changes into manageable units for both humans and AI. He also identifies an emerging hiring gap where experienced engineers with strong architectural context are becoming "lethal" with AI tools. This episode is an essential guide to navigating the new bottlenecks in software development and understanding the skills that will define the next generation of high-impact engineers.Follow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Connect with Greg on LinkedInFollow Greg on XGraphite Website: graphite.devCheck out GalileoTry GalileoAgent Leaderboard
What’s the first step to building an enterprise-grade AI tool? Malte Ubl, CTO of Vercel, joins us this week to share Vercel’s playbook for agents, explaining how agents are a new type of software for solving flexible tasks. He shares how Vercel's developer-first ecosystem, including tools like the AI SDK and AI Gateway, is designed to help teams move from a quick proof-of-concept to a trusted, production-ready application.Malte explores the practicalities of production AI, from the importance of eval-driven development to debugging chaotic agents with robust tracing. He offers a critical lesson on security, explaining why prompt injection requires a totally different solution - tool constraint - than traditional threats like SQL injection. This episode is a deep dive into the infrastructure and mindset, from sandboxes to specialized SLMs, required to build the next generation of AI tools.Follow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Connect with Malte on LinkedInFollow Malte on X (formerly Twitter)Learn more about VercelCheck out GalileoTry GalileoAgent Leaderboard
The technological moat is eroding in the AI era, what new factors separate a successful startup from the rest?Aurimas Griciūnas, CEO of SwirlAI, joins the show to break down the realities of building in this new landscape. Startup success now hinges on speed, strong financial backing, or immediate distribution. Aurimas warns against the critical mistake of prioritizing shiny tools over fundamental engineering and the market gaps this creates.Discover the new moats for AI companies, built on a culture of relentless execution, tight feedback loops, and the surprising skills that define today's most valuable engineers.The episode also looks to the future, with bold predictions about a slowdown in LLM leaps and the coming impact of coding agents and self-improving systems.Follow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Connect with Aurimas on LinkedInAurimas' Course: End-to-End AI Engineering BootcampCheck out GalileoTry GalileoAgent Leaderboard
As we enter the era of the AI engineer, the biggest challenge isn't technical - it's a shift in mindset. Hamel Husain, a leading AI consultant and luminary in the eval space, joins the podcast to explore the skills and processes needed to build reliable AI. Hamel explains why many teams relying on vanity dashboards and a "buffet of metrics" experience a false sense of security, which is no substitute for customized evals tailored to domain-specific risks. The solution? A disciplined process of error analysis, grounded in manually looking at data to identify real-world failures This discussion is an essential guide to building the continuous learning loops and "experimentation mindset" required to take AI products from prototype to production with confidence. Listen to learn the playbook for building AI reliability, and derive qualitative insights from log data to build customized quantitative guardrails. Follow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Connect with Hamel on LinkedInFollow Hamel on X/TwitterCheck out his blog: hamel.devCheck out GalileoTry GalileoAgent Leaderboard
What if your next competitor is not a startup, but a solo builder on a side project shipping features faster than your entire team? For Claire Vo, that's not a hypothetical. As the founder of ChatPRD, formerly the Chief Product and Technology Officer at LaunchDarkly, and host of the How I AI podcast, she has a unique vantage point on the driving forces behind a new blueprint for success.She argues that AI accountability must be driven from the top by an "AI czar" and reveals how a culture of experimentation is the key to overcoming organizational hesitancy. Drawing from her experience as a solo founder, she warns that for incumbents, the cost of moving slowly is the biggest threat and details how AI can finally be used to tackle legacy codebases. The conversation closes with bold predictions on the rise of the "super IC" - who can achieve top-tier impact and salary without managing a team - and the death of product management. Follow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Connect with Claire on LinkedInFollow Claire on X/TwitterClaire’s podcast How I AICheck out GalileoTry GalileoAgent Leaderboard
How do you build an AI-native company to a $7M run rate in just six months?According to Marcel Santilli, Founder and CEO of GrowthX, the secret isn't chasing the next frontier model, it's mastering the "messy middle." Drawing on his deep experience at Scale AI and Deepgram, Marcel joins host Conor Bronsdon to share his framework for building durable, customer-obsessed businesses.Marcel argues that the most critical skills for the AI era aren't technical but philosophical: first-principles thinking and the art of delegation.Tune in to learn why GrowthX first focused on services to codify expert work, how AI can augment human talent instead of replacing it, and why speed and brand are a startup's greatest competitive advantages. This conversation offers a clear playbook for building a resilient company by prioritizing culture and relentless shipping.Follow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Connect with Marcel on LinkedInFollow Marcel on X (formerly Twitter)Learn more about GrowthXCheck out GalileoTry GalileoAgent Leaderboard
AI isn't just changing healthcare; it's providing the essential help needed to unlock a trillion-dollar opportunity for better care.Andreas Cleve, CEO & Co-founder of Corti, steps in to shed light on AI's immense, yet often misunderstood, transformative potential in this high-stakes environment. Andreas refutes the narrative of healthcare being slow adopters, emphasizing its high bar for trustworthy technology and its constant embrace of new tools. He reveals how purpose-built AI models are already alleviating the "pajama time" burden of documentation for clinicians, enabling faster and more accurate assessments in various specializations. This quiet, impactful adoption is seeing companies grow "like weeds" beyond common expectations.The conversation addresses how AI can tackle the looming global shortage of 10 million healthcare professionals by 2030, reallocating a trillion dollars worth of administrative work back into care. Andreas details Corti’s approach to building invisible, reliable AI through rigorous, compliance-first evaluation, ensuring accuracy and efficiency in real-time. He emphasizes that AI's true role is not replacement, but augmentation, empowering professionals to deliver more care, attract talent, and drive organizational growth.Follow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)LinkedIn: linkedin.com/in/andreascleveX (formerly Twitter): andreascleveCorti Website: corti.aiCheck out GalileoTry GalileoAgent Leaderboard
AI agents offer unprecedented power, but mastering agent reliability is the ultimate challenge for agentic systems to actually work in production.Mikiko Chandrashekar, Staff Developer Advocate at MongoDB, whose background spans the entire data-to-AI pipeline, unveils MongoDB's vision as the memory store for agents, supporting complex multi-agent systems from data storage and vector search to debugging chat logs. She highlights how MongoDB, reinforced by the acquisition of Voyage, empowers developers to build production-scale agents across various industries, from solo projects to major enterprises. This robust data layer is foundational to ensure agent performance and improve the end user experience.Mikiko advocates for treating agents as software products, applying rigorous engineering best practices to ensure reliability, even for non-deterministic systems. She details MongoDB's unique position to balance GPU/CPU loads and manage data for performance and observability, including Galileo's integrations. The conversation emphasizes the profound need to rethink observability, evaluations, and guardrails in the era of agents, showcasing Galileo's family of small language models for real-time guardrailing, Luna-2, and Insights Engine for automated failure analysis. Discover how building trustworthiness through systematic evaluation, beyond just "vibe checks," is essential for AI agents to scale and deliver value in high-stakes use cases.Follow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Connect with Mikiko on LinkedInFollow Mikiko on X/TwitterExplore Mikiko's YouTube channelCheck out Mikiko's SubstackConnect with MongoDB on LinkedInConnect with MongoDB on YouTubeCheck out GalileoTry GalileoAgent Leaderboard
The age of ubiquitous AI agents is here, bringing immense potential - and unprecedented risk.Hosts Conor Bronsdon and Vikram Chatterji open the episode by discussing the urgent need for building trust and reliability into next-generation AI agents. Vikram unveils Galileo's free AI reliability platform for agents, featuring Luna 2 SLMs for real-time guardrails and its Insights Engine for automatic failure mode analysis. This platform enables cost-effective, low-latency production evaluations, significantly transforming debugging. Achieving trustworthy AI agents demands rigorous testing, continuous feedback, and robust guardrailing—complex challenges requiring powerful solutions from partners like Elastic.Conor welcomes Philipp Krenn, Director of Developer Relations at Elastic, to discuss their collaboration in ensuring AI agent reliability, including how Elastic leverages Galileo's platform for evaluation. Philipp details Elastic's evolution from a search powerhouse to a key AI enabler, transforming data access with Retrieval-Augmented Generation (RAG) and new interaction modes. He discusses Elastic's investment in SLMs for efficient re-ranking and embeddings, emphasizing robust evaluation and observability for production. This collaborative effort aims to equip developers to build reliable, high-performing AI systems for every enterprise.Chapters:00:00 Introduction 01:09 Galileo's AI Reliability Platform01:43 Challenges in AI Agent Reliability06:17 Insights Engine and Its Importance11:00 Luna 2: Small Language Models14:42 Custom Metrics and Agent Leaderboard19:16 Galileo's Integrations and Partnerships21:04 Philipp Krenn from Elastic24:47 Optimizing LLM Responses 25:41 Galileo and Elastic: A Powerful Partnership28:20 Challenges in AI Production and Trust30:02 Guardrails and Reliability in AI Systems32:17 The Future of AI in Customer InteractionFollow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Connect with Philipp on LinkedInLearn more about ElasticCheck out GalileoTry GalileoAgent Leaderboard
The Internet of Agents is rapidly taking shape, necessitating innovative foundational standards, protocols, and evaluation methods for its success.Recorded at Cisco's office in San Jose, we welcome Giovanna Carofiglio, Distinguished Engineer and Senior Director at Outshift by Cisco. As a leader of the AGNTCY Collective (an open-source initiative by Cisco, Galileo, LangChain, and many other participating companies), Giovanna outlines the vision for agents to collaborate seamlessly across the enterprise and the internet. She details the collective's pillars, from agent discovery and deployment using new agentic protocols like Slim, to ensuring a secure, low-latency communication transport layer. This groundbreaking work aims to make distributed agentic communication a reality.The conversation then explores the critical role of observability and evaluation in building trustworthy agent applications, including defining an interoperable standard schema for communications. Giovanna highlights the complex challenges of scaling agents to thousands or millions, emphasizing the need for robust security (agent identity with OSF schema) and predictable agent behavior through extensive testing and characterization. She distinguishes between protocols like MCP (agent-to-tool) and A2A (agent-to-agent), advocating for open standards and underlying transport layers akin to TCP. Chapters:00:00 Introduction01:00 Overview of Agent Interoperability02:20 What is AGNTCY03:45 Agent Discovery and Composition04:38 Agent Protocols and Communication05:45 Observability and Evaluation07:00 Metrics and Standards for Agents09:45 Challenges in Agent Evaluation14:15 Low Latency and Active Evaluation23:34 Synthetic Data and Ground Truth25:07 Interoperable Agent Schema26:37 MCP & A2A30:17 Future of Agent Communication32:03 Security and Agent Identity34:37 Collaboration and Community Involvement38:28 Conclusion Follow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)AGNTCY Collective: agntcy.orgConnect with Giovanna on LinkedInLearn more about Outshift: outshift.cisco.comCheck out GalileoTry GalileoAgent Leaderboard
When AI makes creating content and code nearly free, how do you stand out? Differentiation now hinges on two things: unique taste and effective distribution.This week, Bharat Vasan, founder & CEO at Intangible and a "recovering VC," explains why the AI landscape compelled him to return to founding. He sees AI sparking a new creative revolution, similar to the early internet, that makes it easier than ever to bring ideas to life. The conversation delivers essential advice for founders, revealing why relentless shipping is the ultimate clarifier for a business and why resilience, not just intelligence, is the key to survival.Drawing from his experience on both sides of the venture table, Bharat breaks down the brutally competitive VC landscape and shares Intangible's mission: to simplify 3D creative tools with AI, finally bridging the gap between human vision and machine power. Listeners will gain insights on company building, brand strategy, and why customer obsession is the ultimate moat in the AI age.Chapters:00:00 Introduction 00:45 From Founder to VC and Back03:17 Human Creativity in the Age of AI07:50 The Role of Taste and Distribution11:49 Building a Brand in the AI Era16:17 The Venture Capital Landscape for AI Startups20:11 Advice for Founders in the AI Boom23:55 Incumbents vs. Startups27:10 The New Generation of Innovators29:19 Pirate Mentality in Startups30:00 Building a Brand36:28 Shipping and Resilience41:49 Customer Obsession46:58 The Vision for Intangible51:52 ConclusionFollow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Connect with Bharat on LinkedIn.Follow Bharat on X.Learn more about Intangible at intangible.ai.Check out GalileoTry GalileoAgent Leaderboard
Unlocking AI agents for knowledge work automation and scaling intelligent, multi-agent systems within enterprises fundamentally requires measurability, reliability, and trust.João Moura, founder & CEO of CrewAI, joins Galileo’s Conor Bronsdon and Vikram Chatterji to unpack and define the emerging AI agent stack. They explore how enterprises are moving beyond initial curiosity to tackle critical questions around provisioning, authentication, and measurement for hundreds or thousands of agents in production. The discussion highlights a crucial "gold rush" among middleware providers, all racing to standardize the orchestration and frameworks needed for seamless agent deployment and interoperability. This new era demands a re-evaluation of everything from cloud choices to communication protocols as agents reshape the market.João and Vikram then dive into the complexities of building for non-deterministic multi-agent systems, emphasizing the challenges of increased failure modes and the need for rigorous testing beyond traditional software. They detail how CrewAI is democratizing agent access with a focus on orchestration, while Galileo provides the essential reliability platform, offering advanced evaluation, observability, and automated feedback loops. From specific use cases in financial services to the re-emergence of core data science principles, discover how companies are building trustworthy, high-quality AI products and prepare for the coming agent marketplace. Chapters:00:00 Introduction and Guest Welcome02:04 Defining the AI Agent Stack03:49 Challenges in Building AI Agents05:52 The Future of AI Agent Marketplaces06:59 Infrastructure and Protocols09:05 Interoperability and Flexibility20:18 Governance and Security Concerns24:12 Industry Adoption and Use Cases25:57 Unlocking Faster Development with Success Metrics28:40 Challenges in Managing Complex Systems30:10 Introducing the Insights Engine30:33 The Importance of Observability and Control32:33 Democratizing Access with No-Code Tools35:39 Ensuring Quality and Reliability in Production41:08 Future of Agentic Systems and Industry TransformationFollow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Joao Moura: LinkedIn | X/TwitterCrewAI: crewai.com | X/Twitter Check out GalileoTry GalileoAgent Leaderboard
How is an open ecosystem powering the next generation of AI for developers and leaders?Broadcasting live from the heart of the action at AMD's Advancing AI 2025, Chain of Thought host Conor Bronsdon welcomes AMD’s Anush Elangovan, VP of AI Software, and Sharon Zhou, VP of AI. They unpack AMD's groundbreaking transformation from a hardware giant to a leader in full-stack AI, committed to an open ecosystem. Discover how new MI350 GPUs deliver mind-blowing performance with advanced data types and why ROCm 7 and AMD Developer Cloud offer Day Zero support for frontier models.Then Conor welcomes Sharon Zhou, VP of AI at AMD, to discuss making AMD's powerful software stack truly accessible and how to drive developer curiosity. Sharon explains strategies for creating a "happy path" for community contributions, fostering engagement through teaching, and listening to developers at every stage. She shares her predictions for the future, including the rise of self-improving AI, the critical role of heterogeneous compute, and the potential of "vibes based feedback" to guide models. This vision for democratizing access to high-performance AI, driven by a deep understanding of the developer journey, promises to unlock the next generation of applications.Chapters:00:00 Live from AMD's Advancing AI 2025 Event00:30 Introduction to Anush Elangovan01:38 The MI350 GPU Series Unveiled04:57 CDNA4 Architecture Explained07:00 The Future of AI Infrastructure08:32 AMD's Developer Cloud and ROCm 711:50 Cultural Shift at AMD14:48 Open Source and Community Contributions18:35 Software Longevity and Ecosystem Strategy22:19 AI Agents and Performance Gains27:36 AI's Role in Solving Power Challenges28:11 Thanking Anush28:42 Introduction to Sharon Zhou29:45 Sharon's Focus at AMD30:39 Engaging Developers with AMD's AI Tools31:24 Listening to the AI Community33:56 Open Source and AI Development45:04 Future of AI and Self-Improving Models48:04 Final Thoughts and FarewellFollow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Anush Elangovan: LinkedInSharon Zhou: LinkedInAMD Official Site: amd.comAMD Developer Resources: AMD Developer CentralCheck out GalileoTry GalileoAgent Leaderboard
What if the most valuable data in your enterprise—the key to your AI future—is sitting dormant in your backups, treated like an insurance policy you hope to never use?Join Conor Bronsdon with Greg Statton, VP of AI Solutions at Cohesity, for an inside look at how they are turning this passive data into an active asset to power generative AI applications. Greg details Cohesity’s evolution from an infinitely scalable file system built for backups into a data intelligence powerhouse, managing hundreds of exabytes of enterprise data globally. He recounts how early successes in using this data for security and anomaly detection paved the way for more advanced AI applications. This foundational work was crucial in preparing Cohesity to meet the new demands of generative AI.Greg offers a candid look at the real-world challenges enterprises face, arguing that establishing data hygiene and a cross-functional governance model is the most critical step before building reliable AI applications. He shares the compelling story of how Cohesity's focus on generative AI was sparked by an internal RAG experiment he built to solve a "semantic divide" in team communication, which quickly grew into a company-wide initiative. He also provides essential advice for data professionals, emphasizing the need to focus on solving core business problems.Chapters:00:00 Introduction00:36 The Role of Gaming in AI Development05:43 Personal Gaming Experiences08:26 The Intersection of AI and Gaming12:53 Importance of Data in Game Development19:03 User Testing and QA in Gaming25:49 Postmortems and Telemetry27:21 Beta Testing and Data Preparedness29:18 Traditional AI vs Generative AI31:31 Challenges of Implementing AI in Games35:57 Leveraging AI for Data Analytics39:41 Automated QA and Reinforcement Learning42:01 AI for Localization and Sentiment Analysis44:21 Future of AI in GamingFollow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Company Website: cohesity.comLinkedIn: Gregory StattonCheck out GalileoTry GalileoAgent Leaderboard
What if the pixels and polygons of your favorite video games were the secret architects of today's AI revolution?Carly Taylor, Field CTO for Gaming at Databricks and founder of ggAI, joins host Conor Bronsdon to illuminate the direct line from video game innovation to the current AI landscape. She explains how the gaming industry's relentless pursuit of better graphics and performance not only drove pivotal GPU advancements and cost reductions, but also fundamentally shaped our popular understanding of artificial intelligence by popularizing the very term "AI" through decades of in-game experiences. Carly shares her personal journey, from a childhood passion for games like Rollercoaster Tycoon ignited while playing with her mom, to becoming a data scientist for Call of Duty. The discussion then confronts a long-standing tension in game development: how the critical need to ship titles often relegates vital game data to a secondary concern, a dynamic Carly explains is now being reshaped by AI. She details the inherent challenges game studios face in capturing and leveraging telemetry, from disparate development processes to the lengthy pipeline required for updates. Carly illuminates how modern AI, particularly generative AI, presents a massive opportunity for studios to finally unlock their vast data troves for everything from self-service analytics and community insight generation to revolutionizing QA processes. This pivotal intersection of evolving game data practices and new AI capabilities is poised to redefine how games are made, understood, and ultimately experienced.Chapters00:00 Introduction00:28 The Role of Gaming in AI Development05:35 Personal Gaming Experiences08:18 The Intersection of AI and Gaming12:45 Importance of Data in Game Development18:55 User Testing and QA in Gaming25:41 Postmortems and Telemetry27:13 Beta Testing and Data Preparedness29:10 Traditional AI vs Generative AI31:23 Challenges of Implementing AI in Games35:49 Leveraging AI for Data Analytics39:33 Automated QA and Reinforcement Learning41:53 AI for Localization and Sentiment Analysis44:13 Future of AI in GamingFollow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Connect with Carly on LinkedInSubscribe to Carly's Substack: Good At BusinessCheck out GalileoTry GalileoAgent Leaderboard
AI in 2025 promises intelligent action, not just smarter chat. But are enterprises prepared for the agentic shift and the complex reliability hurdles it brings?Join Conor Bronsdon on Chain of Thought with fellow co-hosts and Galileo co-founders, Vikram Chatterji (CEO) and Atindriyo Sanyal (CTO), as they explore this pivotal transformation. They discuss how generative AI is evolving from a simple tool into a powerful engine for enterprise task automation, a significant advance driving the pursuit of substantial ROI. This shift is also fueling what Vikram observes as a "gold rush" for middleware and frameworks, alongside healthy skepticism about making widespread agentic task completion a practical reality.As these AI systems grow into highly complex, compound structures—often incorporating multimodal inputs and multi-agent designs—Vikram and Atin address the critical challenges around debugging, achieving reliability, and solving the profound measurement problem. They share Galileo's vision for an AI reliability platform designed to tame these intricate systems through robust guardrailing, advanced metric engines like Luna, and actionable developer insights. Tune in to understand how the industry is moving beyond point-in-time evaluations to continuous AI reliability, crucial for building trustworthy, high-performing AI applications at scale.Chapters00:00 Welcome and Introductions01:05 Generative AI and Task Completion02:13 Middleware and Orchestration Systems03:17 Enterprise Adoption and Challenges05:55 Multimodal AI and Future Plans08:37 AI Reliability and Evaluation11:08 Complex AI Systems and Developer Challenges13:45 Galileo's Vision and Product Roadmap18:59 Modern AI Evaluation Agents20:10 Galileo's Powerful SDK and Tools21:24 The Importance of Observability and Robust Testing22:27 The Rise of Vibe Coding24:48 Balancing Creativity and Reliability in AI31:26 Enterprise Adoption of AI Systems36:59 Challenges and Opportunities in Regulated Industries42:10 Future of AI Reliability and Industry ImpactFollow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Website: galileo.aiRead: Galileo Optimizes Enterprise–Scale Agentic AI Stack with NVIDIACheck out GalileoTry GalileoAgent Leaderboard
As AI redefines how products are built and customers are understood, what are the core strategies engineering leaders use to drive innovation and create lasting value?Join Conor Bronsdon as he welcomes Wade Chambers, Chief Engineering Officer at Amplitude, to explore these critical questions. Wade shares how Amplitude is leveraging AI to deepen customer understanding and enhance product experiences, transforming raw data into actionable insights across their platform. He also discusses their approach to navigating constant change while building an adaptable, high-performing engineering culture that thrives in the current AI landscape.The conversation explores Amplitude's strategy for building a sustainable AI advantage through proprietary data, deep domain expertise, and robust feedback loops, moving beyond superficial AI applications. Wade offers insights on fostering an AI-ready engineering culture through empowerment and clear alignment, alongside exploring the exciting potential of agentic AI to create proactive, intelligent copilots for product teams. He then details Amplitude’s successful approach to integrating specialized AI talent, drawing key lessons from their acquisition of Command AI.Chapters00:00 Introduction and Guest Welcome01:55 Understanding and Acting on Data with AI06:42 Amplitude's Unique Position in the Market08:36 Differentiation and Competitive Advantage09:58 Incorporating Customer Feedback12:48 Evaluating AI Outcomes17:21 Agentic AI and Future Prospects21:38 Acquiring and Integrating AI Talent28:44 Building a Culture of Innovation37:21 Advice for Leaders and Individual Contributors43:26 The Future of AI in the Workplace45:38 Closing ThoughtsFollow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)LinkedIn: Wade ChambersWebsite: amplitude.comCheck out GalileoTry GalileoAgent Leaderboard
Is the prevailing approach to Artificial General Intelligence (AGI) missing a crucial step – deep, focused specialization? For the first time since co-founding Poolside, CEO Jason Warner & CTO Eiso Kant reunite on a podcast articulating their distinct vision for AI's future with our host, Conor Bronsdon. Poolside has intentionally diverged from general-purpose models, developing highly specialized AI meticulously designed for the specific, complex task of coding, viewing it as a direct and robust pathway towards achieving AGI, and revolutionizing how software is created.Jason and Eiso dive deep into the core tenets of their strategy: an unwavering conviction in reinforcement learning through code execution feedback and the burgeoning power of synthetic data, which they believe will help expand the surface area of software by an astounding 1000x. They candidly discuss the "devil's trade" of data privacy, Poolside's commitment to enterprise-grade AI for high-consequence systems, and why true innovation requires moving beyond flashy demos to solve real-world, critical challenges. Looking towards the horizon, they also share their insights on the evolving role of software engineers, where human agency, taste, and judgment become paramount in a landscape augmented by AI "coworkers." They also explore the profound societal implications of their work and the AI industry more generally, touching upon the "event horizon" of intelligent systems and the immense responsibility that comes with being at the forefront of this technological wave. Chapters00:00 Introduction and Guest Welcome01:19 Founding of Poolside02:56 Vision for AGI and Reinforcement Learning05:36 Defining AGI and Its Implications10:03 Training Models for Software Development17:08 Scaling and Synthetic Data20:12 Focus on High-Consequence Systems26:17 Privacy and Security in AI Solutions28:09 Earning Trust with Developers31:08 Reinforcement Learning and Compute34:29 The Vision for AI's Future39:50 Will Developers Still Exist?47:07 Poolside Cloud's Ambitions49:37 ConclusionFollow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Website: poolside.aiLinkedIn: Jason WarnerLinkedIn: Eiso KantCheck out GalileoTry GalileoAgent Leaderboard
The AI landscape often pulls us between the allure of cutting-edge models and the quiet necessity of foundational work—yet how do these extremes actually connect to deliver value?Join Conor Bronsdon as he welcomes Denny Lee, a self-proclaimed "data nerd" and Product Management Director, Developer Relations at Dataricks, to unpack this very spectrum, from AI's core infrastructure to its most advanced applications. Denny explains why robust logging, tracing, and data lineage are indispensable for credible AI evaluation and feedback, ultimately making AI systems more affordable, accessible, and impactful.The discussion ventures into strategies for democratizing AI, exploring the "GenAI ladder" from efficient inference and retrieval-augmented generation to deciding when to fine-tune or pre-train models. Denny also tackles the industry's pressing hardware bottlenecks, the critical role of open standards, and the imperative of navigating data privacy in an increasingly AI-driven world. Listen for grounded advice on moving beyond the hype and making practical, value-driven decisions in your AI journey.Chapters00:00 Introduction and Guest Welcome01:31 Diving into AI Foundations02:25 Importance of Logging and Tracing08:40 Challenges in Data Quality and Lineage14:49 Strategies for Cost-Effective AI19:52 Partnerships and Collaborative Opportunities22:10 Hardware Bottlenecks in AI24:56 China's Power and Networking Advantage25:26 Nvidia's Super Chip and Network Fabrics26:39 The Growing Demand for Power in AI29:26 Practical Advice for Data Governance35:47 Understanding Privacy in AI36:25 Differential Privacy and Its Challenges41:57 ConclusionFollow the hostsFollow AtinFollow ConorFollow VikramFollow YashFollow Today's Guest(s)Website: Databricks.comPodcast: Data Brew by Databricks (available on major podcast platforms)YouTube: @DatabricksLinkedIn: Denny LeeReadSemiAnalysis Blog: https://semianalysis.com/Check out GalileoTry GalileoAgent Leaderboard