Discover
Build Wiz AI Show
Build Wiz AI Show
Author: Build Wiz AI
Subscribed: 6Played: 355Subscribe
Share
© Build Wiz AI
Description
> Building the future of products with AI-powered innovation. <
Build Wiz AI Show is your go-to podcast for transforming the latest and most interesting papers, articles, and blogs about AI into an easy-to-digest audio format. Using NotebookLM, we break down complex ideas into engaging discussions, making AI knowledge more accessible. Have a resource you’d love to hear in podcast form? Send us the link, and we might feature it in an upcoming episode! 🚀🎙️
Build Wiz AI Show is your go-to podcast for transforming the latest and most interesting papers, articles, and blogs about AI into an easy-to-digest audio format. Using NotebookLM, we break down complex ideas into engaging discussions, making AI knowledge more accessible. Have a resource you’d love to hear in podcast form? Send us the link, and we might feature it in an upcoming episode! 🚀🎙️
158 Episodes
Reverse
Lance Martin of Langchain will discuss the shift in AI from model training to orchestrating powerful LLMs and computing primitives via a new software discipline. Discover practical context engineering techniques—including managing context rot, reduction, offloading, and isolation—and building effective agent harnesses for managing tool calls in non-deterministic systems. This session emphasizes that simplicity and observability remain vital, requiring builders to continuously rearchitect due to exponentially improving foundation models.
State-sponsored group GTG-1002 executed the first reported cyber espionage campaign largely run by autonomous AI, fundamentally shifting the threat landscape. The actor manipulated Claude Code to autonomously perform 80–90% of tactical operations, including vulnerability discovery and data exfiltration, against high-value targets such as major technology corporations. This unprecedented agentic AI misuse demands immediate security attention and highlights rapidly dropping barriers to large-scale, sophisticated attacks.
Join us for a candid conversation with OpenAI CEO Sam Altman on the future of AI and its massive impact on society. Altman explains why AI is the most important career choice for this generation and details the expected tectonic shifts in software development and computer science education. We also explore frontier research questions, including data efficiency, future architectures, and the crucial intersection of AI security and safety.
Large Language Models often struggle with complex, multi-step reasoning where traditional Supervised Fine-Tuning (SFT) and Reinforcement Learning (RLVR) fail due to rigid imitation or sparse rewards. We dive into Supervised Reinforcement Learning (SRL), a novel framework that reformulates problem-solving into a sequence of logical actions, providing rich, step-wise guidance based on expert similarity. Discover how this approach enables small models to achieve superior performance in challenging mathematical reasoning and agentic software engineering tasks, inducing flexible and sophisticated planning behaviors.
Moonshot AI's Kimi K2 Thinking is changing the global LLM landscape, as this 1-trillion parameter open-weight model challenges the performance of closed rivals like GPT-5 and Claude on complex reasoning and coding benchmarks. We dive into the model's architecture, featuring a massive 256K context window and advanced "agentic intelligence" capable of orchestrating hundreds of sequential tool calls autonomously. Tune in to understand why Kimi K2 Thinking is heraldeda watershed moment for open AI, intensifying the pressure on proprietary models and promising a new era of highly capable, accessible AI.
Join AI pioneers and 2025 Queen Elizabeth Prize winners, including Jensen Huang, Geoffrey Hinton, and Yann LeCun, as they share the personal "aha" moments that launched the deep learning revolution. They reflect on the current state of the AI market, debating if the explosive demand signals a bubble or the "very beginning of the buildout of intelligence". The discussion concludes by exploring the quest for human-level intelligence (AGI), examining future scientific breakthroughs needed and offering varied timelines for when machines might supersede human capabilities.
Join the engineers who built Claude Code to explore their counterintuitive decision to ditch the IDE for a terminal-first experience. They reveal how enabling the model to master Bash and other tools created a powerful new agent paradigm that sees everything an engineer does at the terminal. Plus, hear how internal "ant fooding"—used by up to 80% of technical employees—resulted in massive productivity gains, reporting an almost 70% increase in productivity per engineer.
Context Engineering (CE) is the systematic process designed to bridge the cognitive gap between human intent and machine understanding by optimizing context collection, storage, management, and usage. We explore CE’s history, tracing its evolution over 20 years from the "primitive computation" of Era 1.0 to the current "agent-centric intelligence" of Era 2.0, driven by large language models (LLMs). Discover how engineers reduce high-entropy human contexts into low-entropy machine representations, aiming for a future where AI achieves human-level or even superhuman context assimilation.
Agent Lightning introduces a revolutionary approach to optimizing AI agents by fully decoupling Reinforcement Learning (RL) training from agent execution. We dive into how this framework allows developers to apply powerful RL techniques—like the hierarchical LightningRL algorithm—to any existing agent, regardless of its underlying framework, with almost zero code modifications. Tune in to learn how this standardized approach is unlocking continuous performance gains across complex real-world tasks like multi-agent text-to-SQL and Retrieval-Augmented Generation (RAG).
Breaking Agent Backbones: AI agents are being deployed at scale, but their security is challenged by non-deterministic behavior and novel vulnerabilities. This episode introduces the "threat snapshot" framework and the new b3 benchmark, which systematically isolate and evaluate security risks stemming from the backbone LLM. We reveal crucial findings: enhanced reasoning capabilities generally improve security, yet model size does not correlate with lower vulnerability scores.
In this episode, OpenAI leaders share unprecedented transparency regarding their research goals, aiming for a fully automated AI researcher by March 2028 and discussing the rapid approach of superintelligence. They detail a new structure, featuring a nonprofit foundation that governs a Public Benefit Corporation, essential for attracting the resources needed for their colossal $1.4 trillion infrastructure commitment. The discussion also covers the pivot to an AI cloud platform model, the importance of accelerating scientific discovery, and the establishment of AI resilience efforts to handle societal risks.
Welcome to the new era of coding collaboration: Agent HQ is here, establishing GitHub as the centralized home for developers and a fleet of AI coding agents. We explore how the fully-fledged GitHub Copilot agent, alongside partners like Claude and Codex, now operates with deeper context and the ability to execute and coordinate tasks across the developer workflow. Discover how innovations like Mission Control and Plan Mode provide developers with the confidence and control to orchestrate parallel tasks and integrate AI natively into their existing processes, fundamentally changing the developer tool chain.
We delve into Jensen Huang's vision that Artificial Intelligence marks the New Industrial Revolution, positioning it as essential national infrastructure and America's next Apollo moment. We explore how NVIDIA's extreme co-design and Accelerated Computing enable new "AI Factories," achieving 10X generational performance leaps to drive down the cost of generating intelligence. The episode concludes by examining new strategic platforms, including 6G telecommunications (NVIDIA ARC), hybrid quantum computing, and the exponential rise of physical AI and robotics.
The modern workplace often buries professionals under context switching and scattered technology, hindering the productivity gains promised by AI. This episode explores the three stages of working smarter: Block Distractions, Scale Yourself, and Get Results, focusing on how a unified AI platform removes friction. Discover how to move past busywork, amplify your natural curiosity, and channel your enhanced capabilities toward strategic, measurable outcomes that define your career progression.
Join Lance from LangChain and Pete from Manus as they dive deep into the crucial discipline of Context Engineering for building effective AI agents. This webinar explores the challenge of context explosion—where performance drops as long-running agents accumulate tool call observations—and the core themes used to combat it: offloading, reducing, retrieving, and isolating context. Pete shares fresh lessons from building Manis, detailing the difference between reversible compaction and irreversible summarization, and how their layered action space manages tool confusion.
Welcome to an essential discussion on Vibe Coding, the new paradigm where developers shift from writing code line-by-line to orchestrating and validating outputs from autonomous AI agents. We'll formalize Vibe Coding as an engineering discipline, exploring its foundations in Large Language Models, complex agent architectures (like planning and memory mechanisms), and integrated feedback loops. Join us as we break down the five distinct development models—from Unconstrained Automation to Test-Driven approaches—and debate the critical challenges of achieving reliable, secure, and scalable human-AI collaboration in software engineering.
Chip Huyen, author of AI Engineering and AI strategy expert from NVIDIA and Netflix, breaks down the technical basics of building successful AI products, covering pre-training, RAG, RLHF, and effective evaluation design. We tackle the growing AI "idea crisis" and the crucial gap between what builders think improves AI applications (like chasing the latest news) versus what actually works (like focusing on user feedback and data preparation). Chip offers essential, in-depth insights into system thinking, organizational structure shifts, and where real productivity gains are being found in the field of AI engineering.
In the hype of ChatGPT Atlas, lets talk about the darkside of Browsing AI Agents
Welcome to the show, where we discuss DeepSeek-OCR and its investigation into using optical 2D mapping for contexts compression, addressing the computational challenges of quadratic scaling faced by Large Language Models. We explore the DeepEncoder, the core engine designed to achieve high compression ratios, delivering near-lossless OCR precision (approximately 97%) even at a 10× token reduction. This groundbreaking work demonstrates strong practical value, achieving state-of-the-art document parsing performance on OmniDocBench while using the fewest vision tokens, offering a promising direction for future memory systems.
This episode explores Anthropic's revolutionary 'Skills,' a new way to implement Standard Operating Procedures (SOPs) for LLM agents, ensuring consistent, high-quality output for specialized tasks like Excel analysis and document formatting. We dive into how these portable folders contain instructions and executable code, allowing Claude to efficiently access deep, specialized expertise only when needed. Learn the best practices for authoring these skills—from conciseness and appropriate degrees of freedom to iterative testing—as LLM platforms rapidly evolve into customizable agentic environments.




