AI code generation: Wins, fails and the future

Update: 2025-12-26

Description

What’s the future of AI code generation? This week on Mixture of Experts, host Tim Hwang is joined by Chris Hay, Olivia Buzek and Gabe Goodhart to debrief the biggest AI use-case of 2025: AI-powered software engineering.

Claude Opus 4.5 solved a months-long optimization in under an hour but failed spectacularly at simple tasks. The barbell effect is real. Next, who's the architect—you or the model? We discuss agent orchestration, context windows and why tool performance varies wildly. Then, model differentiation: are OpenAI and Anthropic fundamentally different, or does agent architecture matter more? Finally, can open-source compete with closed ecosystems? We explore vertical integration, inference costs and the future of open models. All that and more on this week's Mixture of Experts.

00:00 – Introduction

01:11 – The barbell problem: AI coding wins and fails

03:46 – Claude Code cracks Apple Metal optimization

07:52 – Who's the architect: You or the AI?

11:44 – Model vs agent orchestration

20:44 – The future of unsupervised AI agents

24:30 – Open source vs proprietary tools

33:22 – The inference cost challenge

The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.

Subscribe for AI updates → https://www.ibm.com/account/reg/us-en/signup?formid=news-urx-52120

Visit Mixture of Experts podcast page to get more AI content → https://www.ibm.com/think/podcasts/mixture-of-experts

Learn more about AI code generation → https://www.ibm.com/think/topics/ai-code-generation

Comments

In Channel

AI code generation: Wins, fails and the future

2025-12-2635:19

Disney's AI bet: USD 1B OpenAI content deal explained

2025-12-1938:31

GPT-5.2 code red & AWS Nova models drop

2025-12-1241:42

AI model analysis: Mistral 3, DeepSeek-V3.2 & Claude Opus 4.5

2025-12-0535:41

AI agents in 2025: Why agentic commerce isn't ready for Black Friday yet

2025-11-2841:31

Google’s Gemini 3: AI agents, reasoning and search mode

2025-11-2146:43

GPT-5.1 and Kimi K2: What ‘Thinking AI’ really means

2025-11-1431:56

1X NEO humanoid robot enters the home

2025-11-0736:29

Anthropic’s TPU move and NVIDIA’s Starcloud

2025-10-3147:39

ChatGPT Atlas, OpenAI’s new web browser

2025-10-2444:48

OpenAI, Oracle & AMD shake up AI

2025-10-1749:42

IBM partners with Anthropic, plus OpenAI drops AgentKit

2025-10-1043:48

This week in AI models: Granite 4.0, Claude 4.5, Sora 2

2025-10-0342:05

NVIDIA’s USD 100bn investment and Google's AP2

2025-09-2652:41

Anthropic Economic Index, Virtual Agent Economies, AlterEgo and How People Use ChatGPT

2025-09-1946:55

Why language models hallucinate, revisiting Amodei’s code prediction and AI in the job market

2025-09-1242:23

Google Antitrust, Anthropic's $183B leap and are we in the AI winter?

2025-09-0552:18

Monster prompt, OpenAI’s business play, nano-banana and US Open experimentations

2025-08-2944:03

Gen AI pilots fail, GPT-5's hidden prompt revealed, reasoning model flaws and Claude closing chats

2025-08-2245:01

Perplexity’s bid for Chrome, Grok Imagine and GPT-5 check-in

2025-08-1540:44

00:00

AI code generation: Wins, fails and the future

#box-pro-ellipsis-176680748333477{-webkit-line-clamp:2;}AI code generation: Wins, fails and the future

AI code generation: Wins, fails and the future

IBM

AI code generation: Wins, fails and the future