#216 - Grok 4, Project Rainier, Kimi K2
Update: 2025-07-14
Description
Our 216th episode with a summary and discussion of last week's big AI news!
Recorded on 07/11/2025
Hosted by Andrey Kurenkov and Jeremie Harris.
Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai
Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.
In this episode:
- xAI launches Grok 4 with breakthrough performance across benchmarks, becoming the first true frontier model outside established labs, alongside a $300/month subscription tier
- Grok's alignment challenges emerge with antisemitic responses, highlighting the difficulty of steering models toward "truth-seeking" without harmful biases
- Perplexity and OpenAI launch AI-powered browsers to compete with Google Chrome, signaling a major shift in how users interact with AI systems
- Meta study reveals AI tools actually slow down experienced developers by 20% on complex tasks, contradicting expectations and anecdotal reports of productivity gains
Timestamps + Links:
Tools & Apps
- (00:01:59 ) Elon Musk's xAI launches Grok 4 alongside a $300 monthly subscription | TechCrunch
- (00:15:28 ) Elon Musk’s AI chatbot is suddenly posting antisemitic tropes
- (00:29:52 ) Perplexity launches Comet, an AI-powered web browser | TechCrunch
- (00:32:54 ) OpenAI is reportedly releasing an AI browser in the coming weeks | TechCrunch
- (00:33:27 ) Replit Launches New Feature for its Agent, CEO Calls it ‘Deep Research for Coding’
- (00:34:40 ) Cursor launches a web app to manage AI coding agents
- (00:36:07 ) Cursor apologizes for unclear pricing changes that upset users | TechCrunch
Applications & Business
- (00:39:10 ) Lovable on track to raise $150M at $2B valuation
- (00:41:11 ) Amazon built a massive AI supercluster for Anthropic called Project Rainier – here's what we know so far
- (00:46:35 ) Elon Musk confirms xAI is buying an overseas power plant and shipping the whole thing to the U.S. to power its new data center — 1 million AI GPUs and up to 2 Gigawatts of power under one roof, equivalent to powering 1.9 million homes
- (00:48:16 ) Microsoft's own AI chip delayed six months in major setback — in-house chip now reportedly expected in 2026, but won't hold a candle to Nvidia Blackwell
- (00:49:54 ) Ilya Sutskever becomes CEO of Safe Superintelligence after Meta poached Daniel Gross
- (00:52:46 ) OpenAI’s Stock Compensation Reflect Steep Costs of Talent Wars
Projects & Open Source
- (00:58:04 ) Hugging Face Releases SmolLM3: A 3B Long-Context, Multilingual Reasoning Model - MarkTechPost
- (00:58:33 ) Kimi K2: Open Agentic Intelligence
- (00:58:59 ) Kyutai Releases 2B Parameter Streaming Text-to-Speech TTS with 220ms Latency and 2.5M Hours of Training
Research & Advancements
- (01:02:14 ) Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning
- (01:07:58 ) Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
- (01:13:03 ) Mitigating Goal Misgeneralization with Minimax Regret
- (01:17:01 ) Correlated Errors in Large Language Models
- (01:20:31 ) What skills does SWE-bench Verified evaluate?
Policy & Safety
- (01:22:53 ) Evaluating Frontier Models for Stealth and Situational Awareness
- (01:25:49 ) When Chain of Thought is Necessary, Language Models Struggle to Evade Monitors
- (01:30:09 ) Why Do Some Language Models Fake Alignment While Others Don't?
- (01:34:35 ) Positive review only': Researchers hide AI prompts in papers
- (01:35:40 ) Google faces EU antitrust complaint over AI Overviews
- (01:36:41 ) The transfer of user data by DeepSeek to China is unlawful': Germany calls for Google and Apple to remove the AI app from their stores
- (01:37:30 ) Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark
See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
Comments
In Channel