DiscoverThursdAI - The top AI news from the past week📆 ThursdAI - Nov 6, 2025 - Kimi’s 1T Thinking Model Shakes Up Open Source, Apple Bets $1B on Gemini for Siri, and Amazon vs. Perplexity!
📆 ThursdAI - Nov 6, 2025 - Kimi’s 1T Thinking Model Shakes Up Open Source, Apple Bets $1B on Gemini for Siri, and Amazon vs. Perplexity!

📆 ThursdAI - Nov 6, 2025 - Kimi’s 1T Thinking Model Shakes Up Open Source, Apple Bets $1B on Gemini for Siri, and Amazon vs. Perplexity!

Update: 2025-11-07
Share

Description

Hey, Alex here!

Quick note, while preparing for this week, I posted on X that I don’t remember such a quiet week in AI since I started doing ThursdAI regularly, but then 45 min before the show started, Kimi dropped a SOTA oss reasoning model, turning a quiet week into an absolute banger.

Besides Kimi, we covered the updated MCP thinking from Anthropic, and had Kenton Varda from cloudflare as a guest to talk about Code Mode, chatted about Windsurf and Cursor latest updates and covered OpenAI’s insane deals.

Also, because it was a quiet week, I figured I’d use the opportunity to create an AI powered automation, and used N8N for that, and shared it on the stream, so if you’re interested in automating with AI with relatively low code, this episode is for you. Let’s dive in

ThursdAI - Recaps of the most high signal AI weekly spaces is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

Kimi K2 Thinking is Here and It’s a 1 Trillion Parameter Beast! (X, HF, Tech Blog)

Let’s start with the news that got everyone’s energy levels skyrocketing right as we went live. Moonshot AI dropped Kimi K2 Thinking, an open-source, 1 trillion-parameter Mixture-of-Experts (MoE) model, and it’s an absolute monster.

This isn’t just a numbers game; Kimi K2 Thinking is designed from the ground up to be a powerful agent. With just around 32 billion active parameters during inference, a massive 256,000 token context window, and an insane tool-calling capacity. They’re claiming it can handle 200-300 sequential tool calls without any human intervention.

The benchmarks are just as wild. On the Humanities Last Exam (HLE), they’re reporting a score of 44.9%, beating out both GPT-5 and Claude 4.5 Thinking. While it doesn’t quite top the charts on SWE-bench verified, it’s holding its own against the biggest closed-source models out there. Seeing an open-source model compete at this level is incredibly exciting.

During the show, we saw some truly mind-blowing demos, from a beautiful interactive visualization of gradient descent to a simulation of a virus attacking cells, all generated by the model. The model’s reasoning traces, which are exposed through the API, also seem qualitatively different from other models, showing a deep and thoughtful process. My co-hosts and I were blown away. The weights and a very detailed technical report are available on Hugging Face, so you can dive in and see for yourself. Shout out to the entire Moonshot AI team for this incredible release!

Other open source updates from this week

* HuggingFace released an open source “Smol Training Playbook” on training LLMs, it’s a 200+ interactive beast with visualizations, deep dives into pretraining, dataset, postraining and more! (HF)

* Ai2 launches OlmoEarth — foundation models + open, end-to-end platform for fast, high-resolution Earth intelligence (X, Blog)

* LongCat-Flash-Omni — open-source omni-modal system with millisecond E2E spoken interaction, 128K context and a 560B ScMoE backbone (X, HF, Announcement)

Big Tech’s Big Moves: Apple, Amazon, and OpenAI

The big companies were making waves this week, starting with a blockbuster deal that might finally make Siri smart. Apple is reportedly will be paying Google around $1 billion per year to license a custom 1.2 trillion-parameter version of Gemini to power a revamped Siri.

This is a massive move. The Gemini model will run on Apple’s Private Cloud Compute, keeping user data walled off from Google, and will handle Siri’s complex summarizer and planner functions. After years of waiting for Apple to make a significant move in GenAI, it seems they’re outsourcing the heavy lifting for now while they work to catch up with their own in-house models. As a user, I don’t really care who builds the model, as long as Siri stops being dumb!

In more dramatic news, Perplexity revealed that Amazon sent them a legal threat to block their Comet AI assistant from shopping on Amazon.com. This infuriated me. My browser is my browser, and I should be able to use whatever tools I want to interact with the web. Perplexity took a strong stand with their blog post, “Bullying is Not Innovation,” arguing that user agents are distinct from scrapers and act on behalf of the user with their own credentials. An AI assistant is just that—an assistant. It shouldn’t matter if I ask my wife or my AI to buy something for me on Amazon. This feels like a move by Amazon to protect its ad revenue at the expense of user choice and innovation, and I have to give major props to Perplexity for being so transparent and fighting back.

Finally, OpenAI continues its quest for infinite compute, announcing a multi-year strategic partnership with AWS. This comes on top of massive deals with NVIDIA, Microsoft, Oracle, and others, bringing their total commitment to compute into the trillions of dollars. It’s getting to a point where OpenAI seems “too big to fail,” as any hiccup could have serious repercussions for the entire tech economy, which is now heavily propped up by AI investment. Sam has clarified that they don’t think OpenAI wants to be too big to fail in a recent post on X, and that the recent miscommunications around the US government backstopping OpenAI’s infrastructure bailouts were taken out of context. 🤔

Coding with AI: The Evolution of MCP and New Dev Tools

This week, we kicked off a new segment on the show: Coding with AI! Essentially realizing that we talk about AI coding a LOT, and decided to add a dedicated corner to it! And we started with a fascinating development in the world of agentic tooling. Anthropic published a blog post arguing that the standard way of using the Model Context Protocol (MCP) — by loading full tool definitions into the context window — is inefficient.

Their solution? Have LLMs write code to interact with tools instead. This approach can slash token usage by over 98% in some cases. This idea sounded familiar, and that’s because Cloudflare had already explored it with a feature called “Code Mode.” We were lucky enough to have Kenton Varda, one of the authors of the Code Mode post and head of engineering for Cloudflare Workers, join us to discuss this shift.

Kenton explained that LLMs are trained on vast amounts of code, making it a more “native language” for them than the artificial construct of tool calls. By generating code, agents can chain multiple tool calls together, process intermediate results, and operate much more efficiently without sending everything back through the neural network. While MCP still provides crucial standardization for discovering and authorizing tools, this “code execution” pattern seems to be the way forward for building more powerful and scalable agents.

Windsurfs CodeMaps and Cursor multi agent executions

In other coding with AI news, Windsurf has pushed an incredible feature, called CodeMaps. They will use their SWE-1 model to (quickly) generate Codemaps that will expalins a code-base to you, in a visual way. What starts where and goes where. It’s really useful to understand a new codebase or re-understand one you forgot about already! You can even chat with codemaps, to see if your overall system’s design is solid! Great addition that I’m sure will help many folks adopt Windsurf!

And Cursor, another popular AI-native IDE, released a super-performant in-IDE browser and a wild multi-agent feature that queries multiple LLMs in parallel and then synthesizes their answers.

This Week’s Tutorial

I finally got around to building some serious automations for ThursdAI, and folks, N8N has been a game-changer. What used to take me 30+ minutes of manual work now happens automatically in the background.

Here’s what I built: A Telegram bot that takes Twitter/X links, fetches the tweets and all linked content, uses AI agents to extract and summarize the information, and then posts it to our announcement channel and my notes app. The coolest part? I built this whole thing in about 4 hours with the help of Atlas browser and GPT-5 literally telling me what to do at each step.

During the show, we even live-tested swapping out GPT-4o-mini for Kimi K2 - took literally 30 seconds to connect via OpenRouter. I went through my node and explains how this all works on the show, so if you’ve wanted to learn about n8n, check it out starting around 01:13:00 . If you want to see how my automation turned out, it will be posting all my links to the new telegram channel t.me/thursdai_news (expect it to be messy at first as I’m testing out the automation)

Robotics - Xpeng’s “

Comments 
loading
In Channel
loading
00:00
00:00
1.0x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

📆 ThursdAI - Nov 6, 2025 - Kimi’s 1T Thinking Model Shakes Up Open Source, Apple Bets $1B on Gemini for Siri, and Amazon vs. Perplexity!

📆 ThursdAI - Nov 6, 2025 - Kimi’s 1T Thinking Model Shakes Up Open Source, Apple Bets $1B on Gemini for Siri, and Amazon vs. Perplexity!

Alex Volkov