A new and mysterious Gemini model appears at the top of the leaderboard, but is that the full story? I dig behind the headline to show you some anti-climactic results, give some context with leaks in the last 48 hours of diminishing returns to scaling, and add the response of Altman, OpenAI and co. The future is about to look a lot stranger...80,000 hours Podcast and Channel: https://open.spotify.com/show/2WzJwXWBDnn4iZ7odKwDib https://www.youtube.com/@eightythousandhours/videos ...
The last few days have seen two narratives emerge. One, derived from yesterday’s OpenAI leak in TheInformation, that GPT-5/Orion is a disappointment, and less of a leap than GPT-3 to GPT-4. The second comes from a series of 4 clips (shown in this video) from Sam Altman, regarding the ‘clear path’ to AGI. Let’s go beyond the headlines (and through papers like Frontier Math) to get closer to the ground truth… Plus Universal-2, Sora comments, Claude 3.5 Haiku SimpleBench update, and a...
The Google destroyer, the Perplexity crusher? Or just hype? ChatGPT with Search is here, and simultaneously Altman and co did an AMA on Reddit, covering GPT-5, Sora, SearchGPT and a lot more. Plus, the biggest news of them all: Simple Bench is out.ChatGPT with Search: https://openai.com/index/introducing-chatgpt-search/Altman AMA (ask me anything): https://www.reddit.com/r/ChatGPT/comments/1ggixzy/ama_with_openais_sam_altman_kevin_weil_srinivas/https://x.com/sama/status/1852041075793522911Per...
A new state of the art LLM (at least for creative writing and basic reasoning) but what lies behind the numbers that were put out? Is it for real, and are AI agents about to grab your mouse and shake your cursor? Plus, results on my own Simple Bench, and new tools from Runway (Act-One), HeyGen (Zoom Calls) and an updated NotebookLM. AI, without the hype.Weights and Biases' Weave: https://wandb.me/ai_explained