📅 ThursdAI - Gemma 2, AI Engineer 24', AI Wearables, New LLM leaderboard

Update: 2024-06-27

Description

Hey everyone, sending a quick one today, no deep dive, as I'm still in the middle of AI Engineer World's Fair 2024 in San Francisco (in fact, I'm writing this from the incredible floor 32 presidential suite, that the team here got for interviews, media and podcasting, and hey to all new folks who I’ve just met during the last two days!)

It's been an incredible few days meeting so many ThursdAI community members, listeners and folks who came on the pod! The list honestly is too long but I've got to meet friends of the pod Maxime Labonne, Wing Lian, Joao Morra (crew AI), Vik from Moondream, Stefania Druga not to mention the countless folks who came up and gave high fives, introduced themselves, it was honestly a LOT of fun. (and it's still not over, if you're here, please come and say hi, and let's take a LLM judge selfie together!)

On today's show, we recorded extra early because I had to run and play dress up, and boy am I relieved now that both the show and the talk are behind me, and I can go an enjoy the rest of the conference 🔥 (which I will bring you here in full once I get the recording!)

On today's show, we had the awesome pleasure to have Surya Bhupatiraju who's a research engineer at Google DeepMind, talk to us about their newly released amazing Gemma 2 models! It was very technical, and a super great conversation to check out!

Gemma 2 came out with 2 sizes, a 9B and a 27B parameter models, with 8K context (we addressed this on the show) and this 27B model incredible performance is beating LLama-3 70B on several benchmarks and is even beating Nemotron 340B from NVIDIA!

This model is also now available on the Google AI studio to play with, but also on the hub!

We also covered the renewal of the HuggingFace open LLM leaderboard with their new benchmarks in the mix and normalization of scores, and how Qwen 2 is again the best model that's tested!

It's was a very insightful conversation, that's worth listening to if you're interested in benchmarks, definitely give it a listen.

Last but not least, we had a conversation with Ethan Sutin, the co-founder of Bee Computer. At the AI Engineer speakers dinner, all the speakers received a wearable AI device as a gift, and I onboarded (cause Swyx asked me) and kinda forgot about it. On the way back to my hotel I walked with a friend and chatted about my life.

When I got back to my hotel, the app prompted me with "hey, I now know 7 new facts about you" and it was incredible to see how much of the conversation it was able to pick up, and extract facts and eve TODO's!

So I had to have Ethan on the show to try and dig a little bit into the privacy and the use-cases of these hardware AI devices, and it was a great chat!

Sorry for the quick one today, if this is the first newsletter after you just met me and register, usually there’s a deeper dive here, expect a more in depth write-ups in the next sessions, as now I have to run down and enjoy the rest of the conference!

Here's the TL;DR and my RAW show notes for the full show, in case it's helpful!

* AI Engineer is happening right now in SF

* Tracks include Multimodality, Open Models, RAG & LLM Frameworks, Agents, Al Leadership, Evals & LLM Ops, CodeGen & Dev Tools, Al in the Fortune 500, GPUs & Inference

* Open Source LLMs

* HuggingFace - LLM Leaderboard v2 - (Blog)

* Old Benchmarks sucked and it's time to renew

* New Benchmarks

* MMLU-Pro (Massive Multitask Language Understanding - Pro version, paper)

* GPQA (Google-Proof Q&A Benchmark, paper). GPQA is an extremely hard knowledge dataset

* MuSR (Multistep Soft Reasoning, paper).

* MATH (Mathematics Aptitude Test of Heuristics, Level 5 subset, paper)

* IFEval (Instruction Following Evaluation, paper)

* 🤝 BBH (Big Bench Hard, paper). BBH is a subset of 23 challenging tasks from the BigBench dataset

* The community will be able to vote for models, and we will prioritize running models with the most votes first

* Mozilla announces Builders Accelerator @ AI Engineer (X)

* Theme: Local AI

* 100K non dilutive funding

* Google releases Gemma 2 (X, Blog)

* Big CO LLMs + APIs

* UMG, Sony, Warner sue Udio and Suno for copyright (X)

* were able to recreate some songs

* sue both companies

* have 10 unnamed individuals who are also on the suit

* Google Chrome Canary has Gemini nano (X)

* Super easy to use window.ai.createTextSession()

* Nano 1 and 2, at a 4bit quantized 1.8B and 3.25B parameters has decent performance relative to Gemini Pro

* Behind a feature flag

* Most text gen under 500ms

* Unclear re: hardware requirements

* Someone already built extensions

* someone already posted this on HuggingFace

* Anthropic Claude share-able projects (X)

* Snapshots of Claude conversations shared with your team

* Can share custom instructions

* Anthropic has released new "Projects" feature for Claude AI to enable collaboration and enhanced workflows

* Projects allow users to ground Claude's outputs in their own internal knowledge and documents

* Projects can be customized with instructions to tailor Claude's responses for specific tasks or perspectives

* "Artifacts" feature allows users to see and interact with content generated by Claude alongside the conversation

* Claude Team users can share their best conversations with Claude to inspire and uplevel the whole team

* North Highland consultancy has seen 5x faster content creation and analysis using Claude

* Anthropic is committed to user privacy and will not use shared data to train models without consent

* Future plans include more integrations to bring in external knowledge sources for Claude

* OpenAI voice mode update - not until Fall

* AI Art & Diffusion & 3D

* Fal open sourced AuraSR - a 600M upscaler based on GigaGAN (X, Fal)

* Interview with Ethan Sutin from Bee Computer

* We all got Bees as a gifts

* AI Wearable that extracts TODOs, knows facts, etc'

This is a public episode. If you’d like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

📆 ThursdAI - Jan 9th - NVIDIA's Tiny Supercomputer, Phi-4 is back, Kokoro TTS & Moondream gaze, ByteDance SOTA lip sync & more AI news

2025-01-1001:20:26

📆 ThursdAI - Jan 2 - is 25' the year of AI agents?

2025-01-0201:31:29

📆 ThursdAI - Dec 26 - OpenAI o3 & o3 mini, DeepSeek v3 658B beating Claude, Qwen Visual Reasoning, Hume OCTAVE & more AI news

2024-12-2701:35:32

🎄ThursdAI - Dec19 - o1 vs gemini reasoning, VEO vs SORA, and holiday season full of AI surprises

2024-12-2001:35:38

📆 ThursdAI - Dec 12 - unprecedented AI week - SORA, Gemini 2.0 Flash, Apple Intelligence, LLama 3.3, NeurIPS Drama & more AI news

2024-12-1301:39:04

📆 ThursdAI - Dec 5 - OpenAI o1 & o1 pro, Tencent HY-Video, FishSpeech 1.5, Google GENIE2, Weave in GA & more AI news

2024-12-0601:31:37

🦃 ThursdAI - Thanksgiving special 24' - Qwen Open Sources Reasoning, BlueSky hates AI, H controls the web & more AI news

2024-11-2801:46:16

📆 ThursdAI - Nov 21 - The fight for the LLM throne, OSS SOTA from AllenAI, Flux new tools, Deepseek R1 reasoning & more AI news

2024-11-2201:45:25

📆 ThursdAI - Nov 14 - Qwen 2.5 Coder, No Walls, Gemini 1114 👑 LLM, ChatGPT OS integrations & more AI news

2024-11-1501:48:42

📆 ThursdAI - Nov 7 - Video version, full o1 was given and taken away, Anthropic price hike-u, halloween 💀 recap & more AI news

2024-11-0801:38:22

📆 ThursdAI - Spooky Halloween edition with Video!

2024-11-0101:49:05

📅 ThursdAI - Oct 24 - Claude 3.5 controls your PC?! Talking AIs with 🦾, Multimodal Weave, Video Models mania + more AI news from this 🔥 week.

2024-10-2501:56:20

📆 ThursdAI - Oct 17 - Robots, Rockets, and Multi Modal Mania with open source voice cloning, OpenAI new voice API and more AI news

2024-10-1801:35:10

📆 ThursdAI - Oct 10 - Two Nobel Prizes in AI!? Meta Movie Gen (and sounds ) amazing, Pyramid Flow a 2B video model, 2 new VLMs & more AI news!

2024-10-1001:30:01

📆 ThursdAI - Oct 3 - OpenAI RealTime API, ChatGPT Canvas & other DevDay news (how I met Sam Altman), Gemini 1.5 8B is basically free, BFL makes FLUX 1.1 6x faster, Rev breaks whisper records...

2024-10-0401:45:14

OpenAI Dev Day 2024 keynote

2024-10-0105:55

📅 ThursdAI - Sep 26 - 🔥 Llama 3.2 multimodal & meta connect recap, new Gemini 002, Advanced Voice mode & more AI news

2024-09-2601:47:15

ThursdAI - Sep 19 - 👑 Qwen 2.5 new OSS king LLM, MSFT new MoE, Nous Research's Forge announcement, and Talking AIs in the open source!

2024-09-1901:56:06

🔥 📅 ThursdAI - Sep 12 - OpenAI's 🍓 is called 01 and is HERE, reflecting on Reflection 70B, Google's new auto podcasts & more AI news from last week

2024-09-1301:58:14

📅 ThursdAI - Sep 5 - 👑 Reflection 70B beats Claude 3.5, Anthropic Enterprise 500K context, 100% OSS MoE from AllenAI, 1000 agents world sim, Replit agent is the new Cursor? and more AI news

2024-09-0601:44:56

00:00

📅 ThursdAI - Gemma 2, AI Engineer 24', AI Wearables, New LLM leaderboard

#box-pro-ellipsis-173693824863036{-webkit-line-clamp:2;}📅 ThursdAI - Gemma 2, AI Engineer 24', AI Wearables, New LLM leaderboard

📅 ThursdAI - Gemma 2, AI Engineer 24', AI Wearables, New LLM leaderboard

Alex Volkov

📅 ThursdAI - Gemma 2, AI Engineer 24', AI Wearables, New LLM leaderboard