OLMo: Everything You Need to Train an Open Source LLM with Akshita Bhagia - #674

Update: 2024-03-04

Description

Today we’re joined by Akshita Bhagia, a senior research engineer at the Allen Institute for AI. Akshita joins us to discuss OLMo, a new open source language model with 7 billion and 1 billion variants, but with a key difference compared to similar models offered by Meta, Mistral, and others. Namely, the fact that AI2 has also published the dataset and key tools used to train the model. In our chat with Akshita, we dig into the OLMo models and the various projects falling under the OLMo umbrella, including Dolma, an open three-trillion-token corpus for language model pretraining, and Paloma, a benchmark and tooling for evaluating language model performance across a variety of domains.

The complete show notes for this episode can be found at twimlai.com/go/674.

Comments

In Channel

Is It Time to Rethink LLM Pre-Training? with Aditi Raghunathan - #747

2025-09-1616:34

Building an Immune System for AI Generated Software with Animesh Koratana - #746

2025-09-0901:04:41

Autoformalization and Verifiable Superintelligence with Christian Szegedy - #745

2025-09-0201:11:18

Multimodal AI Models on Apple Silicon with MLX with Prince Canuma - #744

2025-08-2601:09:50

Genie 3: A New Frontier for World Models with Jack Parker-Holder and Shlomi Fruchter - #743

2025-08-1901:00:31

Closing the Loop Between AI Training and Inference with Lin Qiao - #742

2025-08-1201:00:40

Context Engineering for Productive AI Agents with Filip Kozera - #741

2025-07-2945:31

Infrastructure Scaling and Compound AI Systems with Jared Quincy Davis - #740

2025-07-2201:12:32

Building Voice AI Agents That Don’t Suck with Kwindla Kramer - #739

2025-07-1501:12:32

Distilling Transformers and Diffusion Models for Robust Edge Use Cases with Fatih Porikli - #738

2025-07-0901:00:30

Building the Internet of Agents with Vijoy Pandey - #737

2025-06-2456:31

LLMs for Equities Feature Forecasting at Two Sigma with Ben Wellington - #736

2025-06-1759:01

Zero-Shot Auto-Labeling: The End of Annotation for Computer Vision with Jason Corso - #735

2025-06-1057:01

Grokking, Generalization Collapse, and the Dynamics of Training Deep Neural Networks with Charles Martin - #734

2025-06-0501:25:37

Google I/O 2025 Special Edition - #733

2025-05-2826:37

RAG Risks: Why Retrieval-Augmented LLMs are Not Safer with Sebastian Gehrmann - #732

2025-05-2157:37

From Prompts to Policies: How RL Builds Better AI Agents with Mahesh Sathiamoorthy - #731

2025-05-1301:01:27

How OpenAI Builds AI Agents That Think and Act with Josh Tobin - #730

2025-05-0601:06:57

CTIBench: Evaluating LLMs in Cyber Threat Intelligence with Nidhi Rastogi - #729

2025-04-3055:48

Generative Benchmarking with Kelly Hong - #728

2025-04-2353:47

00:00

1.0x

OLMo: Everything You Need to Train an Open Source LLM with Akshita Bhagia - #674

#box-pro-ellipsis-175816997451132{-webkit-line-clamp:2;}OLMo: Everything You Need to Train an Open Source LLM with Akshita Bhagia - #674

OLMo: Everything You Need to Train an Open Source LLM with Akshita Bhagia - #674

Sam Charrington

OLMo: Everything You Need to Train an Open Source LLM with Akshita Bhagia - #674