REFRAG with Xiaoqiang Lin - Weaviate Podcast #130!

Update: 2025-11-03

Description

Xiaoqiang Lin is a Ph.D. student at the National University of Singapore. During his time at Meta, Xiaoqiang lead the research behind REFRAG: Rethinking RAG-based Decoding. Traditional RAG systems use vectors to retrieve relevant context with semantic search, but then throw away the vectors when passing the context to the LLM. REFRAG instead feeds the LLM these pre-compute vectors, achieving massive gains in long context processing and LLM inference speed! REFRAG makes Time-To-First-Token (TTFT) 31x faster and Time-To-Iterative-Token (TTIT) 3x faster, boosting overall LLM throughput by 7x while also being able to handle much longer contexts!

There are so many interesting aspects to this and I really loved diving into the details with Xiaoqiang! I hope you enjoy the podcast!

Comments

In Channel

Semantic Query Engines with Matthew Russo - Weaviate Podcast #131!

2025-11-1801:02:25

REFRAG with Xiaoqiang Lin - Weaviate Podcast #130!

2025-11-0301:00:00

Weaviate and SAS with Saurabh Mishra and Bob van Luijt - Weaviate Podcast #129!

2025-10-1343:55

Weaviate's Query Agent with Charles Pierse - Weaviate Podcast #128!

2025-09-2201:01:32

GEPA with Lakshya A. Agrawal - Weaviate Podcast #127!

2025-08-1301:01:55

Agentic Topic Modeling with Maarten Grootendorst - Weaviate Podcast #126!

2025-07-0901:05:18

Sufficient Context with Hailey Joren - Weaviate Podcast #125!

2025-07-0250:53

RAG Benchmarks with Nandan Thakur - Weaviate Podcast #124!

2025-06-2501:04:46

MUVERA with Rajesh Jayaram and Roberto Esposito - Weaviate Podcast #123!

2025-05-2801:13:06

Patronus AI with Anand Kannappan - Weaviate Podcast #122!

2025-05-1501:01:06

Haize Labs with Leonard Tang - Weaviate Podcast #121!

2025-05-1254:15

Box AI with Ben Kus and Bob van Luijt

2025-05-0755:32

Structured Outputs with Will Kurt and Cameron Pfiffer - Weaviate Podcast #119!

2025-04-0901:10:17

Synthetic Data with David Berenstein and Ben Burtenshaw - Weaviate Podcast #118!

2025-03-2501:02:01

Letta AI with Sarah Wooders - Weaviate Podcast #117!

2025-03-0357:34

Agent Experience with Matt Biilmann, Sebastian Witalec, and Charles Pierse - Weaviate Podcast #116!

2025-02-2752:09

Optimizing Retrieval Agents with Shirley Wu - Weaviate Podcast #115!

2025-02-1901:00:20

Contextual AI with Amanpreet Singh - Weaviate Podcast #114!

2025-02-1257:56

Cartesia AI with Karan Goel - Weaviate Podcast #113!

2025-01-2853:45

Google Vertex AI RAG Engine with Lewis Liu and Bob van Luijt - Weaviate Podcast #112!

2025-01-1558:16

00:00

REFRAG with Xiaoqiang Lin - Weaviate Podcast #130!

#box-pro-ellipsis-176403136493121{-webkit-line-clamp:2;}REFRAG with Xiaoqiang Lin - Weaviate Podcast #130!

REFRAG with Xiaoqiang Lin - Weaviate Podcast #130!

Weaviate

REFRAG with Xiaoqiang Lin - Weaviate Podcast #130!