Economical way of serving vector search workloads with Simon Eskildsen, CEO Turbopuffer
Description
Turbopuffer search engine supports such products as Cursor, Notion, Linear, Superhuman and Readwise.
This episode on YouTube: https://youtu.be/I8Ztqajighg
Medium: https://dmitry-kan.medium.com/vector-podcast-simon-eskildsen-turbopuffer-69e456da8df3
Dev: https://dev.to/vectorpodcast/vector-podcast-simon-eskildsen-turbopuffer-cfa
If you are on Lucene / OpenSearch stack, you can go managed by signing up here: https://console.aiven.io/signup?utm_source=youtube&utm_medium=&&utm_content=vectorpodcast
Time codes:
00:00 Intro
00:15 Napkin Problem 4: Throughput of Redis
01:35 Episode intro
02:45 Simon's background, including implementation of Turbopuffer
09:23 How Cursor became an early client
11:25 How to test pre-launch
14:38 Why a new vector DB deserves to exist?
20:39 Latency aspect
26:27 Implementation language for Turbopuffer
28:11 Impact of LLM coding tools on programmer craft
30:02 Engineer 2 CEO transition
35:10 Architecture of Turbopuffer
43:25 Disk vs S3 latency, NVMe disks, DRAM
48:27 Multitenancy
50:29 Recall@N benchmarking
59:38 filtered ANN and Big-ANN Benchmarks
1:00:54 What users care about more (than Recall@N benchmarking)
1:01:28 Spicy question about benchmarking in competition
1:06:01 Interesting challenges ahead to tackle
1:10:13 Simon's announcement
Show notes:
- Turbopuffer in Cursor: https://www.youtube.com/watch?v=oFfVt3S51T4&t=5223s
transcript: https://lexfridman.com/cursor-team-transcript
- Napkin Math: https://sirupsen.com/napkin
- Follow Simon on X: https://x.com/Sirupsen
- Not All Vector Databases Are Made Equal: https://towardsdatascience.com/milvus-pinecone-vespa-weaviate-vald-gsi-what-unites-these-buzz-words-and-what-makes-each-9c65a3bd0696/