Listen Top Shows Blog

Fireworks Founder Lin Qiao on How Fast Inference and Small Models Will Benefit Businesses

Fireworks Founder Lin Qiao on How Fast Inference and Small Models Will Benefit Businesses

Update: 2024-08-13

Share

Description

In the first wave of the generative AI revolution, startups and enterprises built on top of the best closed-source models available, mostly from OpenAI. The AI customer journey moves from training to inference, and as these first products find PMF, many are hitting a wall on latency and cost.

Fireworks Founder and CEO Lin Qiao led the PyTorch team at Meta that rebuilt the whole stack to meet the complex needs of the world’s largest B2C company. Meta moved PyTorch to its own non-profit foundation in 2022 and Lin started Fireworks with the mission to compress the timeframe of training and inference and democratize access to GenAI beyond the hyperscalers to let a diversity of AI applications thrive.

Lin predicts when open and closed source models will converge and reveals her goal to build simple API access to the totality of knowledge.

Hosted by: Sonya Huang and Pat Grady, Sequoia Capital

Mentioned in this episode:

Pytorch: the leading framework for building deep learning models, originated at Meta and now part of the Linux Foundation umbrella

Caffe2 and ONNX: ML frameworks Meta used that PyTorch eventually replaced

Conservation of complexity: the idea that that every computer application has inherent complexity that cannot be reduced but merely moved between the backend and frontend, originated by Xerox PARC researcher Larry Tesler

Mixture of Experts: a class of transformer models that route requests between different subsets of a model based on use case

Fathom: a product the Fireworks team uses for video conference summarization

LMSYS Chatbot Arena: crowdsourced open platform for LLM evals hosted on Hugging Face

00:00 - Introduction

02:01 - What is Fireworks?

02:48 - Leading Pytorch

05:01 - What do researchers like about PyTorch?

07:50 - How Fireworks compares to open source

10:38 - Simplicity scales

12:51 - From training to inference

17:46 - Will open and closed source converge?

22:18 - Can you match OpenAI on the Fireworks stack?

26:53 - What is your vision for the Fireworks platform?

31:17 - Competition for Nvidia?

32:47 - Are returns to scale starting to slow down?

34:28 - Competition

36:32 - Lightning round

Comments

In Channel

Nvidia CTO Michael Kagan: Scaling Beyond Moore's Law to Million-GPU Clusters

Nvidia CTO Michael Kagan: Scaling Beyond Moore's Law to Million-GPU Clusters

2025-10-2841:31

Securing the AI Frontier: Irregular Co-founder Dan Lahav

Securing the AI Frontier: Irregular Co-founder Dan Lahav

2025-10-2144:09

Why AI Will Transform Customer Experience: Cresta CEO Ping Wu and Sequoia’s Doug Leone

Why AI Will Transform Customer Experience: Cresta CEO Ping Wu and Sequoia’s Doug Leone

2025-10-1444:46

Block CTO Dhanji Prasanna: Building the AI-First Enterprise with Goose, their Open Source Agent

Block CTO Dhanji Prasanna: Building the AI-First Enterprise with Goose, their Open Source Agent

2025-09-3059:43

Why Businesses Are Rejecting the AI They’ve Asked For: Agency CEO Elias Torres

Why Businesses Are Rejecting the AI They’ve Asked For: Agency CEO Elias Torres

2025-09-2344:49

Building the "App Store" for Robots: Hugging Face's Thomas Wolf on Physical AI

Building the "App Store" for Robots: Hugging Face's Thomas Wolf on Physical AI

2025-09-0943:08

Deal Velocity, Not Billable Hours: How Crosby Uses AI to Redefine Legal Contracting

Deal Velocity, Not Billable Hours: How Crosby Uses AI to Redefine Legal Contracting

2025-09-0249:59

n8n CEO Jan Oberhauser on Building the Universal AI Automation Layer

n8n CEO Jan Oberhauser on Building the Universal AI Automation Layer

2025-08-2635:44

Scaling the ‘Cursor for Slides’ to $50M ARR: Gamma founder Jon Noronha

Scaling the ‘Cursor for Slides’ to $50M ARR: Gamma founder Jon Noronha

2025-08-1930:05

Delphi’s Dara Ladjevardian: How AI Digital Minds Can Scale Human Connection

Delphi’s Dara Ladjevardian: How AI Digital Minds Can Scale Human Connection

2025-08-1239:05

Vercel CEO Guillermo Rauch: Building the Generative Web with AI

Vercel CEO Guillermo Rauch: Building the Generative Web with AI

2025-08-0501:00:59

OpenAI’s IMO Team on Why Models Are Finally Solving Elite-Level Math

OpenAI’s IMO Team on Why Models Are Finally Solving Elite-Level Math

2025-07-3030:10

OpenAI Just Released ChatGPT Agent, Its Most Powerful Agent Yet

OpenAI Just Released ChatGPT Agent, Its Most Powerful Agent Yet

2025-07-2237:36

DeepMind's Pushmeet Kohli on AI's Scientific Revolution

DeepMind's Pushmeet Kohli on AI's Scientific Revolution

2025-07-1141:13

Mapping the Mind of a Neural Net: Goodfire’s Eric Ho on the Future of Interpretability

Mapping the Mind of a Neural Net: Goodfire’s Eric Ho on the Future of Interpretability

2025-07-0847:07

ElevenLabs’ Mati Staniszewski: Why Voice Will Be the Fundamental Interface for Tech

ElevenLabs’ Mati Staniszewski: Why Voice Will Be the Fundamental Interface for Tech

2025-07-0159:53

From DevOps ‘Heart Attacks’ to AI-Powered Diagnostics With Traversal’s AI Agents

From DevOps ‘Heart Attacks’ to AI-Powered Diagnostics With Traversal’s AI Agents

2025-06-2440:32

The Breakthroughs Needed for AGI Have Already Been Made: OpenAI Former Research Head Bob McGrew

The Breakthroughs Needed for AGI Have Already Been Made: OpenAI Former Research Head Bob McGrew

2025-06-1748:51

OpenAI Codex Team: From Coding Autocomplete to Asynchronous Autonomous Agents

OpenAI Codex Team: From Coding Autocomplete to Asynchronous Autonomous Agents

2025-06-1037:44

Google I/O Afterparty: The Future of Human-AI Collaboration, From Veo to Mariner

Google I/O Afterparty: The Future of Human-AI Collaboration, From Veo to Mariner

2025-06-0353:51

00:00

00:00

x

Fireworks Founder Lin Qiao on How Fast Inference and Small Models Will Benefit Businesses

Fireworks Founder Lin Qiao on How Fast Inference and Small Models Will Benefit Businesses

Sequoia Capital