PyTorch's Combined Effort in Large Model Optimization // Michael Gschwind // #274

Update: 2024-11-26

Description

Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services.

// MLOps Podcast #274 with Michael Gschwind, Software Engineer, Software Executive at Meta Platforms.

// Abstract
Explore the role in boosting model performance, on-device AI processing, and collaborations with tech giants like ARM and Apple. Michael shares his journey from gaming console accelerators to AI, emphasizing the power of community and innovation in driving advancements.

// Bio
Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services. He led the development of MultiRay and Textray, the first deployment of LLMs at a scale exceeding a trillion queries per day shortly after its rollout. He created the strategy and led the implementation of PyTorch donation optimization with Better Transformers and Accelerated Transformers, bringing Flash Attention, PT2 compilation, and ExecuTorch into the mainstream for LLMs and GenAI models. Most recently, he led the enablement of large language models on-device AI with mobile and edge devices.

// MLOps Swag/Merch
https://mlops-community.myshopify.com/

// Related Links
Website: https://en.m.wikipedia.org/wiki/Michael_Gschwind

--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Michael on LinkedIn: https://www.linkedin.com/in/michael-gschwind-3704222/?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=ios_app

Timestamps:
[00:00 ] Michael's preferred coffee
[00:21 ] Takeaways
[01:59 ] Please like, share, leave a review, and subscribe to our MLOps channels!
[02:10 ] Gaming to AI Accelerators
[11:34 ] Torch Chat goals
[18:53 ] Pytorch benchmarking and competitiveness
[21:28 ] Optimizing MLOps models
[24:52 ] GPU optimization tips
[29:36 ] Cloud vs On-device AI
[38:22 ] Abstraction across devices
[42:29 ] PyTorch developer experience
[45:33 ] AI and MLOps-related antipatterns
[48:33 ] When to optimize
[53:26 ] Efficient edge AI models
[56:57 ] Wrap up

Comments

Top Podcasts

The Best New Comedy Podcast Right Now – June 2024 The Best News Podcast Right Now – June 2024 The Best New Business Podcast Right Now – June 2024 The Best New Sports Podcast Right Now – June 2024 The Best New True Crime Podcast Right Now – June 2024 The Best New Joe Rogan Experience Podcast Right Now – June 20 The Best New Dan Bongino Show Podcast Right Now – June 20 The Best New Mark Levin Podcast – June 2024

In Channel

Unleashing Unconstrained News Knowledge Graphs to Combat Misinformation // Robert Caulk // #279

2024-12-2001:15:24

Domino: Communication-Free LLM Training Engine // Guanhua Wang // #278

2024-12-1749:47

AI's Next Frontier // Aditya Naganath // #277

2024-12-1157:30

PyTorch for Control Systems and Decision Making // Vincent Moens // #276

2024-12-0456:39

AI-Driven Code: Navigating Due Diligence & Transparency in MLOps // Matt van Itallie // #275

2024-11-2957:01

PyTorch's Combined Effort in Large Model Optimization // Michael Gschwind // #274

2024-11-2657:44

LLMs to agents: The Beauty & Perils of Investing in GenAI // VC Panel // Agents in Production

2024-11-2233:24

We Can All Be AI Engineers and We Can Do It with Open Source Models // Luke Marsden // #273

2024-11-2051:08

Exploring AI Agents: Voice, Visuals, and Versatility // Panel // Agents in Production

2024-11-1528:58

The Impact of UX Research in the AI Space // Lauren Kaplan // #272

2024-11-1301:08:19

EU AI Act - Navigating New Legislation // Petar Tsankov // MLOps Podcast #271

2024-11-0158:56

Boosting LLM/RAG Workflows & Scheduling w/ Composable Memory and Checkpointing // Bernie Wu // #270

2024-10-2255:18

How to Systematically Test and Evaluate Your LLMs Apps // Gideon Mendels // #269

2024-10-1801:01:42

Exploring the Impact of Agentic Workflows // Raj Rikhy // #268

2024-10-1551:02

The Only Constant is (Data) Change // Panel // DE4AI

2024-10-1140:49

The AI Dream Team: Strategies for ML Recruitment and Growth // Jelmer Borst and Daniela Solis // #267

2024-10-0958:42

Making Your Company LLM-native // Francisco Ingham // #266

2024-10-0657:54

Unpacking 3 Types of Feature Stores // Simba Khadder // #265

2024-10-0101:07:42

Reinvent Yourself and Be Curious // Stefano Bosisio // MLOps Podcast #264

2024-09-2757:15

Global Feature Store // Gottam Sai Bharath & Cole Bailey // #263

2024-09-2450:18

00:00

PyTorch's Combined Effort in Large Model Optimization // Michael Gschwind // #274

#box-pro-ellipsis-173486770329194{-webkit-line-clamp:2;}PyTorch's Combined Effort in Large Model Optimization // Michael Gschwind // #274

PyTorch's Combined Effort in Large Model Optimization // Michael Gschwind // #274

Demetrios Brinkmann

PyTorch's Combined Effort in Large Model Optimization // Michael Gschwind // #274