Listen Top Shows Blog

Fast Inference e Serving con vLLM #70

Fast Inference e Serving con vLLM #70

Update: 2025-10-30

Share

Description

Conosciamo Nicolò Lucchesi, Senior Machine Learning Engineer presso Red Hat, e il suo lavoro su vLLM e l'ottimizzazione delle performance. Nicolò condivide la sua esperienza nel campo dell'AI e del Machine Learning, spiegando l'importanza di strumenti come Paged Attention e le sfide legate all'integrazione di nuovi modelli. Viene anche discusso il ruolo della community e le tecnologie utilizzate nel progetto vLLM, oltre a tendenze emergenti nel settore.

Comments

In Channel

Platform Strategy #71

Platform Strategy #71

2025-11-2850:36

Sviluppo embedded con MicroPython #58

Sviluppo embedded con MicroPython #58

2024-10-1348:54

AI model collapse e vibe coding #72

AI model collapse e vibe coding #72

2025-12-1144:27

AI Bitter Lesson - Python e Caffè

AI Bitter Lesson - Python e Caffè

2025-12-0506:52

Duct-tap programmer - Python e Caffè

Duct-tap programmer - Python e Caffè

2025-11-2109:54

Python 3.14 - Python e Caffè

Python 3.14 - Python e Caffè

2025-11-1508:02

Python e Caffè - Django Girls

Python e Caffè - Django Girls

2025-11-0808:34

Fast Inference e Serving con vLLM #70

Fast Inference e Serving con vLLM #70

2025-10-3038:19

Python e Caffè - Conferenze dove siamo stati

Python e Caffè - Conferenze dove siamo stati

2025-10-2411:26

Benchmark di GPU per ML e LLM #69

Benchmark di GPU per ML e LLM #69

2025-10-1848:19

Python e Caffè - Cosa ci piace su Youtube

Python e Caffè - Cosa ci piace su Youtube

2025-10-1010:07

Team Topologies #68

Team Topologies #68

2025-10-0750:15

Python e Caffè - Imparare insegnando

Python e Caffè - Imparare insegnando

2025-10-0410:42

Python e Caffè - Organizzare Meetup

Python e Caffè - Organizzare Meetup

2025-09-2609:54

Python e Caffè - FastMCP

Python e Caffè - FastMCP

2025-09-1910:31

Computer vision moderna #67

Computer vision moderna #67

2025-09-0337:16

Runtime e Interpreter con un Core Developer! #66

Runtime e Interpreter con un Core Developer! #66

2025-07-1352:14

PyCon Italia 2025 Special!

PyCon Italia 2025 Special!

2025-05-2734:54

Encoding e db vettoriali. #65

Encoding e db vettoriali. #65

2025-05-1640:22

GraphQL, REST e il nuovo FastAPI Labs. #64

GraphQL, REST e il nuovo FastAPI Labs. #64

2025-05-0641:07

00:00

00:00

1.0x

Fast Inference e Serving con vLLM #70

Fast Inference e Serving con vLLM #70

Python Milano