Listen Top Shows Blog

Inference in Action: Scaling Al Smarter with Inferless

Inference in Action: Scaling Al Smarter with Inferless

Update: 2024-10-24

Share

Description

In this episode, we sit down with Nilesh Agarwal, co-founder of Inferless, a platform designed to streamline serverless GPU inference. We’ll cover the evolving landscape of model deployment, explore open-source tools like KServe and Knative, and discuss how Inferless solves common bottlenecks, such as cold starts and scaling issues. We also take a closer look at real-world examples like CleanLab, who saved 90% on GPU costs using Inferless.

Whether you’re a developer, DevOps engineer, or tech enthusiast curious about the latest in AI infrastructure, this podcast offers insights into Kubernetes-based model deployment, efficient updates, and the future of serverless ML. Tune in to hear Nilesh's journey from Amazon to founding Inferless and how his platform is transforming the way companies deploy machine learning models.

Subscribe now for more episodes!

Show Links:

OpenShift 4.17 is GA https://www.youtube.com/live/DvKHwz-c11c?si=6Zap6hk_GsQfdX2m
Policy SBOM from Styra: https://www.styra.com/blog/introducing-policy-sbom/
NVIDIA GEForce NOW runs on KubeVirt https://thenewstack.io/now-nvidia-scaled-its-cloud-services-with-kubevirt/
CBT feedback https://thenewstack.io/kubernetes-advances-cloud-native-data-protection-share-feedback
CNCF KUBEEDGE Grad https://www.devopsdigest.com/cncf-announces-kubeedge-graduation?utm_source=tldrdevops
Palumi Operator 2.0 https://www.pulumi.com/blog/pulumi-kubernetes-operator-2-0

Inferless LInks:

https://www.inferless.com/blog/cleanlab-saves-90-on-gpu-costs-with-inferless-serverless-inference
https://www.inferless.com/blog/how-spoofsense-scaled-their-ai-inference-with-inferless-dynamic-batching-autoscaling
https://www.inferless.com/
https://docs.inferless.com/introduction/introduction
LinkedIn - https://www.linkedin.com/in/nilesh-agarwal/
X- https://x.com/nilesh_agarwal2
Medium Blog https://nilesh-agarwal.medium.com/

Comments

In Channel

Database as a service with Percona Everest

Database as a service with Percona Everest

2025-03-0301:02:44

KubeCon NA 2024 News Recap

KubeCon NA 2024 News Recap

2024-12-1858:24

Increasing AI adoption using Kubernetes

Increasing AI adoption using Kubernetes

2024-12-0652:03

Inference in Action: Scaling Al Smarter with Inferless

Inference in Action: Scaling Al Smarter with Inferless

2024-10-2455:17

Container security with Wiz

Container security with Wiz

2024-10-0701:02:33

Dagger.io Deep Dive with Co-Founder Sam Alba

Dagger.io Deep Dive with Co-Founder Sam Alba

2024-09-2301:06:24

Running Ray on Kubernetes with KubeRay

Running Ray on Kubernetes with KubeRay

2024-09-0553:06

Building scalable data platforms using Data on EKS

Building scalable data platforms using Data on EKS

2024-08-2201:02:20

Deploy and fine-tune LLM models on Kubernetes using KAITO

Deploy and fine-tune LLM models on Kubernetes using KAITO

2024-08-0744:17

The business case for cloud-native and Kubernetes

The business case for cloud-native and Kubernetes

2024-07-2654:24

Building the AI Hyperscaler with Kubernetes

Building the AI Hyperscaler with Kubernetes

2024-06-2854:56

Shifting Minds: Exploring OpenShift's AI Landscape

Shifting Minds: Exploring OpenShift's AI Landscape

2024-06-1401:05:07

Training Machine Learning (ML) models on Kubernetes

Training Machine Learning (ML) models on Kubernetes

2024-05-3155:29

The evolution of service mesh technologies

The evolution of service mesh technologies

2024-05-1701:08:00

What are Vector Databases

What are Vector Databases

2024-05-0601:03:06

KubeCon EU Paris News Recap

KubeCon EU Paris News Recap

2024-04-1647:39

Open Policy Agent (OPA) 101

Open Policy Agent (OPA) 101

2024-04-0301:07:20

Ops Ops Hooray! Navigating IDPs from an Ops perspective

Ops Ops Hooray! Navigating IDPs from an Ops perspective

2024-03-2058:17

Generative AI on Kubernetes

Generative AI on Kubernetes

2024-03-1201:15:56

IDPs Unveiled: Accelerating Deployment on Kubernetes

IDPs Unveiled: Accelerating Deployment on Kubernetes

2024-02-2359:52

00:00

00:00

x

Inference in Action: Scaling Al Smarter with Inferless

Inference in Action: Scaling Al Smarter with Inferless

Kubernetes Bytes