Breaking the Memory Barrier

Update: 2024-10-27

Description

🧠 Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

This research paper introduces Inf-CL, a novel approach for contrastive learning that dramatically reduces GPU memory usage during training, allowing for near-infinite batch sizes. The authors address the issue of quadratic memory growth in traditional methods by implementing a tile-based computation strategy that partitions the contrastive loss calculation into smaller, sequentially computed blocks. To further enhance efficiency, they propose a multi-level tiling strategy that leverages ring-based communication at the GPU level and fused kernels at the CUDA core level, minimizing I/O overhead. The experiments demonstrate that Inf-CL significantly outperforms previous methods, achieving unprecedented batch sizes while maintaining accuracy and comparable training speed. This breakthrough opens new possibilities for large-scale contrastive learning, paving the way for advancements in areas such as self-supervised learning and dense text retrieval.

📎 Link to paper

Comments

In Channel

Marco-o1

2024-11-2314:47

Scaling Laws for Precision

2024-11-1818:39

Test-Time Training

2024-11-1414:38

Qwen2.5-Coder

2024-11-1224:03

Attacking Vision-Language Computer Agents via Pop-ups

2024-11-0921:39

Number Cookbook

2024-11-0816:11

Jigsaw Puzzles

2024-11-0716:44

Multi-expert Prompting with LLMs

2024-11-0512:41

Investigating the Role of Prompting and External Tools in Hallucination Rates of LLMs

2024-11-0316:03

Mind Your Step (by Step)

2024-11-0216:44

SimpleQA

2024-10-3117:33

GPT-4o System Card

2024-10-3024:23

Mixture of Parrots

2024-10-2910:51

Improve Vision Language Model Chain-of-thought Reasoning

2024-10-2815:44

Breaking the Memory Barrier

2024-10-2715:33

LLMs Reflect the Ideology of their Creators

2024-10-2611:09

LongRAG

2024-10-2518:07

A Theoretical Understanding of Chain-of-Thought

2024-10-2409:56

A Survey on Data Synthesis and Augmentation for Large Language Models

2024-10-2321:21

Revealing the Barriers of Language Agents in Planning

2024-10-2208:56

00:00

1.0x

Breaking the Memory Barrier

#box-pro-ellipsis-175934704160469{-webkit-line-clamp:2;}Breaking the Memory Barrier

Breaking the Memory Barrier

Shahriar Shariati

Breaking the Memory Barrier