CUDA graph trees

CUDA graph trees

Update: 2024-03-24
Share

Description

CUDA graph trees are the internal implementation of CUDA graphs used in PT2 when you say mode="reduce-overhead". Their primary innovation is that they allow the reuse of memory across multiple CUDA graphs, as long as they form a tree structure of potential paths you can go down with the CUDA graph. This greatly reduced the memory usage of CUDA graphs in PT2. There are some operational implications to using CUDA graphs which are described in the podcast.
Comments 
In Channel
Compiler collectives

Compiler collectives

2024-08-0416:33

Higher order operators

Higher order operators

2024-04-2117:10

CUDA graph trees

CUDA graph trees

2024-03-2420:50

Min-cut partitioner

Min-cut partitioner

2024-03-1715:56

AOTInductor

AOTInductor

2024-03-0217:30

Compiled autograd

Compiled autograd

2024-02-1918:07

PT2 extension points

PT2 extension points

2024-02-0515:54

Unsigned integers

Unsigned integers

2024-01-1713:07

Inductor - IR

Inductor - IR

2024-01-1618:00

Unbacked SymInts

Unbacked SymInts

2023-02-2121:31

torchdynamo

torchdynamo

2022-12-0625:35

PyTorch 2.0

PyTorch 2.0

2022-12-0417:51

History of functorch

History of functorch

2022-11-0719:10

loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

CUDA graph trees

CUDA graph trees

PyTorch