AOTInductor

Update: 2024-03-02

Description

AOTInductor is a feature in PyTorch that lets you export an inference model into a self-contained dynamic library, which can subsequently be loaded and used to run optimized inference. It is aimed primarily at CUDA and CPU inference applications, for situations when your model export once to be exported once while your runtime may still get continuous updates. One of the big underlying organizing principles is a limited ABI which does not include libtorch, which allows these libraries to stay stable over updates to the runtime. There are many export-like use cases you might be interested in using AOTInductor for, and some of the pieces should be useful, but AOTInductor does not necessarily solve them.

Comments

In Channel

Compiler collectives

2024-08-0416:33

TORCH_TRACE and tlparse

2024-04-2915:28

Higher order operators

2024-04-2117:10

Inductor - Post-grad FX passes

2024-04-1224:07

CUDA graph trees

2024-03-2420:50

Min-cut partitioner

2024-03-1715:56

AOTInductor

2024-03-0217:30

Tensor subclasses and PT2

2024-02-2413:25

Compiled autograd

2024-02-1918:07

PT2 extension points

2024-02-0515:54

Inductor - Define-by-run IR

2024-01-2412:06

Unsigned integers

2024-01-1713:07

Inductor - IR

2024-01-1618:00

Dynamo - VariableTracker

2024-01-1215:55

Unbacked SymInts

2023-02-2121:31

Zero-one specialization

2023-02-2021:07

torchdynamo

2022-12-0625:35

PyTorch 2.0

2022-12-0417:51

History of functorch

2022-11-0719:10

Learning rate schedulers

2022-06-1319:35

00:00

#box-pro-ellipsis-175801237806939{-webkit-line-clamp:2;}AOTInductor

AOTInductor

PyTorch

AOTInductor