pytorch/torch/cuda
Dan Johnson d22c4cc353 Add option to use mempool on OOM (#151487)
MemPool is a separate pool of memory handled by the caching allocator. This PR adds the option let the caching allocator try to use this pool as a last resort instead of OOMing by associating a use_on_oom bool with each MemPool.

Usage:
Users can optionally specify a ``use_on_oom`` bool (which is False by default) during MemPool creation. If true, then the CUDACachingAllocator will be able to use memory in this pool as a last resort instead of OOMing.

```
pool = torch.cuda.MemPool(allocator, use_on_oom=True)
with torch.cuda.use_mem_pool(pool):
    a = torch.randn(40 * 1024 * 1024, dtype=torch.uint8, device="cuda")
del a
# at the memory limit, this will succeed by using pool's memory in order to avoid the oom
b = torch.randn(40 * 1024 * 1024, dtype=torch.uint8, device="cuda")
```

Testing:
```
python test/test_cuda.py -k test_mempool_limited_memory_with_allocator
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151487
Approved by: https://github.com/eqy, https://github.com/syed-ahmed, https://github.com/ngimel
2025-04-26 04:04:57 +00:00
..
amp
__init__.py [ROCm] Fixes to enable VM-based MI300 CI runners (#152133) 2025-04-25 18:06:48 +00:00
_gpu_trace.py
_memory_viz.py
_sanitizer.py
_utils.py Add torch.cuda._compile_kernel() (#151484) 2025-04-24 07:14:31 +00:00
comm.py
error.py
gds.py [BE] Upgrade to mypy 1.14 (#145966) 2025-03-04 20:58:26 +00:00
graphs.py Revert "Implement cuda graphs implementation of torch.cond and torch.while_loop (#140979)" 2025-02-13 18:04:26 +00:00
jiterator.py
memory.py Add option to use mempool on OOM (#151487) 2025-04-26 04:04:57 +00:00
nccl.py
nvtx.py
profiler.py
random.py Avoid unnecessary clone in torch.cuda.set_rng_state (#149283) 2025-03-18 20:47:57 +00:00
sparse.py
streams.py
tunable.py [ROCm][TunableOp] Support submatrices in offline tuning (#151138) 2025-04-19 04:14:27 +00:00