pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

History

Dan Johnson d22c4cc353 Add option to use mempool on OOM (#151487 ) MemPool is a separate pool of memory handled by the caching allocator. This PR adds the option let the caching allocator try to use this pool as a last resort instead of OOMing by associating a use_on_oom bool with each MemPool. Usage: Users can optionally specify a ``use_on_oom`` bool (which is False by default) during MemPool creation. If true, then the CUDACachingAllocator will be able to use memory in this pool as a last resort instead of OOMing. ``` pool = torch.cuda.MemPool(allocator, use_on_oom=True) with torch.cuda.use_mem_pool(pool): a = torch.randn(40 * 1024 * 1024, dtype=torch.uint8, device="cuda") del a # at the memory limit, this will succeed by using pool's memory in order to avoid the oom b = torch.randn(40 * 1024 * 1024, dtype=torch.uint8, device="cuda") ``` Testing: ``` python test/test_cuda.py -k test_mempool_limited_memory_with_allocator ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/151487 Approved by: https://github.com/eqy, https://github.com/syed-ahmed, https://github.com/ngimel		2025-04-26 04:04:57 +00:00
..
_dynamo	[ca] introduce RuntimeState to support c++ hooks via graph breaks (#149987 )	2025-03-27 05:05:34 +00:00
__init__.pyi.in	Add option to use mempool on OOM (#151487 )	2025-04-26 04:04:57 +00:00
_aoti.pyi	[AOTI XPU] Support AOT Inductor for Intel GPU. (#140269 )	2024-12-10 05:05:08 +00:00
_autograd.pyi	Add overload names to profiler trace (#143114 )	2025-03-05 01:00:29 +00:00
_cpu.pyi	[CPUInductor] Fix SVE256 detection (#146207 )	2025-02-01 18:51:34 +00:00
_cudnn.pyi	Improve typing in torch/types.py (#145237 )	2025-01-28 05:29:12 +00:00
_cusparselt.pyi	[sparse] Add cuSPARSELt as a backend (#128534 )	2024-08-21 22:06:07 +00:00
_distributed_autograd.pyi	remove allow-untyped-defs for torch/_C/_distributed_autograd.pyi (#143369 )	2024-12-17 18:09:28 +00:00
_distributed_c10d.pyi	c10d/Store: add nonblocking mode to queue_pop (#151485 )	2025-04-18 02:14:50 +00:00
_distributed_rpc_testing.pyi
_distributed_rpc.pyi
_export.pyi	[export] Implement cpp deserializer. (#136398 )	2024-11-14 16:34:59 +00:00
_functions.pyi	PEP585 update - torch/_C torch/_decomp torch/_lazy torch/_library torch/_numpy torch/_prims torch/_refs torch/_strobelight (#145102 )	2025-01-18 20:47:12 +00:00
_functorch.pyi	[BE] Upgrade to mypy 1.14 (#145966 )	2025-03-04 20:58:26 +00:00
_instruction_counter.pyi	Add compile time instruction count metric (#133834 )	2024-08-27 23:29:02 +00:00
_itt.pyi
_lazy_ts_backend.pyi
_lazy.pyi	remove allow-untyped-defs for torch/_C/_lazy.pyi (#143370 )	2024-12-17 17:18:10 +00:00
_monitor.pyi	PEP585: More UP006 fixes (#146392 )	2025-02-20 06:18:13 +00:00
_nn.pyi.in	Use Python 3.9 typing (#148157 )	2025-03-04 03:09:55 +00:00
_nvtx.pyi	Inductor annotations (#130429 )	2024-12-10 08:53:39 +00:00
_onnx.pyi
_profiler.pyi	[Profiler] Add profiler activity for HPU devices (#148182 )	2025-03-05 01:37:48 +00:00
_VariableFunctions.pyi.in	Use Python 3.9 typing (#148157 )	2025-03-04 03:09:55 +00:00
_verbose.pyi
build.bzl
return_types.pyi.in	Use Python 3.9 typing (#148157 )	2025-03-04 03:09:55 +00:00