pytorch/docs/source/notes
Banit Agrawal 48d18fbd4c [PyTorch CUDA Allocator] Allow reuse of non-split blocks with better rounding (#136174)
Summary:
This diff adds an option to round the non-split blocks in caching allocator so that they can be reused without causing lots of fragmentation for large memory segments.

For example, if we specify max_split memory size as 400MB, then all allocations more than 400MB will not be split. Lets say, we allocated some 1024MB blocks and these are cached in the allocator blocks. If we request a new 500MB block, we round it to nearest power-2-division, thats 512MB, we add default kLargeBuffer of 20MB, that will be 532MB and since 532MB is less than existing 1024MB block, the 1024MB will not be used for this allocation, instead a new 512MB block will be created. In this diff, we provide an option to cofigure the kLargeBuffer for rounding and expose as a configurable option, so 512MB + max_non_split_rounding_size and if thats greater than 1024MB, we will use te 1024MB and we wont create a new 512MB block using cudaMalloc. This option is added so that we can pre-allocate some large blocks so that we can reuse them as much as possible and we dont stall on calling cudaMalloc.

Differential Revision: D62758758

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136174
Approved by: https://github.com/zyan0
2024-09-17 19:08:44 +00:00
..
amp_examples.rst Update document for autocast on CPU (#135299) 2024-09-13 09:11:47 +00:00
autograd.rst Fix unexpected inference_mode interaction with torch.autograd.functional.jacobian (#130307) 2024-08-25 22:14:02 +00:00
broadcasting.rst
cpu_threading_runtimes.svg
cpu_threading_torchscript_inference.rst
cpu_threading_torchscript_inference.svg
cuda.rst [PyTorch CUDA Allocator] Allow reuse of non-split blocks with better rounding (#136174) 2024-09-17 19:08:44 +00:00
custom_operators.rst [docs] Redirect custom ops landing page to the correct place (#129177) 2024-06-21 13:31:32 +00:00
ddp.rst
extending.func.rst
extending.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
faq.rst
fsdp.rst
get_start_xpu.rst Adding a note for Getting Started with PyTorch on Intel GPUs (#127872) 2024-06-14 14:24:28 +00:00
gradcheck.rst
hip.rst
large_scale_deployments.rst
modules.rst
mps.rst
multiprocessing.rst
numerical_accuracy.rst [Doc] fix some typos (found by codespell and typos) (#132544) 2024-08-05 17:21:56 +00:00
randomness.rst
serialization.rst Add torch.serialization.skip_data context manager (#134504) 2024-09-05 16:53:39 +00:00
windows.rst