pytorch/docs/source/notes
Emilio Castillo c9d4390d13 Add Pluggable CUDA allocator backend (#86786)
Fixes #43144

This uses the Backend system added by [82682](https://github.com/pytorch/pytorch/pull/82682) to change allocators dynamically during the code execution. This will allow us to use RMM, use CUDA managed memory for some portions of the code that do not fit in GPU memory. Write static memory allocators to reduce fragmentation while training models and improve interoperability with external DL compilers/libraries.

For example, we could have the following allocator in c++

```c++
#include <sys/types.h>
#include <cuda_runtime_api.h>
#include <iostream>

extern "C" {
void* my_malloc(ssize_t size, int device, cudaStream_t stream) {
   void *ptr;
   std::cout<<"alloc "<< size<<std::endl;
   cudaMalloc(&ptr, size);
   return ptr;
}

void my_free(void* ptr) {
   std::cout<<"free "<<std::endl;
   cudaFree(ptr);
}
}
```

Compile it as a shared library
```
nvcc allocator.cc -o alloc.so -shared --compiler-options '-fPIC'
```

And use it from PyTorch as follows

```python
import torch

# Init caching
# b = torch.zeros(10, device='cuda')
new_alloc = torch.cuda.memory.CUDAPluggableAllocator('alloc.so', 'my_malloc', 'my_free')
old = torch.cuda.memory.get_current_allocator()
torch.cuda.memory.change_current_allocator(new_alloc)
b = torch.zeros(10, device='cuda')
# This will error since the current allocator was already instantiated
torch.cuda.memory.change_current_allocator(old)
```

Things to discuss
- How to test this, needs compiling external code ...

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86786
Approved by: https://github.com/albanD
2022-11-23 17:54:36 +00:00
..
amp_examples.rst [AMP] Use generic autocast in example, specify dtype (#79579) 2022-06-17 21:32:51 +00:00
autograd.rst Fix typo under docs directory and RELEASE.md (#85896) 2022-09-29 21:41:59 +00:00
broadcasting.rst
cpu_threading_runtimes.svg
cpu_threading_torchscript_inference.rst
cpu_threading_torchscript_inference.svg
cuda.rst Add Pluggable CUDA allocator backend (#86786) 2022-11-23 17:54:36 +00:00
ddp.rst
extending.rst [docs] Fix ScalarTensor __repr__ in Extending PyTorch example (#86330) 2022-10-17 20:01:10 +00:00
faq.rst
gradcheck.rst
hip.rst Introduce TORCH_DISABLE_GPU_ASSERTS (#84190) 2022-11-04 04:43:05 +00:00
large_scale_deployments.rst
modules.rst Fix typo under docs directory (#87583) 2022-10-24 23:52:44 +00:00
mps.rst update mps note with more details (#78669) 2022-06-02 20:53:19 +00:00
multiprocessing.rst
numerical_accuracy.rst Add a note on the stability of linalg functions. (#88313) 2022-11-07 22:44:23 +00:00
randomness.rst Improve reproducibility docs for RNG (#78849) 2022-06-06 14:53:59 +00:00
serialization.rst [DOC] Missing line in serialization notes (#79454) 2022-06-17 18:26:47 +00:00
windows.rst