mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 00:21:07 +01:00
Altering the flag to use the correct streamType in CUDAPluggableAllocator class for ROCm gpu. The flag TORCH_HIP_VERSION does not work for ROCm as intended. This flag is replaced with USE_ROCM. This is impacting Distributed Fused Adam in Rocm/APEX when using nccl_ub feature. This has been tested with rocm/apex. See PR https://github.com/ROCm/apex/pull/184 Pull Request resolved: https://github.com/pytorch/pytorch/pull/150010 Approved by: https://github.com/jeffdaily |
||
|---|---|---|
| .. | ||
| shared | ||
| comm.cpp | ||
| comm.h | ||
| CUDAPluggableAllocator.cpp | ||
| CUDAPluggableAllocator.h | ||
| device_set.h | ||
| Event.cpp | ||
| Event.h | ||
| GdsFile.cpp | ||
| GdsFile.h | ||
| Graph.cpp | ||
| memory_snapshot.cpp | ||
| memory_snapshot.h | ||
| MemPool.cpp | ||
| Module.cpp | ||
| Module.h | ||
| nccl.cpp | ||
| nccl.h | ||
| python_comm.cpp | ||
| python_comm.h | ||
| python_nccl.cpp | ||
| python_nccl.h | ||
| Stream.cpp | ||
| Stream.h | ||
| Tensor.cpp | ||
| THCP.h | ||
| utils.cpp | ||