pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-08 07:39:33 +01:00

History

Dan Johnson 3c97b0ab00 Use ncclAlltoAllv and ncclAlltoAll API when supported (#134499 ) NCCL does not have an api for ncclAllToAll and ncclAllToAllv, so PyTorch does point to point send/recv. Expose this API if it is supported. Differential Revision: [D61683836](https://our.internmc.facebook.com/intern/diff/D61683836/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/134499 Approved by: https://github.com/shuqiangzhang, https://github.com/eqy		2024-09-16 20:08:06 +00:00
..
shared	[sparse] Add cuSPARSELt as a backend (#128534 )	2024-08-21 22:06:07 +00:00
comm.cpp
comm.h
CUDAPluggableAllocator.cpp	[Reland] Refactor caching device allocator utils (#130923 )	2024-09-07 11:14:17 +00:00
CUDAPluggableAllocator.h	[Reland] Refactor caching device allocator utils (#130923 )	2024-09-07 11:14:17 +00:00
device_set.h
Event.cpp	drop gil in couple places (leads to deadlocks) (#134910 )	2024-09-01 00:05:53 +00:00
Event.h
GdsFile.cpp	[Reland] Add wrappers for synchronous GPUDirect Storage APIs (#133489 )	2024-08-15 17:11:52 +00:00
GdsFile.h	[Reland] Add wrappers for synchronous GPUDirect Storage APIs (#133489 )	2024-08-15 17:11:52 +00:00
Graph.cpp
memory_snapshot.cpp	[Memory Snapshot] Skip C++ warmup unwind() call if context is not set (#133038 )	2024-08-13 17:25:24 +00:00
memory_snapshot.h
MemPool.cpp	Implements torch.cuda.MemPool() API (#131152 )	2024-08-01 01:29:30 +00:00
Module.cpp	[ROCm] Enable ROCm support for inductor's dynamic_rblock_scaling (#129663 )	2024-09-13 16:45:39 +00:00
Module.h
nccl.cpp	Use ncclAlltoAllv and ncclAlltoAll API when supported (#134499 )	2024-09-16 20:08:06 +00:00
nccl.h
python_comm.cpp
python_comm.h
python_nccl.cpp
python_nccl.h
Stream.cpp
Stream.h
Tensor.cpp
THCP.h
utils.cpp