pytorch/torch/csrc/distributed/c10d
2025-03-17 22:43:15 +00:00
..
control_collectives
control_plane [19/N] Fix extra warnings brought by clang-tidy-17 (#144448) 2025-01-09 15:58:05 +00:00
cuda [codemod] Remove unused-variable in caffe2/torch/csrc/distributed/c10d/cuda/AsyncMM.cu (#148501) 2025-03-07 00:33:39 +00:00
quantization
Backend.cpp
Backend.hpp c10d/ProcessGroup: cleanup abort and shutdown (#148798) 2025-03-08 18:33:18 +00:00
Backoff.cpp
Backoff.hpp
c10d.h
comm.cpp
comm.hpp
CudaDMAConnectivity.cpp
CUDASymmetricMemory-inl.h Support SymmetricMemory's signaling kernels on sm60 and sm70 (#146308) 2025-02-21 15:29:02 +00:00
CUDASymmetricMemory.cu [c10d] Restrict use condition of NCCL mem pool (#147764) 2025-02-26 03:40:00 +00:00
CUDASymmetricMemory.hpp [SymmetricMemory] support specifying group_name at rendezvous time (#139529) 2024-11-17 09:31:17 +00:00
CUDASymmetricMemoryOps.cu Support SymmetricMemory's signaling kernels on sm60 and sm70 (#146308) 2025-02-21 15:29:02 +00:00
debug.cpp
debug.h
default_comm_hooks.cpp
default_comm_hooks.hpp
DMAConnectivity.cpp [19/N] Fix extra warnings brought by clang-tidy-17 (#144448) 2025-01-09 15:58:05 +00:00
DMAConnectivity.hpp
error.h
exception.h [BE] TCPStore: use typed errors for assertions (#147647) 2025-02-24 20:58:10 +00:00
FakeProcessGroup.hpp Add support for non functional collectives under FakeTensorMode and fake_pg for memory tracking (#147566) 2025-03-08 18:00:49 +00:00
FileStore.cpp Enable clang-tidy on torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp (#143806) 2025-01-24 12:22:13 +00:00
FileStore.hpp
FlightRecorder.cpp [c10d] Flush file in file recorder (#145458) 2025-01-27 23:15:52 +00:00
FlightRecorder.hpp [2/N] Rename NCCLTraceBuffer to FlightRecorder (#141712) 2024-11-29 21:15:31 +00:00
Functional.cpp Optimize shard_dim_alltoall to use alltoall_single (#148868) 2025-03-10 18:38:12 +00:00
Functional.hpp
GlooDeviceFactory.cpp [Reland][Environment Variable][4/N] Use thread-safe getenv functions (#140593) 2025-01-28 20:51:49 +00:00
GlooDeviceFactory.hpp
GroupRegistry.cpp Remove some NOLINT (#146610) 2025-02-07 01:50:06 +00:00
GroupRegistry.hpp
HashStore.cpp
HashStore.hpp
init.cpp Revert "[PGNCCL] Launch kernel on current stream & remove record_stream entirely (#148590)" 2025-03-17 22:43:15 +00:00
intra_node_comm.cpp Fix compile errors (#148758) 2025-03-08 04:56:42 +00:00
intra_node_comm.cu [IntraNodeComm] fix a recent breakage (#141200) 2024-11-26 00:46:38 +00:00
intra_node_comm.hpp
logger.cpp [4/N] Remove unnecessary once flag usage (#146783) 2025-02-11 13:55:06 +00:00
logger.hpp [fr][c10d] log trace capture enabled or not in flight recorder (#143865) 2024-12-27 03:07:55 +00:00
logging.cpp
logging.h Enable clang-tidy on torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp (#143806) 2025-01-24 12:22:13 +00:00
NanCheck.cu catch tensor.numel() == 0 in nan detector (#140741) 2024-11-15 05:03:20 +00:00
NanCheck.hpp
NCCLUtils.cpp [codemod] Fix unused-value issue in caffe2/aten/src/ATen/cuda/detail/CUDAHooks.cpp +4 (#147555) 2025-03-01 19:46:13 +00:00
NCCLUtils.hpp [DDP] Use NCCL allocated memory for gradient bucket (#146589) 2025-02-10 05:23:11 +00:00
Ops.cpp Revert "[PGNCCL] Launch kernel on current stream & remove record_stream entirely (#148590)" 2025-03-17 22:43:15 +00:00
ParamCommsUtils.cpp
ParamCommsUtils.hpp
PrefixStore.cpp
PrefixStore.hpp
ProcessGroup.cpp
ProcessGroup.hpp Revert "[PGNCCL] Launch kernel on current stream & remove record_stream entirely (#148590)" 2025-03-17 22:43:15 +00:00
ProcessGroupGloo.cpp Use task submitter TLS in gloo working threads (#142184) 2024-12-06 17:03:17 +00:00
ProcessGroupGloo.hpp Use task submitter TLS in gloo working threads (#142184) 2024-12-06 17:03:17 +00:00
ProcessGroupMPI.cpp [2/N] Remove unnecessary once flag usage (#145057) 2025-01-23 09:48:46 +00:00
ProcessGroupMPI.hpp c10d/ProcessGroup: cleanup abort and shutdown (#148798) 2025-03-08 18:33:18 +00:00
ProcessGroupNCCL.cpp Revert "[PGNCCL] Launch kernel on current stream & remove record_stream entirely (#148590)" 2025-03-17 22:43:15 +00:00
ProcessGroupNCCL.hpp Revert "[PGNCCL] Launch kernel on current stream & remove record_stream entirely (#148590)" 2025-03-17 22:43:15 +00:00
ProcessGroupUCC.cpp Cleanup CallOnce.h (#146700) 2025-02-07 16:44:45 +00:00
ProcessGroupUCC.hpp [c10d][UCC] Add _reduce_scatter_base to c10d::ProcessGroupUCC (#138021) 2024-12-09 16:02:24 +00:00
ProcessGroupWrapper.cpp
ProcessGroupWrapper.hpp
PyProcessGroup.hpp c10d/ProcessGroup: cleanup abort and shutdown (#148798) 2025-03-08 18:33:18 +00:00
python_comm_hook.cpp
python_comm_hook.h
RankLocal.hpp
reducer_cuda.cpp Fix compile errors (#148758) 2025-03-08 04:56:42 +00:00
reducer_timer.hpp
reducer.cpp [reland][ca] side-effect free inital trace: compiled_args (#148376) 2025-03-11 01:57:36 +00:00
reducer.hpp Enable clang-tidy on torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp (#143806) 2025-01-24 12:22:13 +00:00
sequence_num.cpp [4/N] Apply bugprone-unchecked-optional-access (#142832) 2024-12-12 04:33:32 +00:00
sequence_num.hpp
socket_fmt.h
socket.cpp Remove unnecessary once flag usage (#143255) 2025-01-16 02:36:11 +00:00
socket.h
Store.cpp
Store.hpp
SymmetricMemory.cpp [SymmetricMemory] introduce multimem_all_gather (#142810) 2024-12-17 01:07:27 +00:00
SymmetricMemory.hpp [torch/distributed] Make _SymmetricMemory.has_multicast_support() ret… (#141598) 2024-11-26 23:36:32 +00:00
TCPStore.cpp Fix dist.init_process_group on windows (#148266) 2025-03-05 00:07:56 +00:00
TCPStore.hpp
TCPStoreBackend.cpp [19/N] Fix extra warnings brought by clang-tidy-17 (#144448) 2025-01-09 15:58:05 +00:00
TCPStoreBackend.hpp
TCPStoreLibUvBackend.cpp [BE] TCPStore: use typed errors for assertions (#147647) 2025-02-24 20:58:10 +00:00
TraceUtils.h [pgnccl][simple] log started work numel (#139773) 2024-11-05 23:11:19 +00:00
Types.hpp Revert "[PGNCCL] Launch kernel on current stream & remove record_stream entirely (#148590)" 2025-03-17 22:43:15 +00:00
UCCTracing.cpp
UCCTracing.hpp
UCCUtils.cpp
UCCUtils.hpp [3/N] Replace c10::sv with std::sv (#139861) 2024-11-07 20:03:57 +00:00
UnixSockUtils.hpp
Utils.cpp Code Refactoring for getting start and stride from global ranks (#147230) 2025-02-21 10:02:50 +00:00
Utils.hpp Code Refactoring for getting start and stride from global ranks (#147230) 2025-02-21 10:02:50 +00:00
WinSockUtils.hpp
Work.cpp Enable more readability-redundant checks (#143963) 2024-12-30 14:49:33 +00:00
Work.hpp