pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

History

Nariaki Tateiwa 23a3cef5d9 [c10d] Add `_allgather_base` , `reduce_scatter` , and `_reduce_scatter_base` into ProcessGroupMPI to enable FSDP with MPI backend (#150162 ) This PR implements _allgather_base, reduce_scatter, and _reduce_scatter_base in the MPI backend (ProcessGroupMPI), enabling support for Fully Sharded Data Parallel (FSDP) in environments that use MPI for distributed communication. ### Context As noted in https://github.com/pytorch/pytorch/issues/85628, FSDP currently supports only the NCCL backend. Due to this limitation, FSDP cannot run on legacy HPC environments or clusters that rely on MPI. By implementing just these three collective operations, we can enable FSDP to work with the MPI backend. These collectives are implemented in a similar manner to existing operations such as allgather. ### Testing We validated this PR using pytorch/build/bin/ProcessGroupMPITest with OpenMPI, and all tests passed successfully. Pull Request resolved: https://github.com/pytorch/pytorch/pull/150162 Approved by: https://github.com/H-Huang		2025-04-14 19:31:38 +00:00
..
aoti_abi_check
aoti_inference	[AOTInductor] Add User Managed buffer for AOTI constant buffer. (#150276 )	2025-04-10 00:15:44 +00:00
api	Set requires grad in TensorMaker::make_tensor() (#148255 )	2025-03-29 08:06:42 +00:00
c10d	[c10d] Add `_allgather_base` , `reduce_scatter` , and `_reduce_scatter_base` into ProcessGroupMPI to enable FSDP with MPI backend (#150162 )	2025-04-14 19:31:38 +00:00
common
dist_autograd	Set RUNPATH so installed tests can find the required shared libraries (#136627 )	2024-10-25 09:38:08 +00:00
jit	[Profiler/Easy] Remove temp flag for on-demand Memory Snapshot (#151068 )	2025-04-11 18:50:25 +00:00
lazy	Introduce cache clearing APIs for the lazy graph executor (#144489 )	2025-01-29 17:38:01 +00:00
lite_interpreter_runtime
monitor
profiler	[codemod] Fix a few unused-variable issues in pytorch (#143517 )	2024-12-19 00:18:08 +00:00
rpc	[rpc] Fix unit test after c10::nullopt removal (#143690 )	2024-12-20 23:36:07 +00:00
tensorexpr	Fix floating point literals in IRPrinter (#142119 )	2024-12-18 21:59:48 +00:00
__init__.py