pytorch/torch/csrc/distributed
Yifu Wang 6e1ba79b7f [re-land] Introduce 3 low-latency, intra-node allreduce algorithms for small messages to PyTorch (#114001) (#116125)
This is an attempt to re-land https://github.com/pytorch/pytorch/pull/114001. The previous attempt used `std::array` in cuda kernels which wasn't compatible with Meta's internal build.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116125
Approved by: https://github.com/yf225
2023-12-20 07:13:50 +00:00
..
autograd [Reland] Elimates c10::guts::to_string (#108748) 2023-09-07 13:35:17 +00:00
c10d [re-land] Introduce 3 low-latency, intra-node allreduce algorithms for small messages to PyTorch (#114001) (#116125) 2023-12-20 07:13:50 +00:00
rpc [CI] Update clang-format (#116002) 2023-12-18 14:58:46 +00:00