pytorch/torch/csrc/distributed
Wanchao Liang 976f8bee94 [c10d] add ncclGetLastError to NCCL pg (#83724)
This PR add ncclGetLastError API to the nccl pg, to provide better error
reporting out of nccl failures directly, instead of guessing on random
reasons

Differential Revision: [D39161199](https://our.internmc.facebook.com/intern/diff/D39161199)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83724
Approved by: https://github.com/kwen2501, https://github.com/H-Huang
2022-09-14 23:21:33 +00:00
..
autograd canonicalize includes of form <aten/src/ATen/...> 2022-06-16 17:46:45 +00:00
c10d [c10d] add ncclGetLastError to NCCL pg (#83724) 2022-09-14 23:21:33 +00:00
rpc fix [rpc] Wrong usage of RRefContext::handleException #71458 (#83166) 2022-09-08 18:22:51 +00:00