mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Summary: The MPI async work class returned a temporary as reference, which is invalid (hat tip to colesbury for noticing it). This change fixes that and uses a std::exception_ptr to hold on to the exception if applicable, and then returns the reference by throwing it and returning it, like the existing code path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11947 Differential Revision: D10019928 Pulled By: pietern fbshipit-source-id: 5a8ed0e894615a09224ca5e48c8b3104275a3019 |
||
|---|---|---|
| .. | ||
| bin | ||
| cmake | ||
| example | ||
| private | ||
| test | ||
| CMakeLists.txt | ||
| CUDAUtils.cpp | ||
| CUDAUtils.hpp | ||
| Def.hpp | ||
| FileStore.cpp | ||
| FileStore.hpp | ||
| NCCLUtils.hpp | ||
| PrefixStore.cpp | ||
| PrefixStore.hpp | ||
| ProcessGroup.cpp | ||
| ProcessGroup.hpp | ||
| ProcessGroupGloo.cpp | ||
| ProcessGroupGloo.hpp | ||
| ProcessGroupMPI.cpp | ||
| ProcessGroupMPI.hpp | ||
| ProcessGroupNCCL.cpp | ||
| ProcessGroupNCCL.hpp | ||
| README.md | ||
| Store.cpp | ||
| Store.hpp | ||
| TCPStore.cpp | ||
| TCPStore.hpp | ||
| Types.hpp | ||
| Utils.cpp | ||
| Utils.hpp | ||
THD refactor
This is a work in progress. It is separate from the main THD directory to avoid disrupting THD users or have to deal with backwards compat early on. Once this gets to a usable state, we'll add Python bindings and a compat layer.
See https://github.com/pytorch/pytorch/issues/7434 for the main issue.
This tree is intentionally not part of the main build and will be
buildable/testable in isolation, as long as ATen is available in
<repository root>/torch/lib/tmp_install.
To build and install ATen here, navigate to the root of this repository and run:
tools/build_pytorch_libs.sh --with-cuda ATen