mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Max Wang 268859ce0d Fix CUDA stream syncing bug in allgather and reduce_scatter (#19631 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19631 ghimport-source-id: edc47e77d6ef03e966944ff98eefc22f2574eeaa Reviewed By: mrshenli Differential Revision: D15110077 Pulled By: mxw fbshipit-source-id: 27a68308ade5ea511e2ea568a071eedb5d21c1ba		2019-04-27 08:35:56 -07:00
..
bin	Revert "remove use of tmp_install" (#15847 )	2019-01-08 16:30:19 -08:00
example	FileStore auto deletes file and FileStore::add bug fix (#13708 )	2018-11-14 01:34:22 -08:00
test	Add support for reduce-scatter in c10d (#18844 )	2019-04-26 13:46:57 -07:00
CMakeLists.txt	Remove GLOO usage when USE_GLOO is OFF	2019-03-20 09:31:53 -07:00
FileStore.cpp	Canonicalize all includes in PyTorch. (#14849 )	2018-12-08 19:38:30 -08:00
FileStore.hpp	FileStore auto deletes file and FileStore::add bug fix (#13708 )	2018-11-14 01:34:22 -08:00
NCCLUtils.hpp	Working async version of AllGather, test fix and compiler warnings, and CI (#10932 )	2018-08-28 12:40:14 -07:00
PrefixStore.cpp	Canonicalize all includes in PyTorch. (#14849 )	2018-12-08 19:38:30 -08:00
PrefixStore.hpp	Adding setTimeout option in Store (#11265 )	2018-09-06 12:55:50 -07:00
ProcessGroup.cpp	Fix a few instances of notifying on a CV while holding the lock (#18857 )	2019-04-05 08:41:53 -07:00
ProcessGroup.hpp	Add support for reduce-scatter in c10d (#18844 )	2019-04-26 13:46:57 -07:00
ProcessGroupGloo.cpp	Add support for reduce-scatter in c10d (#18844 )	2019-04-26 13:46:57 -07:00
ProcessGroupGloo.hpp	Add support for reduce-scatter in c10d (#18844 )	2019-04-26 13:46:57 -07:00
ProcessGroupMPI.cpp	Add support for reduce-scatter in c10d (#18844 )	2019-04-26 13:46:57 -07:00
ProcessGroupMPI.hpp	Add support for reduce-scatter in c10d (#18844 )	2019-04-26 13:46:57 -07:00
ProcessGroupNCCL.cpp	Fix CUDA stream syncing bug in allgather and reduce_scatter (#19631 )	2019-04-27 08:35:56 -07:00
ProcessGroupNCCL.hpp	Add support for reduce-scatter in c10d (#18844 )	2019-04-26 13:46:57 -07:00
README.md	Revert "remove use of tmp_install" (#15847 )	2019-01-08 16:30:19 -08:00
Store.cpp	Make Store::setTimeout take milliseconds (#16278 )	2019-01-29 16:15:25 -08:00
Store.hpp	Make Store::setTimeout take milliseconds (#16278 )	2019-01-29 16:15:25 -08:00
TCPStore.cpp	TCP init method race condition fix (#15684 )	2019-01-18 02:29:38 -08:00
TCPStore.hpp	TCP init method race condition fix (#15684 )	2019-01-18 02:29:38 -08:00
Types.hpp	Add support for reduce-scatter in c10d (#18844 )	2019-04-26 13:46:57 -07:00
Utils.cpp	Fix c10d checking errno unconditionally (#15986 )	2019-01-14 16:02:05 -08:00
Utils.hpp	Propagate ProcessGroup timeout to Store (#16571 )	2019-04-09 12:36:28 -07:00

README.md

THD refactor

This is a work in progress. It is separate from the main THD directory to avoid disrupting THD users or have to deal with backwards compat early on. Once this gets to a usable state, we'll add Python bindings and a compat layer.

See https://github.com/pytorch/pytorch/issues/7434 for the main issue.

This tree is intentionally not part of the main build and will be buildable/testable in isolation, as long as ATen is available in <repository root>/torch/lib/tmp_install.

To build and install ATen here, navigate to the root of this repository and run:

tools/build_pytorch_libs.sh --with-cuda ATen