pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-08 07:39:33 +01:00

Author	SHA1	Message	Date
Edward Yang	173f224570	Turn on F401: Unused import warning. (#18598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598 ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a Stack from [ghstack](https://github.com/ezyang/ghstack): * #18598 Turn on F401: Unused import warning. This was requested by someone at Facebook; this lint is turned on for Facebook by default. "Sure, why not." I had to noqa a number of imports in __init__. Hypothetically we're supposed to use __all__ in this case, but I was too lazy to fix it. Left for future work. Be careful! flake8-2 and flake8-3 behave differently with respect to import resolution for # type: comments. flake8-3 will report an import unused; flake8-2 will not. For now, I just noqa'd all these sites. All the changes were done by hand. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14687478 fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3	2019-03-30 09:01:17 -07:00
Tongzhou Wang	540ef9b1fc	Add distributed get_backend (#11715 ) Summary: I have no idea how to run distributed tests locally so I'll let CI do this. Hopefully everything still works with `IntEnum`. cc mcarilli Pull Request resolved: https://github.com/pytorch/pytorch/pull/11715 Reviewed By: pietern Differential Revision: D9889646 Pulled By: SsnL fbshipit-source-id: 1e2a487cb6fe0bd4cc67501c9d72a295c35693e2	2018-09-18 10:56:24 -07:00
Teng Li	0988bbad2d	C10d release to torch.distributed for PT1 (#11405 ) Summary: The old `torch.distributed` will go to `torch.distributed.deprecated` The old DDP will go to `torch.nn.parallel.deprecated` Now `torch.nn.parallel.DDP` will use c10d DDP Now `torch.distributed` will use C10d frontend API Pull Request resolved: https://github.com/pytorch/pytorch/pull/11405 Reviewed By: pietern Differential Revision: D9733733 Pulled By: teng-li fbshipit-source-id: d6a3f3e73f8d3a7fcb1f4baef53c78063b8cbb08	2018-09-10 23:27:22 -07:00
Tongzhou Wang	8e33451e2e	Make torch.cuda.* take device objects; Update distributed docs (#10833 ) Summary: Commits: 1. Make `torch.cuda.*` take device objects 2. Update `torch.distributed` docs to emphasize calling `torch.cuda.set_device` before `init_process_group` Pull Request resolved: https://github.com/pytorch/pytorch/pull/10833 Differential Revision: D9514241 Pulled By: SsnL fbshipit-source-id: 2497464305fb1e63d6c495291a5744aaa7e2696e	2018-08-27 15:24:42 -07:00
Tongzhou Wang	db7b7f1359	fix typo Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10686 Differential Revision: D9399874 Pulled By: SsnL fbshipit-source-id: 28130992d2416721552f72cfa835ff0358caeefa	2018-08-20 10:40:55 -07:00
Tongzhou Wang	3f603eeee8	some improvements on distributed docs Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10666 Differential Revision: D9395242 Pulled By: SsnL fbshipit-source-id: 952326b9c5a1a974a1c33a0e12738e1e21ad9956	2018-08-19 17:40:28 -07:00
Ailing Zhang	371a786b18	Errors out when Openmpi < 2.x.x with distributed. (#10015 ) Summary: This PR fixes #9418 . Openmpi 1.10 segfaults in MPI_Bcast with CUDA buffer. And it's a retired openmpi version. I've tested on 2.1.1 and 3.0.0 and they work well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10015 Reviewed By: soumith Differential Revision: D9088103 Pulled By: ailzhang fbshipit-source-id: fc0a45e5cd016093ef0dbb9f371cbf67170d7045	2018-07-31 12:24:40 -07:00
Teng Li	f5beff334b	Added distributed docs on NCCL2 backend/functions and launch module (#6579 )	2018-04-15 21:53:10 -04:00
Teng Li	5c65466b86	Release NCCL distributed backend from experimental (#4921 ) * Release NCCL distributed backend from experimental * fix typo	2018-01-30 16:21:21 +01:00
Teng Li	a3b098dcf9	Adding is process_group initialized support (#4618 )	2018-01-12 22:56:54 +01:00
Teng Li	926ed2b280	Implemented NCCL Distributed Backend for PyTorch with new dist APIs (#3435 ) * Implemented NCCL Distributed Backend for PyTorch with new dist APIs * Let FindNCCL to determine the NCCL version * Let NCCL2 Backend use ATEN instead deprecated THPP * Let distributed parallel model use a single reduction thread for NCCL backend * Caching the sockets, bug fix, refactoring, and addressed Adam's comments * Make BcastNcclID take a single param and bug fix for all_gather * Removed barrier function, added warning for users, and not exposing experimental func to users * Use the simplest single bucket working solution for distriubted data parallel model with rebase * Cleanup, fixes and further addressed Adam's comments * Used PySequence_Fast in distributed csrc * Removed the limitation that each group is only bound to a given device sequence * Used THPObjectPtr for PySequence_Fast	2017-11-29 15:57:02 -05:00
Adam Paszke	2a8603c5e1	Make distributed recv return sender rank	2017-09-25 12:11:52 -04:00
Scott Sievert	dd27997aeb	DOC: adding note about distributed MPI backend (#2750 )	2017-09-15 13:47:35 -04:00
Zhou Mo	2c07f88ea3	Fix typos.	2017-08-25 14:27:07 -04:00
Gregory Chanan	50c208a50b	Revert "Fix typos." This reverts commit `4622b33952`.	2017-08-10 13:57:00 -04:00
Zhou Mo	4622b33952	Fix typos.	2017-08-08 11:05:38 -04:00
Adam Paszke	575a4a98e0	Remove assertions with side effects	2017-07-20 01:45:57 -04:00
Adam Paszke	8915e2710c	Refactor scatter/gather and add distributed docs	2017-07-12 14:47:36 -04:00
Adam Paszke	714351ff39	Officially enable process-group mode	2017-06-12 22:02:11 -04:00
Adam Paszke	12813b88f6	Add DistributedDataParallel	2017-06-12 22:00:22 -04:00
Adam Paszke	5a0d5ec058	Add more checks in torch.distributed	2017-06-12 21:58:38 -04:00
Janusz Marcinkiewicz	34804e9600	Refactor file and tcp init methods * Add sanity checks * Refactor InitMethodFile and TCPInitMethod to more logical functions * Update few error messages * Add passing parameters by *kwargs, so now order of parameters is not relevant Review comments	2017-06-02 23:42:11 +02:00
Janusz Marcinkiewicz	c41555fb0a	Add rank parameter; Fix MW mode initalization	2017-06-02 23:42:11 +02:00
Janusz Marcinkiewicz	e685277299	Add address discovery; Bug fixes;	2017-06-02 23:42:11 +02:00
Janusz Marcinkiewicz	09c0d9c51c	Add multiple initalization methods for DataChannels	2017-06-02 23:42:11 +02:00
Adam Paszke	79232c24e2	Fixes after rebase	2017-01-31 01:58:09 +01:00
Janusz Marcinkiewicz	ac1f68127a	Add barrier, scatter, gather and allGather implementations + groups (#34 )	2017-01-31 01:58:09 +01:00
Adam Paszke	60d1852c7b	Major improvements to master-worker mode * Fixed all undefined symbol errors * Implemented storage interface and THStorage class * RPC improvements * Code refactor	2017-01-31 01:58:09 +01:00
Adam Paszke	ea876eb6d5	Add initial bindings for master-worker mode	2017-01-31 01:58:09 +01:00
Janusz Marcinkiewicz	5e6fcd02b5	Implement data channel groups (#25 )	2017-01-31 01:58:09 +01:00
Adam Paszke	55632d81d2	Add Python wrappers for process group mode	2017-01-31 01:58:09 +01:00

31 Commits