pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
SsnL	d5236f8517	Avoid initializing unnecessary tensors in nccl.reduce (#39688 ) Summary: While working on https://github.com/pytorch/pytorch/issues/38911, I realized that `nccl.reduce` only needs a single output tensor, while our current implementation requires a list of output tensors. This, along with a TODO I fixed in reduce_add, should have some speed up for data parallel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/39688 Differential Revision: D22034547 Pulled By: mrshenli fbshipit-source-id: e74d54d673ebbb062474b1bb5cc93a095a3a5f6c	2020-06-14 10:11:32 -07:00
Sam Gross	bcfe259f83	Add streams and comms as optional arguments (#3968 ) Adds streams and comms as optional arguments to the NCCL calls in torch.cuda.nccl. Also exposes ncclUniqueId and ncclCommInitRank for multi-process mode. Moves Py_RETURN_NONE statements after the GIL is re-acquired.	2017-12-04 13:51:22 -05:00
SsnL	01be4d6b20	sparse broadcast_coalesce and reduce_add_coalesced	2017-10-28 18:52:35 -04:00
Soumith Chintala	efe91fb9c1	delete redundant python nccl code	2017-10-09 22:24:18 -04:00
Soumith Chintala	e9dccb3156	implement all_reduce, broadcast, all_gather, reduce_scatter	2017-10-09 22:24:18 -04:00
Soumith Chintala	4d62933529	add initial NCCL C bindings	2017-10-09 22:24:18 -04:00
Christian Sarofeen	ec86d0b2ba	Updates for CUDA 9	2017-08-25 07:32:05 -04:00
Adam Paszke	8ab3d214d5	Fixes for DistributedDataParallel (#2168 )	2017-07-21 16:00:46 -04:00
Adam Paszke	8db8716c7c	Support non-default streams in NCCL reduce	2017-06-12 21:58:38 -04:00
Sam Gross	b9379cfab7	Use cuDNN and NCCL symbols from _C library (#1017 ) This ensures that we use the same library at the C++ level and with Python ctypes. It moves the searching for the correct library from run-time to compile-time.	2017-03-16 16:10:17 -04:00
Sam Gross	fc6fcf23f7	Lock the cudaFree mutex. (#880 ) Prevents NCCL calls from overlapping with cudaFree() which can lead to deadlocks.	2017-03-01 11:29:25 -05:00
Luke Yeager	e7c1e6a8e3	[pep8] Fix most lint automatically with autopep8 Here's the command I used to invoke autopep8 (in parallel!): git ls-files \| grep '\.py$' \| xargs -n1 -P`nproc` autopep8 -i Several rules are ignored in setup.cfg. The goal is to let autopep8 handle everything which it can handle safely, and to disable any rules which are tricky or controversial to address. We may want to come back and re-enable some of these rules later, but I'm trying to make this patch as safe as possible. Also configures flake8 to match pep8's behavior. Also configures TravisCI to check the whole project for lint.	2017-01-28 01:15:51 +01:00
Natalia Gimelshein	2290798a83	if nccl is available, do not compile it and load system version	2017-01-14 10:09:48 +01:00
Sam Gross	0cb5943be8	Fix NCCL reduce_scatter in Python 2.7 (#183 )	2016-10-30 17:58:02 -04:00
Sam Gross	f30081a313	Use NCCL bcast and reduce functions in comm	2016-10-14 14:16:32 -07:00

15 Commits