pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Pritam Damania	359c39b3c2	Use global lock instead of per instance lock. (#31404 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/31404 Multiple "trainers" could each create different instances of DistributedOptimizer, which means we can still have a race condition unless we do a trully global per worker lock. ghstack-source-id: 95874624 Test Plan: run unit tests -- unfortunatelly due to the non-deterministic behavior it's not clear how to unit test this properly. Differential Revision: D19154248 fbshipit-source-id: fab6286c17212f534f1bd1cbdf9f0de002d48c74	2019-12-18 09:22:54 -08:00
Alisson Gusatti Azzolini	07e14c7cd0	DistributedOptimizer: wait for all workers to finish _LocalOptimizer constructor (#30062 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30062 This allows to catch exceptions during optimizer creation. ghstack-source-id: 94232436 Test Plan: new unit test. Differential Revision: D18586108 fbshipit-source-id: 71cfdf337fe803dbea8787b4c68e5a52b70a1f68	2019-11-19 18:30:00 -08:00
Pritam Damania	5d69bc1eda	Add docs for distributed optimizer. (#29971 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29971 ghstack-source-id: 94132160 Test Plan: waitforbuildbot Differential Revision: D18554631 fbshipit-source-id: c4485f7cff5159f423d0f35d1caf71074b62dc28	2019-11-18 18:51:26 -08:00
Alisson Gusatti Azzolini	b0cf43b2dd	Simple distributed optimizer (#29304 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29304 Implements a simple python distributed optimizer that takes rrefs to parameters that will be optimized. It keeps instances of optimizers remotely and calling step on distributed optimizer will call step on each of the remote optimizers in parallel. ghstack-source-id: 93564364 Test Plan: unit tests. Differential Revision: D18354586 fbshipit-source-id: 85d4c8bfec4aa38d2863cda704d024692511cff5	2019-11-11 12:02:24 -08:00

4 Commits