pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-08 07:39:33 +01:00

Author	SHA1	Message	Date
Christopher Gray Howard	dfa7225a38	[Pytorch][Bootcamp] Add fix and testing for non-vectorized Adadelta optimizer to handle complex numbers (#66587 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66587 Made some changes in the step function of the non-vectorized Adadelta optimizer to handle complex numbers as two real numbers as per 65711 on github ghstack-source-id: 141484731 Test Plan: buck test mode/dev caffe2/test:optim -- 'test_adadelta_complex' https://pxl.cl/1R7kJ Reviewed By: albanD Differential Revision: D31630069 fbshipit-source-id: 2741177b837960538ce39772897af36bbce7b7d8	2021-10-26 17:35:01 -07:00
Christopher Gray Howard	acb340de75	[Pytorch][Bootcamp] Add fixes and vanilla testing for Adagrad non-vectorized and vectorized optimizers to handle complex numbers (#66671 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66671 Made changes in the step function of the vectorized and non-vectorized adagrad optimizers to handle complex numbers as two real numbers as per 65711 on github ghstack-source-id: 141442350 Test Plan: buck test mode/dev caffe2/test:optim -- 'test_adagrad_complex' https://pxl.cl/1Rd44 Reviewed By: albanD Differential Revision: D31673503 fbshipit-source-id: 90a0d0c69b556716e2d17c59ce80f09c750fc464	2021-10-25 10:13:21 -07:00
Ilqar Ramazanli	5ed6e4429e	To fix variance computation for complex Adam (#62946 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/59998 It has been discussed in the issue that the variance term of Adam optimizer currently doesn't compute correctly for complex domain. As it has been stated in the Generalization to Complex numbers section in https://en.wikipedia.org/wiki/Variance variance is computed as E[(X - mu)(X-mu)] (where mu = E[X] and stands for conjugate) for complex random variable X. However, currently the computation method in implementation of Adam is via E[(X - mu)(X-mu)] which doesn't return right variance value, in particular it returns complex number. Variance is defined to be real number even though underlying random variable is complex. We fix this issue here, and testing that resulting variance is indeed real number. Pull Request resolved: https://github.com/pytorch/pytorch/pull/62946 Reviewed By: albanD Differential Revision: D30196038 Pulled By: iramazanli fbshipit-source-id: ab0a6f31658aeb56bdcb211ff86eaa29f3f0d718	2021-08-09 17:54:43 -07:00
Ilqar Ramazanli	7c2938bf67	To refactor Sparse Adam algorithm for functional form (#59171 ) Summary: Adds Functional Interface for Sparse Adam Optimizer. Pull Request resolved: https://github.com/pytorch/pytorch/pull/59171 Reviewed By: vincentqb Differential Revision: D29360582 Pulled By: iramazanli fbshipit-source-id: 5ceffd7f4b7abd1e0b758a5b8445abdf5555eba0	2021-06-25 06:35:39 -07:00
Ilqar Ramazanli	63219f1f9f	To add Rectified Adam Algorithm to Optimizers (#58968 ) Summary: Fixes : https://github.com/pytorch/pytorch/issues/24892 In the paper : https://arxiv.org/pdf/1908.03265.pdf Liyuan Liu et al. suggested a new optimization algorithm with an essence of similar to Adam Algorithm. It has been discussed in the paper that, without warmup heuristic, in the early stage of adaptive optimization / learning algorithms sometimes we can get undesirable large variance which can slow overall convergence process. Authors proposed the idea of rectification of variance of adaptive learning rate when it is expected to be high. Differing from the paper, we selected variance tractability cut-off as 5 instead of 4. This adjustment is common practice, and could be found in the code-repository and also tensorflow swift optim library as well : `2f03dd1970/radam/radam.py (L156)` `f51ee4618d/Sources/TensorFlow/Optimizers/MomentumBased.swift (L638)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/58968 Reviewed By: vincentqb Differential Revision: D29310601 Pulled By: iramazanli fbshipit-source-id: b7bd487f72f1074f266687fd9c0c6be264a748a9	2021-06-23 18:27:57 -07:00
Ilqar Ramazanli	e8690dacb2	To add Nesterov Adam Algorithm to Optimizers (#59009 ) Summary: Fixes : https://github.com/pytorch/pytorch/issues/5804 In the paper : https://openreview.net/forum?id=OM0jvwB8jIp57ZJjtNEZ Timothy Dozat suggested a new optimization algorithm with an essence of combination of NAG and Adam algorithms. It is known that the idea of momentum can be improved with the Nesterov acceleration in optimization algorithms, and Dozat is investigating to apply this idea to momentum component of Adam algorithm. Author provided experiment evidence in their work to show excellence of the idea. In this PR we are implementing the proposed algorithm NAdam in the mentioned paper. Author has a preliminary work http://cs229.stanford.edu/proj2015/054_report.pdf where he shows the decay base constant should be taken as 0.96 which we also followed the same phenomenon here in this implementation similar to Keras. Moreover, implementation / coding practice have been followed similar to Keras in some other places as well: `f9d3868495/tensorflow/python/keras/optimizer_v2/nadam.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/59009 Reviewed By: gchanan, vincentqb Differential Revision: D29220375 Pulled By: iramazanli fbshipit-source-id: 4b4bb4b15f7e16f7527f368bbf4207ed345751aa	2021-06-23 08:21:43 -07:00
Sam Estep	1abf45e37f	Revert D29241736: [pytorch][PR] To add Rectified Adam Algorithm to Optimizers Test Plan: revert-hammer Differential Revision: D29241736 (`0d2a936176`) Original commit changeset: 288b9b1f3125 fbshipit-source-id: 56c4ec98647c6f1822b130726741a1c9ca193670	2021-06-22 12:08:31 -07:00
Ilqar Ramazanli	0d2a936176	To add Rectified Adam Algorithm to Optimizers (#58968 ) Summary: Fixes : https://github.com/pytorch/pytorch/issues/24892 In the paper : https://arxiv.org/pdf/1908.03265.pdf Liyuan Liu et al. suggested a new optimization algorithm with an essence of similar to Adam Algorithm. It has been discussed in the paper that, without warmup heuristic, in the early stage of adaptive optimization / learning algorithms sometimes we can get undesirable large variance which can slow overall convergence process. Authors proposed the idea of rectification of variance of adaptive learning rate when it is expected to be high. Differing from the paper, we selected variance tractability cut-off as 5 instead of 4. This adjustment is common practice, and could be found in the code-repository and also tensorflow swift optim library as well : `2f03dd1970/radam/radam.py (L156)` `f51ee4618d/Sources/TensorFlow/Optimizers/MomentumBased.swift (L638)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/58968 Reviewed By: gchanan Differential Revision: D29241736 Pulled By: iramazanli fbshipit-source-id: 288b9b1f3125fdc6c7a7bb23fde1ea5c201c0448	2021-06-22 10:38:41 -07:00
Ilqar Ramazanli	9a622f4cd9	refactor ASGD to use functional API (#58410 ) Summary: Functional API is used in large scale distributed training to enable multithreaded training instead of multiprocess, as it gives more optimal resource utilization and efficiency. In this PR, we provide code migration and refactoring for functional API for ASGD algorithm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58410 Reviewed By: ailzhang Differential Revision: D28546702 Pulled By: iramazanli fbshipit-source-id: 4f62b6037d53f35b19f98340e88af2ebb6243a4f	2021-05-19 18:55:52 -07:00
Wanchao Liang	4611387608	[optim] take kw-only argument for functional optim APIs (#56185 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56185 ghstack-source-id: 126670123 Reviewed By: albanD Differential Revision: D27802169 fbshipit-source-id: f5e1cb2046dcdeecf5f6b0f70892828bf0adb22f	2021-04-15 20:08:04 -07:00
Wanchao Liang	8ef13cf976	[optim] refactor rprop to use functional API (#55832 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55832 ghstack-source-id: 126325541 Reviewed By: driazati Differential Revision: D27703877 fbshipit-source-id: 34d4ce7b7d124c0cd75e2f6d0bc8f836713b7301	2021-04-15 15:19:41 -07:00
Wanchao Liang	bb245b6444	[optim] refactor adamax to use functional API (#55830 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55830 ghstack-source-id: 126325537 Reviewed By: driazati Differential Revision: D26561017 fbshipit-source-id: 41273d200e546d4ac08d39b57865d63c624f143a	2021-04-15 15:19:39 -07:00
Vincent Quenneville-Belair	50d903f19f	[optim] make functional api be private (#51316 ) (#51665 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51665 This reverts commit `896f82aa92`. Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D26232608 Pulled By: vincentqb fbshipit-source-id: ca006baf4fb672c11c1bb003c39a29cbadb63dd3	2021-02-03 17:59:05 -08:00

13 Commits