pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Wojciech Glogowski	fefd5479a3	Initial implementation of YellowFin algorithm Summary: Added YellowFin optimizer to Caffe2. This implemention is different from the original: It has separate alpha and mu for each parameter and it uses different version of Momentum SGD. Tests / benchmarks for the optimizer are to be done. Some refactor of the code is to be done before pushing. This is still a working version. Reviewed By: akyrola Differential Revision: D5652689 fbshipit-source-id: c10dc0424f47c3051b454aede1d121902cb759a8	2017-08-30 18:53:46 -07:00
Christopher Hay	cc3662e939	Added support for scaling learning rate of Caffe2 optimizers during training Summary: While there is currently support for scaling the base learning rate when loading the model, there is not support for scaling the base learning rate during training. This is needed for LATTE's seq2seq translation models, as the learning schedule is not predefined and is modified at runtime. Reviewed By: jhcross Differential Revision: D5701391 fbshipit-source-id: ae3bec45f238db1a2be7af9c04d720067e9095d5	2017-08-25 19:04:47 -07:00
Christopher Hay	ad07f5f05d	Added norm-based gradient clipping to optimizer library Summary: Moved code for global norm-based gradient clipping from fb specific workflows (seq2seq) to the open-source caffe2 optimizer library Reviewed By: jhcross Differential Revision: D5637453 fbshipit-source-id: 7e73c9a1c97c28a152c188467b27a6449f79242e	2017-08-24 10:17:50 -07:00
Yangqing Jia	cc2c4d07d6	Always use assertAlmostEqual for floats when crossing python and C boundaries Summary: This fixes travis numerical issue. Closes https://github.com/caffe2/caffe2/pull/1024 Differential Revision: D5571340 Pulled By: Yangqing fbshipit-source-id: 097e6f91da68cc3eacf21fe109f342e0dddea189	2017-08-06 14:51:11 -07:00
Ahmed Taei	bcce1bd04a	Fix optimizer_context OSS test Summary: This will fix the test by querying how many instances of the optimizer are already created. Because OSS tests doesn't run in isolation causing number of created instances of optimizer to be >= 0. Reviewed By: akyrola Differential Revision: D5462433 Tags: easy fbshipit-source-id: 7a9ab4fe5345f5d5138abb461ba7a990d9ace840	2017-07-20 12:21:09 -07:00
Ahmed Taei	13980d2bb5	Set device to the default device(CPU) when DeviceContext is None. Summary: Fix case when optimizer isn't called within a device scope context. Fix OptimizerContext lr blob names Reviewed By: volkhin Differential Revision: D5421046 fbshipit-source-id: 186a0d05f40d4442c5ba5736084626da73a0c0f1	2017-07-13 17:54:36 -07:00
Ahmed Taei	3faca65adf	Add a unit-test to validate sharing learning rate between Reviewed By: kennyhorror Differential Revision: D5413387 fbshipit-source-id: ff4022375183394ca9cee6faea5ac46e56079b86	2017-07-12 21:53:25 -07:00
Tao Wu	b9e64ecef1	allow param_info to set optimizer Summary: this diff adds optimizer into param_info, and the associated implementations for modelhelper and brew to set optimizer for each individual parameter. Reviewed By: kennyhorror Differential Revision: D5385432 fbshipit-source-id: 5d682f9d1ab077e04a5d76a24d71470f4e64fc92	2017-07-12 08:49:48 -07:00
Aapo Kyrola	401908d570	add_weight_decay + restore weight decay to resnet50_trainer Summary: Add add_weight_decay to optimizer + test. In D5142973 I accidentally removed weight decay from resnet50 trainer, so this restores it. Reviewed By: asaadaldien Differential Revision: D5173594 fbshipit-source-id: c736d8955eddff151632ae6be11afde0883f7531	2017-06-02 14:16:56 -07:00
Simon Layton	58874ad5bf	Fp16 training initializers Summary: Re-open for re-importing :) Closes https://github.com/caffe2/caffe2/pull/721 Differential Revision: D5164345 Pulled By: akyrola fbshipit-source-id: e80b32556cd25610602df91a4225b93edc0ca40b	2017-06-01 08:34:46 -07:00
Aapo Kyrola	0f8c8f37a8	Revert D5159712: [caffe2][PR] Fp16 training initializers Summary: This reverts commit 60a889494d2e2f4df1d720331e19f638c5eb95cc Differential Revision: D5159712 fbshipit-source-id: 16040c911b260648857f656f92b165f92c2daae0	2017-06-01 00:17:14 -07:00
Simon Layton	2bfacff426	Fp16 training initializers Summary: Adds support for generating and training pfp16 models. Added SGD optimizer for multi-precision trainers and a new callback to data_parallel_model in order to help multi-precision models keep their different copies of parameters in sync during training. Closes https://github.com/caffe2/caffe2/pull/697 Differential Revision: D5159712 Pulled By: salexspb fbshipit-source-id: 60a889494d2e2f4df1d720331e19f638c5eb95cc	2017-05-31 17:46:58 -07:00
Aapo Kyrola	44257ea5ed	automatically infer device scope for param Summary: hankun is using the optimizer, but having mixed set of of GPU and CPU operators. Currently this won't work with optimizer since it adds optimizers for all parameters in the current device scope. But we can actually infer the device that a param belongs to by looking at the device option in the param_init_net. Added a test as well. Reviewed By: salexspb Differential Revision: D5133652 fbshipit-source-id: ad8689d75ac1f5c78981bae1b6978fe91e40ef0f	2017-05-30 12:02:19 -07:00
Bor-Yiing Su	7270471ed6	Returns auxiliary parameters in the optimizers. Summary: 1. Adds a function to return auxiliary parameters for each optimizer. This function can be used to serialize the optimizers so that they can be recovered. 2. Fixes the bug that the iteration blob is not incremented by one in each iteration. Suppose there are k parameters using the adam learning rate optimizer, the iteration blob is incremented by k based on the original implementation. Reviewed By: azzolini Differential Revision: D4872397 fbshipit-source-id: d86711feedda2ba83af5f2a18141b06a6a473733	2017-04-17 10:16:32 -07:00
Luke Yeager	014d1fe5c4	Allow test discovery in caffe2/python/ Summary: These are all essentially no-op changes which allow for nose-style (or pytest-style) test discovery. With this patch, you can use any of these methods to discover and run tests under `caffe2/python`: ``` python -m unittest discover -p 'test.py' caffe2/python/ python -m nose caffe2/python/ python -m pytest caffe2/python/ ``` Future work: * Get all of the tests to pass * Some seem to be testing operations which don't have GPU implementations * I get a segfault unless I set `CUDA_VISIBLE_DEVICES=0` * Some tests are flaky * Allow test discovery throughout the whole project (e.g. the `experiments/` dir) Closes https://github.com/caffe2/caffe2/pull/199 Reviewed By: pietern Differential Revision: D4704504 Pulled By: Yangqing fbshipit-source-id: 8f5687ec9c8aa873dfaff30dbf44272bc38a206b	2017-03-14 18:16:41 -07:00
Huazhong Ning	83437853ad	refactor and modulize optimizers Summary: The current optimizer code in c2/python has the following issues: (1) the optimizers in sgd.py cannot config per param-blob optimizer; (2) sgd.py is a bad file name. optimizer.py is a better name; (3) layer_model_helper.py has another set of optimizer code (which supports per param-blob optimizer) This diff did the following (1) create optimizer objects so that we can config per param-blob optimizer and that are also compatible to the existing optimizer code (2) the new optimizer code are much more modulized (3) move the optimizer code to file with better name (optimizer.py) (4) replace the optimizer imports in the existing code will do in next diffs (1) optimizers with structured parameters for dper2 (2) get rid of the optimizer code in layer_model_helper.py Reviewed By: salexspb Differential Revision: D4609013 fbshipit-source-id: 2e2d6dfa8685d10498f89069157453d9feca3f27	2017-03-07 18:46:47 -08:00

16 Commits