Summary:
1. Adds a function to return auxiliary parameters for each optimizer. This function can be used to serialize the optimizers so that they can be recovered.
2. Fixes the bug that the iteration blob is not incremented by one in each iteration. Suppose there are k parameters using the adam learning rate optimizer, the iteration blob is incremented by k based on the original implementation.
Reviewed By: azzolini
Differential Revision: D4872397
fbshipit-source-id: d86711feedda2ba83af5f2a18141b06a6a473733
Summary:
The current optimizer code in c2/python has the following issues:
(1) the optimizers in sgd.py cannot config per param-blob optimizer;
(2) sgd.py is a bad file name. optimizer.py is a better name;
(3) layer_model_helper.py has another set of optimizer code (which supports per param-blob optimizer)
This diff did the following
(1) create optimizer objects so that we can config per param-blob optimizer and that are also compatible to the existing optimizer code
(2) the new optimizer code are much more modulized
(3) move the optimizer code to file with better name (optimizer.py)
(4) replace the optimizer imports in the existing code
will do in next diffs
(1) optimizers with structured parameters for dper2
(2) get rid of the optimizer code in layer_model_helper.py
Reviewed By: salexspb
Differential Revision: D4609013
fbshipit-source-id: 2e2d6dfa8685d10498f89069157453d9feca3f27