pytorch/caffe2/python
Jamie King c23db9327a Smart Decay for Adam - Caffe2 (#61548)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61548

We want to decay learning parameters properly.  Previously this was not done when a parameter is absent from a minibatch.  We fix this by keeping track of missed minibatches and making decay catch up accordingly.

The exponential moving averages (EMA) for the first and second moments used in Adam are updated only for parameters seen in a minibatch.  Actually, for these parameters, 0 should be added to the EMAs and the EMAs should then be decayed by multiplying by beta1 and beta2 respectively.

To avoid the computational overhead of touching every parameter for every minibatch, we:
* keep track of the last time a parameter is seen
* instead of decaying the EMAs by multiplying by beta1 and beta2, we multiply by beta1^k and beta2^k, where k is the number of minibatches since the parameter was last seen
* we calculate the amount of momentum that would have been discharged over the missed minibatches and update the weight accordingly.

Differential Revision: D29654246

fbshipit-source-id: 7a6cd7966eb1f31116d99dfce79a78b2d3ee9e3e
2021-07-14 10:22:38 -07:00
..
benchmarks Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
docs Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
examples [codemod] fix tautological imports 2021-03-27 01:15:57 -07:00
fakelowp Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
helpers Support ArgMin in c2_pt_converter 2020-12-05 16:35:34 -08:00
ideep Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
layers [itemwise-dropout][1/x][low-level module] Implement Itemwise Sparse Feature Dropout in Dper3 (#59322) 2021-06-04 19:59:17 -07:00
mint [typing] suppress errors in fbcode/caffe2 - batch 2 2021-03-16 16:45:41 -07:00
mkl Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
modeling Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
models [pytorch] Update caffe2/python to eliminate Pyre errors (#52083) 2021-02-11 11:04:59 -08:00
onnx Fix ONNX forward compatibility (#59327) 2021-06-02 12:39:56 -07:00
operator_test Smart Decay for Adam - Caffe2 (#61548) 2021-07-14 10:22:38 -07:00
predictor [caffe2] Speed up remote net loading 2021-04-20 22:32:40 -07:00
rnn Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
serialized_test Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
test Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
trt Replace TensorRT's deprecated API in caffe2/python/trt/test_pt_onnx_trt.py (#60236) 2021-06-19 19:56:30 -07:00
__init__.py Replace a platform.system() check with sys.platform (#51766) 2021-02-11 20:09:14 -08:00
_import_c_extension.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
_import_c_extension.pyi [caffe2] expose whether FBGEMM is available to the Python code (#54274) 2021-03-19 12:52:14 -07:00
allcompare_test.py Disallow versionless Python shebangs (#58275) 2021-05-14 08:26:02 -07:00
attention.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
benchmark_generator.py Disallow versionless Python shebangs (#58275) 2021-05-14 08:26:02 -07:00
binarysize.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
brew_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
brew.py [pytorch] Update caffe2/python to eliminate Pyre errors (#52083) 2021-02-11 11:04:59 -08:00
build.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
cached_reader.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
caffe_translator_test.py
caffe_translator.py Remove unused python2 shebang (#58409) 2021-05-17 13:19:32 -07:00
checkpoint_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
checkpoint.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
CMakeLists.txt Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
cnn.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
context_test.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
context.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
context.pyi caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
control_ops_grad_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
control_ops_grad.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
control_ops_util.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
control_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
control.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
convert_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
convert.py Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
convnet_benchmarks_test.py
convnet_benchmarks.py
core_gradients_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
core_test.py [caffe2] Fix duplicate name bug in Net.AddExternalInput (#47530) 2020-11-09 08:30:58 -08:00
core.py [caffe2] Explicitly define all DataTypes in python/core.py (#51768) 2021-02-17 20:54:17 -08:00
crf_predict.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
crf_viterbi_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
crf.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
data_parallel_model_test.py Enable GPU/RE tags for caffe2/caffe2/python/TARGETS 2021-02-05 13:52:48 -08:00
data_parallel_model.py [*.py] Rename "Arguments:" to "Args:" (#49736) 2020-12-28 09:34:47 -08:00
data_workers_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
data_workers.py Drop some Python 2 compatibility code (#51769) 2021-02-11 11:02:33 -08:00
dataio_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
dataio.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
dataset.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
db_file_reader.py [caffe2] Fix DBFileReader (#53498) 2021-03-08 08:34:39 -08:00
db_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
device_checker.py
dlpack.h [TVM] Fix build and sync with caffe2/caffe2/python/dlpack.h (#40888) 2020-07-02 15:37:45 -07:00
dyndep.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
embedding_generation_benchmark.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
experiment_util.py change logging.warn to logging.warning (#51727) 2021-03-29 10:42:30 -07:00
extension_loader.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
fakefp16_transform_lib.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
filler_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
functional_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
functional.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
fused_8bit_rowwise_conversion_ops_test.py Replace list(map(...)) constructs by list comprehensions (#46461) 2020-10-19 18:42:49 -07:00
gradient_check_test.py [pytorch] Update caffe2/python to eliminate Pyre errors (#52083) 2021-02-11 11:04:59 -08:00
gradient_checker.py [caffe2] Disable running full grad check in tests by default 2020-10-27 16:10:03 -07:00
gru_cell.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
hip_test_util.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
hsm_util.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
hypothesis_test_util.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
hypothesis_test.py Drop some Python 2 compatibility code (#51769) 2021-02-11 11:02:33 -08:00
ideep_test_util.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
layer_model_helper.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
layer_model_instantiator.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
layer_parameter_sharing_test.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
layer_test_util.py [pytorch] Update caffe2/python to eliminate Pyre errors (#52083) 2021-02-11 11:04:59 -08:00
layers_test.py [itemwise-dropout][1/x][low-level module] Implement Itemwise Sparse Feature Dropout in Dper3 (#59322) 2021-06-04 19:59:17 -07:00
lazy_dyndep_test.py Disallow versionless Python shebangs (#58275) 2021-05-14 08:26:02 -07:00
lazy_dyndep.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
lazy.py Modify lazy_dyndep loading to trigger inside workspace. (#41687) 2020-07-22 15:36:43 -07:00
lengths_reducer_fused_8bit_rowwise_ops_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
lengths_reducer_rowwise_8bit_ops_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
lstm_benchmark.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
memonger_test.py [caffe][memonger] Extend operator schema check to dag memonger (#48021) 2020-11-16 19:17:55 -08:00
memonger.py Use nodes instead of node 2021-04-13 10:45:35 -07:00
mkl_test_util.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
model_device_test.py
model_helper_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
model_helper.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
modifier_context.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
mpi_python.cc Replace c10::guts::stuff with std::stuff (#30915) 2019-12-16 13:57:19 -08:00
muji_test.py
muji.py
net_builder_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
net_builder.py [*.py] Rename "Arguments:" to "Args:" (#49736) 2020-12-28 09:34:47 -08:00
net_drawer.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
net_printer_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
net_printer.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
nomnigraph_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
nomnigraph_transformations_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
nomnigraph_transformations.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
nomnigraph.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
normalizer_context.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
normalizer_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
normalizer.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
numa_benchmark.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
numa_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
observer_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
operator_fp_exceptions_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
optimizer_context.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
optimizer_test_util.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
optimizer_test.py optimizer exploration - v1 and v2 + fix position_weighted optimizer + decoupled weight decay (#54042) 2021-03-27 23:03:29 -07:00
optimizer.py Weighted decay with frequency (count-based) (#60382) 2021-06-21 18:46:35 -07:00
parallel_workers_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
parallel_workers.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
parallelize_bmuf_distributed_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
pipeline_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
pipeline.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
predictor_constants.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
pybind_state_dlpack.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state_dlpack.h pass TypeMeta by value (#45026) 2020-10-30 10:14:17 -07:00
pybind_state_gpu.cc
pybind_state_hip.cc Make caffe2/fb folder compatible with AMD (#29131) 2019-11-04 16:40:29 -08:00
pybind_state_ideep.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state_int8.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state_nomni.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state_registry.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state_registry.h
pybind_state.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state.h Remove redundant code for unsupported Python versions (#49486) 2021-01-06 12:45:46 -08:00
python_op_test.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
queue_util.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
record_queue.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
recurrent.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
regularizer_context.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
regularizer_test.py Enable FP16 sparse regularizer 2021-02-12 12:29:32 -08:00
regularizer.py Enable FP16 sparse regularizer 2021-02-12 12:29:32 -08:00
rnn_cell.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
schema_test.py [caffe2] Add unittests for schema.Field init (#47512) 2020-11-06 13:27:58 -08:00
schema.py [pytorch] Update caffe2/python to eliminate Pyre errors (#52083) 2021-02-11 11:04:59 -08:00
scope_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
scope.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
session_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
session.py [RFC][LocalSession] Fix workspace type 2020-10-29 04:12:17 -07:00
sparse_to_dense_mask_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
sparse_to_dense_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
task_test.py [RFC][caffe2] TaskGroup.__repr__ shouldn't have side effects 2020-10-01 14:21:03 -07:00
task.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
test_util.py Add a make_tempdir() utility function to the TestCase base class (#51762) 2021-02-12 10:56:01 -08:00
text_file_reader.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
timeout_guard.py [torch/debuggability] use log.info() in addition to print() in timeoutguard (#57296) 2021-04-29 15:23:35 -07:00
toy_regression_test.py
transformations_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
transformations.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
tt_core_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
tt_core.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
utils_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
utils.py Remove redundant code for unsupported Python versions (#49486) 2021-01-06 12:45:46 -08:00
visualize.py Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
workspace_test.py [typing] suppress errors in fbcode/caffe2 - batch 2 2021-05-04 12:44:27 -07:00
workspace.py [caffe2] expose whether FBGEMM is available to the Python code (#54274) 2021-03-19 12:52:14 -07:00