pytorch/caffe2/python
Jamie King 812bc1dde6 Smart Decay for Adam - DPER3 (#62058)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62058

This is the second diff in this stack.  This diff includes the changes to DPER3; the first diff includes the changes to Caffe2.

We want to decay learning parameters properly.  Previously this was not done when a parameter is absent from a minibatch.  We fix this by keeping track of missed minibatches and making decay catch up accordingly.

The exponential moving averages (EMA) for the first and second moments used in Adam are updated only for parameters seen in a minibatch.  Actually, for these parameters, 0 should be added to the EMAs and the EMAs should then be decayed by multiplying by beta1 and beta2 respectively.

To avoid the computational overhead of touching every parameter for every minibatch, we:
* keep track of the last time a parameter is seen
* instead of decaying the EMAs by multiplying by beta1 and beta2, we multiply by beta1^k and beta2^k, where k is the number of minibatches since the parameter was last seen.

We hope this will significantly improve the inconsistent learning parameter issue we have seen with Adam.

Differential Revision: D29638897

fbshipit-source-id: 18d8e227d72c2e23010ca81e0f6eeb78872c8d3c
2021-07-23 13:26:30 -07:00
..
benchmarks Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
docs Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
examples [codemod] fix tautological imports 2021-03-27 01:15:57 -07:00
fakelowp Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
helpers Support ArgMin in c2_pt_converter 2020-12-05 16:35:34 -08:00
ideep Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
layers [itemwise-dropout][1/x][low-level module] Implement Itemwise Sparse Feature Dropout in Dper3 (#59322) 2021-06-04 19:59:17 -07:00
mint [typing] suppress errors in fbcode/caffe2 - batch 2 2021-07-21 17:56:26 -07:00
mkl Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
modeling Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
models [pytorch] Update caffe2/python to eliminate Pyre errors (#52083) 2021-02-11 11:04:59 -08:00
onnx Fix ONNX forward compatibility (#59327) 2021-06-02 12:39:56 -07:00
operator_test [BE] Include a unit test for Save Operator with db_options 2021-07-19 12:22:59 -07:00
predictor [caffe2] Speed up remote net loading 2021-04-20 22:32:40 -07:00
rnn Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
serialized_test Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
test Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
trt Replace TensorRT's deprecated API in caffe2/python/trt/test_pt_onnx_trt.py (#60236) 2021-06-19 19:56:30 -07:00
__init__.py Replace a platform.system() check with sys.platform (#51766) 2021-02-11 20:09:14 -08:00
_import_c_extension.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
_import_c_extension.pyi [caffe2] expose whether FBGEMM is available to the Python code (#54274) 2021-03-19 12:52:14 -07:00
allcompare_test.py Disallow versionless Python shebangs (#58275) 2021-05-14 08:26:02 -07:00
attention.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
benchmark_generator.py Disallow versionless Python shebangs (#58275) 2021-05-14 08:26:02 -07:00
binarysize.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
brew_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
brew.py [pytorch] Update caffe2/python to eliminate Pyre errors (#52083) 2021-02-11 11:04:59 -08:00
build.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
cached_reader.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
caffe_translator_test.py Fix several ResourceWarning: unclosed file (#15746) 2019-01-09 15:36:53 -08:00
caffe_translator.py Remove unused python2 shebang (#58409) 2021-05-17 13:19:32 -07:00
checkpoint_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
checkpoint.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
CMakeLists.txt Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
cnn.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
context_test.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
context.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
context.pyi caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
control_ops_grad_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
control_ops_grad.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
control_ops_util.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
control_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
control.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
convert_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
convert.py Lint trailing newlines (#54737) 2021-03-30 13:09:52 -07:00
convnet_benchmarks_test.py Skip convnets benchmark in rocm CI (#17331) 2019-02-20 21:12:24 -08:00
convnet_benchmarks.py Remove Apache headers from source. 2018-03-27 13:10:18 -07:00
core_gradients_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
core_test.py [caffe2] Fix duplicate name bug in Net.AddExternalInput (#47530) 2020-11-09 08:30:58 -08:00
core.py [caffe2] Explicitly define all DataTypes in python/core.py (#51768) 2021-02-17 20:54:17 -08:00
crf_predict.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
crf_viterbi_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
crf.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
data_parallel_model_test.py Enable GPU/RE tags for caffe2/caffe2/python/TARGETS 2021-02-05 13:52:48 -08:00
data_parallel_model.py [*.py] Rename "Arguments:" to "Args:" (#49736) 2020-12-28 09:34:47 -08:00
data_workers_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
data_workers.py Drop some Python 2 compatibility code (#51769) 2021-02-11 11:02:33 -08:00
dataio_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
dataio.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
dataset.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
db_file_reader.py [caffe2] Fix DBFileReader (#53498) 2021-03-08 08:34:39 -08:00
db_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
device_checker.py Update from facebook (#7451) 2018-05-10 23:14:27 -07:00
dlpack.h [TVM] Fix build and sync with caffe2/caffe2/python/dlpack.h (#40888) 2020-07-02 15:37:45 -07:00
dyndep.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
embedding_generation_benchmark.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
experiment_util.py change logging.warn to logging.warning (#51727) 2021-03-29 10:42:30 -07:00
extension_loader.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
fakefp16_transform_lib.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
filler_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
functional_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
functional.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
fused_8bit_rowwise_conversion_ops_test.py Replace list(map(...)) constructs by list comprehensions (#46461) 2020-10-19 18:42:49 -07:00
gradient_check_test.py [pytorch] Update caffe2/python to eliminate Pyre errors (#52083) 2021-02-11 11:04:59 -08:00
gradient_checker.py [caffe2] Disable running full grad check in tests by default 2020-10-27 16:10:03 -07:00
gru_cell.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
hip_test_util.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
hsm_util.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
hypothesis_test_util.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
hypothesis_test.py Drop some Python 2 compatibility code (#51769) 2021-02-11 11:02:33 -08:00
ideep_test_util.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
layer_model_helper.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
layer_model_instantiator.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
layer_parameter_sharing_test.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
layer_test_util.py [pytorch] Update caffe2/python to eliminate Pyre errors (#52083) 2021-02-11 11:04:59 -08:00
layers_test.py [itemwise-dropout][1/x][low-level module] Implement Itemwise Sparse Feature Dropout in Dper3 (#59322) 2021-06-04 19:59:17 -07:00
lazy_dyndep_test.py Disallow versionless Python shebangs (#58275) 2021-05-14 08:26:02 -07:00
lazy_dyndep.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
lazy.py Modify lazy_dyndep loading to trigger inside workspace. (#41687) 2020-07-22 15:36:43 -07:00
lengths_reducer_fused_8bit_rowwise_ops_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
lengths_reducer_rowwise_8bit_ops_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
lstm_benchmark.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
memonger_test.py [caffe][memonger] Extend operator schema check to dag memonger (#48021) 2020-11-16 19:17:55 -08:00
memonger.py Use nodes instead of node 2021-04-13 10:45:35 -07:00
mkl_test_util.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
model_device_test.py Unify gpu_support variable in python tests (#16748) 2019-02-07 00:29:51 -08:00
model_helper_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
model_helper.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
modifier_context.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
mpi_python.cc Replace c10::guts::stuff with std::stuff (#30915) 2019-12-16 13:57:19 -08:00
muji_test.py Unify cuda and hip device types in Caffe2 python front end (#14221) 2018-11-29 14:00:16 -08:00
muji.py Unify cuda and hip device types in Caffe2 python front end (#14221) 2018-11-29 14:00:16 -08:00
net_builder_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
net_builder.py [*.py] Rename "Arguments:" to "Args:" (#49736) 2020-12-28 09:34:47 -08:00
net_drawer.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
net_printer_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
net_printer.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
nomnigraph_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
nomnigraph_transformations_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
nomnigraph_transformations.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
nomnigraph.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
normalizer_context.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
normalizer_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
normalizer.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
numa_benchmark.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
numa_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
observer_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
operator_fp_exceptions_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
optimizer_context.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
optimizer_test_util.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
optimizer_test.py Smart Decay for Adam - DPER3 (#62058) 2021-07-23 13:26:30 -07:00
optimizer.py Smart Decay for Adam - DPER3 (#62058) 2021-07-23 13:26:30 -07:00
parallel_workers_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
parallel_workers.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
parallelize_bmuf_distributed_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
pipeline_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
pipeline.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
predictor_constants.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
pybind_state_dlpack.cc Make PyTorch code-base clang-tidy compliant (#56892) 2021-04-28 14:10:25 -07:00
pybind_state_dlpack.h pass TypeMeta by value (#45026) 2020-10-30 10:14:17 -07:00
pybind_state_gpu.cc add simple memory analyzer and log warning if GPU underutilized (#21024) 2019-05-28 19:58:54 -07:00
pybind_state_hip.cc Make caffe2/fb folder compatible with AMD (#29131) 2019-11-04 16:40:29 -08:00
pybind_state_ideep.cc Disable avoid-non-const-global-variables lint check (#62008) 2021-07-22 18:04:40 -07:00
pybind_state_int8.cc Disable avoid-non-const-global-variables lint check (#62008) 2021-07-22 18:04:40 -07:00
pybind_state_nomni.cc Disable avoid-non-const-global-variables lint check (#62008) 2021-07-22 18:04:40 -07:00
pybind_state_registry.cc Disable avoid-non-const-global-variables lint check (#62008) 2021-07-22 18:04:40 -07:00
pybind_state_registry.h Move registry fully to c10 (#12077) 2018-09-27 03:09:54 -07:00
pybind_state.cc Disable avoid-non-const-global-variables lint check (#62008) 2021-07-22 18:04:40 -07:00
pybind_state.h Remove redundant code for unsupported Python versions (#49486) 2021-01-06 12:45:46 -08:00
python_op_test.py Remove unused six code for Python 2/3 compatibility (#48077) 2020-12-22 18:07:08 -08:00
queue_util.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
record_queue.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
recurrent.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
regularizer_context.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
regularizer_test.py Enable FP16 sparse regularizer 2021-02-12 12:29:32 -08:00
regularizer.py Enable FP16 sparse regularizer 2021-02-12 12:29:32 -08:00
rnn_cell.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
schema_test.py [caffe2] Add unittests for schema.Field init (#47512) 2020-11-06 13:27:58 -08:00
schema.py [pytorch] Update caffe2/python to eliminate Pyre errors (#52083) 2021-02-11 11:04:59 -08:00
scope_test.py Drop unused imports from caffe2/python (#49980) 2021-01-05 13:17:46 -08:00
scope.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
session_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
session.py [RFC][LocalSession] Fix workspace type 2020-10-29 04:12:17 -07:00
sparse_to_dense_mask_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
sparse_to_dense_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
task_test.py [RFC][caffe2] TaskGroup.__repr__ shouldn't have side effects 2020-10-01 14:21:03 -07:00
task.py caffe2: refactor context to allow being typed (#48340) 2020-11-30 18:31:14 -08:00
test_util.py Add a make_tempdir() utility function to the TestCase base class (#51762) 2021-02-12 10:56:01 -08:00
text_file_reader.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
timeout_guard.py [torch/debuggability] use log.info() in addition to print() in timeoutguard (#57296) 2021-04-29 15:23:35 -07:00
toy_regression_test.py Enable junk fill for the default CPU allocator (#13377) 2018-11-08 00:02:37 -08:00
transformations_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
transformations.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
tt_core_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
tt_core.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
utils_test.py Remove __future__ imports for legacy Python2 supports (#45033) 2020-09-23 17:57:02 -07:00
utils.py Remove redundant code for unsupported Python versions (#49486) 2021-01-06 12:45:46 -08:00
visualize.py Fix typos, via a Levenshtein-type corrector (#31523) 2020-01-17 16:03:19 -08:00
workspace_test.py [typing] suppress errors in fbcode/caffe2 - batch 2 2021-05-04 12:44:27 -07:00
workspace.py [caffe2] expose whether FBGEMM is available to the Python code (#54274) 2021-03-19 12:52:14 -07:00