mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Fixes #107282 ## Overview - basic design decision was followed as they made on #103881 (tensor operation, test cases, order & position of argument etc.) - for the algorithm for decoupled weight decay, I referred to [1, 2] ## backwards-incompatible changes - positional argument `decoupled_weight_decay` is added to: - `torch.optim.radam` The existing code which refers to these APIs can be affected. Note: Positional argument `decoupled_weight_decay` is added to `torch.optim.RAdam`. However, since it was added to the last position and with default value, it is not affected. ## Reference - [1] [Decoupled Weight Decay Regularization](https://arxiv.org/abs/1711.05101) - [2] https://github.com/LiyuanLucasLiu/RAdam/blob/master/radam/radam.py#L5-L94 ## TODO - [x] implement tensor operation - [x] implement test cases - [x] modify doc-string - [x] pass unit test code locally `python test/test_optim.py -k test_radam` Pull Request resolved: https://github.com/pytorch/pytorch/pull/107507 Approved by: https://github.com/janeyx99 |
||
|---|---|---|
| .. | ||
| _multi_tensor | ||
| __init__.py | ||
| __init__.pyi | ||
| _functional.py | ||
| adadelta.py | ||
| adadelta.pyi | ||
| adagrad.py | ||
| adagrad.pyi | ||
| adam.py | ||
| adam.pyi | ||
| adamax.py | ||
| adamax.pyi | ||
| adamw.py | ||
| adamw.pyi | ||
| asgd.py | ||
| asgd.pyi | ||
| lbfgs.py | ||
| lbfgs.pyi | ||
| lr_scheduler.py | ||
| lr_scheduler.pyi | ||
| nadam.py | ||
| nadam.pyi | ||
| optimizer.py | ||
| radam.py | ||
| radam.pyi | ||
| rmsprop.py | ||
| rmsprop.pyi | ||
| rprop.py | ||
| rprop.pyi | ||
| sgd.py | ||
| sgd.pyi | ||
| sparse_adam.py | ||
| sparse_adam.pyi | ||
| swa_utils.py | ||
| swa_utils.pyi | ||