pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

bilzard 18a58f0bd6 Implement "RAdamW" optimizer (#107507 ) Fixes #107282 ## Overview - basic design decision was followed as they made on #103881 (tensor operation, test cases, order & position of argument etc.) - for the algorithm for decoupled weight decay, I referred to [1, 2] ## backwards-incompatible changes - positional argument `decoupled_weight_decay` is added to: - `torch.optim.radam` The existing code which refers to these APIs can be affected. Note: Positional argument `decoupled_weight_decay` is added to `torch.optim.RAdam`. However, since it was added to the last position and with default value, it is not affected. ## Reference - [1] [Decoupled Weight Decay Regularization](https://arxiv.org/abs/1711.05101) - [2] https://github.com/LiyuanLucasLiu/RAdam/blob/master/radam/radam.py#L5-L94 ## TODO - [x] implement tensor operation - [x] implement test cases - [x] modify doc-string - [x] pass unit test code locally `python test/test_optim.py -k test_radam` Pull Request resolved: https://github.com/pytorch/pytorch/pull/107507 Approved by: https://github.com/janeyx99		2023-08-28 20:50:25 +00:00
..
_multi_tensor
__init__.py
__init__.pyi
_functional.py
adadelta.py	[BE]: Update Ruff to 0.0.280 (#105724 )	2023-07-22 23:03:34 +00:00
adadelta.pyi	Merge and improve torch optim optimizer type stubs (#102593 )	2023-07-26 11:56:42 +00:00
adagrad.py	[BE]: Update Ruff to 0.0.280 (#105724 )	2023-07-22 23:03:34 +00:00
adagrad.pyi	Merge and improve torch optim optimizer type stubs (#102593 )	2023-07-26 11:56:42 +00:00
adam.py	[optim] FusedAdam/W accepts lr: Tensor without h2ds (#106916 )	2023-08-21 23:00:44 +00:00
adam.pyi	[optim] FusedAdam/W accepts lr: Tensor without h2ds (#106916 )	2023-08-21 23:00:44 +00:00
adamax.py	[BE]: Update Ruff to 0.0.280 (#105724 )	2023-07-22 23:03:34 +00:00
adamax.pyi	Merge and improve torch optim optimizer type stubs (#102593 )	2023-07-26 11:56:42 +00:00
adamw.py	[optim] FusedAdam/W accepts lr: Tensor without h2ds (#106916 )	2023-08-21 23:00:44 +00:00
adamw.pyi	[optim] FusedAdam/W accepts lr: Tensor without h2ds (#106916 )	2023-08-21 23:00:44 +00:00
asgd.py	[BE]: Update Ruff to 0.0.280 (#105724 )	2023-07-22 23:03:34 +00:00
asgd.pyi	Merge and improve torch optim optimizer type stubs (#102593 )	2023-07-26 11:56:42 +00:00
lbfgs.py	Correct LBFGS tolerance_grad doc string (#99792 )	2023-04-22 20:19:01 +00:00
lbfgs.pyi	Merge and improve torch optim optimizer type stubs (#102593 )	2023-07-26 11:56:42 +00:00
lr_scheduler.py	[BE]: Update ruff to 0.285 (#107519 )	2023-08-22 23:16:38 +00:00
lr_scheduler.pyi	Fixed type hints for CosineAnnealingWarmRestarts (#102067 )	2023-05-23 19:06:07 +00:00
nadam.py	Fix docs, missed a // in LaTeX for nadam (#107736 )	2023-08-23 21:36:27 +00:00
nadam.pyi	Merge and improve torch optim optimizer type stubs (#102593 )	2023-07-26 11:56:42 +00:00
optimizer.py	Revert "[optim] Make casting to match params a hook (#106725 )"	2023-08-25 13:47:19 +00:00
radam.py	Implement "RAdamW" optimizer (#107507 )	2023-08-28 20:50:25 +00:00
radam.pyi	Implement "RAdamW" optimizer (#107507 )	2023-08-28 20:50:25 +00:00
rmsprop.py	[BE]: Update Ruff to 0.0.280 (#105724 )	2023-07-22 23:03:34 +00:00
rmsprop.pyi	Merge and improve torch optim optimizer type stubs (#102593 )	2023-07-26 11:56:42 +00:00
rprop.py	Add in-place `_foreach_copy` (#107226 )	2023-08-17 00:11:18 +00:00
rprop.pyi	Merge and improve torch optim optimizer type stubs (#102593 )	2023-07-26 11:56:42 +00:00
sgd.py	Fixes #107737 SGD doc blank line (#107738 )	2023-08-25 19:48:30 +00:00
sgd.pyi	Merge and improve torch optim optimizer type stubs (#102593 )	2023-07-26 11:56:42 +00:00
sparse_adam.py	[BE]: Update Ruff to 0.0.280 (#105724 )	2023-07-22 23:03:34 +00:00
sparse_adam.pyi	Merge and improve torch optim optimizer type stubs (#102593 )	2023-07-26 11:56:42 +00:00
swa_utils.py	use reset_running_stats in swa_utils.update_bn (#103801 )	2023-06-23 01:17:13 +00:00
swa_utils.pyi