pytorch/torch/optim
Jane Xu 9d6c5be781 Add ASGD capturable API for forloop (#121264)
@tfsingh I got to it first--wanted to land this stack and close the gap ASAP.

This PR also fixes a discrepancy between `_init_group` and `__set_state__` because we have the constants live on params' device always.

There are some next steps though:
- ASGD can be made faster by making etas, mus, steps be on CPU when NOT capturable. (I had mistakenly thought foreachifying was faster and so we landed https://github.com/pytorch/pytorch/pull/107857, but it is slower). No one has complained yet though.  ¯\_(ツ)_/¯

Pull Request resolved: https://github.com/pytorch/pytorch/pull/121264
Approved by: https://github.com/albanD
ghstack dependencies: #121260
2024-03-08 00:00:30 +00:00
..
_multi_tensor
__init__.py
__init__.pyi
_functional.py
adadelta.py Migrate test_complex_optimizer to OptimizerInfo (#118160) 2024-01-24 21:22:47 +00:00
adadelta.pyi
adagrad.py Migrate test_complex_optimizer to OptimizerInfo (#118160) 2024-01-24 21:22:47 +00:00
adagrad.pyi
adam.py [optim] Rectify capturable testing and fix bugs! (#118326) 2024-02-02 19:13:00 +00:00
adam.pyi
adamax.py Add capturable single tensor Adamax (#121183) 2024-03-07 17:57:02 +00:00
adamax.pyi
adamw.py [optim] Rectify capturable testing and fix bugs! (#118326) 2024-02-02 19:13:00 +00:00
adamw.pyi
asgd.py Add ASGD capturable API for forloop (#121264) 2024-03-08 00:00:30 +00:00
asgd.pyi
lbfgs.py [optim] lbfgs: handle complex params as independent real params (#118184) 2024-01-31 19:24:16 +00:00
lbfgs.pyi
lr_scheduler.py improve the constantLR doc (#120852) 2024-03-04 21:15:27 +00:00
lr_scheduler.pyi
nadam.py [optim] Rectify capturable testing and fix bugs! (#118326) 2024-02-02 19:13:00 +00:00
nadam.pyi
optimizer.py Revert "Remove extra graph breaks (#118987)" 2024-02-05 22:19:37 +00:00
radam.py Add RAdam capturable API for forloop (#121260) 2024-03-08 00:00:30 +00:00
radam.pyi
rmsprop.py Migrate test_complex_optimizer to OptimizerInfo (#118160) 2024-01-24 21:22:47 +00:00
rmsprop.pyi
rprop.py [Optim][Rprop] Replace new().resize_as_() by torch.full_like() (#119978) 2024-02-16 19:54:04 +00:00
rprop.pyi
sgd.py [BE][optim] Simplify _init_group. (#120055) 2024-02-22 22:15:01 +00:00
sgd.pyi
sparse_adam.py Add guardrails preventing complex params in LBFGS & SparseAdam (#118161) 2024-01-24 21:22:47 +00:00
sparse_adam.pyi
swa_utils.py
swa_utils.pyi