mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
@tfsingh I got to it first--wanted to land this stack and close the gap ASAP. This PR also fixes a discrepancy between `_init_group` and `__set_state__` because we have the constants live on params' device always. There are some next steps though: - ASGD can be made faster by making etas, mus, steps be on CPU when NOT capturable. (I had mistakenly thought foreachifying was faster and so we landed https://github.com/pytorch/pytorch/pull/107857, but it is slower). No one has complained yet though. ¯\_(ツ)_/¯ Pull Request resolved: https://github.com/pytorch/pytorch/pull/121264 Approved by: https://github.com/albanD ghstack dependencies: #121260 |
||
|---|---|---|
| .. | ||
| _multi_tensor | ||
| __init__.py | ||
| __init__.pyi | ||
| _functional.py | ||
| adadelta.py | ||
| adadelta.pyi | ||
| adagrad.py | ||
| adagrad.pyi | ||
| adam.py | ||
| adam.pyi | ||
| adamax.py | ||
| adamax.pyi | ||
| adamw.py | ||
| adamw.pyi | ||
| asgd.py | ||
| asgd.pyi | ||
| lbfgs.py | ||
| lbfgs.pyi | ||
| lr_scheduler.py | ||
| lr_scheduler.pyi | ||
| nadam.py | ||
| nadam.pyi | ||
| optimizer.py | ||
| radam.py | ||
| radam.pyi | ||
| rmsprop.py | ||
| rmsprop.pyi | ||
| rprop.py | ||
| rprop.pyi | ||
| sgd.py | ||
| sgd.pyi | ||
| sparse_adam.py | ||
| sparse_adam.pyi | ||
| swa_utils.py | ||
| swa_utils.pyi | ||