Commit Graph

39 Commits

Author SHA1 Message Date
Xuehai Pan
596b418391 [BE][PYFMT] migrate PYFMT for {torch,test}/{nn,optim}/** to ruff format (#144548)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144548
Approved by: https://github.com/ezyang
2025-06-14 11:27:04 +00:00
Aaron Orenstein
0afd335174 PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175)
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145175
Approved by: https://github.com/bobrenjc93
2025-01-21 16:57:27 +00:00
PyTorch MergeBot
5fd881a5b6 Revert "PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175)"
This reverts commit 54a00af2c6.

Reverted https://github.com/pytorch/pytorch/pull/145175 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to break some trunk tests ([comment](https://github.com/pytorch/pytorch/pull/145175#issuecomment-2603418267))
2025-01-21 00:49:55 +00:00
Aaron Orenstein
54a00af2c6 PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175)
See #145101 for details.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145175
Approved by: https://github.com/bobrenjc93
2025-01-20 22:32:59 +00:00
Aaron Orenstein
27f9d3b0a1 Flip default value for mypy disallow_untyped_defs [8/11] (#127845)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127845
Approved by: https://github.com/oulgen
ghstack dependencies: #127842, #127843, #127844
2024-06-08 18:49:56 +00:00
FFFrog
560efaa471 Part 1: UFMT partial files in torch/optim due to the pr-sanity-checks (#124053)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/124053
Approved by: https://github.com/ezyang
ghstack dependencies: #124048
2024-04-16 03:17:18 +00:00
pilot-j
a2552d5521 Fixed docstring errors inside torch/cuda/ and torch/optim/ (Docathon H2) (#112964)
Fixes #112592
1) **File: torch/cuda/random.py**
```
Before:
/content/pytorch/torch/cuda/random.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/cuda/random.py:21 in public function `get_rng_state`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
/content/pytorch/torch/cuda/random.py:43 in public function `get_rng_state_all`:
        D202: No blank lines allowed after function docstring (found 1)
/content/pytorch/torch/cuda/random.py:43 in public function `get_rng_state_all`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
/content/pytorch/torch/cuda/random.py:54 in public function `set_rng_state`:
        D401: First line should be in imperative mood (perhaps 'Set', not 'Sets')
/content/pytorch/torch/cuda/random.py:79 in public function `set_rng_state_all`:
        D208: Docstring is over-indented
/content/pytorch/torch/cuda/random.py:79 in public function `set_rng_state_all`:
        D209: Multi-line docstring closing quotes should be on a separate line
/content/pytorch/torch/cuda/random.py:79 in public function `set_rng_state_all`:
        D401: First line should be in imperative mood (perhaps 'Set', not 'Sets')
/content/pytorch/torch/cuda/random.py:79 in public function `set_rng_state_all`:
        D414: Section has no content ('Args')
/content/pytorch/torch/cuda/random.py:88 in public function `manual_seed`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/cuda/random.py:88 in public function `manual_seed`:
        D401: First line should be in imperative mood (perhaps 'Set', not 'Sets')
/content/pytorch/torch/cuda/random.py:110 in public function `manual_seed_all`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/cuda/random.py:110 in public function `manual_seed_all`:
        D401: First line should be in imperative mood (perhaps 'Set', not 'Sets')
/content/pytorch/torch/cuda/random.py:128 in public function `seed`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/cuda/random.py:128 in public function `seed`:
        D401: First line should be in imperative mood (perhaps 'Set', not 'Sets')
/content/pytorch/torch/cuda/random.py:146 in public function `seed_all`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/cuda/random.py:146 in public function `seed_all`:
        D401: First line should be in imperative mood (perhaps 'Set', not 'Sets')
/content/pytorch/torch/cuda/random.py:167 in public function `initial_seed`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
18
```

```
After:
/content/pytorch/torch/cuda/random.py:1 at module level:
        D100: Missing docstring in public module
1

```
2) **File: torch/cuda/amp/autocast_mode.py**
```
Before: /content/pytorch/torch/cuda/amp/autocast_mode.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/cuda/amp/autocast_mode.py:18 in public class `autocast`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/cuda/amp/autocast_mode.py:23 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/cuda/amp/autocast_mode.py:38 in public method `__enter__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/cuda/amp/autocast_mode.py:44 in public method `__exit__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/cuda/amp/autocast_mode.py:49 in public method `__call__`:
        D102: Missing docstring in public method
/content/pytorch/torch/cuda/amp/autocast_mode.py:90 in public function `custom_fwd`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/cuda/amp/autocast_mode.py:90 in public function `custom_fwd`:
        D400: First line should end with a period (not 'f')
/content/pytorch/torch/cuda/amp/autocast_mode.py:90 in public function `custom_fwd`:
        D401: First line should be in imperative mood; try rephrasing (found 'Helper')
/content/pytorch/torch/cuda/amp/autocast_mode.py:130 in public function `custom_bwd`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/cuda/amp/autocast_mode.py:130 in public function `custom_bwd`:
        D400: First line should end with a period (not 'f')
/content/pytorch/torch/cuda/amp/autocast_mode.py:130 in public function `custom_bwd`:
        D401: First line should be in imperative mood; try rephrasing (found 'Helper')
12
```
```
After:
/content/pytorch/torch/cuda/amp/autocast_mode.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/cuda/amp/autocast_mode.py:23 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/cuda/amp/autocast_mode.py:38 in public method `__enter__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/cuda/amp/autocast_mode.py:44 in public method `__exit__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/cuda/amp/autocast_mode.py:49 in public method `__call__`:
        D102: Missing docstring in public method
5
```

3)  **File: torch/cuda/amp/grad_scaler.py**
```
Before: /content/pytorch/torch/cuda/amp/grad_scaler.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/cuda/amp/grad_scaler.py:17 in private class `_MultiDeviceReplicator`:
        D200: One-line docstring should fit on one line with quotes (found 3)
/content/pytorch/torch/cuda/amp/grad_scaler.py:39 in public class `OptState`:
        D101: Missing docstring in public class
/content/pytorch/torch/cuda/amp/grad_scaler.py:50 in public class `GradScaler`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/cuda/amp/grad_scaler.py:50 in public class `GradScaler`:
        D400: First line should end with a period (not 'g')
/content/pytorch/torch/cuda/amp/grad_scaler.py:115 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/cuda/amp/grad_scaler.py:354 in public method `step`:
        D400: First line should end with a period (not ':')
/content/pytorch/torch/cuda/amp/grad_scaler.py:456 in public method `update`:
        D401: First line should be in imperative mood (perhaps 'Update', not 'Updates')
/content/pytorch/torch/cuda/amp/grad_scaler.py:529 in public method `get_scale`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
/content/pytorch/torch/cuda/amp/grad_scaler.py:544 in public method `get_growth_factor`:
        D200: One-line docstring should fit on one line with quotes (found 3)
/content/pytorch/torch/cuda/amp/grad_scaler.py:544 in public method `get_growth_factor`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
/content/pytorch/torch/cuda/amp/grad_scaler.py:550 in public method `set_growth_factor`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/cuda/amp/grad_scaler.py:550 in public method `set_growth_factor`:
        D400: First line should end with a period (not ':')
/content/pytorch/torch/cuda/amp/grad_scaler.py:557 in public method `get_backoff_factor`:
        D200: One-line docstring should fit on one line with quotes (found 3)
/content/pytorch/torch/cuda/amp/grad_scaler.py:557 in public method `get_backoff_factor`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
/content/pytorch/torch/cuda/amp/grad_scaler.py:563 in public method `set_backoff_factor`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/cuda/amp/grad_scaler.py:563 in public method `set_backoff_factor`:
        D400: First line should end with a period (not ':')
/content/pytorch/torch/cuda/amp/grad_scaler.py:570 in public method `get_growth_interval`:
        D200: One-line docstring should fit on one line with quotes (found 3)
/content/pytorch/torch/cuda/amp/grad_scaler.py:570 in public method `get_growth_interval`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
/content/pytorch/torch/cuda/amp/grad_scaler.py:576 in public method `set_growth_interval`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/cuda/amp/grad_scaler.py:576 in public method `set_growth_interval`:
        D400: First line should end with a period (not ':')
/content/pytorch/torch/cuda/amp/grad_scaler.py:592 in public method `is_enabled`:
        D200: One-line docstring should fit on one line with quotes (found 3)
/content/pytorch/torch/cuda/amp/grad_scaler.py:592 in public method `is_enabled`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
/content/pytorch/torch/cuda/amp/grad_scaler.py:598 in public method `state_dict`:
        D400: First line should end with a period (not ':')
/content/pytorch/torch/cuda/amp/grad_scaler.py:598 in public method `state_dict`:
        D401: First line should be in imperative mood (perhaps 'Return', not 'Returns')
/content/pytorch/torch/cuda/amp/grad_scaler.py:624 in public method `load_state_dict`:
        D401: First line should be in imperative mood (perhaps 'Load', not 'Loads')
/content/pytorch/torch/cuda/amp/grad_scaler.py:649 in public method `__getstate__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/cuda/amp/grad_scaler.py:665 in public method `__setstate__`:
        D105: Missing docstring in magic method
28
```
```
After:
/content/pytorch/torch/cuda/amp/grad_scaler.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/cuda/amp/grad_scaler.py:40 in public class `OptState`:
        D101: Missing docstring in public class
/content/pytorch/torch/cuda/amp/grad_scaler.py:117 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/cuda/amp/grad_scaler.py:647 in public method `__getstate__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/cuda/amp/grad_scaler.py:663 in public method `__setstate__`:
        D105: Missing docstring in magic method
5
```
4) **File: torch/optim/_functional.py**
```
Before:
/content/pytorch/torch/optim/_functional.py:1 at module level:
        D400: First line should end with a period (not 'e')
1
```
```
After:
0

```
5) **File: torch/optim/__init__.py**
```
Before:
/content/pytorch/torch/optim/__init__.py:1 at module level:
        D205: 1 blank line required between summary line and description (found 0)
1
```
```
After:
0

```
6) **File: torch/optim/lbfgs.py**
```
Before:
/content/pytorch/torch/optim/lbfgs.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/lbfgs.py:185 in public class `LBFGS`:
        D205: 1 blank line required between summary line and description (found 0)
/content/pytorch/torch/optim/lbfgs.py:185 in public class `LBFGS`:
        D400: First line should end with a period (not 'c')
/content/pytorch/torch/optim/lbfgs.py:215 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/lbfgs.py:285 in public method `step`:
        D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
5
```
```
After:
/content/pytorch/torch/optim/lbfgs.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/lbfgs.py:217 in public method `__init__`:
        D107: Missing docstring in __init__
2
```
7)**File: torch/optim/sparse_adam.py**
```
Before: /content/pytorch/torch/optim/sparse_adam.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/sparse_adam.py:7 in public class `SparseAdam`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/sparse_adam.py:8 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/sparse_adam.py:40 in public method `step`:
        D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
4
```
```
After:
/content/pytorch/torch/optim/sparse_adam.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/sparse_adam.py:7 in public class `SparseAdam`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/sparse_adam.py:8 in public method `__init__`:
        D107: Missing docstring in __init__
3
```
8) **File:torch/optim/adadelta.py**
```
Before:
/content/pytorch/torch/optim/adadelta.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/adadelta.py:11 in public class `Adadelta`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/adadelta.py:12 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/adadelta.py:44 in public method `__setstate__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/optim/adadelta.py:82 in public method `step`:
        D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
/content/pytorch/torch/optim/adadelta.py:193 in public function `adadelta`:
        D202: No blank lines allowed after function docstring (found 1)
6
```
```
After:
/content/pytorch/torch/optim/adadelta.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/adadelta.py:11 in public class `Adadelta`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/adadelta.py:12 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/adadelta.py:44 in public method `__setstate__`:
        D105: Missing docstring in magic method
4
```
9) **File: torch/optim/adagrad.py**
```
Before:
/content/pytorch/torch/optim/adagrad.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/adagrad.py:11 in public class `Adagrad`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/adagrad.py:12 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/adagrad.py:63 in public method `__setstate__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/optim/adagrad.py:78 in public method `share_memory`:
        D102: Missing docstring in public method
/content/pytorch/torch/optim/adagrad.py:100 in public method `step`:
        D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
/content/pytorch/torch/optim/adagrad.py:201 in public function `adagrad`:
        D202: No blank lines allowed after function docstring (found 1)
7
```
```
After:
/content/pytorch/torch/optim/adagrad.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/adagrad.py:11 in public class `Adagrad`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/adagrad.py:12 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/adagrad.py:63 in public method `__setstate__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/optim/adagrad.py:78 in public method `share_memory`:
        D102: Missing docstring in public method
5
```
10) **File: torch/optim/adam.py**
```
Before:
/content/pytorch/torch/optim/adam.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/adam.py:14 in public class `Adam`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/adam.py:15 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/adam.py:65 in public method `__setstate__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/optim/adam.py:135 in public method `step`:
        D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
/content/pytorch/torch/optim/adam.py:281 in public function `adam`:
        D202: No blank lines allowed after function docstring (found 1)
/content/pytorch/torch/optim/adam.py:281 in public function `adam`:
        D205: 1 blank line required between summary line and description (found 0)
7
```
```
After:
/content/pytorch/torch/optim/adam.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/adam.py:14 in public class `Adam`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/adam.py:15 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/adam.py:65 in public method `__setstate__`:
        D105: Missing docstring in magic method
4

```
11) **File: torch/optim/adamax.py**
```
Before:
/content/pytorch/torch/optim/adamax.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/adamax.py:12 in public class `Adamax`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/adamax.py:13 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/adamax.py:47 in public method `__setstate__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/optim/adamax.py:91 in public method `step`:
        D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
/content/pytorch/torch/optim/adamax.py:203 in public function `adamax`:
        D202: No blank lines allowed after function docstring (found 1)
6
```
```
After:
/content/pytorch/torch/optim/adamax.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/adamax.py:12 in public class `Adamax`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/adamax.py:13 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/adamax.py:47 in public method `__setstate__`:
        D105: Missing docstring in magic method
4
```
12) **File: torch/optim/adamw.py**
```
Before:
/content/pytorch/torch/optim/adamw.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/adamw.py:12 in public class `AdamW`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/adamw.py:13 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/adamw.py:73 in public method `__setstate__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/optim/adamw.py:153 in public method `step`:
        D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
/content/pytorch/torch/optim/adamw.py:304 in public function `adamw`:
        D202: No blank lines allowed after function docstring (found 1)
6

```
```
After:
/content/pytorch/torch/optim/adamw.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/adamw.py:12 in public class `AdamW`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/adamw.py:13 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/adamw.py:73 in public method `__setstate__`:
        D105: Missing docstring in magic method
4

```
13) **File: torch/optim/asgd.py**
```
Before:
/content/pytorch/torch/optim/asgd.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/asgd.py:17 in public class `ASGD`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/asgd.py:18 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/asgd.py:52 in public method `__setstate__`:
        D105: Missing docstring in magic method
/content/pytorch/torch/optim/asgd.py:107 in public method `step`:
        D401: First line should be in imperative mood (perhaps 'Perform', not 'Performs')
/content/pytorch/torch/optim/asgd.py:195 in public function `asgd`:
        D202: No blank lines allowed after function docstring (found 1)
6
```
```
After:
/content/pytorch/torch/optim/asgd.py:1 at module level:
        D100: Missing docstring in public module
/content/pytorch/torch/optim/asgd.py:17 in public class `ASGD`:
        D101: Missing docstring in public class
/content/pytorch/torch/optim/asgd.py:18 in public method `__init__`:
        D107: Missing docstring in __init__
/content/pytorch/torch/optim/asgd.py:52 in public method `__setstate__`:
        D105: Missing docstring in magic method
4
```
Resolved docstring errors as listed. I initially changed in the main branch of forked repo which caused changes to appear in my PR to other issue. I have fixed that and hope this PR won't have any conflicts.
Kindly review @svekars @jbschlosser.
In case of any other issues please let me know. Thanks!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112964
Approved by: https://github.com/kit1980
2023-11-13 22:16:44 +00:00
kshitij12345
82229d1e33 [optim] fix: empty grad support for SparseAdam (#86459)
Fixes #82486

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86459
Approved by: https://github.com/albanD
2022-10-07 19:24:59 +00:00
Rob Zinkov
f24c94d7ae Adding maximize to SparseAdam (#80336)
Added the maximize flag #68052 to SparseAdam optimizer and updates the respective tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80336
Approved by: https://github.com/albanD
2022-07-08 12:17:27 +00:00
Mikayla Gawarecki
2a5aaf1c49 Optim foreach cleanup for AdamW (#70484)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70484

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767869

Pulled By: mikaylagawarecki

fbshipit-source-id: 2f5273bbfeea3ed502c5d77da4bebe1674243e86
(cherry picked from commit 2dd9b77917)
2022-02-15 18:02:08 +00:00
Mikayla Gawarecki
dff58d519f Optim foreach cleanup for Rprop (#70483)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70483

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767866

Pulled By: mikaylagawarecki

fbshipit-source-id: ffc5ae68eeea8fa09385862b853b731554b77bcb
(cherry picked from commit 3a0fe29580)
2022-02-15 18:02:08 +00:00
Mikayla Gawarecki
ce3094f5f6 Optim foreach cleanup for Rmsprop (#70482)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70482

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767862

Pulled By: mikaylagawarecki

fbshipit-source-id: 8e2e9c986d5a3774093a79755940372945f1b3a9
(cherry picked from commit baea537277)
2022-02-15 18:02:08 +00:00
Mikayla Gawarecki
2cb03e926f Optim foreach cleanup for SGD (#70481)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70481

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767868

Pulled By: mikaylagawarecki

fbshipit-source-id: 89b9227a4ddf99602855973cbc343c58ae3d5328
(cherry picked from commit ffea8ddcfd)
2022-02-15 18:02:08 +00:00
Mikayla Gawarecki
5f9590681d Optim foreach cleanup for Adam (#70295)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70295

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767870

Pulled By: mikaylagawarecki

fbshipit-source-id: f922f15ecb0307458c8ecee737325c42c4f3ce8b
(cherry picked from commit 66233a8a3e)
2022-02-15 18:02:08 +00:00
Mikayla Gawarecki
0972db5b7d Optim foreach cleanup for ASGD (#70231)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70231

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767867

Pulled By: mikaylagawarecki

fbshipit-source-id: 4406824acbb6f427d52c1ced2d8a02a98c943b86
(cherry picked from commit cbd9a4da15)
2022-02-09 16:52:13 +00:00
Mikayla Gawarecki
5948522e9c Optim foreach cleanup for RAdam (#70230)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70230

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767874

Pulled By: mikaylagawarecki

fbshipit-source-id: 9379db24266a7bbcc2c23849f87ae0af2e6729c0
(cherry picked from commit ecf7b31fc3)
2022-02-09 16:52:13 +00:00
Mikayla Gawarecki
3653f07c7c Optim foreach cleanup for NAdam (#70229)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70229

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767873

Pulled By: mikaylagawarecki

fbshipit-source-id: 833ead14c1d1659351ebfbeb41045a3c7eb96dad
(cherry picked from commit 9415df6b5c)
2022-02-09 16:52:13 +00:00
Mikayla Gawarecki
d9acfef831 Optim foreach cleanup for Adamax (#69982)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69982

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767865

Pulled By: mikaylagawarecki

fbshipit-source-id: c5efd351e359825d38b71f57a2c61a2055c3c114
(cherry picked from commit 37bb80c2d7)
2022-02-09 16:52:13 +00:00
Mikayla Gawarecki
dabfea8363 Optim foreach cleanup for Adagrad (#69981)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69981

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767863

Pulled By: mikaylagawarecki

fbshipit-source-id: 1c99abe4ac4eb2a9eb896dff4837b539b94f68e7
(cherry picked from commit 61c28d0645)
2022-02-09 16:52:12 +00:00
Mikayla Gawarecki
8e8d170674 Optim foreach cleanup for Adadelta (#69980)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69980

- Merged `torch/optim/adadelta.py` and `torch/optim/_multitensor/adadelta.py` into `torch/optim/adadelta.py`
- Moved adadelta functional forms from `torch/optim/_functional.py` and `torch/optim/_multi_tensor/_functional.py` to `torch/optim/adadelta.py`
- `torch/optim/_functional.py` just imports from `torch/optim/adadelta.py`
- Added a test `test_optimizers_foreach_flag` which replicates `test_multi_tensor_optimizers` in `test/test_optim.py`
- Add a test `test_adadelta_new` that replicates the behavior of `test_adadelta` but with `foreach` flag instead of using the multitensor adadleta class. If we delete `_multitensor/` we could replace `test_adadelta` with this

Remaining TODO:

- [ ] single_tensor adadelta supports complex but multitensor does not, need to integrate the singletensor logic in multitensor and switch the `test_adadelta_complex` to test for foreach in [True, False]

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin, albanD

Differential Revision: D33413059

Pulled By: mikaylagawarecki

fbshipit-source-id: 92a9fa98705762bb6bd464261671e49aef40070e
(cherry picked from commit a008227d22)
2022-02-09 16:52:12 +00:00
Mikayla Gawarecki
8bb1d06702 [optim] ASGD fold state updates into functional and pass list of vars rather than states (#71335)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71335

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767871

Pulled By: mikaylagawarecki

fbshipit-source-id: 84ebe1fafb1c27572f08c8c8026c882dd7e054c1
(cherry picked from commit 7613ebb391)
2022-02-08 23:58:41 +00:00
Mikayla Gawarecki
ccc1a01dcb [optim] NAdam fold state updates into functional (#71334)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71334

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767864

Pulled By: mikaylagawarecki

fbshipit-source-id: 4d985e9e346f40110bd4231e0f16e5643fbc448d
(cherry picked from commit 58aa77e367)
2022-02-08 23:58:41 +00:00
Mikayla Gawarecki
7176c92687 [optim] update step in functional and pass state_steps instead of state (#71333)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71333

Updated
- Adagrad
- Adamax
- Adam
- AdamW
- RAdam
make multi_tensor functionals take `state_steps: List[Tensor]` instead of taking `states: List[Dict]`
make `state_steps: List[int]s -> state_steps:List[Tensor]` where each is a Singleton tensor so step can be updated within the functional

(NAdam and ASGD) were updated in separate diffs to fold their handling of state into the functionals

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33767872

Pulled By: mikaylagawarecki

fbshipit-source-id: 9baa7cafb6375eab839917df9287c65a437891f2
(cherry picked from commit 831c02b3d0)
2022-02-08 16:51:19 +00:00
Adnios
a9c7d626e1 Add the maximize flag to AdamW (#70146)
Summary:
Related issue: https://github.com/pytorch/pytorch/issues/68052

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70146

Reviewed By: malfet

Differential Revision: D33254561

Pulled By: albanD

fbshipit-source-id: f190c836a4162f936c5953e076747c345df21421
2021-12-23 09:20:29 -08:00
oliver
3d358a7678 Adds a maximize flag to Adam (#68164)
Summary:
Solves the next most important use case in https://github.com/pytorch/pytorch/issues/68052.

I have kept the style as close to that in SGD as seemed reasonable, given the slight differences in their internal implementations.

All feedback welcome!

cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68164

Reviewed By: VitalyFedyunin

Differential Revision: D32994129

Pulled By: albanD

fbshipit-source-id: 65c57c3f3dbbd3e3e5338d51def54482503e8850
2021-12-13 05:53:53 -08:00
oliver
f8297d40fc Adds a maximize flag to SGD. (#67847)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/46480 -- for SGD.

## Notes:
- I have modified the existing tests to take a new `constructor_accepts_maximize` flag. When this is set to true, the ` _test_basic_cases_template` function will test both maximizing and minimizing the sample function.
- This was the clearest way I could think of testing the changes -- I would appreciate feedback on this strategy.

## Work to be done:
[] I need to update the docs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67847

Reviewed By: H-Huang

Differential Revision: D32252631

Pulled By: albanD

fbshipit-source-id: 27915a3cc2d18b7e4d17bfc2d666fe7d2cfdf9a4
2021-11-09 00:43:07 -08:00
Christopher Gray Howard
dfa7225a38 [Pytorch][Bootcamp] Add fix and testing for non-vectorized Adadelta optimizer to handle complex numbers (#66587)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66587

Made some changes in the step function of the non-vectorized Adadelta optimizer to handle complex numbers as two real numbers as per 65711 on github
ghstack-source-id: 141484731

Test Plan:
buck test mode/dev caffe2/test:optim -- 'test_adadelta_complex'

https://pxl.cl/1R7kJ

Reviewed By: albanD

Differential Revision: D31630069

fbshipit-source-id: 2741177b837960538ce39772897af36bbce7b7d8
2021-10-26 17:35:01 -07:00
Christopher Gray Howard
acb340de75 [Pytorch][Bootcamp] Add fixes and vanilla testing for Adagrad non-vectorized and vectorized optimizers to handle complex numbers (#66671)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66671

Made changes in the step function of the vectorized and non-vectorized adagrad optimizers to handle complex numbers as two real numbers as per 65711 on github
ghstack-source-id: 141442350

Test Plan:
buck test mode/dev caffe2/test:optim -- 'test_adagrad_complex'
https://pxl.cl/1Rd44

Reviewed By: albanD

Differential Revision: D31673503

fbshipit-source-id: 90a0d0c69b556716e2d17c59ce80f09c750fc464
2021-10-25 10:13:21 -07:00
Ilqar Ramazanli
5ed6e4429e To fix variance computation for complex Adam (#62946)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/59998

It has been discussed in the issue that the variance term of Adam optimizer currently doesn't compute correctly for complex domain.  As it has been stated in the Generalization to Complex numbers section  in https://en.wikipedia.org/wiki/Variance variance is computed as E[(X - mu)(X-mu)*] (where mu = E[X] and * stands for conjugate) for complex random variable X.

However, currently the computation method in implementation of Adam is via E[(X - mu)(X-mu)] which doesn't return right variance value, in particular it returns complex number. Variance is defined to be real number even though underlying random variable is complex.

We fix this issue here, and testing that resulting variance is indeed real number.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62946

Reviewed By: albanD

Differential Revision: D30196038

Pulled By: iramazanli

fbshipit-source-id: ab0a6f31658aeb56bdcb211ff86eaa29f3f0d718
2021-08-09 17:54:43 -07:00
Ilqar Ramazanli
7c2938bf67 To refactor Sparse Adam algorithm for functional form (#59171)
Summary:
Adds Functional Interface for Sparse Adam Optimizer.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59171

Reviewed By: vincentqb

Differential Revision: D29360582

Pulled By: iramazanli

fbshipit-source-id: 5ceffd7f4b7abd1e0b758a5b8445abdf5555eba0
2021-06-25 06:35:39 -07:00
Ilqar Ramazanli
63219f1f9f To add Rectified Adam Algorithm to Optimizers (#58968)
Summary:
Fixes : https://github.com/pytorch/pytorch/issues/24892

In the paper : https://arxiv.org/pdf/1908.03265.pdf  Liyuan Liu et al. suggested a new optimization algorithm with an essence of similar to Adam Algorithm.

It has been discussed in the paper that, without warmup heuristic, in the early stage of adaptive optimization / learning algorithms sometimes we can get undesirable large variance which can slow overall convergence process.

Authors proposed the idea of rectification of variance of adaptive learning rate when it is expected to be high.

Differing from the paper, we selected variance tractability cut-off as 5 instead of 4. This adjustment is common practice, and could be found in the code-repository and also tensorflow swift optim library as well :

2f03dd1970/radam/radam.py (L156)

f51ee4618d/Sources/TensorFlow/Optimizers/MomentumBased.swift (L638)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58968

Reviewed By: vincentqb

Differential Revision: D29310601

Pulled By: iramazanli

fbshipit-source-id: b7bd487f72f1074f266687fd9c0c6be264a748a9
2021-06-23 18:27:57 -07:00
Ilqar Ramazanli
e8690dacb2 To add Nesterov Adam Algorithm to Optimizers (#59009)
Summary:
Fixes : https://github.com/pytorch/pytorch/issues/5804

In the paper : https://openreview.net/forum?id=OM0jvwB8jIp57ZJjtNEZ  Timothy Dozat suggested a new optimization algorithm with an essence of combination of NAG and Adam algorithms.

It is known that the idea of momentum can be improved with the Nesterov acceleration in optimization algorithms, and Dozat is investigating to apply this idea to momentum component of Adam algorithm. Author provided experiment evidence in their work to show excellence of the idea.

In this PR we are implementing the proposed algorithm NAdam in the mentioned paper. Author has a preliminary work http://cs229.stanford.edu/proj2015/054_report.pdf  where he shows the decay base constant should be taken as 0.96 which we also followed the same phenomenon here in this implementation similar to Keras. Moreover, implementation / coding practice have been followed similar to Keras in some other places as well:

f9d3868495/tensorflow/python/keras/optimizer_v2/nadam.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59009

Reviewed By: gchanan, vincentqb

Differential Revision: D29220375

Pulled By: iramazanli

fbshipit-source-id: 4b4bb4b15f7e16f7527f368bbf4207ed345751aa
2021-06-23 08:21:43 -07:00
Sam Estep
1abf45e37f Revert D29241736: [pytorch][PR] To add Rectified Adam Algorithm to Optimizers
Test Plan: revert-hammer

Differential Revision:
D29241736 (0d2a936176)

Original commit changeset: 288b9b1f3125

fbshipit-source-id: 56c4ec98647c6f1822b130726741a1c9ca193670
2021-06-22 12:08:31 -07:00
Ilqar Ramazanli
0d2a936176 To add Rectified Adam Algorithm to Optimizers (#58968)
Summary:
Fixes : https://github.com/pytorch/pytorch/issues/24892

In the paper : https://arxiv.org/pdf/1908.03265.pdf  Liyuan Liu et al. suggested a new optimization algorithm with an essence of similar to Adam Algorithm.

It has been discussed in the paper that, without warmup heuristic, in the early stage of adaptive optimization / learning algorithms sometimes we can get undesirable large variance which can slow overall convergence process.

Authors proposed the idea of rectification of variance of adaptive learning rate when it is expected to be high.

Differing from the paper, we selected variance tractability cut-off as 5 instead of 4. This adjustment is common practice, and could be found in the code-repository and also tensorflow swift optim library as well :

2f03dd1970/radam/radam.py (L156)

f51ee4618d/Sources/TensorFlow/Optimizers/MomentumBased.swift (L638)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58968

Reviewed By: gchanan

Differential Revision: D29241736

Pulled By: iramazanli

fbshipit-source-id: 288b9b1f3125fdc6c7a7bb23fde1ea5c201c0448
2021-06-22 10:38:41 -07:00
Ilqar Ramazanli
9a622f4cd9 refactor ASGD to use functional API (#58410)
Summary:
Functional API is used in large scale distributed training to enable multithreaded training instead of multiprocess, as it gives more optimal resource utilization and efficiency.

In this PR, we provide code migration and refactoring for functional API for ASGD algorithm.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58410

Reviewed By: ailzhang

Differential Revision: D28546702

Pulled By: iramazanli

fbshipit-source-id: 4f62b6037d53f35b19f98340e88af2ebb6243a4f
2021-05-19 18:55:52 -07:00
Wanchao Liang
4611387608 [optim] take kw-only argument for functional optim APIs (#56185)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56185

ghstack-source-id: 126670123

Reviewed By: albanD

Differential Revision: D27802169

fbshipit-source-id: f5e1cb2046dcdeecf5f6b0f70892828bf0adb22f
2021-04-15 20:08:04 -07:00
Wanchao Liang
8ef13cf976 [optim] refactor rprop to use functional API (#55832)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55832

ghstack-source-id: 126325541

Reviewed By: driazati

Differential Revision: D27703877

fbshipit-source-id: 34d4ce7b7d124c0cd75e2f6d0bc8f836713b7301
2021-04-15 15:19:41 -07:00
Wanchao Liang
bb245b6444 [optim] refactor adamax to use functional API (#55830)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55830

ghstack-source-id: 126325537

Reviewed By: driazati

Differential Revision: D26561017

fbshipit-source-id: 41273d200e546d4ac08d39b57865d63c624f143a
2021-04-15 15:19:39 -07:00
Vincent Quenneville-Belair
50d903f19f [optim] make functional api be private (#51316) (#51665)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51665

This reverts commit 896f82aa92.

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D26232608

Pulled By: vincentqb

fbshipit-source-id: ca006baf4fb672c11c1bb003c39a29cbadb63dd3
2021-02-03 17:59:05 -08:00