Commit Graph

8 Commits

Author SHA1 Message Date
Michael Lazos
b1d2028eb0 Add compiled optimizer test for nadam (#109548)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109548
Approved by: https://github.com/janeyx99
2023-09-19 22:54:36 +00:00
Michael Lazos
b193f295b6 Add capturable ASGD impl (#107857)
Add capturable ASGD impl + test

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107857
Approved by: https://github.com/janeyx99
2023-09-07 06:30:30 +00:00
Michael Lazos
49df1de383 Cudagraphs support for compiled optimizers (#107504)
Marks all params/optimizer state as static addresses and a finalizer which cleans up the graph attributes when the optimizer goes out of scope.

**Note: this does not mark grads as static because this will increase memory usage significantly

There are two cases:
1. The upstream graph is cudagraphed - this case will work fine OOTB
2. The upstream graph is not cudagraphed - in this case, there will be a lot of copies introduced from the upstream (to copy the grads) into cudagraphed-owned memory, unless the user explicitly marks the grads as static. If the user does this, this will also require not deallocating the grads in zero_grad() (either the mod or optimizer version) by setting them to zero vs None. There is a PR (https://github.com/pytorch/pytorch/pull/107853) in flight to throw an error if zero_grad attempts to set static grads to None.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107504
Approved by: https://github.com/eellison
2023-08-31 20:47:18 +00:00
Michael Lazos
5cbd3fc412 [Inductor] Fuse non-foreach ops with foreach ops without iterating over all subnodes (#106008)
Previously, when fusing a single node into a foreach op, the scheduler would iterate over each subnode and check if it can be fused, this PR adds a mapping so that the node to be fused with can be found more quickly by checking dependencies.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106008
Approved by: https://github.com/jansel
2023-07-27 21:40:24 +00:00
Jane Xu
5fec1f93dc Add meta registration for foreach_maximum_.List (#105864)
Will fix issues compiling for when amsgrad is True for Adam(W), see related failures in https://github.com/pytorch/benchmark/actions/runs/5628705163/job/15252867793

Also did some refactoring where common registrations could be deduplicated.

Test plan:
python test/inductor/test_compiled_optimizers.py -k test_adam

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105864
Approved by: https://github.com/albanD, https://github.com/mlazos
2023-07-25 00:39:13 +00:00
Michael Lazos
690ea933ca Enable more e2e foreach optimizer compilation tests (#105438)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105438
Approved by: https://github.com/jansel
2023-07-20 02:41:19 +00:00
Michael Lazos
4063158df9 Enable running compiled optimizers in CI (#104888)
as title

for reference: this is a followup to https://github.com/pytorch/pytorch/pull/104121

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104888
Approved by: https://github.com/janeyx99
2023-07-10 23:45:41 +00:00
Michael Lazos
a290cbf32b Enable fused foreach Adam compilation (#104121)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104121
Approved by: https://github.com/janeyx99
2023-07-05 23:40:03 +00:00