Commit Graph

22 Commits

Author SHA1 Message Date
Sam Estep
8c798e0622 Forbid trailing whitespace (#53406)
Summary:
Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857

These are the only hand-written parts of this diff:
- the addition to `.github/workflows/lint.yml`
- the file endings changed in these four files (to appease FB-internal land-blocking lints):
  - `GLOSSARY.md`
  - `aten/src/ATen/core/op_registration/README.md`
  - `scripts/README.md`
  - `torch/csrc/jit/codegen/fuser/README.md`

The rest was generated by running this command (on macOS):
```
git grep -I -l ' $' -- . ':(exclude)**/contrib/**' ':(exclude)third_party' | xargs gsed -i 's/ *$//'
```

I looked over the auto-generated changes and didn't see anything that looked problematic.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406

Test Plan:
This run (after adding the lint but before removing existing trailing spaces) failed:
- https://github.com/pytorch/pytorch/runs/2043032377

This run (on the tip of this PR) succeeded:
- https://github.com/pytorch/pytorch/runs/2043296348

Reviewed By: walterddr, seemethere

Differential Revision: D26856620

Pulled By: samestep

fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97
2021-03-05 17:22:55 -08:00
senius
e7dbaa252e Update optim.rst for better understanding (#45944)
Summary:
The `i` variable in `Line 272` may cause ambiguity in understanding. I think it should be named as `epoch` variable.

Fixes #{issue number}

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45944

Reviewed By: agolynski

Differential Revision: D24219486

Pulled By: vincentqb

fbshipit-source-id: 2af0408594613e82a1a1b63971650cabde2b576e
2020-10-14 09:36:06 -07:00
mattip
8c653e05ff DOC: fail to build if there are warnings (#41335)
Summary:
Merge after gh-41334 and gh-41321 (EDIT: both are merged).
Closes gh-38011

This is the last in a series of PRs to build documentation without warnings. It adds `-WT --keepgoing` to the shpinx build which will [fail the build if there are warnings](https://www.sphinx-doc.org/en/master/man/sphinx-build.html#cmdoption-sphinx-build-W), print a [trackeback on error](https://www.sphinx-doc.org/en/master/man/sphinx-build.html#cmdoption-sphinx-build-T) and [finish the build](https://www.sphinx-doc.org/en/master/man/sphinx-build.html#cmdoption-sphinx-build-keep-going) even when there are warnings.

It should fail now, but pass once the PRs mentioned at the top are merged.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41335

Reviewed By: pbelevich

Differential Revision: D22794425

Pulled By: mruberry

fbshipit-source-id: eb2903e50759d1d4f66346ee2ceebeecfac7b094
2020-07-28 22:33:44 -07:00
Pavel Izmailov
509c18a096 Documentation for torch.optim.swa_utils (#41228)
Summary:
This PR adds a description of `torch.optim.swa_utils` added in https://github.com/pytorch/pytorch/pull/35032 to the docs at `docs/source/optim.rst`. Please let me know what you think!

vincentqb andrewgordonwilson

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41228

Reviewed By: ngimel

Differential Revision: D22609451

Pulled By: vincentqb

fbshipit-source-id: 8dd98102c865ae4a074a601b047072de8cc5a5e3
2020-07-27 17:52:16 -07:00
Tongzhou Wang
d0af07ca4c Fix capitalization inconsistency in optim.rst
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30608

Differential Revision: D18808516

Pulled By: ezyang

fbshipit-source-id: 4be68be9a8c8c3da7a0b98162bc1050b588fab43
2019-12-04 08:17:03 -08:00
Pritam Damania
5d69bc1eda Add docs for distributed optimizer. (#29971)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29971

ghstack-source-id: 94132160

Test Plan: waitforbuildbot

Differential Revision: D18554631

fbshipit-source-id: c4485f7cff5159f423d0f35d1caf71074b62dc28
2019-11-18 18:51:26 -08:00
Vincent Quenneville-Belair
e4f40bf3b2 Add multiplicative lr. (#27254)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27254

`MultiplicativeLR` consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax.

Test Plan: Imported from OSS

Differential Revision: D17728088

Pulled By: vincentqb

fbshipit-source-id: 1c4a8e19a4f24c87b5efccda01630c8a970dc5c9
2019-10-23 11:38:45 -07:00
TortillasAlfred
38e4766349 Add CosineAnnealingWarmRestarts to optim documentation (#25421)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/20028.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25421

Differential Revision: D17221542

Pulled By: soumith

fbshipit-source-id: 9c83c9ad6bf34ba59713c61485e4ef4b782a2792
2019-09-05 19:06:18 -07:00
Vincent Quenneville-Belair
05f1fed693 Add OneCycleLR (#25324)
Summary:
Squash rebase of https://github.com/pytorch/pytorch/issues/21258

ghstack-source-id: 7d3ce522ac4dd3050bc6c6bbda1eaaeb8bc4b2c1
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25324
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25325

Differential Revision: D17095722

Pulled By: vincentqb

fbshipit-source-id: 7fe69b210924ee3b39223dd78122aea61267234a
2019-08-28 16:59:40 -07:00
Michael Acar
a4b2f3e213 Implement AdamW optimizer (#21250)
Summary:
# What is this?
This is an implementation of the AdamW optimizer as implemented in [the fastai library](803894051b/fastai/callback.py) and as initially introduced in the paper [Decoupled Weight Decay Regularization](https://arxiv.org/abs/1711.05101). It decouples the weight decay regularization step from the optimization step during training.

There have already been several abortive attempts to push this into pytorch in some form or fashion: https://github.com/pytorch/pytorch/pull/17468, https://github.com/pytorch/pytorch/pull/10866, https://github.com/pytorch/pytorch/pull/3740, https://github.com/pytorch/pytorch/pull/4429. Hopefully this one goes through.
# Why is this important?
Via a simple reparameterization, it can be shown that L2 regularization has a weight decay effect in the case of SGD optimization. Because of this, L2 regularization became synonymous with the concept of weight decay. However, it can be shown that the equivalence of L2 regularization and weight decay breaks down for more complex adaptive optimization schemes. It was shown in the paper [Decoupled Weight Decay Regularization](https://arxiv.org/abs/1711.05101) that this is the reason why models trained with SGD achieve better generalization than those trained with Adam. Weight decay is a very effective regularizer. L2 regularization, in and of itself, is much less effective. By explicitly decaying the weights, we can achieve state-of-the-art results while also taking advantage of the quick convergence properties that adaptive optimization schemes have.
# How was this tested?
There were test cases added to `test_optim.py` and I also ran a [little experiment](https://gist.github.com/mjacar/0c9809b96513daff84fe3d9938f08638) to validate that this implementation is equivalent to the fastai implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21250

Differential Revision: D16060339

Pulled By: vincentqb

fbshipit-source-id: ded7cc9cfd3fde81f655b9ffb3e3d6b3543a4709
2019-07-02 09:09:10 -07:00
vfdev
449a2c3555 Fixes #20124 (#20203)
Summary:
Fixes #20124

Description:
Code wraps `optimizer.step()` method to detect whether user is following new pattern or old pattern. In case of old pattern detected, a UserWarning is raised. Documentation is also updated to reflect the change:

![Screen Shot 2019-05-07 at 11 05 17](https://user-images.githubusercontent.com/2459423/57287527-04e63580-70b8-11e9-9ddd-5d159ef0ed2f.png)

cc SsnL, bado-lee
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20203

Differential Revision: D15543060

Pulled By: ezyang

fbshipit-source-id: 3605e1afdb6ffc2dfd5e75e92e01b967c4d065b5
2019-05-29 14:15:01 -07:00
Tongzhou Wang
4f5e72600e fix lint in optim doc
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18883

Differential Revision: D14793365

Pulled By: ezyang

fbshipit-source-id: c1b46c98e3319badec3e0e772d0ddea24cbf9c89
2019-04-04 19:08:13 -07:00
Sam Pepose
8635078d9e Adds Cyclical Learning Rate and Momentum (#18001)
Summary:
This implements a cyclical learning rate (CLR) schedule with an optional inverse cyclical momentum. More info about CLR: https://github.com/bckenstler/CLR

This is finishing what #2016 started. Resolves #1909.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18001

Differential Revision: D14451845

Pulled By: sampepose

fbshipit-source-id: 8f682e0c3dee3a73bd2b14cc93fcf5f0e836b8c9
2019-03-27 19:56:04 -07:00
livc
ecc5e623a2 fix punctuation
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17973

Differential Revision: D14438725

Pulled By: zou3519

fbshipit-source-id: 30a5485b508b4ae028057e0b66a8abb2b163d66b
2019-03-13 08:14:30 -07:00
Kai Arulkumaran
e9ef20eab5 Add Cosine Annealing LR Scheduler (#3311)
* Add Cosine Annealing LR Scheduler

* Update eta_min in tests to prevent numerical mistakes

* Use non-zero min_eta in test_cos_anneal_lr
2017-12-18 02:43:08 -05:00
SsnL
e2f33eb6a2 add doc for sparse_adam (#3519) 2017-11-06 18:37:15 -05:00
SsnL
9260f0e5ee Fix a typo in optim.rst (#3069) 2017-10-11 16:47:14 +02:00
SsnL
828048f578 Add document on how Module.cuda() and optims should work together (#3056) 2017-10-10 22:55:23 -04:00
Jiaming Liu
630af4d7d8 add learning rate schedulers (#1370) 2017-05-25 16:21:43 -04:00
Adam Paszke
f8ae34706e Port L-BFGS from Lua optim 2017-01-22 18:02:40 -05:00
Adam Paszke
604e13775f Add optim docs 2017-01-16 12:59:47 -05:00
Sam Gross
126a1cc398 Add Sphinx docs 2016-12-28 00:03:39 +01:00