Commit Graph

38 Commits

Author SHA1 Message Date
Gregory Chanan
fc82ec298b Update CosineAnnealingWarmRestarts to follow PyTorch 1.1+ Step Order. (#23833)
Summary:
Fixes: https://github.com/pytorch/pytorch/issues/23480.

I only verified that the schedule reaches the restart at the expected step as specified in the issue, it would be good to have someone else verify correctness here.

Script:
```
scheduler = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(torch.optim.SGD([torch.randn(1, requires_grad=True)], lr=0.5), T_0=1, T_mult=2)
for i in range(9):
    print(i)
    print(scheduler.get_lr())
    scheduler.step()
```
Output:
```
0
[0.5]
1
[0.5]
2
[0.25]
3
[0.5]
4
[0.42677669529663687]
5
[0.25]
6
[0.07322330470336313]
7
[0.5]
8
[0.4809698831278217]
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23833

Differential Revision: D16657251

Pulled By: gchanan

fbshipit-source-id: 713973cb7cbfc85dc333641cbe9feaf917718eb9
2019-08-07 07:15:50 -07:00
Ejaaz Merali
fb9fbc009c Fix momentum bug in CyclicLR (#20401)
Summary:
Resolves issue https://github.com/pytorch/pytorch/issues/19003

The author of this issue also asked that `cycle_momentum` default to `False` if the optimizer does not have a momentum parameter, but I'm not sure what the best way to do this would be. Silently changing the value based on the optimizer may confuse the user in some cases (say the user explicitly set `cycle_momentum=True` but doesn't know that the Adam optimizer doesn't use momentum).

Maybe printing a warning when switching this argument's value would suffice?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20401

Differential Revision: D15765463

Pulled By: ezyang

fbshipit-source-id: 88ddabd9e960c46f3471f37ea46013e6b4137eaf
2019-06-11 15:10:28 -07:00
Edward Yang
3889855a5b Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21463
ghimport-source-id: 1b0ea4a282b41388d5c6f6a5d18d37c14ae874ad

Differential Revision: D15747426

Pulled By: ezyang

fbshipit-source-id: 0708394f907b98a9f45bcfa26e5cc450fda8cf76
2019-06-10 15:26:25 -07:00
vfn
8ece538a79 Addresses bad behavior with overridden optimizer.step by #20124 (#21460)
Summary:
This PR addresses the problem described in the comment: https://github.com/pytorch/pytorch/pull/20203#issuecomment-499231276
and previously coded bad behaviour:
- a warning was raised all the times when lr schedulling is initialized

Now the code checks that:
- on the second call of `lr_scheduler.step`, ensure that `optimizer.step` has been already called, otherwise raise a warning (as it was done in #20203 )
- if optimizer's step is overridden -> raise once another warning to aware user about the new pattern:
`opt.step()` -> `lrs.step()` as we can not check this .

Now tests check that
- at initialization (`lrs = StepLR(...)`)there is no warnings
- if we replace `optimizer.step` by something else (similarly to the [code of nvidia/apex](https://github.com/NVIDIA/apex/blob/master/apex/amp/_process_optimizer.py#L287)) there is another warning raised.

cc ezyang

PS. honestly I would say that there is a lot of overhead introduced for simple warnings. I hope all these checks will be removed in future `1.2.0` or other versions...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21460

Differential Revision: D15701776

Pulled By: ezyang

fbshipit-source-id: eac5712b9146d9d3392a30f6339cd33d90c497c7
2019-06-06 13:54:42 -07:00
vfdev
449a2c3555 Fixes #20124 (#20203)
Summary:
Fixes #20124

Description:
Code wraps `optimizer.step()` method to detect whether user is following new pattern or old pattern. In case of old pattern detected, a UserWarning is raised. Documentation is also updated to reflect the change:

![Screen Shot 2019-05-07 at 11 05 17](https://user-images.githubusercontent.com/2459423/57287527-04e63580-70b8-11e9-9ddd-5d159ef0ed2f.png)

cc SsnL, bado-lee
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20203

Differential Revision: D15543060

Pulled By: ezyang

fbshipit-source-id: 3605e1afdb6ffc2dfd5e75e92e01b967c4d065b5
2019-05-29 14:15:01 -07:00
njdalton
d190450a35 Fix typo in CyclicLR docs (#21021)
Summary:
Fixes a typo in the CyclicLR docs by adding the lr_scheduler directory and puts in other required arguments.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21021

Differential Revision: D15530109

Pulled By: soumith

fbshipit-source-id: 98781bdab8d82465257229e50fa3bd0015da1286
2019-05-28 21:18:50 -07:00
Sam Pepose
082936f033 Clarify cycliclr param docs (#20880)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20880

This clarifies how the momentum parameters should be used.

Reviewed By: soumith

Differential Revision: D15482450

fbshipit-source-id: e3649a38876c5912cb101d8e404abca7c3431766
2019-05-28 12:07:47 -07:00
Edward Yang
74bdcd44c4 Remove tab. (#20715)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20715
ghimport-source-id: 600e244581b37152d86614cca6c9fb5fee6cdcde

Differential Revision: D15417984

Pulled By: ezyang

fbshipit-source-id: 939425de1a95ecc3798384e121b12faaba3a27b8
2019-05-20 11:57:18 -07:00
kirayue
d0c742134d #20028 (#20696)
Summary:
Hi, ezyang
Sorry to trouble you.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20696

Differential Revision: D15413694

Pulled By: ezyang

fbshipit-source-id: 1c19d18e00c3a66a52bb9230aa25d7530f6e659c
2019-05-20 07:51:55 -07:00
Edward Yang
839a69f587 Revert D15393514: [pytorch][PR] Refine CosineAnnealingWarmRestarts doc for issue #20028
Differential Revision:
D15393514

Original commit changeset: 03f270a577fc

fbshipit-source-id: 3633f4e9916bdadf018288a64df89078b14af563
2019-05-17 09:55:56 -07:00
kirayue
3c69c9a7fe Refine CosineAnnealingWarmRestarts doc for issue #20028 (#20267)
Summary:
Fixes #20028
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20267

Differential Revision: D15393514

Pulled By: ezyang

fbshipit-source-id: 03f270a577fc3e0414d3f07d97512a409b08f7cd
2019-05-17 09:02:28 -07:00
vfdev
61f1242b7f Formula typo fix (#20110)
Summary:
T_{cur + 1} -> T_{cur} + 1
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20110

Differential Revision: D15218135

Pulled By: ezyang

fbshipit-source-id: fb914d977cac447867921510bf57b59e62e4f68c
2019-05-06 08:08:37 -07:00
Ricky Chen
57948414ac Fix small typo T_mul->T_mult
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20148

Differential Revision: D15217485

Pulled By: ezyang

fbshipit-source-id: cb183cdc2eb3e42c685ef024742a18745923d283
2019-05-06 06:32:33 -07:00
barrh
767c82e151 Initialize last_epoch in _LRScheduler.__init__() (#20059)
Summary:
Class attributes preferably be explicitly initiated within
the __init__() call. Otherwise, overriding step() is
prone to bugs.

This patch partially reverts #7889
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20059

Differential Revision: D15195747

Pulled By: soumith

fbshipit-source-id: 3d1a51d8c725d6f14e3e91ee94c7bc7a7d6c1713
2019-05-02 22:38:12 -07:00
kirayue
af06d6342c Add SGDR(Stochastic Gradient Descent with Warm Restarts) scheduler (#17226)
Summary:
Because of merge error with master in #15042, open a new PR for ezyang.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17226

Differential Revision: D14418145

Pulled By: mrshenli

fbshipit-source-id: 099ba225b28e6aba71760b81b2153ad1c40fbaae
2019-04-25 09:26:31 -07:00
Chandler Zuo
e3f1504621 Fix the Division by Zero Bug of CosineAnnealingLR (#19180)
Summary:
Added the formula for the corner case. Updated unit tests.

Fixes #17913
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19180

Differential Revision: D14942023

Pulled By: ezyang

fbshipit-source-id: 167c109b97a7830d5b24541dc91e4788d531feec
2019-04-23 09:54:28 -07:00
Bado Lee
36084908e4 Fix lr_scheduler's last_epoch value at the time of initialization (BC BREAKING!) (#7889)
Summary:
Hello everyone :) !!

I've found that lr_scheduler was initialized with last_epoch as -1.
This causes that even after the first step (not the one in init but explicit step of scheduler),
learning rate of scheduler's optimizer remains as the previous.
```python
>>> import torch
>>> cc = torch.nn.Conv2d(10,10,3)
>>> myinitial_lr = 0.1
>>> myoptimizer = torch.optim.Adam(cc.parameters(), lr=myinitial_lr)
>>> mylrdecay = 0.5
>>> myscheduler = torch.optim.lr_scheduler.ExponentialLR(myoptimizer,mylrdecay)

>>> myscheduler.get_lr()
[0.2]    # this is because of  get_lr calculates lr by 0.1 * 0.5^-1
>>> myscheduler.optimizer.param_groups[0]["lr"]
0.1    # this is not consistent with get_lr value
>>> myscheduler.last_epoch
-1

>>> myscheduler.step()
>>> myscheduler.get_lr()
[0.1]    # this should be the value right after the init, not after first step
>>> myscheduler.optimizer.param_groups[0]["lr"]
0.1    # since this is after first step, it should have been decayed as 0.05
>>> myscheduler.last_epoch
0

>>> myscheduler.step()
>>> myscheduler.last_epoch
1
>>> myscheduler.get_lr()
[0.05]
>>> myscheduler.optimizer.param_groups[0]["lr"]
0.05
>>> myscheduler.last_epoch
1
```

First problem is, even after the init of lr_scheduler, you get the inconsistent parameter values.

The second problem is, you are stuck with same learning rate in the first 2 epochs if the step function of lr_scheduler is not called in the beginning of the epoch loop.
Of course, you can avoid this by calling lr_scheduler's step in the beginning,
but I don't think this is proper use since, incase of optimizer, step is called in the end of the iteration loop.

I've simply avoided all above issues by setting last_epoch as 0 after the initialization.

This also makes sense when you init with some value of last_epoch which is not -1.
For example, if you want to init with last epoch 10,
lr should not be set with decayed 1 step further. Which is
last_epoch gets +1 in the previous code.
base_lr * self.gamma ** self.last_epoch

Instead, it should be set with step 10 exact value.

I hope this fix find it's way with all your help :)
I'm really looking forward & excited to become a contributor for pytorch!
Pytorch Rocks!!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/7889

Differential Revision: D15012769

Pulled By: ezyang

fbshipit-source-id: 258fc3009ea7b7390a3cf2e8a3682eafb506b08b
2019-04-23 08:54:09 -07:00
Edward Yang
173f224570 Turn on F401: Unused import warning. (#18598)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**

This was requested by someone at Facebook; this lint is turned
on for Facebook by default.  "Sure, why not."

I had to noqa a number of imports in __init__.  Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it.  Left for future work.

Be careful!  flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments.  flake8-3 will
report an import unused; flake8-2 will not.  For now, I just
noqa'd all these sites.

All the changes were done by hand.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14687478

fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
2019-03-30 09:01:17 -07:00
Søren Rasmussen
95d3825e48 ReduceLrOnPlateau: best=current -> best=copy(current) (#16364) (#16697)
Summary:
Fixes #16364
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16697

Differential Revision: D14680879

Pulled By: soumith

fbshipit-source-id: c50c22f3eacea4474fb3a04fe85fbf11d5a177c9
2019-03-29 06:56:51 -07:00
Sam Pepose
8635078d9e Adds Cyclical Learning Rate and Momentum (#18001)
Summary:
This implements a cyclical learning rate (CLR) schedule with an optional inverse cyclical momentum. More info about CLR: https://github.com/bckenstler/CLR

This is finishing what #2016 started. Resolves #1909.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18001

Differential Revision: D14451845

Pulled By: sampepose

fbshipit-source-id: 8f682e0c3dee3a73bd2b14cc93fcf5f0e836b8c9
2019-03-27 19:56:04 -07:00
Chandler Zuo
096ee8467c Redefine scheduler to set learning rate using recursive formula (#14010)
Summary:
Modified step_lr for StepLR, MultiStepLR, ExponentialLR and CosineAnnealingLR. In this way, multiple schedulers can be used simultaneously to modify the learning rates.

Related issue: https://github.com/pytorch/pytorch/issues/13022

Added unit tests combining multiple schedulers.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14010

Reviewed By: ezyang

Differential Revision: D13494941

Pulled By: chandlerzuo

fbshipit-source-id: 7561270245639ba1f2c00748f8e4a5f7dec7160c
2018-12-18 16:44:31 -08:00
0phoff
294c065384 Changed serialization mechanism of LambdaLR scheduler (#9927)
Summary:
I opened an issue explaining some of my frustrations with the current state of schedulers.
While most points that I raised in [that issue](https://github.com/pytorch/pytorch/issues/8741#issuecomment-404449697) need to be discussed more thoroughly before being implemented, there are some that are not so difficult to fix.

This PR changes the way the LambdaLR scheduler gets serialized:
> The lr_lambda functions are only saved if the are callable objects (which can be stateful).
> There is no point in saving functions/lambdas as you need their definition before unpickling and they are stateless.

This has the big advantage that the scheduler is serializable, even if you use lambda functions or locally defined functions (aka a function in a function).

Does this functionality need any unit tests?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9927

Differential Revision: D9055505

Pulled By: soumith

fbshipit-source-id: 6c1cec588beedd098ec7d2bce6a9add27f29e48f
2018-07-31 19:39:06 -07:00
Tongzhou Wang
27455e9c78 Use _six for inf and nan (#9500)
Summary:
Things like `float('inf')` are actually quite expensive.
```py
In [1]: import math

In [2]: %timeit -n 200 math.inf
49.3 ns ± 1.42 ns per loop (mean ± std. dev. of 7 runs, 200 loops each)

In [3]: %timeit -n 200 float('inf')
194 ns ± 39.1 ns per loop (mean ± std. dev. of 7 runs, 200 loops each)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9500

Reviewed By: soumith

Differential Revision: D8876229

Pulled By: SsnL

fbshipit-source-id: 78602b76bb53d5588910b58270930c0bd413d2d7
2018-07-18 10:40:29 -07:00
Ailing
5a3f7810f8 _LRSchedulers getstate include optimizer info (#7757)
* getstate should include optimizer

* remove getstate/setstate functions
2018-05-23 11:43:42 -04:00
Changhan Wang
a257bd19a2 added state_dict/load_state_dict for ReduceLROnPlateau (#7201) 2018-05-10 12:02:28 +02:00
Richard Zou
3369828bfa
Clarify patience in ReduceLROnPlateau docs (#7242)
* Clarify patience in ReduceLROnPlateau docs

It's unclear which definition of patience we have. The two ways to
interpret it are:
- How many bad epochs can you see before you start considering changing the learning rate.
- How many bad epochs can you see before you change the learning rate.

This PR clarifies the docs with an example. If `patience = 2`, then
after 2 bad epochs, we begin considering changing the learning rate.
After seeing one more epoch (the 3rd epoch), if that epoch is also bad,
then we change the learning rate after it.

* address comments
2018-05-04 16:39:26 -04:00
Armen
e44f901b55 added functionality for state_dict/load_state_dict for lr_scheduler ( Fixes: #3026 ) (#6342)
* added functionality for state_dict/load_state_dict for lr_scheduler

* fixed linting issues/removed unused import

* refactor lr_scheduler state_dicts/state_dict holds everything __dict__ but optimizer

* changed documentation in lr_scheduler

* Update lr_scheduler.py
2018-04-19 07:09:03 -04:00
Marcin Elantkowski
d2ff733cb1 Make ReduceLROnPlateau serializable. (#5300)
* replace lambdas with partial

* flake8
2018-02-20 00:59:14 -05:00
Martin Drawitsch
1fdb3929c9 Fixes for docstrings/sphinx rendering of CosineAnnealingLR and Local Response Normalization (#5254)
* Fix LaTex rendering in CosineAnnealingLR

Backslashes were interpreted by Python as escapes in the string, so \frac
turned into frac, which is not a valid LaTex command.
This could be fixed with double backslashes, but the easiest solution is to
just use a raw (r) docstring.

* Fix sphinx warnings for LRN doc headings

* Move LRN docstring from __init__ to class level

The docstring was not rendered by sphinx at
http://pytorch.org/docs/master/nn.html#torch.nn.LocalResponseNorm
because it was in the constructor.

* Remove superfluous backticks from LRN formula
2018-02-15 10:29:02 -05:00
nguyen-binh-minh
188ee3ff0b Fix wrong learning rate evaluation in CosineAnnealingLR in Python 2 (#4656) 2018-01-14 13:10:41 +01:00
Jon Crall
f94f5723e7 fixed spelling (#4598) 2018-01-10 18:48:14 -05:00
Richard Zou
fe70823f8e Fix StepLR docs (#4478) 2018-01-04 12:37:26 -05:00
Kai Arulkumaran
e9ef20eab5 Add Cosine Annealing LR Scheduler (#3311)
* Add Cosine Annealing LR Scheduler

* Update eta_min in tests to prevent numerical mistakes

* Use non-zero min_eta in test_cos_anneal_lr
2017-12-18 02:43:08 -05:00
Ozan Çağlayan
dd6d04ddf2 doc: Normalize all true/false in docstrings to `True|False` (#3593)
* doc: Normalize all true/false in docstrings to ``True|False``

This makes them more apparent in the documentation.

* doc: fix flake8
2017-11-09 08:12:29 -05:00
Tzu-Wei Huang
6bcbecfb97 fix doc of lr_scheduler (#2280)
* resolves #1991

* fix typo
2017-08-24 17:04:53 -04:00
Quan Vuong
c5a9aa027b fix wrong path to ReduceLROnPlateau in docstring 2017-08-19 10:27:58 -04:00
Leonid Vlasenkov
46a868dab7 [Ready] Limit docs line length (#1900)
* some docs are ready

* docs

* docs

* fix some more

* fix some more
2017-07-10 10:24:54 -04:00
Jiaming Liu
630af4d7d8 add learning rate schedulers (#1370) 2017-05-25 16:21:43 -04:00