pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
joncrall	ad782ff7df	Enable xdoctest runner in CI for real this time (#83816 ) Builds on #83317 and enables running the doctests. Just need to figure out what is causing the failures. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83816 Approved by: https://github.com/ezyang, https://github.com/malfet	2022-12-29 05:32:42 +00:00
Adrian Wälchli	f5e20d6060	Make the state dict of CyclicLR scheduler pickleable (#91400 ) Fixes #90414 This PR drops the unpicklable `weakref.WeakMethod` object from CyclicLR scheduler from the state dict, and re-inits the object again once the state dict gets loaded. This makes the state picklable so you can include it in your checkpoint. Also fixes https://github.com/Lightning-AI/lightning/issues/15901 A simple test was added that `pickle.dumps(state)` the state. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91400 Approved by: https://github.com/albanD	2022-12-28 18:05:24 +00:00
Jane Xu	0a69c50a46	Publicly expose _LRScheduler to LRScheduler (#88503 ) Fixes #61232 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88503 Approved by: https://github.com/soulitzer	2022-11-07 21:15:10 +00:00
mikael10j	7dcfbedce0	Fix LinearLR scheduler start_factor (#86695 ) Fixes #86454 The `start_factor` must be comprised in ]0;1] instead of [0;1] to avoid division by 0. This PR changes the lower limit checking of the parameter. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86695 Approved by: https://github.com/albanD	2022-10-13 17:31:36 +00:00
Check Deng	b3fdb02fb2	Fix memory leak in _LRScheduler.step() (#85602 ) Fixes #85410 This diff removed the cyclic references in `_LRScheduler.step()`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85602 Approved by: https://github.com/albanD	2022-10-07 15:55:55 +00:00
PyTorch MergeBot	233d6f195a	Revert "Fix memory leak in _LRScheduler.step() (#85602 )" This reverts commit `eb32330d6b`. Reverted https://github.com/pytorch/pytorch/pull/85602 on behalf of https://github.com/albanD due to newly added test is flaky	2022-10-06 22:02:02 +00:00
Chengqi Deng	eb32330d6b	Fix memory leak in _LRScheduler.step() (#85602 ) Fixes #85410 This diff removed the cyclic references in `_LRScheduler.step()`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85602 Approved by: https://github.com/albanD	2022-10-06 17:07:36 +00:00
Peter Jung	9f1468ae6c	CyclicLR memory leak fix (#85462 ) Hi, we noticed in our team that by using CyclicLR, there is a problem with memory clearance on GPU (probably it will be the case without the GPU as well, but that was our use case) After initializing CyclicLR, GPU memory is not cleared even after the model, optimizer and scheduler are out of scope (e.g. reference count is zero). This is because `__init__` method inside `CyclicLR` creates reference to its own methods and it will not get removed until `gc.collect()` is called manually. This is a problem if people want to test multiple models in one run of a script, after testing the first model, second one will fail on `CUDA out of memory error` because the first one is not cleared from the memory. I propose a simple fix by using `weakref`, similarly as in `_LRScheduler` base class, but if you have any comments I am happy to change it. Here is the code to reproduce the bug: ``` import torch import weakref from transformers import DetrForObjectDetection class X: def __init__(self, optimizer): self.optimizer = optimizer # Will cause cyclic reference. self.func = self.dummy # Will work as expected, memory cleared after instance count is zero. # self.func = weakref.WeakMethod(self.dummy) def dummy(self, x): return 1. def test(): model = DetrForObjectDetection.from_pretrained('facebook/detr-resnet-50') model.to('cuda') optimizer = torch.optim.Adam(model.parameters()) x = X(optimizer) test() print(f'{torch.cuda.memory_reserved()}, {torch.cuda.memory_allocated()}') # Should print (<some memory>, 0), but with cyclic reference, it will print (<some memory>, <some memory>). ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/85462 Approved by: https://github.com/albanD	2022-09-27 17:41:58 +00:00
F-G Fernandez	7243264c61	fix: Allowed optimizers with more than 2 betas (#84486 ) Hello there 👋 As discussed in #84485, this PR enables more flexibility on the optimizers that are wrapped by LR schedulers in PyTorch. Currently, it is incompatible with optimizers that have a number of betas different than 2. This PR fixes that with minimal modifications. Fixes #84485 Any feedback is welcome! Pull Request resolved: https://github.com/pytorch/pytorch/pull/84486 Approved by: https://github.com/Lezcano, https://github.com/soulitzer	2022-09-06 19:24:10 +00:00
joncrall	b136f3f310	More doctest refinements. (#83317 ) Follow up to #82797 Now that the doctests themselves are in a better state, we should be able to enable xdoctest on the CI so they stay that way. @ezyang @vadimkantorov Pull Request resolved: https://github.com/pytorch/pytorch/pull/83317 Approved by: https://github.com/ezyang	2022-08-22 20:07:26 +00:00
joncrall	4618371da5	Integrate xdoctest - Rebased (#82797 ) This is a new version of #15648 based on the latest master branch. Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR. In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.) Fixes https://github.com/pytorch/pytorch/issues/71105 @ezyang Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797 Approved by: https://github.com/ezyang	2022-08-12 02:08:01 +00:00
Federico Pozzi	f8a10a7f79	feat: add PolynomialLR scheduler (#82769 ) ### Description <!-- What did you change and why was it needed? --> Add PolynomialLR scheduler. ### Issue Closes #79511. ### Testing I added tests for PolynomialLR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82769 Approved by: https://github.com/datumbox	2022-08-10 18:21:00 +00:00
anjali411	bda04e9f5e	Add __all__ for torch.optim and torch.nn.modules modules (#80237 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80237 Approved by: https://github.com/albanD	2022-06-24 21:34:10 +00:00
Antonio Kim	765b6a8fab	Fix SequentialLR initialization (#72856 ) What was happening is that when we have multiple learning rate schedulers, the order in which they are being initialized is not being taken into account. This is a problem if they were being initialized in sequential order (as one might intuitively do). Each scheduler calls `step()` on initialization and sets the `lr` in its optimizer's `params_groups`. However, this means that step 0 will be using the `lr` that was set by the very last scheduler (in the case of initializing schedulers sequentially) instead of the first scheduler. The fix in this PR, addresses the above bug by performing a call to the appropriate scheduler on initialization after decrementing the `last_epoch` values in order to keep them the same post-step. This will ensure that the correct scheduler is the one setting the `lr` values for the optimizer's `param_groups` Pull Request resolved: https://github.com/pytorch/pytorch/pull/72856 Approved by: https://github.com/jbschlosser	2022-06-21 20:21:13 +00:00
Madhushan B	9acbaaaf05	Fix typo in ChainedScheduler docstring (#79775 ) ### Goal Fixes https://github.com/pytorch/pytorch/issues/79720 ### Approach replace `Chains list of learning rate schedulers. It takes a list of chainable learning rate schedulers and performs consecutive step() functions` `belong` `to them by just one call.` with `Chains list of learning rate schedulers. It takes a list of chainable learning rate schedulers and performs consecutive step() functions` `belonging` `to them by just one call.` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79775 Approved by: https://github.com/albanD	2022-06-17 14:18:42 +00:00
Emilio Castillo	e5ee6f5cf7	Fix `CosineAnnealingLR` on restart Fixes #60265 The initial LR for this scheduler is not consistent when a new instance is created with `last_epoch != -1` Maybe we can refactor the testing code to test `last_epoch != -1` in schedulers that can recreate their state from the current epoch? Pull Request resolved: https://github.com/pytorch/pytorch/pull/60339 Approved by: https://github.com/albanD	2022-04-20 13:35:01 +00:00
Jake Tae	dd1121435b	SequentialLR update _last_lr on step (#70558 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/68956. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70558 Reviewed By: dagitses Differential Revision: D33430213 Pulled By: albanD fbshipit-source-id: 446f182610de32db224d55b244d76c3076e8080f	2022-01-07 10:36:35 -08:00
Rohit Gupta	5f3f327a9d	update `SequentialLR` signature (#69817 ) Summary: - ~optimizer isn't required for `SequentialLR` since it's already present in the schedulers. Trying to match the signature of it with `ChainedScheduler`.~ - ~`verbose` isn't really used anywhere so removed it.~ updated missing docs and added a small check Pull Request resolved: https://github.com/pytorch/pytorch/pull/69817 Reviewed By: ngimel Differential Revision: D33069589 Pulled By: albanD fbshipit-source-id: f015105a35a2ca39fe94c70acdfd55cdf5601419	2021-12-16 12:58:00 -08:00
John Muradeli	fdcb78df38	`print` fix in `lr_scheduler` (#68338 ) Summary: `{:5d}` fails for `CosineAnnealingWarmRestarts` which has float `epoch` Pull Request resolved: https://github.com/pytorch/pytorch/pull/68338 Reviewed By: jbschlosser Differential Revision: D33063970 Pulled By: albanD fbshipit-source-id: 992e987f8d5f6f8f5067924df4671e9725b6d884	2021-12-14 09:05:19 -08:00
Kurt Mohler	52219b1017	Fix `ChainedScheduler.get_last_lr()` (#69112 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/68820 cc vincentqb jbschlosser albanD Pull Request resolved: https://github.com/pytorch/pytorch/pull/69112 Reviewed By: zou3519 Differential Revision: D32796626 Pulled By: albanD fbshipit-source-id: bde9d4e473527be4c0a7f21cb57f795a67a99eaa	2021-12-02 13:44:12 -08:00
oliver	94b6fa6f8b	Adds an optimizer instance variable to ChainedScheduler (#68010 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67601. As simple a fix as I could make it. I even managed to delete some testing code! I checked calling `super()` and, as I had feared, it doesn't work out the box, so perhaps that ought to be revisited later. As it stands, https://github.com/pytorch/pytorch/issues/20124, still applies to the chained scheduler, but I think this change is still an improvement. Pull Request resolved: https://github.com/pytorch/pytorch/pull/68010 Reviewed By: zou3519 Differential Revision: D32278139 Pulled By: albanD fbshipit-source-id: 4c6f9f1b2822affdf63a6d22ddfdbcb1c6afd579	2021-11-10 01:31:47 -08:00
hesom	07a08fb95f	Fix typo in LinearLR docs (#67840 ) Summary: The final learning rate should be 0.05 like the lr used as the argument for the optimizer and not 0.005. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67840 Reviewed By: jbschlosser Differential Revision: D32187091 Pulled By: albanD fbshipit-source-id: 8aff691bba3896a847d7b9d9d669a65f67a6f066	2021-11-05 07:16:15 -07:00
Yiwen Song	6696c59af4	Adding `optimizer` attribute to SequentialLR (#67406 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67318 :) cc albanD, datumbox Pull Request resolved: https://github.com/pytorch/pytorch/pull/67406 Reviewed By: jbschlosser Differential Revision: D31997873 Pulled By: albanD fbshipit-source-id: f579fb886d049a545673fd92ef5892fcf501bcc6	2021-10-28 14:43:40 -07:00
Balaji	32f0387ee8	Bug in CosineAnnealingWarmRestarts in optim/lr_scheduler.py (#64758 ) Summary: ## {emoji:1f41b} Bug 'CosineAnnealingWarmRestarts' object has no attribute 'T_cur'. In the Constructor of the CosineAnnealingWarmRestarts, we're calling the constructor of the Parent class (_LRScheduler) which inturn calls the step method of the CosineAnnealingWarmRestarts. The called method tries to update the object's attribute 'T_cur' which is not defined yet. So it raises the error. This only holds, when we give the value for last_epoch argument as 0 or greater than 0 to the 'CosineAnnealingWarmRestarts', while initializing the object. ![Bug_in_CosineAnnealingWarmRestarts](https://user-images.githubusercontent.com/77477328/132552212-70abc8b5-0357-4c35-90a9-832648bac607.png) ## To Reproduce Steps to reproduce the behavior: 1. Give the value for the last_epoch argument as zero OR 1. Give the value for the last_epoch argument as a Positive integer. ## Expected behavior I only expected the 'CosineAnnealingWarmRestarts' object to be initialized. ## Environment PyTorch version: 1.9.0+cpu Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A OS: Ubuntu 20.04.2 LTS (x86_64) GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Clang version: Could not collect CMake version: version 3.21.2 Libc version: glibc-2.31 Python version: 3.8.10 [GCC 9.4.0] (64-bit runtime) Python platform: Linux-5.8.0-59-generic-x86_64-with-glibc2.29 Is CUDA available: False CUDA runtime version: No CUDA ## Additional context We can able to solve this bug by moving the line 'self.T_cur = self.last_epoch' above the 'super(CosineAnnealingWarmRestarts,self).__init__()' line. Since we've initialized the "self.T_cur" to the object. Pull Request resolved: https://github.com/pytorch/pytorch/pull/64758 Reviewed By: ezyang Differential Revision: D31113694 Pulled By: jbschlosser fbshipit-source-id: 98c0e292291775895dc3566fda011f2d6696f721	2021-09-22 16:55:14 -07:00
Ilqar Ramazanli	df3d649380	To add state dict and load_dict for Chained Scheduler (#65034 ) Summary: Adding state_dict() and load_state_dict() methods for Chained Scheduler Pull Request resolved: https://github.com/pytorch/pytorch/pull/65034 Reviewed By: prabhat00155, nateanl Differential Revision: D30958207 Pulled By: datumbox fbshipit-source-id: 1a587a330d34e0548e891a39f8fb5a3d251b71fa	2021-09-15 13:11:41 -07:00
Ilqar Ramazanli	211ad231dc	To add state_dict and load_state_dict to SequentialLR (#65035 ) Summary: To add state_dict() and load_state_dict() methods to SequentialLR Pull Request resolved: https://github.com/pytorch/pytorch/pull/65035 Reviewed By: prabhat00155, nateanl Differential Revision: D30958204 Pulled By: datumbox fbshipit-source-id: 65114e1b07146526ae2680233f5cd42b2534d67a	2021-09-15 12:01:51 -07:00
Ilqar Ramazanli	2b41bf40c5	To add SequentialLR to PyTorch Core Schedulers (#64037 ) Summary: Partially resolves https://github.com/pytorch/vision/issues/4281 In this PR we are proposing a new scheduler --SequentialLR-- which enables list of different schedulers called in different periods of the training process. The main motivation of this scheduler is recently gained popularity of warming up phase in the training time. It has been shown that having a small steps in initial stages of training can help convergence procedure get faster. With the help of SequentialLR we mainly enable to call a small constant (or linearly increasing) learning rate followed by actual target learning rate scheduler. ```PyThon scheduler1 = ConstantLR(optimizer, factor=0.1, total_iters=2) scheduler2 = ExponentialLR(optimizer, gamma=0.9) scheduler = SequentialLR(optimizer, schedulers=[scheduler1, scheduler2], milestones=[5]) for epoch in range(100): train(...) validate(...) scheduler.step() ``` which this code snippet will call `ConstantLR` in the first 5 epochs and will follow up with `ExponentialLR` in the following epochs. This scheduler could be used to provide call of any group of schedulers next to each other. The main consideration we should make is every time we switch to a new scheduler we assume that new scheduler starts from the beginning- zeroth epoch. We also add Chained Scheduler to `optim.rst` and `lr_scheduler.pyi` files here. Pull Request resolved: https://github.com/pytorch/pytorch/pull/64037 Reviewed By: albanD Differential Revision: D30841099 Pulled By: iramazanli fbshipit-source-id: 94f7d352066ee108eef8cda5f0dcb07f4d371751	2021-09-09 09:36:32 -07:00
Ilqar Ramazanli	f767cf6683	To change WarmUp Scheduler with ConstantLR and LinearLR (#64395 ) Summary: Partially unblocks https://github.com/pytorch/vision/issues/4281 Previously we have added WarmUp Schedulers to PyTorch Core in the PR : https://github.com/pytorch/pytorch/pull/60836 which had two mode of execution - linear and constant depending on warming up function. In this PR we are changing this interface to more direct form, as separating linear and constant modes to separate Schedulers. In particular ```Python scheduler1 = WarmUpLR(optimizer, warmup_factor=0.1, warmup_iters=5, warmup_method="constant") scheduler2 = WarmUpLR(optimizer, warmup_factor=0.1, warmup_iters=5, warmup_method="linear") ``` will look like ```Python scheduler1 = ConstantLR(optimizer, warmup_factor=0.1, warmup_iters=5) scheduler2 = LinearLR(optimizer, warmup_factor=0.1, warmup_iters=5) ``` correspondingly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/64395 Reviewed By: datumbox Differential Revision: D30753688 Pulled By: iramazanli fbshipit-source-id: e47f86d12033f80982ddf1faf5b46873adb4f324	2021-09-07 08:42:31 -07:00
Ilqar Ramazanli	5a12cb611f	To add Chained Scheduler to the list of PyTorch schedulers. (#63491 ) Summary: In this PR we are introducing ChainedScheduler which initially proposed in the discussion https://github.com/pytorch/pytorch/pull/26423#discussion_r329976246 . The idea is to provide a user friendly chaining method for schedulers, especially for the cases many of them are involved and we want to have a clean and easy to read interface for schedulers. This method will be even more crucial once CompositeSchedulers and Schedulers for different type of parameters are involved. The immediate application of Chained Scheduler is expected to happen in TorchVision Library to combine WarmUpLR and MultiStepLR https://github.com/pytorch/vision/blob/master/references/video_classification/scheduler.py#L5 . However, it can be expected that in many other use cases also this method could be applied. ### Example The usage is as simple as below: ```python sched=ChainedScheduler([ExponentialLR(self.opt, gamma=0.9), WarmUpLR(self.opt, warmup_factor=0.2, warmup_iters=4, warmup_method="constant"), StepLR(self.opt, gamma=0.1, step_size=3)]) ``` Then calling ```python sched.step() ``` would trigger step function for all three schedulers consecutively Partially resolves https://github.com/pytorch/vision/issues/4281 Pull Request resolved: https://github.com/pytorch/pytorch/pull/63491 Reviewed By: datumbox, mruberry Differential Revision: D30576180 Pulled By: iramazanli fbshipit-source-id: b43f0749f55faab25079641b7d91c21a891a87e4	2021-08-26 13:30:21 -07:00
Ilqar Ramazanli	e7c4988b52	To fix the chainability at epoch zero for some schedulers (#63457 ) Summary: It has been discussed in the https://github.com/pytorch/pytorch/pull/60836#issuecomment-899084092 that we have observed an obstacle to chain some type of learning rate schedulers. In particular we observed * some of the learning rate schedulers returns initial learning rates at epoch 0 as ``` return self.base_lrs` ``` * This can be a problem when two schedulers called as chained as ``` scheduler1.step() scheduler2.step() ``` in particular, we completely ignore the effect of scheduler1 at epoch 0. This could not be an issue if at epoch 0, scheduler1 was ineffective as in many schedulers, however for schedulers as WarmUp Schedulers, where at epoch 0 schedulers multiplicative value is smaller than 1 this could lead to undesired behaviors. The following code snippet illustrates the problem better ## Reproducing the bug ```python import torch from torch.nn import Parameter from torch.optim import SGD from torch.optim.lr_scheduler import WarmUpLR, ExponentialLR model = [Parameter(torch.randn(2, 2, requires_grad=True))] optimizer = SGD(model, 1.0) scheduler1 = WarmUpLR(optimizer, warmup_factor=0.1, warmup_iters=5, warmup_method="constant") scheduler2 = ExponentialLR(optimizer, gamma=0.9) for epoch in range(10): print(epoch, scheduler2.get_last_lr()[0]) optimizer.step() scheduler1.step() scheduler2.step() ``` ### Current Result ``` 0 1.0 1 0.9 2 0.81 3 0.7290000000000001 4 0.6561000000000001 5 5.904900000000001 6 5.314410000000001 7 4.782969000000001 8 4.304672100000001 9 3.874204890000001 ``` ### Expected Result ``` 0 1.0 1 0.9 2 0.81 3 0.7290000000000001 4 0.6561000000000001 5 0.5904900000000001 6 0.5314410000000001 7 0.4782969000000001 8 0.4304672100000001 9 0.3874204890000001 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/63457 Reviewed By: datumbox Differential Revision: D30424160 Pulled By: iramazanli fbshipit-source-id: 3e15af8d278c872cd6f53406b55f4d3ce5002867	2021-08-19 07:17:03 -07:00
Ilqar Ramazanli	cec08e7032	To add warm-up scheduler to optim (#60836 ) Summary: Warm up of learning rate scheduling has initially been discussed by Priya et. al. in the paper: https://arxiv.org/pdf/1706.02677.pdf . In the section 2.2 of the paper they discussed and proposed idea of warming up learning schedulers in order to prevent big variance / noise in the learning rate. Then idea has been further discussed in the following papers: * Akilesh Gotmare et al. https://arxiv.org/abs/1810.13243 * Bernstein et al http://proceedings.mlr.press/v80/bernstein18a/bernstein18a.pdf * Liyuan Liu et al: https://arxiv.org/pdf/1908.03265.pdf There are two type of popularly used learning rate warm up ideas * Constant warmup (start with very small constant learning rate) * Linear Warmup ( start with small learning rate and gradually increase) In this PR we are adding warm up as learning rate scheduler. Note that learning rates are chainable, which means that we can merge warmup scheduler with any other learning rate scheduler to make more sophisticated learning rate scheduler. ## Linear Warmup Linear Warmup is multiplying learning rate with pre-defined constant - warmup_factor in the first epoch (epoch 0). Then targeting to increase this multiplication constant to one in warmup_iters many epochs. Hence we can derive the formula at i-th step to have multiplication constant equal to: warmup_factor + (1-warmup_factor) * i / warmup_iters Moreover, the fraction of this quantity at point i to point i-1 will give us 1 + (1.0 - warmup_factor) / [warmup_iterswarmup_factor+(i-1)(1-warmup_factor)] which is used in get_lr() method in our implementation. Below we provide an example how to use linear warmup scheduler and to give an example to show how does it works. ```python import torch from torch.nn import Parameter from torch.optim import SGD from torch.optim.lr_scheduler import WarmUpLR model = [Parameter(torch.randn(2, 2, requires_grad=True))] optimizer = SGD(model, 0.1) scheduler = WarmUpLR(optimizer, warmup_factor=0.1, warmup_iters=10, warmup_method="linear") for epoch in range(15): print(epoch, scheduler.get_last_lr()[0]) optimizer.step() scheduler.step() ``` ``` 0 0.010000000000000002 1 0.019000000000000003 2 0.028000000000000008 3 0.03700000000000001 4 0.04600000000000001 5 0.055000000000000014 6 0.06400000000000002 7 0.07300000000000002 8 0.08200000000000003 9 0.09100000000000004 10 0.10000000000000005 11 0.10000000000000005 12 0.10000000000000005 13 0.10000000000000005 14 0.10000000000000005 ``` ## Constant Warmup Constant warmup has straightforward idea, to multiply learning rate by warmup_factor until we reach to epoch warmup_factor, then do nothing for following epochs ```python import torch from torch.nn import Parameter from torch.optim import SGD from torch.optim.lr_scheduler import WarmUpLR model = [Parameter(torch.randn(2, 2, requires_grad=True))] optimizer = SGD(model, 0.1) scheduler = WarmUpLR(optimizer, warmup_factor=0.1, warmup_iters=5, warmup_method="constant") for epoch in range(10): print(epoch, scheduler.get_last_lr()[0]) optimizer.step() scheduler.step() ``` ``` 0 0.010000000000000002 1 0.010000000000000002 2 0.010000000000000002 3 0.010000000000000002 4 0.010000000000000002 5 0.10000000000000002 6 0.10000000000000002 7 0.10000000000000002 8 0.10000000000000002 9 0.10000000000000002 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/60836 Reviewed By: saketh-are Differential Revision: D29537615 Pulled By: iramazanli fbshipit-source-id: d910946027acc52663b301f9c56ade686e62cb69	2021-08-15 12:31:45 -07:00
Samuel Marks	e6779d4357	[*.py] Rename "Arguments:" to "Args:" (#49736 ) Summary: I've written custom parsers and emitters for everything from docstrings to classes and functions. However, I recently came across an issue when I was parsing/generating from the TensorFlow codebase: inconsistent use of `Args:` and `Arguments:` in its docstrings. ```sh (pytorch#c348fae)$ for name in 'Args:' 'Arguments:'; do printf '%-10s %04d\n' "$name" "$(rg -IFtpy --count-matches "$name" \| paste -s -d+ -- \| bc)"; done Args: 1095 Arguments: 0336 ``` It is easy enough to extend my parsers to support both variants, however it looks like `Arguments:` is wrong anyway, as per: - https://google.github.io/styleguide/pyguide.html#doc-function-args @ [`ddccc0f`](https://github.com/google/styleguide/blob/ddccc0f/pyguide.md) - https://chromium.googlesource.com/chromiumos/docs/+/master/styleguide/python.md#describing-arguments-in-docstrings @ [`9fc0fc0`](https://chromium.googlesource.com/chromiumos/docs/+/9fc0fc0/styleguide/python.md) - https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html @ [`c0ae8e3`](https://github.com/sphinx-contrib/napoleon/blob/c0ae8e3/docs/source/example_google.rst) Therefore, only `Args:` is valid. This PR replaces them throughout the codebase. PS: For related PRs, see tensorflow/tensorflow/pull/45420 PPS: The trackbacks automatically appearing below are sending the same changes to other repositories in the [PyTorch](https://github.com/pytorch) organisation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/49736 Reviewed By: albanD Differential Revision: D25710534 Pulled By: soumith fbshipit-source-id: 61e8ff01abb433e9f78185c2d1d0cbd7c22c1619	2020-12-28 09:34:47 -08:00
jsrozner	42e6951e62	Remove save_state_warning in LambdaLR (#46813 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/46405, https://github.com/pytorch/pytorch/issues/43352 I updated the docstring in the local file (function level comments). Do I also need to edit somewhere else or recompile docstrings? Also, though I didn't change any types here, how is typing (for IDE type checking) documentation generated / used)? Pull Request resolved: https://github.com/pytorch/pytorch/pull/46813 Reviewed By: ezyang Differential Revision: D24923112 Pulled By: vincentqb fbshipit-source-id: be7818e0d4593bfc5d74023b9c361ac2a538589a	2020-12-04 13:19:59 -08:00
Alexander Grund	5b0f400488	Replace list(map(...)) constructs by list comprehensions (#46461 ) Summary: As discussed in https://github.com/pytorch/pytorch/issues/46392 this makes the code more readable and possibly more performant. It also fixes a bug detected by this where the argument order of `map` was confused: `030a24906e (diff-5bb26bd3a23ee3bb540aeadcc0385df2a4e48de39f87ed9ea76b21990738fe98L1537-R1537)` Fixes https://github.com/pytorch/pytorch/issues/46392 Pull Request resolved: https://github.com/pytorch/pytorch/pull/46461 Reviewed By: ailzhang Differential Revision: D24367015 Pulled By: ezyang fbshipit-source-id: d55a67933cc22346b00544c9671f09982ad920e7	2020-10-19 18:42:49 -07:00
Aiden Nibali	2bc6caa9e4	Add three-phase option to OneCycleLR (#42715 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/40362 The new `three_phase` option provides a way of constructing schedules according to the scheme recommended in [Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates](https://arxiv.org/abs/1708.07120). Note that this change maintains backwards compatibility, and as a result the default behaviour of OneCycleLR remains quite counter-intuitive. vincentqb Pull Request resolved: https://github.com/pytorch/pytorch/pull/42715 Reviewed By: heitorschueroff Differential Revision: D24289744 Pulled By: vincentqb fbshipit-source-id: e4aad87880716bb14613c0aa8631e43b04a93e5c	2020-10-14 15:05:14 -07:00
Kent Gauen	2efc618f19	lr_schedule.py redundant code (#44613 ) Summary: The subclass sets "self.last_epoch" when this is set in the parent class's init function. Why would we need to set last_epoch twice? I think calling "super" resets last_epoch anyway, so I am not sure why we would want to include this in the subclass. Am I missing something? For the record, I am just a Pytorch enthusiast. I hope my question isn't totally silly. Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/44613 Reviewed By: albanD Differential Revision: D23691770 Pulled By: mrshenli fbshipit-source-id: 080d9acda86e1a2bfaafe2c6fcb8fc1544f8cf8a	2020-09-15 20:28:39 -07:00
NTT123	103887892c	Fix "non-negative integer" error messages (#42734 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42662 Use "positive integer" error message for consistency with: `17f76f9a78/torch/optim/lr_scheduler.py (L958-L959)` `ad7133d3c1/torch/utils/data/sampler.py (L102-L104)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/42734 Reviewed By: zdevito Differential Revision: D23039575 Pulled By: smessmer fbshipit-source-id: 1be1e0caa868891540ecdbe6f471a6cd51c40ede	2020-08-10 19:39:37 -07:00
guol-fnst	17f76f9a78	Verbose param for schedulers that don't have it #38726 (#41580 ) Summary: Verbose param for schedulers that don't have it https://github.com/pytorch/pytorch/issues/38726 Pull Request resolved: https://github.com/pytorch/pytorch/pull/41580 Reviewed By: izdeby Differential Revision: D22671163 Pulled By: vincentqb fbshipit-source-id: 53a6c9e929141d411b6846bc25f3fe7f46fdf3be	2020-07-23 09:57:33 -07:00
vfdev	a6a2dd14ea	Fix typo in warning message (#39854 ) Summary: Fix typo Pull Request resolved: https://github.com/pytorch/pytorch/pull/39854 Reviewed By: ezyang Differential Revision: D22193544 Pulled By: zou3519 fbshipit-source-id: 04b9f59da7b6ba0649fc6d315adcf20685e10930	2020-06-23 16:47:35 -07:00
lordeddard	2de4f245c6	Fix typo in documentation (#34581 ) Summary: Update the parameter description of `total_steps` in `OneCycleLR`. References https://github.com/pytorch/pytorch/issues/34531 Pull Request resolved: https://github.com/pytorch/pytorch/pull/34581 Differential Revision: D20386306 Pulled By: albanD fbshipit-source-id: f8b424a01760e8f5d4de5367b6c60fb342019689	2020-03-11 13:57:10 -07:00
Vincent Quenneville-Belair	be3bc1deb1	convert counter back to list #33229 (#33356 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/33229 Pull Request resolved: https://github.com/pytorch/pytorch/pull/33356 Differential Revision: D20003196 Pulled By: vincentqb fbshipit-source-id: 96f9e0fc7e99a7c2e202f932d1a2ffa158afad92	2020-03-10 15:46:24 -07:00
HearyShen	edd5c009f7	fix docs mistakes in lr_scheduler.MultiplicativeLR (#33805 ) Summary: This PR is referenced to an issue: [The docs of `MultiplicativeLR` use `LambdaLR` as example](https://github.com/pytorch/pytorch/issues/33752#issue-570374087) https://github.com/pytorch/pytorch/issues/33752 Pull Request resolved: https://github.com/pytorch/pytorch/pull/33805 Differential Revision: D20121314 Pulled By: mruberry fbshipit-source-id: 5afa63bbe83d35ce4e55705b8cbd96326a907651	2020-02-27 14:11:57 -08:00
Hong Xu	a6a72ac68f	Fix all occurrences of C416. (#33429 ) Summary: C416: Unnecessary (list/set) comprehension - rewrite using list/set(). See https://pypi.org/project/flake8-comprehensions/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/33429 Differential Revision: D19972858 Pulled By: ezyang fbshipit-source-id: faac042a94c59d737bd5ae983121a0a029346e23	2020-02-21 08:32:22 -08:00
Edgar Andrés Margffoy Tuay	cdf381c967	Fix LambdaLR scheduler side effects (#32848 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/32756 Pull Request resolved: https://github.com/pytorch/pytorch/pull/32848 Differential Revision: D19859736 Pulled By: vincentqb fbshipit-source-id: 43b3cbb2b6bed208c75aad37aebc2a8a9565fe0d	2020-02-20 11:09:56 -08:00
Vincent Quenneville-Belair	e7f0b15473	Remove return value for __exit__ (#32997 ) Summary: When an error is raised and `__exit__` in a context manager returns `True`, the error is suppressed; otherwise the error is raised. No return value should be given to maintain the default behavior of context manager. Fixes https://github.com/pytorch/pytorch/issues/32639. The `get_lr` function was overridden with a function taking an epoch parameter, which is not allowed. However, the relevant error was not being raised. ```python In [1]: import torch ...: ...: class MultiStepLR(torch.optim.lr_scheduler._LRScheduler): ...: def __init__(self, optimizer, gamma, milestones, last_epoch = -1): ...: self.init_lr = [group['lr'] for group in optimizer.param_groups] ...: self.gamma = gamma ...: self.milestones = milestones ...: super().__init__(optimizer, last_epoch) ...: ...: def get_lr(self, step): ...: global_step = self.last_epoch #iteration number in pytorch ...: gamma_power = ([0] + [i + 1 for i, m in enumerate(self.milestones) if global_step >= m])[-1] ...: return [init_lr * (self.gamma ** gamma_power) for init_lr in self.init_lr] ...: ...: optimizer = torch.optim.SGD([torch.rand(1)], lr = 1) ...: scheduler = MultiStepLR(optimizer, gamma = 1, milestones = [10, 20]) ``` ``` --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-1-7fad6ba050b0> in <module> 14 15 optimizer = torch.optim.SGD([torch.rand(1)], lr = 1) ---> 16 scheduler = MultiStepLR(optimizer, gamma = 1, milestones = [10, 20]) <ipython-input-1-7fad6ba050b0> in __init__(self, optimizer, gamma, milestones, last_epoch) 6 self.gamma = gamma 7 self.milestones = milestones ----> 8 super().__init__(optimizer, last_epoch) 9 10 def get_lr(self, step): ~/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/optim/lr_scheduler.py in __init__(self, optimizer, last_epoch) 75 self._step_count = 0 76 ---> 77 self.step() 78 79 def state_dict(self): ~/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/optim/lr_scheduler.py in step(self, epoch) 141 print("1a") 142 # try: --> 143 values = self.get_lr() 144 # except TypeError: 145 # raise RuntimeError TypeError: get_lr() missing 1 required positional argument: 'step' ``` May be related to https://github.com/pytorch/pytorch/issues/32898. Pull Request resolved: https://github.com/pytorch/pytorch/pull/32997 Differential Revision: D19737731 Pulled By: vincentqb fbshipit-source-id: 5cf84beada69b91f91e36b20c3278e9920343655	2020-02-11 09:27:29 -08:00
Enealor	e085c55e53	Fix `\\` warnings/errors when building optim documentation (#32911 ) Summary: This PR fixes the warnings and errors attributed to the use of `\\` outside of a proper environment. While rendered correctly in the documentation, it produces the warning ``` LaTeX-incompatible input and strict mode is set to 'warn': In LaTeX, \\ or \newline does nothing in display mode [newLineInDisplayMode] ``` on the CI tools and errors with ``` ParseError: KaTeX parse error: Expected 'EOF', got '\\' at position (x): ... ``` when not set to warn. This PR also makes minor formatting adjustments. The `CosineAnnealingLR` documentation has been adjusted to remove an unnecessarily large fraction and to improve spacing. The `SGD` documentation has been adjusted so that variables are consistently typeset and so that it follows the convention of punctuating equations. I attached images of the current documentation, the new documentation and a marked version to highlight differences. * SGD: New: ![new_sgd](https://user-images.githubusercontent.com/53704971/73596383-98795500-44d6-11ea-97ce-bac02a0a1638.png) Current: ![current_sgd](https://user-images.githubusercontent.com/53704971/73596384-98795500-44d6-11ea-86d3-b407cebbb513.png) Marked new: ![marked_sgd](https://user-images.githubusercontent.com/53704971/73596385-98795500-44d6-11ea-9e06-9ac5e5e27270.png) * CosineAnnealingLR: New: ![new_calr](https://user-images.githubusercontent.com/53704971/73596382-98795500-44d6-11ea-9c90-02406d297bae.png) Current: ![current_calr](https://user-images.githubusercontent.com/53704971/73596387-9911eb80-44d6-11ea-93fb-ee72d695312a.png) Marked new: ![marked_calr](https://user-images.githubusercontent.com/53704971/73596386-9911eb80-44d6-11ea-91a6-ed7a62b4e255.png) Pull Request resolved: https://github.com/pytorch/pytorch/pull/32911 Differential Revision: D19697114 Pulled By: ezyang fbshipit-source-id: 567304bd4adcfa4086eae497cb818cf74375fe5d	2020-02-03 09:54:38 -08:00
Kirayue	9e9bfbfd8d	Update old scheduler example usage (#31358 ) Summary: Update the old example usage in CosineAnnealingWarm, `scheduler.step()` should be called after `optimizer.step()`. https://github.com/pytorch/pytorch/issues/20028#issuecomment-566061580 Pull Request resolved: https://github.com/pytorch/pytorch/pull/31358 Differential Revision: D19199311 Pulled By: vincentqb fbshipit-source-id: cb29b95f8277d2dfa75ec2a83c1af03a5c9c9a69	2020-01-02 09:15:04 -08:00
Vincent Quenneville-Belair	9459db86bf	Raise warning for schedulers following chainable shedulers (#31125 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/29697. Raise warning for schedulers following chainable schedulers in https://github.com/pytorch/pytorch/issues/26423. See explanation for * [new warning when load/save](https://github.com/pytorch/pytorch/issues/29697#issuecomment-564655802) * [change from deprecation to user warning](https://github.com/pytorch/pytorch/issues/29697#issuecomment-564659775). gchanan -- This should go in the upcoming release following https://github.com/pytorch/pytorch/issues/26423. Pull Request resolved: https://github.com/pytorch/pytorch/pull/31125 Differential Revision: D19143740 Pulled By: vincentqb fbshipit-source-id: 35b55fe6c5b39ca5a68b1a6e19f14eb95b9a784e	2019-12-23 08:24:22 -08:00
Adam J. Stewart	23483406aa	Fix missing space in lr_scheduler warning msg Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29527 Differential Revision: D18422662 Pulled By: ngimel fbshipit-source-id: 80191232ee0b639274ba3561e0d89ddcb40434e7	2019-11-10 22:51:35 -08:00
Vincent Quenneville-Belair	cbddc77ac5	fix docs for lr (#28026 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28026 Documentation for learning rate does not render well. #27730. Test Plan: Imported from OSS Differential Revision: D17953395 Pulled By: vincentqb fbshipit-source-id: 9e84df3e7de43f11399a67bc99c76ef241b1120f	2019-10-23 13:49:34 -07:00
Timothy Man	1c53a74e26	Fixed behavior of div_factor parameter in optim.lr_scheduler.OneCycleLR (#28217 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/28216 Pull Request resolved: https://github.com/pytorch/pytorch/pull/28217 Differential Revision: D18070759 Pulled By: vincentqb fbshipit-source-id: ed032190c0e3eab834fc9a8f408b75b56f0f35ec	2019-10-23 13:39:05 -07:00
Vincent Quenneville-Belair	e4f40bf3b2	Add multiplicative lr. (#27254 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27254 `MultiplicativeLR` consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax. Test Plan: Imported from OSS Differential Revision: D17728088 Pulled By: vincentqb fbshipit-source-id: 1c4a8e19a4f24c87b5efccda01630c8a970dc5c9	2019-10-23 11:38:45 -07:00
Vincent Quenneville-Belair	d1d2358d31	Correct math formatting for lr scheduler (#28467 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28467 Correcting formatting error from #27874. Also making size of parenthesis more natural. ![Screen Shot 2019-10-22 at 5 38 22 PM](https://user-images.githubusercontent.com/3047868/67336492-76ddfa00-f4f3-11e9-9d79-70a0aa4f6d29.png) Closes #27874 Test Plan: Imported from OSS Differential Revision: D18076085 Pulled By: vincentqb fbshipit-source-id: cb7c52b347d6d11ea4a2d3c94d00a42f849c0a83	2019-10-23 11:11:25 -07:00
zou3519	e5d6b75319	Bag of documentation fixes; fix more sphinx warnings (#27850 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/27850 Many of these are real problems in the documentation (i.e., link or bullet point doesn't display correctly). Test Plan: - built and viewed the documentation for each change locally. Differential Revision: D17908123 Pulled By: zou3519 fbshipit-source-id: 65c92a352c89b90fb6b508c388b0874233a3817a	2019-10-15 07:31:14 -07:00
Vincent Quenneville-Belair	28b1f586f6	Change schedulers to chainable form (#26423 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/26423 Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](https://github.com/pytorch/pytorch/pull/21800#issuecomment-513370208). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](https://github.com/pytorch/pytorch/pull/21800#issuecomment-513940729) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](https://github.com/pytorch/pytorch/pull/21800#discussion_r294305485)) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax. # #20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) # Scheduler with sometimes-constant epoch number for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: lr_scheduler.step(epoch) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) last_epoch = -1 for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: # Check if epoch number has changed manually if epoch-last_epoch > 0: lr_scheduler.step() last_epoch = epoch print(epoch, scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ``` # ghstack This contains the changes from #24352. Opening again since they were reverted. This reverts commit `1c477b7e1f`. Test Plan: Imported from OSS Differential Revision: D17460427 Pulled By: vincentqb fbshipit-source-id: 8c10f4e7246d6756ac91df734e8bed65bdef63c9	2019-10-04 08:53:14 -07:00
Zecong Hu	b8ae4d0f1c	Resolve #25605 cyclic reference in _LRScheduler (#25776 ) Summary: Cyclic reference was introduced in a previous version due to runtime overwriting of the bound method `optimizer.step`. This is now avoided by keeping a weak reference to the optimizer instance. Credit: https://stackoverflow.com/questions/26157952/why-set-a-bound-method-to-python-object-create-a-circular-reference Pull Request resolved: https://github.com/pytorch/pytorch/pull/25776 Differential Revision: D17420770 Pulled By: ezyang fbshipit-source-id: 546ec94cf725ebfddb310b24e6a2e146ddecd1f6	2019-09-18 06:08:35 -07:00
Vincent Quenneville-Belair	a3f0d988d9	Revert D17349760: Change schedulers to chainable form Test Plan: revert-hammer Differential Revision: D17349760 Original commit changeset: 0a6ac01e2a6b fbshipit-source-id: 41c2c136215dabc26cad5098a08eff2a2a29b715	2019-09-13 12:54:59 -07:00
Vincent Quenneville-Belair	939ae80de1	Change schedulers to chainable form (#24352 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/24352 Enable chainable schedulers as requested in #13022 by implementing the changes mentioned below from [comment](https://github.com/pytorch/pytorch/pull/21800#issuecomment-513370208). * Changing the behavior of schedulers to the chainable formula when available * Using the closed form whenever epoch is different from None until the next release with a deprecation warning * Making `get_computed_values` the supported way of obtaining the last computed learning rate by the scheduler (see [comment](https://github.com/pytorch/pytorch/pull/21800#issuecomment-513940729) for new syntax) * Returning a deprecation warning when invoking the undocumented get_lr function (see [comment](https://github.com/pytorch/pytorch/pull/21800#discussion_r294305485)) referring to `get_computed_values`, and deprecating it in the next release. * `CosineAnnealingWarmRestart` still takes an epoch parameter as it is the only one with a mechanic relying on fractional epoch * `MultiplicativeLR` is consumes a function providing the multiplicative factor at each epoch. It mimics `LambdaLR` in its syntax. # #20527 ### Before The user calls scheduler with a constant epoch either across loops or in the same loop. ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) # Scheduler with sometimes-constant epoch number for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: lr_scheduler.step(epoch) print(optimizer.param_groups[0]['lr']) ``` ### After If the user wants to step ``` import torch.optim as optim from torch import nn conv = nn.Conv2d(3,3,3) optimizer = optim.Adam(conv.parameters()) lr_scheduler = optim.lr_scheduler.StepLR(optimizer, 2) last_epoch = -1 for epoch in [0, 0, 1, 1, 2, 2, 3, 3]: # Check if epoch number has changed manually if epoch-last_epoch > 0: lr_scheduler.step() last_epoch = epoch print(epoch, scheduler.get_computed_values()) ``` # #22107 ### Before ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Scheduler computes and returns new learning rate, leading to unexpected behavior print(i, scheduler.get_lr()) scheduler.step() ``` ### After ``` import torch from torchvision.models import resnet18 net = resnet18() optimizer = torch.optim.SGD(net.parameters(), 0.1) lr_scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[3, 6, 9], gamma=0.1) lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, 3, gamma=0.1) for i in range(10): # Returns last computed learning rate by scheduler print(i, lr_scheduler.get_computed_values()) lr_scheduler.step() ``` Test Plan: Imported from OSS Differential Revision: D17349760 Pulled By: vincentqb fbshipit-source-id: 0a6ac01e2a6b45000bc6f9df732033dd81f0d89f	2019-09-13 07:36:05 -07:00
Vincent Quenneville-Belair	135bbc261d	fix base_lr overridden in cyclic lr (#26105 ) Summary: base_lr parameter was being overridden by super `__init__`, see https://github.com/pytorch/pytorch/issues/21965. Pull Request resolved: https://github.com/pytorch/pytorch/pull/26105 Reviewed By: yf225 Differential Revision: D17346724 Pulled By: vincentqb fbshipit-source-id: 4b146bd64f4f385c0a9c4f4df8eb8991312fb15c	2019-09-12 15:53:03 -07:00
Vincent Quenneville-Belair	05f1fed693	Add OneCycleLR (#25324 ) Summary: Squash rebase of https://github.com/pytorch/pytorch/issues/21258 ghstack-source-id: 7d3ce522ac4dd3050bc6c6bbda1eaaeb8bc4b2c1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/25324 Pull Request resolved: https://github.com/pytorch/pytorch/pull/25325 Differential Revision: D17095722 Pulled By: vincentqb fbshipit-source-id: 7fe69b210924ee3b39223dd78122aea61267234a	2019-08-28 16:59:40 -07:00
Gregory Chanan	fc82ec298b	Update CosineAnnealingWarmRestarts to follow PyTorch 1.1+ Step Order. (#23833 ) Summary: Fixes: https://github.com/pytorch/pytorch/issues/23480. I only verified that the schedule reaches the restart at the expected step as specified in the issue, it would be good to have someone else verify correctness here. Script: ``` scheduler = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(torch.optim.SGD([torch.randn(1, requires_grad=True)], lr=0.5), T_0=1, T_mult=2) for i in range(9): print(i) print(scheduler.get_lr()) scheduler.step() ``` Output: ``` 0 [0.5] 1 [0.5] 2 [0.25] 3 [0.5] 4 [0.42677669529663687] 5 [0.25] 6 [0.07322330470336313] 7 [0.5] 8 [0.4809698831278217] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/23833 Differential Revision: D16657251 Pulled By: gchanan fbshipit-source-id: 713973cb7cbfc85dc333641cbe9feaf917718eb9	2019-08-07 07:15:50 -07:00
Ejaaz Merali	fb9fbc009c	Fix momentum bug in CyclicLR (#20401 ) Summary: Resolves issue https://github.com/pytorch/pytorch/issues/19003 The author of this issue also asked that `cycle_momentum` default to `False` if the optimizer does not have a momentum parameter, but I'm not sure what the best way to do this would be. Silently changing the value based on the optimizer may confuse the user in some cases (say the user explicitly set `cycle_momentum=True` but doesn't know that the Adam optimizer doesn't use momentum). Maybe printing a warning when switching this argument's value would suffice? Pull Request resolved: https://github.com/pytorch/pytorch/pull/20401 Differential Revision: D15765463 Pulled By: ezyang fbshipit-source-id: 88ddabd9e960c46f3471f37ea46013e6b4137eaf	2019-06-11 15:10:28 -07:00
Edward Yang	3889855a5b	Revert "Redefine scheduler to set learning rate using recursive formula" #14010 (#21463 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/21463 ghimport-source-id: 1b0ea4a282b41388d5c6f6a5d18d37c14ae874ad Differential Revision: D15747426 Pulled By: ezyang fbshipit-source-id: 0708394f907b98a9f45bcfa26e5cc450fda8cf76	2019-06-10 15:26:25 -07:00
vfn	8ece538a79	Addresses bad behavior with overridden optimizer.step by #20124 (#21460 ) Summary: This PR addresses the problem described in the comment: https://github.com/pytorch/pytorch/pull/20203#issuecomment-499231276 and previously coded bad behaviour: - a warning was raised all the times when lr schedulling is initialized Now the code checks that: - on the second call of `lr_scheduler.step`, ensure that `optimizer.step` has been already called, otherwise raise a warning (as it was done in #20203 ) - if optimizer's step is overridden -> raise once another warning to aware user about the new pattern: `opt.step()` -> `lrs.step()` as we can not check this . Now tests check that - at initialization (`lrs = StepLR(...)`)there is no warnings - if we replace `optimizer.step` by something else (similarly to the [code of nvidia/apex](https://github.com/NVIDIA/apex/blob/master/apex/amp/_process_optimizer.py#L287)) there is another warning raised. cc ezyang PS. honestly I would say that there is a lot of overhead introduced for simple warnings. I hope all these checks will be removed in future `1.2.0` or other versions... Pull Request resolved: https://github.com/pytorch/pytorch/pull/21460 Differential Revision: D15701776 Pulled By: ezyang fbshipit-source-id: eac5712b9146d9d3392a30f6339cd33d90c497c7	2019-06-06 13:54:42 -07:00
vfdev	449a2c3555	Fixes #20124 (#20203 ) Summary: Fixes #20124 Description: Code wraps `optimizer.step()` method to detect whether user is following new pattern or old pattern. In case of old pattern detected, a UserWarning is raised. Documentation is also updated to reflect the change: ![Screen Shot 2019-05-07 at 11 05 17](https://user-images.githubusercontent.com/2459423/57287527-04e63580-70b8-11e9-9ddd-5d159ef0ed2f.png) cc SsnL, bado-lee Pull Request resolved: https://github.com/pytorch/pytorch/pull/20203 Differential Revision: D15543060 Pulled By: ezyang fbshipit-source-id: 3605e1afdb6ffc2dfd5e75e92e01b967c4d065b5	2019-05-29 14:15:01 -07:00
njdalton	d190450a35	Fix typo in CyclicLR docs (#21021 ) Summary: Fixes a typo in the CyclicLR docs by adding the lr_scheduler directory and puts in other required arguments. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21021 Differential Revision: D15530109 Pulled By: soumith fbshipit-source-id: 98781bdab8d82465257229e50fa3bd0015da1286	2019-05-28 21:18:50 -07:00
Sam Pepose	082936f033	Clarify cycliclr param docs (#20880 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20880 This clarifies how the momentum parameters should be used. Reviewed By: soumith Differential Revision: D15482450 fbshipit-source-id: e3649a38876c5912cb101d8e404abca7c3431766	2019-05-28 12:07:47 -07:00
Edward Yang	74bdcd44c4	Remove tab. (#20715 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20715 ghimport-source-id: 600e244581b37152d86614cca6c9fb5fee6cdcde Differential Revision: D15417984 Pulled By: ezyang fbshipit-source-id: 939425de1a95ecc3798384e121b12faaba3a27b8	2019-05-20 11:57:18 -07:00
kirayue	d0c742134d	#20028 (#20696 ) Summary: Hi, ezyang Sorry to trouble you. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20696 Differential Revision: D15413694 Pulled By: ezyang fbshipit-source-id: 1c19d18e00c3a66a52bb9230aa25d7530f6e659c	2019-05-20 07:51:55 -07:00
Edward Yang	839a69f587	Revert D15393514: [pytorch][PR] Refine CosineAnnealingWarmRestarts doc for issue #20028 Differential Revision: D15393514 Original commit changeset: 03f270a577fc fbshipit-source-id: 3633f4e9916bdadf018288a64df89078b14af563	2019-05-17 09:55:56 -07:00
kirayue	3c69c9a7fe	Refine CosineAnnealingWarmRestarts doc for issue #20028 (#20267 ) Summary: Fixes #20028 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20267 Differential Revision: D15393514 Pulled By: ezyang fbshipit-source-id: 03f270a577fc3e0414d3f07d97512a409b08f7cd	2019-05-17 09:02:28 -07:00
vfdev	61f1242b7f	Formula typo fix (#20110 ) Summary: T_{cur + 1} -> T_{cur} + 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20110 Differential Revision: D15218135 Pulled By: ezyang fbshipit-source-id: fb914d977cac447867921510bf57b59e62e4f68c	2019-05-06 08:08:37 -07:00
Ricky Chen	57948414ac	Fix small typo T_mul->T_mult Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20148 Differential Revision: D15217485 Pulled By: ezyang fbshipit-source-id: cb183cdc2eb3e42c685ef024742a18745923d283	2019-05-06 06:32:33 -07:00
barrh	767c82e151	Initialize last_epoch in _LRScheduler.__init__() (#20059 ) Summary: Class attributes preferably be explicitly initiated within the __init__() call. Otherwise, overriding step() is prone to bugs. This patch partially reverts #7889 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20059 Differential Revision: D15195747 Pulled By: soumith fbshipit-source-id: 3d1a51d8c725d6f14e3e91ee94c7bc7a7d6c1713	2019-05-02 22:38:12 -07:00
kirayue	af06d6342c	Add SGDR(Stochastic Gradient Descent with Warm Restarts) scheduler (#17226 ) Summary: Because of merge error with master in #15042, open a new PR for ezyang. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17226 Differential Revision: D14418145 Pulled By: mrshenli fbshipit-source-id: 099ba225b28e6aba71760b81b2153ad1c40fbaae	2019-04-25 09:26:31 -07:00
Chandler Zuo	e3f1504621	Fix the Division by Zero Bug of CosineAnnealingLR (#19180 ) Summary: Added the formula for the corner case. Updated unit tests. Fixes #17913 Pull Request resolved: https://github.com/pytorch/pytorch/pull/19180 Differential Revision: D14942023 Pulled By: ezyang fbshipit-source-id: 167c109b97a7830d5b24541dc91e4788d531feec	2019-04-23 09:54:28 -07:00
Bado Lee	36084908e4	Fix lr_scheduler's last_epoch value at the time of initialization (BC BREAKING!) (#7889 ) Summary: Hello everyone :) !! I've found that lr_scheduler was initialized with last_epoch as -1. This causes that even after the first step (not the one in init but explicit step of scheduler), learning rate of scheduler's optimizer remains as the previous. ```python >>> import torch >>> cc = torch.nn.Conv2d(10,10,3) >>> myinitial_lr = 0.1 >>> myoptimizer = torch.optim.Adam(cc.parameters(), lr=myinitial_lr) >>> mylrdecay = 0.5 >>> myscheduler = torch.optim.lr_scheduler.ExponentialLR(myoptimizer,mylrdecay) >>> myscheduler.get_lr() [0.2] # this is because of get_lr calculates lr by 0.1 * 0.5^-1 >>> myscheduler.optimizer.param_groups[0]["lr"] 0.1 # this is not consistent with get_lr value >>> myscheduler.last_epoch -1 >>> myscheduler.step() >>> myscheduler.get_lr() [0.1] # this should be the value right after the init, not after first step >>> myscheduler.optimizer.param_groups[0]["lr"] 0.1 # since this is after first step, it should have been decayed as 0.05 >>> myscheduler.last_epoch 0 >>> myscheduler.step() >>> myscheduler.last_epoch 1 >>> myscheduler.get_lr() [0.05] >>> myscheduler.optimizer.param_groups[0]["lr"] 0.05 >>> myscheduler.last_epoch 1 ``` First problem is, even after the init of lr_scheduler, you get the inconsistent parameter values. The second problem is, you are stuck with same learning rate in the first 2 epochs if the step function of lr_scheduler is not called in the beginning of the epoch loop. Of course, you can avoid this by calling lr_scheduler's step in the beginning, but I don't think this is proper use since, incase of optimizer, step is called in the end of the iteration loop. I've simply avoided all above issues by setting last_epoch as 0 after the initialization. This also makes sense when you init with some value of last_epoch which is not -1. For example, if you want to init with last epoch 10, lr should not be set with decayed 1 step further. Which is last_epoch gets +1 in the previous code. base_lr * self.gamma ** self.last_epoch Instead, it should be set with step 10 exact value. I hope this fix find it's way with all your help :) I'm really looking forward & excited to become a contributor for pytorch! Pytorch Rocks!! Pull Request resolved: https://github.com/pytorch/pytorch/pull/7889 Differential Revision: D15012769 Pulled By: ezyang fbshipit-source-id: 258fc3009ea7b7390a3cf2e8a3682eafb506b08b	2019-04-23 08:54:09 -07:00
Edward Yang	173f224570	Turn on F401: Unused import warning. (#18598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598 ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a Stack from [ghstack](https://github.com/ezyang/ghstack): * #18598 Turn on F401: Unused import warning. This was requested by someone at Facebook; this lint is turned on for Facebook by default. "Sure, why not." I had to noqa a number of imports in __init__. Hypothetically we're supposed to use __all__ in this case, but I was too lazy to fix it. Left for future work. Be careful! flake8-2 and flake8-3 behave differently with respect to import resolution for # type: comments. flake8-3 will report an import unused; flake8-2 will not. For now, I just noqa'd all these sites. All the changes were done by hand. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14687478 fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3	2019-03-30 09:01:17 -07:00
Søren Rasmussen	95d3825e48	ReduceLrOnPlateau: best=current -> best=copy(current) (#16364 ) (#16697 ) Summary: Fixes #16364 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16697 Differential Revision: D14680879 Pulled By: soumith fbshipit-source-id: c50c22f3eacea4474fb3a04fe85fbf11d5a177c9	2019-03-29 06:56:51 -07:00
Sam Pepose	8635078d9e	Adds Cyclical Learning Rate and Momentum (#18001 ) Summary: This implements a cyclical learning rate (CLR) schedule with an optional inverse cyclical momentum. More info about CLR: https://github.com/bckenstler/CLR This is finishing what #2016 started. Resolves #1909. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18001 Differential Revision: D14451845 Pulled By: sampepose fbshipit-source-id: 8f682e0c3dee3a73bd2b14cc93fcf5f0e836b8c9	2019-03-27 19:56:04 -07:00
Chandler Zuo	096ee8467c	Redefine scheduler to set learning rate using recursive formula (#14010 ) Summary: Modified step_lr for StepLR, MultiStepLR, ExponentialLR and CosineAnnealingLR. In this way, multiple schedulers can be used simultaneously to modify the learning rates. Related issue: https://github.com/pytorch/pytorch/issues/13022 Added unit tests combining multiple schedulers. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14010 Reviewed By: ezyang Differential Revision: D13494941 Pulled By: chandlerzuo fbshipit-source-id: 7561270245639ba1f2c00748f8e4a5f7dec7160c	2018-12-18 16:44:31 -08:00
0phoff	294c065384	Changed serialization mechanism of LambdaLR scheduler (#9927 ) Summary: I opened an issue explaining some of my frustrations with the current state of schedulers. While most points that I raised in [that issue](https://github.com/pytorch/pytorch/issues/8741#issuecomment-404449697) need to be discussed more thoroughly before being implemented, there are some that are not so difficult to fix. This PR changes the way the LambdaLR scheduler gets serialized: > The lr_lambda functions are only saved if the are callable objects (which can be stateful). > There is no point in saving functions/lambdas as you need their definition before unpickling and they are stateless. This has the big advantage that the scheduler is serializable, even if you use lambda functions or locally defined functions (aka a function in a function). Does this functionality need any unit tests? Pull Request resolved: https://github.com/pytorch/pytorch/pull/9927 Differential Revision: D9055505 Pulled By: soumith fbshipit-source-id: 6c1cec588beedd098ec7d2bce6a9add27f29e48f	2018-07-31 19:39:06 -07:00
Tongzhou Wang	27455e9c78	Use _six for inf and nan (#9500 ) Summary: Things like `float('inf')` are actually quite expensive. ```py In [1]: import math In [2]: %timeit -n 200 math.inf 49.3 ns ± 1.42 ns per loop (mean ± std. dev. of 7 runs, 200 loops each) In [3]: %timeit -n 200 float('inf') 194 ns ± 39.1 ns per loop (mean ± std. dev. of 7 runs, 200 loops each) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/9500 Reviewed By: soumith Differential Revision: D8876229 Pulled By: SsnL fbshipit-source-id: 78602b76bb53d5588910b58270930c0bd413d2d7	2018-07-18 10:40:29 -07:00
Ailing	5a3f7810f8	_LRSchedulers getstate include optimizer info (#7757 ) * getstate should include optimizer * remove getstate/setstate functions	2018-05-23 11:43:42 -04:00
Changhan Wang	a257bd19a2	added state_dict/load_state_dict for ReduceLROnPlateau (#7201 )	2018-05-10 12:02:28 +02:00
Richard Zou	3369828bfa	Clarify patience in ReduceLROnPlateau docs (#7242 ) * Clarify patience in ReduceLROnPlateau docs It's unclear which definition of patience we have. The two ways to interpret it are: - How many bad epochs can you see before you start considering changing the learning rate. - How many bad epochs can you see before you change the learning rate. This PR clarifies the docs with an example. If `patience = 2`, then after 2 bad epochs, we begin considering changing the learning rate. After seeing one more epoch (the 3rd epoch), if that epoch is also bad, then we change the learning rate after it. * address comments	2018-05-04 16:39:26 -04:00
Armen	e44f901b55	added functionality for state_dict/load_state_dict for lr_scheduler ( Fixes: #3026 ) (#6342 ) * added functionality for state_dict/load_state_dict for lr_scheduler * fixed linting issues/removed unused import * refactor lr_scheduler state_dicts/state_dict holds everything __dict__ but optimizer * changed documentation in lr_scheduler * Update lr_scheduler.py	2018-04-19 07:09:03 -04:00
Marcin Elantkowski	d2ff733cb1	Make ReduceLROnPlateau serializable. (#5300 ) * replace lambdas with partial * flake8	2018-02-20 00:59:14 -05:00
Martin Drawitsch	1fdb3929c9	Fixes for docstrings/sphinx rendering of CosineAnnealingLR and Local Response Normalization (#5254 ) * Fix LaTex rendering in CosineAnnealingLR Backslashes were interpreted by Python as escapes in the string, so \frac turned into frac, which is not a valid LaTex command. This could be fixed with double backslashes, but the easiest solution is to just use a raw (r) docstring. * Fix sphinx warnings for LRN doc headings * Move LRN docstring from __init__ to class level The docstring was not rendered by sphinx at http://pytorch.org/docs/master/nn.html#torch.nn.LocalResponseNorm because it was in the constructor. * Remove superfluous backticks from LRN formula	2018-02-15 10:29:02 -05:00
nguyen-binh-minh	188ee3ff0b	Fix wrong learning rate evaluation in CosineAnnealingLR in Python 2 (#4656 )	2018-01-14 13:10:41 +01:00
Jon Crall	f94f5723e7	fixed spelling (#4598 )	2018-01-10 18:48:14 -05:00
Richard Zou	fe70823f8e	Fix StepLR docs (#4478 )	2018-01-04 12:37:26 -05:00
Kai Arulkumaran	e9ef20eab5	Add Cosine Annealing LR Scheduler (#3311 ) * Add Cosine Annealing LR Scheduler * Update eta_min in tests to prevent numerical mistakes * Use non-zero min_eta in test_cos_anneal_lr	2017-12-18 02:43:08 -05:00
Ozan Çağlayan	dd6d04ddf2	doc: Normalize all true/false in docstrings to ``True\|False`` (#3593 ) * doc: Normalize all true/false in docstrings to ``True\|False`` This makes them more apparent in the documentation. * doc: fix flake8	2017-11-09 08:12:29 -05:00
Tzu-Wei Huang	6bcbecfb97	fix doc of lr_scheduler (#2280 ) * resolves #1991 * fix typo	2017-08-24 17:04:53 -04:00
Quan Vuong	c5a9aa027b	fix wrong path to ReduceLROnPlateau in docstring	2017-08-19 10:27:58 -04:00
Leonid Vlasenkov	46a868dab7	[Ready] Limit docs line length (#1900 ) * some docs are ready * docs * docs * fix some more * fix some more	2017-07-10 10:24:54 -04:00
Jiaming Liu	630af4d7d8	add learning rate schedulers (#1370 )	2017-05-25 16:21:43 -04:00

1 2 3

148 Commits