pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
zeshengzong	82dc3457e0	Add `load_state_dict` hint doc about invoke order work with lr_scheduler (#149942 ) Fixes #119168 ## Test Result ![image](https://github.com/user-attachments/assets/edb8124c-f103-475a-b903-20fbc71fdea6) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149942 Approved by: https://github.com/janeyx99 Co-authored-by: Jane (Yuan) Xu <31798555+janeyx99@users.noreply.github.com>	2025-05-15 01:07:36 +00:00
Aaron Gokaslan	f05b38aa26	[BE]: Improve decorator typing for Optimizer subclasses (#153374 ) Improves typing so that all the optimizer subclasses (which all of them that subtype step) do not erase their type signature when this decorator is used. Now *kwarg values and returns will propogate This complements @tsunghsienlee PR #153367 as the type signature of step() was being erased on all the optimizer subclasses by this untyped decorator Pull Request resolved: https://github.com/pytorch/pytorch/pull/153374 Approved by: https://github.com/janeyx99, https://github.com/tsunghsienlee	2025-05-12 22:55:25 +00:00
Tsung-Hsien Lee	ea4b65ab60	Fix the type hint of `step()` with default value (#153367 ) Summary: Because the default value of `closure` is `None`, this fixes the situation when `step()`. The previous typing (https://github.com/pytorch/pytorch/pull/102593) could only be used as `step(closure=None)` and `step(None)`. Test Plan: contbuild & OSS CI Differential Revision: D74560785 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153367 Approved by: https://github.com/cyyever, https://github.com/Skylion007, https://github.com/janeyx99	2025-05-12 15:52:59 +00:00
Jacobgoss30	6a8006472e	Fix doc cosineannealinglr 152081 (#152936 ) ## Summary This PR updates the docstring for `CosineAnnealingLR` to accurately reflect its recursive learning rate schedule. The previous docstring displayed only the SGDR closed-form expression, which doesn't match the actual recursive implementation in code. Changes: - Added the recursive update formula used in `get_lr()` - Retained the original closed-form SGDR expression for reference - Clarified that warm restarts are not implemented in this scheduler This addresses confusion raised in issue #152081. ## Related issue [#152081](https://github.com/pytorch/pytorch/issues/152081) ## Testing Doc-only change. Ran pre-commit to verify formatting. Pull Request resolved: https://github.com/pytorch/pytorch/pull/152936 Approved by: https://github.com/janeyx99	2025-05-08 17:25:30 +00:00
Jane Xu	3bc69cc08d	Document that dampening is skipped in SGD momentum first step (#152833 ) Pointed out by https://x.com/hi_tysam/status/1917318692276174977/photo/2. It would be BC breaking to change this behavior 7 years after it has been decided, so we are documenting it first at the very least. <img width="642" alt="image" src="https://github.com/user-attachments/assets/3febcb07-e0ed-44a1-bd3b-a8e685711cb4" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/152833 Approved by: https://github.com/albanD	2025-05-05 20:07:23 +00:00
kyo	a21090a38c	Fix incorrect citation of authors in documentation (#145209 ) This PR corrects the citation of Adafactor authors "Noam Shazeer" and "Mitchell Stern" in the documentation. The current text incorrectly lists them as "Shazeer, Noam, and Mitchell Stern," which seems to be a result of a data parsing issue of some reference manager(s) [as you can find many papers with the same issue](https://www.google.com/search?q=%22Shazeer%2C+Noam%2C+and+Mitchell+Stern%22). The updated citation follows standard conventions for author names. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145209 Approved by: https://github.com/janeyx99	2025-05-05 17:45:05 +00:00
dscamiss	9a9cc48c65	Update SGD documentation to match implementation (#149884 ) Fixes #149476 This PR updates the pseudocode description of the SGD optimizer to better match the implementation. Updated pseudocode: ![image](https://github.com/user-attachments/assets/2d7bc618-0408-4909-b835-af6465736918) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149884 Approved by: https://github.com/janeyx99	2025-05-05 16:06:17 +00:00
zeshengzong	eb69f4e609	Add lr_lambda type check in MultiplicativeLR (#151973 ) Fixes #81554 ## TestResult ### Before ```python In [3]: import torch ...: class SimpleLinearModel(torch.nn.Module): ...: def __init__(self): ...: super(SimpleLinearModel, self).__init__() ...: self.linear = torch.nn.Linear(10, 1) ...: ...: def forward(self, x): ...: return self.linear(x) ...: ...: net = SimpleLinearModel() ...: optimizer = torch.optim.Adam(net.parameters(), lr=0.01) ...: scheduler = torch.optim.lr_scheduler.MultiplicativeLR(optimizer, 0.95) ...: for i in range(10): ...: print(i, scheduler.get_last_lr()) ...: scheduler.step() TypeError: 'float' object is not callable ### After ```python ...: scheduler = torch.optim.lr_scheduler.MultiplicativeLR(optimizer, 0.95) TypeError: lr_lambda should be a function, but got float ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/151973 Approved by: https://github.com/janeyx99	2025-04-29 08:21:41 +00:00
zeshengzong	c81d8c231c	Fix CosineAnnealingWarmRestarts reset T_cur (#151289 ) Fixes #88791 ## Test Result ```python pytest test/optim/test_lrscheduler.py -k test_CosineAnnealingWarmRestarts ``` ![image](https://github.com/user-attachments/assets/75ad238c-f319-47dc-bf2d-da05b0879b84) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151289 Approved by: https://github.com/janeyx99	2025-04-28 23:02:55 +00:00
Anthony Shoumikhin	7cae7902a2	Add scripts to check xrefs and urls (#151844 ) Traverses the docs and code to find any broken links Pull Request resolved: https://github.com/pytorch/pytorch/pull/151844 Approved by: https://github.com/huydhn	2025-04-28 09:30:07 +00:00
Jane Xu	dccc41581a	Include other accelerators in capturable docstr for optimizers (#149770 ) Fixes #149722 @ILCSFNO is this better? Pull Request resolved: https://github.com/pytorch/pytorch/pull/149770 Approved by: https://github.com/albanD	2025-04-24 20:38:42 +00:00
zeshengzong	25803d3a22	Optimize typing in `lr_scheduler.py` (#151219 ) ## Changes - Add typing annotation in `lr_scheduler.py` ## Test Result ```bash pytest test/optim/test_lrscheduler.py -vv ``` ![image](https://github.com/user-attachments/assets/34a91965-ff3a-462a-9ab0-b46ad4b290e9) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151219 Approved by: https://github.com/janeyx99	2025-04-15 01:00:13 +00:00
zeshengzong	5eebcb991a	Add scripts to generate plots of LRSchedulers (#149189 ) Fixes #92007 ## Changes - Add script to generate plots for `lr_scheduler` - Add plots to `lr_scheduler` docs - Add example section if it missing in `lr_scheduler` docs ## Test Result ### LambdaLR ![image](https://github.com/user-attachments/assets/37fc0894-e2ec-48f2-a2d6-3514e51e1ea2) ### MultiplicativeLR ![image](https://github.com/user-attachments/assets/2122b3a0-a4ce-42c7-bb45-559c1fc73e0f) ### StepLR ![image](https://github.com/user-attachments/assets/47bc9d96-4b60-4586-a000-f213583bbe8f) ### MultiStepLR ![image](https://github.com/user-attachments/assets/c822b849-d5be-4b94-aa7a-0017a2c9ff15) ### ConstantLR ![image](https://github.com/user-attachments/assets/83107cdd-7b00-44a6-b09d-e8ee849b4a12) ### LinearLR ![image](https://github.com/user-attachments/assets/60190105-691a-4101-8966-5b0c396093a4) ### ExponentialLR ![image](https://github.com/user-attachments/assets/dfcbcbca-89e5-4a2f-b1bd-33e25d2405ec) ### PolynomialLR ![image](https://github.com/user-attachments/assets/7c3d4fce-c846-40a0-b62e-f3e81c7e08bd) ### CosineAnnealingLR ![image](https://github.com/user-attachments/assets/26712769-dde9-4faa-b61b-e23c51daef50) ### ChainedScheduler ![image](https://github.com/user-attachments/assets/20734a8b-e939-424f-b45a-773f86f020b1) ### SequentialLR ![image](https://github.com/user-attachments/assets/2cd3ed67-2a0a-4c42-9ad2-e0be090d3751) ### ReduceLROnPlateau ![image](https://github.com/user-attachments/assets/b77f641e-4810-450d-b2cd-8b3f134ea188) ### CyclicLR ![image](https://github.com/user-attachments/assets/29b8666f-41b3-45e4-9159-6929074e6108) ### OneCycleLR ![image](https://github.com/user-attachments/assets/d5b683ef-41e8-4ca8-9fe8-0f1e6b433866) ### CosineAnnealingWarmRestarts ![image](https://github.com/user-attachments/assets/1d45ea80-dea8-494d-a8ab-e9cfc94c55d6) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149189 Approved by: https://github.com/janeyx99	2025-04-14 09:53:38 +00:00
zeshengzong	304633152c	Clean up duplicated code in lr_scheduler (#150984 ) ## Changes - Remove duplicated code in `ReduceLROnPlateau` - Remove redundant `noqa` comment ## Test Result ```bash pytest test/optim/test_lrscheduler.py ``` ![image](https://github.com/user-attachments/assets/37f91f31-0e77-4abf-9dd1-75538c0f0792) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150984 Approved by: https://github.com/janeyx99	2025-04-13 09:18:50 +00:00
Isalia20	49f6cce736	[MPS] grad scaler (#150255 ) Fixes #142397 Basic implementation is done. What's left: - [x] Different dtype/device tensors in the TensorList - [x] fast path for grouping the foreach kernel - [x] Tests Regarding tests, I found some tests in `test/test_torch.py` for GradScaler but I couldn't figure out what is the best way to enable the test for MPS device. By removing `@onlyNativeDeviceTypes`, one enables the tests for MPS but also enables tests for all other devices which are not included in the native device types. If I put: `instantiate_device_type_tests(TestTorchDeviceType, globals(), allow_mps=True)` This enables lots of tests in that class for MPS which were not(?) being tested before? This part needs some clarification Pull Request resolved: https://github.com/pytorch/pytorch/pull/150255 Approved by: https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>	2025-04-06 17:06:55 +00:00
Tony-Y	78715a181f	Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674 ) Fixes #145461 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145674 Approved by: https://github.com/janeyx99 Co-authored-by: Jane (Yuan) Xu <31798555+janeyx99@users.noreply.github.com>	2025-03-17 23:07:05 +00:00
zeshengzong	fb1b7ec173	Remove deprecate method and attirbute in `LRScheduler` (#147301 ) Following [#99270 suggestion](https://github.com/pytorch/pytorch/issues/99270#issuecomment-1511656408), remove deprecate method `LRScheduler.print_lr` Pull Request resolved: https://github.com/pytorch/pytorch/pull/147301 Approved by: https://github.com/janeyx99	2025-03-05 05:30:19 +00:00
zeshengzong	c0ee62573a	[Easy][optim] Add LBFGS params optional desc (#147579 ) [LBFGS docs](https://pytorch.org/docs/stable/generated/torch.optim.LBFGS.html#torch.optim.LBFGS) missing `optional` description for params in compare with other optimizer docs, like [Adam](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html) ## Test Result ### Before ![image](https://github.com/user-attachments/assets/34877490-16b4-4c68-bf6c-405bae563352) ### After ![image](https://github.com/user-attachments/assets/7fba94c8-7091-47b8-bdf1-ca7d779a027f) Pull Request resolved: https://github.com/pytorch/pytorch/pull/147579 Approved by: https://github.com/janeyx99	2025-02-21 19:38:10 +00:00
PyTorch MergeBot	302f56a1f2	Revert "Fix non-bitwise type annotations for Tensor operators (see #145838 ) (#146845 )" This reverts commit `59b7e52ad8`. Reverted https://github.com/pytorch/pytorch/pull/146845 on behalf of https://github.com/jeanschmidt due to Seems to break a few code dependencies in multiple places ([comment](https://github.com/pytorch/pytorch/pull/146845#issuecomment-2666656834))	2025-02-18 19:01:27 +00:00
Tom Ritchford	59b7e52ad8	Fix non-bitwise type annotations for Tensor operators (see #145838 ) (#146845 ) Fix https://github.com/pytorch/pytorch/issues/145838 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146845 Approved by: https://github.com/Skylion007	2025-02-17 22:42:16 +00:00
Aaron Gokaslan	292af3cc89	[BE][Ez]: ISC001 Auto concatenate implicit one line strings (#146408 ) Apply ruff rule about implicit string concatenation, this autofixes strings that are all the same type and on the same line. These lines are broken up likely as the result of autoformatters in the past. All fixes are automated using the autofixes in ISC001. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146408 Approved by: https://github.com/justinchuby, https://github.com/janeyx99	2025-02-04 19:07:04 +00:00
Aaron Orenstein	7178b827d7	PEP585: Missed conversions (#145342 ) Differential Revision: [D68785969](https://our.internmc.facebook.com/intern/diff/D68785969) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145342 Approved by: https://github.com/bobrenjc93	2025-01-29 05:24:36 +00:00
Aaron Orenstein	0afd335174	PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145175 Approved by: https://github.com/bobrenjc93	2025-01-21 16:57:27 +00:00
PyTorch MergeBot	5fd881a5b6	Revert "PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175 )" This reverts commit `54a00af2c6`. Reverted https://github.com/pytorch/pytorch/pull/145175 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to break some trunk tests ([comment](https://github.com/pytorch/pytorch/pull/145175#issuecomment-2603418267))	2025-01-21 00:49:55 +00:00
Aaron Orenstein	54a00af2c6	PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145175 Approved by: https://github.com/bobrenjc93	2025-01-20 22:32:59 +00:00
Jane Xu	3908be676c	Fix loading older state_dict into AdamW after refactor (#144972 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144972 Approved by: https://github.com/albanD	2025-01-16 19:50:31 +00:00
Jane Xu	e32d2bf853	Document decoupled_weight_decay for Adam for consistency with N/RAdam (#144984 ) Followup from #144972 and #143710 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144984 Approved by: https://github.com/albanD	2025-01-16 18:58:29 +00:00
PyTorch MergeBot	154185dcd0	Revert "Removed unused _RequiredParameter (#144771 )" This reverts commit `6a5f895e54`. Reverted https://github.com/pytorch/pytorch/pull/144771 on behalf of https://github.com/malfet due to It broke number of cpuinductor tests ([comment](https://github.com/pytorch/pytorch/pull/144771#issuecomment-2593293542))	2025-01-15 15:51:33 +00:00
Piergiacomo De Marchi	6a5f895e54	Removed unused _RequiredParameter (#144771 ) As per this [discussion](https://discuss.pytorch.org/t/a-question-about-requiredparameter/137977), I figured that `_RequiredParameter` is no longer used. The `required` object was initially introduced in this [PR](`4db6667923`) as the `SGD` optimizer did not offer a default value for the learning rate. However there isn't a single place in the code base using `_RequiredParameter`, nor `required`. I am therefore removing unused `_RequiredParameter` and `required`. Everything not included in this PR is Not a Contribution. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144771 Approved by: https://github.com/janeyx99	2025-01-15 04:11:17 +00:00
Aaron Orenstein	45ef3309e3	[BE] typing for decorators (#144161 ) Summary: Untyped decorators strip annotations from the decorated items. - _compile - _inductor/fx_passes/post_grad - _inductor/lowering - _library/custom_ops - _meta_registrations - _ops - _refs/nn/functional - ao/quantization/quantizer/xnnpack_quantizer_utils - distributed/_composable/contract - fx/experimental/graph_gradual_typechecker - fx/experimental/migrate_gradual_types/constraint_generator - optim/optimizer - signal/windows/windows - testing/_internal/common_device_type - torch/_inductor/decomposition - utils/flop_counter Test Plan: unit tests Differential Revision: D62302684 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144161 Approved by: https://github.com/Skylion007, https://github.com/albanD	2025-01-04 16:40:09 +00:00
Jane Xu	7b69f7b449	Clarify what we mean by decoupled weight decay in the *AdamWs (#144101 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144101 Approved by: https://github.com/albanD	2025-01-03 19:06:00 +00:00
emmettbicker	92d8965082	Adding support for differentiable lr, weight_decay, and betas in Adam/AdamW (#143726 ) Third PR in a series of PRs to broaden differentiable optimizer support w/ @janeyx99 (sorry for pinging over the holidays! I just wanted to put this one out but I am definitely not asking for review or anything like that rn) This is also going to probably be my last PR before the holidays! Note: This is a branch of #143710 -- I've never worked on a branch of a branch before so I wasn't sure about the protocol so I thought I'd just made the PR and wait until that one gets merged. This is adding support for differentiable lr, weight_decay, and betas to Adam and AdamW (but after refactoring AdamW into an Adam subclass, it's really just changing code in torch/optim/adam.py) I had one main thing I was wondering about, which is that adam already has a differentiable flag built in, so I have code like this ```py if differentiable and isinstance(beta2, Tensor): if beta2.requires_grad: exp_avg_sq.mul_(beta2).addcmul_(grad, grad.conj().mul(1 - beta2)) else: exp_avg_sq.mul_(beta2).addcmul_(grad, grad.conj(), value=1 - beta2) else: exp_avg_sq.mul_(beta2).addcmul_(grad, grad.conj(), value=1 - beta2) ``` That I could definitely simplify to just ```py if differentiable and isinstance(beta2, Tensor): exp_avg_sq.mul_(beta2).addcmul_(grad, grad.conj().mul(1 - beta2)) else: exp_avg_sq.mul_(beta2).addcmul_(grad, grad.conj(), value=1 - beta2) ``` It would definitely be a little slower in the case that it's differentiable but doesn't need a grad for beta2, but the code would also be a lot more clear and I'm debating speed vs future code usability. Also the line in the above example: ```py exp_avg_sq.mul_(beta2).addcmul_(grad, grad.conj().mul(1 - beta2)) ``` was concerning to me because it is considerably more expensive than `value=1 - beta2`, but I couldn't think of a better way to do it. Further work on #141832 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143726 Approved by: https://github.com/janeyx99	2024-12-30 01:11:57 +00:00
Emmett Bicker	0de661dc27	Add support for differentiable weight decay (#143679 ) (Actual) second PR in a larger project to broaden support for differentiable optimizers with @janeyx99! In this PR, I did a lot of pattern matching from the previous PR to add support for differentiable weight_decay. And also added a single new line on line 359 (previously line 352) to make the code from the last PR a little easier to read Continuation of progress on #141832 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143679 Approved by: https://github.com/janeyx99 Co-authored-by: Jane (Yuan) Xu <31798555+janeyx99@users.noreply.github.com>	2024-12-27 23:14:43 +00:00
emmettbicker	6ccb8ed186	Refactor AdamW into Adam (heavily inspired by tfsingh) (#143710 ) Fixes #104899 Refactors AdamW into Adam by making AdamW a subclass of Adam. Additionally adds a test to assert that the added parameter `decoupled_weight_decay` is True in AdamW and also updates test_defaults_changed_to_foreach to account for the differences in module location for AdamW. Heavily heavily inspired by #118857 by @tfsingh Pull Request resolved: https://github.com/pytorch/pytorch/pull/143710 Approved by: https://github.com/janeyx99	2024-12-23 23:27:28 +00:00
emmettbicker	0b2c47962c	Add support for differentiable LR in SGD + test v2.0 (#143510 ) Second PR in a larger project to broader support for differentiable optimizers with @janeyx99 ! The first one had an issue near the end so this is the second PR on that subject. See #143122 for the development up until this point. Pull Request resolved: https://github.com/pytorch/pytorch/pull/143510 Approved by: https://github.com/janeyx99	2024-12-19 21:04:44 +00:00
Tony-Y	61a835ec53	Corrected description of AMSGrad algorithm (#142351 ) Fixes #142323 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142351 Approved by: https://github.com/janeyx99	2024-12-19 16:24:19 +00:00
Fabian Keller	5e8e1d725a	Remove some unused type ignores (round 1) (#142325 ) Over time, a large number of the existing type ignores have become irrelevant/unused/dead as a result of improvements in annotations and type checking. Having these `# type: ignore` linger around is not ideal for two reasons: - They are verbose/ugly syntatically. - They could hide genuine bugs in the future, if a refactoring would actually introduce a bug but it gets hidden by the ignore. I'm counting over 1500 unused ignores already. This is a first PR that removes some of them. Note that I haven't touched type ignores that looked "conditional" like the import challenge mentioned in https://github.com/pytorch/pytorch/pull/60006#issuecomment-2480604728. I will address these at a later point, and eventually would enable `warn_unused_ignores = True` in the mypy configuration as discussed in that comment to prevent accumulating more dead ignores going forward. This PR should have no effect on runtime at all. Pull Request resolved: https://github.com/pytorch/pytorch/pull/142325 Approved by: https://github.com/Skylion007, https://github.com/janeyx99	2024-12-09 18:23:46 +00:00
Xuehai Pan	e1196dfe51	Deprecate `torch._utils.is_compiling()` (#127690 ) This PR is split from PR #126898. - #126898 ------ Pull Request resolved: https://github.com/pytorch/pytorch/pull/127690 Approved by: https://github.com/Skylion007, https://github.com/malfet	2024-12-08 22:55:36 +00:00
UV	7597ab6370	Corrected AMSGrad max equation in Adam and AdamW (#142051 ) Fixes #142041 Pull Request resolved: https://github.com/pytorch/pytorch/pull/142051 Approved by: https://github.com/janeyx99	2024-12-06 21:55:26 +00:00
Aaron Gokaslan	08db735629	[BE]: Update mypy to 1.13.0 (#140808 ) Update mypy to 1.13.0 . Should hopefully reduce linting time. Has support for orjson cache serialization which should improve mypy cache perf if orjson is installed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140808 Approved by: https://github.com/ezyang, https://github.com/malfet	2024-12-03 02:50:10 +00:00
PyTorch MergeBot	daa77f3d9f	Revert "[BE]: Update mypy to 1.13.0 (#140808 )" This reverts commit `00134d68af`. Reverted https://github.com/pytorch/pytorch/pull/140808 on behalf of https://github.com/huydhn due to This is failing a distributed test in trunk, target determination missed this test and did not run it on PR ([comment](https://github.com/pytorch/pytorch/pull/140808#issuecomment-2512788426))	2024-12-02 20:47:43 +00:00
Aaron Gokaslan	00134d68af	[BE]: Update mypy to 1.13.0 (#140808 ) Update mypy to 1.13.0 . Should hopefully reduce linting time. Has support for orjson cache serialization which should improve mypy cache perf if orjson is installed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/140808 Approved by: https://github.com/ezyang, https://github.com/malfet	2024-12-02 18:47:54 +00:00
Michael Lazos	1fd4757fdc	Support tensor betas in Adam and AdamW (#134171 ) Adds support for beta1 and beta2 to be wrapped in tensor for Adam and AdamW. Fixes https://github.com/pytorch/pytorch/issues/133898 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134171 Approved by: https://github.com/janeyx99	2024-11-15 21:55:55 +00:00
Masaki Kozuki	6a368b3fc5	Add ScalarList overload to `_foreach_lerp` (#134482 ) Related: - https://github.com/pytorch/pytorch/issues/133367 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134482 Approved by: https://github.com/janeyx99	2024-11-12 19:03:41 +00:00
Masaki Kozuki	71d8bb7ede	implement `torch._foreach_rsqrt` (#134574 ) Related: - #133367 c Pull Request resolved: https://github.com/pytorch/pytorch/pull/134574 Approved by: https://github.com/eqy, https://github.com/janeyx99	2024-11-12 15:34:35 +00:00
PyTorch MergeBot	1d28b8b6d5	Revert "Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 )" This reverts commit `e84d1121ad`. Reverted https://github.com/pytorch/pytorch/pull/127690 on behalf of https://github.com/ZainRizvi due to Sorry but this is breaking internally. More details in D65483292 ([comment](https://github.com/pytorch/pytorch/pull/127690#issuecomment-2458381056))	2024-11-05 23:10:38 +00:00
Xuehai Pan	e84d1121ad	Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 ) This PR is split from PR #126898. - #126898 ------ Pull Request resolved: https://github.com/pytorch/pytorch/pull/127690 Approved by: https://github.com/Skylion007, https://github.com/malfet	2024-11-05 10:44:56 +00:00
bskrlj	8e27833e30	Ensure SWA boundary conditions w.r.t. definition (#133773 ) According to the documentation, decay is a number in [0,1] range,[ i.e.](https://pytorch.org/docs/stable/optim.html) ``` Decay is a parameter between 0 and 1 that controls how fast the averaged parameters are decayed. If not provided to get_ema_multi_avg_fn, the default is 0.999. ``` An inspection of `swa_utils.py` indicates there are no checks for invalid values of `decay`. Adding asserts as suggested in this PR ensures valid compute range (one way to enforce correct behavior, there are perhaps more suitable ones). Papers `torch` cites for reference idea/implementation also consider exclusively this range (e.g., https://arxiv.org/pdf/2310.04415). Fixes https://github.com/pytorch/pytorch/issues/133772 Pull Request resolved: https://github.com/pytorch/pytorch/pull/133773 Approved by: https://github.com/janeyx99	2024-10-31 18:24:08 +00:00
Tom Ritchford	c0582fd0f8	Remove unused Python variables in torch/[b-z]* (#136963 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136963 Approved by: https://github.com/ezyang	2024-10-19 16:45:22 +00:00
Matt Pitkin	8a5dd7f59b	Allow SequentialLR to include ChainedScheduler (#133450 ) This fixes #132745 and allows a `SequentialLR` to include schedulers that are compound scheduler types (i.e., a `ChainedScheduler`), which contain a list of schedulers in a `_schedulers` attribute. Pull Request resolved: https://github.com/pytorch/pytorch/pull/133450 Approved by: https://github.com/janeyx99	2024-10-18 02:29:38 +00:00

1 2 3 4 5 ...

720 Commits