pytorch/torch/optim
Isalia20 49f6cce736 [MPS] grad scaler (#150255)
Fixes #142397

Basic implementation is done. What's left:
- [x] Different dtype/device tensors in the TensorList
- [x] fast path for grouping the foreach kernel
- [x] Tests

Regarding tests, I found some tests in `test/test_torch.py` for GradScaler but I couldn't figure out what is the best way to enable the test for MPS device.

By removing `@onlyNativeDeviceTypes`, one enables the tests for MPS but also enables tests for all other devices which are not included in the native device types. If I put:
`instantiate_device_type_tests(TestTorchDeviceType, globals(), allow_mps=True)`

This enables lots of tests in that class for MPS which were not(?) being tested before? This part needs some clarification

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150255
Approved by: https://github.com/malfet

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
2025-04-06 17:06:55 +00:00
..
_multi_tensor Optim package docstring fix (#129086) 2024-06-21 14:30:53 +00:00
__init__.py [BE][optim] Make pyright recognize exported symbols (#135043) 2024-09-04 21:53:46 +00:00
_adafactor.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
_functional.py PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175) 2025-01-21 16:57:27 +00:00
adadelta.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
adagrad.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
adam.py [MPS] grad scaler (#150255) 2025-04-06 17:06:55 +00:00
adamax.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
adamw.py PEP585 update - torch/nn torch/optim torch/package torch/profiler torch/serialization torch/sparse torch/xpu (#145175) 2025-01-21 16:57:27 +00:00
asgd.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
lbfgs.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
lr_scheduler.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
nadam.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
optimizer.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
radam.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
rmsprop.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
rprop.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
sgd.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
sparse_adam.py Convert Tensor lr to 0-dim as needed for the optimizer to normally work (#145674) 2025-03-17 23:07:05 +00:00
swa_utils.py Revert "Fix non-bitwise type annotations for Tensor operators (see #145838) (#146845)" 2025-02-18 19:01:27 +00:00