mirror of
https://github.com/zebrajr/pytorch.git
synced 2025-12-07 12:21:27 +01:00
Summary: ### 🚀 The feature, motivation and pitch Following the discussion in https://github.com/pytorch/pytorch/issues/65813, I added the QR factorization to powerSGD_hook.py Gram-Schmidt orthogonalization can't be fully replaced because _torch.linalg.qr_ doesn't work with half-precision. Moreover, in my tests, it works faster with a rank lesser than 3. This is one sample experiment timing powerSGD_hook on ResNext101 with the two different methods:  ### Alternatives Use _torch.orgqr(*torch.geqrf(matrix))_. From my tests it performances are similar to _torch.linalg.qr_. ### Additional context _No response_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/72043 Reviewed By: albanD Differential Revision: D34042781 Pulled By: cbalioglu fbshipit-source-id: e331179d3b7ac40d445b651fc473b16ae4ead462 |
||
|---|---|---|
| .. | ||
| _checkpoint | ||
| _optimizer_overlap | ||
| ddp_comm_hooks | ||
| model_averaging | ||
| quantization | ||
| __init__.py | ||
| join.py | ||