pytorch/torch/distributed/algorithms
Omar 25f9fe22a9 [PowerSGD] Add orthogonalization with QR factorization (#72043)
Summary:
### 🚀 The feature, motivation and pitch
Following the discussion in https://github.com/pytorch/pytorch/issues/65813, I added the QR factorization to powerSGD_hook.py
Gram-Schmidt orthogonalization can't be fully replaced because _torch.linalg.qr_ doesn't work with half-precision. Moreover, in my tests, it works faster with a rank lesser than 3.

This is one sample experiment timing powerSGD_hook on ResNext101 with the two different methods:
![Screenshot from 2022-01-31 18-14-00](https://user-images.githubusercontent.com/42100908/151840929-270c67dd-9fe7-4f11-8e70-8bf2d0ba678d.png)

### Alternatives
Use _torch.orgqr(*torch.geqrf(matrix))_. From my tests it performances are similar to _torch.linalg.qr_.

### Additional context
_No response_

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72043

Reviewed By: albanD

Differential Revision: D34042781

Pulled By: cbalioglu

fbshipit-source-id: e331179d3b7ac40d445b651fc473b16ae4ead462
(cherry picked from commit f64bf3839a)
2022-02-07 21:15:40 +00:00
..
_checkpoint [FSDP/Checkpoint] Activation offload support in checkpoint_wrapper (#70165) 2021-12-21 10:08:18 -08:00
_optimizer_overlap make fsdp folder to be public (#72084) 2022-02-02 15:50:14 +00:00
ddp_comm_hooks [PowerSGD] Add orthogonalization with QR factorization (#72043) 2022-02-07 21:15:40 +00:00
model_averaging [LocalSGD] Move feature to Beta, clean up some docs (#71621) 2022-01-21 21:10:42 +00:00
quantization [BE] minor improvement to dist quantization (#67401) 2021-11-21 23:31:59 -08:00
__init__.py Make _Join, _Joinable, _JoinHook public (#62605) 2021-08-03 12:20:11 -07:00
join.py Make _Join, _Joinable, _JoinHook public (#62605) 2021-08-03 12:20:11 -07:00