pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

History

Omar 25f9fe22a9 [PowerSGD] Add orthogonalization with QR factorization (#72043 ) Summary: ### 🚀 The feature, motivation and pitch Following the discussion in https://github.com/pytorch/pytorch/issues/65813, I added the QR factorization to powerSGD_hook.py Gram-Schmidt orthogonalization can't be fully replaced because _torch.linalg.qr_ doesn't work with half-precision. Moreover, in my tests, it works faster with a rank lesser than 3. This is one sample experiment timing powerSGD_hook on ResNext101 with the two different methods: ![Screenshot from 2022-01-31 18-14-00](https://user-images.githubusercontent.com/42100908/151840929-270c67dd-9fe7-4f11-8e70-8bf2d0ba678d.png) ### Alternatives Use _torch.orgqr(*torch.geqrf(matrix))_. From my tests it performances are similar to _torch.linalg.qr_. ### Additional context _No response_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/72043 Reviewed By: albanD Differential Revision: D34042781 Pulled By: cbalioglu fbshipit-source-id: e331179d3b7ac40d445b651fc473b16ae4ead462 (cherry picked from commit `f64bf3839a`)		2022-02-07 21:15:40 +00:00
..
_checkpoint	[FSDP/Checkpoint] Activation offload support in checkpoint_wrapper (#70165 )	2021-12-21 10:08:18 -08:00
_optimizer_overlap	make fsdp folder to be public (#72084 )	2022-02-02 15:50:14 +00:00
ddp_comm_hooks	[PowerSGD] Add orthogonalization with QR factorization (#72043 )	2022-02-07 21:15:40 +00:00
model_averaging	[LocalSGD] Move feature to Beta, clean up some docs (#71621 )	2022-01-21 21:10:42 +00:00
quantization	[BE] minor improvement to dist quantization (#67401 )	2021-11-21 23:31:59 -08:00
__init__.py	Make _Join, _Joinable, _JoinHook public (#62605 )	2021-08-03 12:20:11 -07:00
join.py	Make _Join, _Joinable, _JoinHook public (#62605 )	2021-08-03 12:20:11 -07:00