Scott Wolchok
7e4730d017
[PyTorch] Round T up to next multiple of 8 in NestedTensor case
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77903
Code comment should explain why; in brief, it lets us use Tensor cores.
Differential Revision: [D36527773](https://our.internmc.facebook.com/intern/diff/D36527773/ )
Approved by: https://github.com/ngimel , https://github.com/cpuhrsch
2022-05-25 20:34:19 +00:00
Scott Wolchok
e816e17655
[PyTorch] Add native fast path for transformer encoder inference ( #76333 )
...
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76333
The current PyTorch multi-head attention and transformer
implementations are slow. This should speed them up for inference.
ghstack-source-id: 154737857
(Note: this ignores all push blocking failures!)
Test Plan: CI
Reviewed By: cpuhrsch
Differential Revision: D35239925
fbshipit-source-id: 5a7eb8ff79bc6afb4b7d45075ddb2a24a6e2df28
2022-04-26 12:58:03 -04:00
Jon Janzen
2387efd356
Revert "[PyTorch] Add native fast path for transformer encoder inference"
...
This reverts commit b369b89f23 .
This has internal changes and should not have been landed via mergebot.
Ref: https://github.com/pytorch/pytorch/pull/75809#issuecomment-1108717166
2022-04-25 11:40:02 -04:00
Scott Wolchok
b369b89f23
[PyTorch] Add native fast path for transformer encoder inference
...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75809
The current PyTorch multi-head attention and transformer
implementations are slow. This should speed them up for inference.
Differential Revision: [D35239925](https://our.internmc.facebook.com/intern/diff/D35239925/ )
**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D35239925/ )!
Approved by: https://github.com/ezyang
2022-04-25 06:11:36 +00:00