pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	3d1fa40ae1	Revert "[BC-Breaking] Remove long-deprecated casting functions from native_functions.yaml (#164641 )" This reverts commit `64108bdbed`. Reverted https://github.com/pytorch/pytorch/pull/164641 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/164641#issuecomment-3386346474))	2025-10-09 15:42:51 +00:00
Yuanyuan Chen	64108bdbed	[BC-Breaking] Remove long-deprecated casting functions from native_functions.yaml (#164641 ) This PR removes `torch._cast_XXX` from generated OPs. They were deprecated in PyTorch 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/164641 Approved by: https://github.com/albanD, https://github.com/justinchuby	2025-10-08 08:27:58 +00:00
cyy	ab5467897a	Fix NOLINTNEXTLINE (#141794 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/141794 Approved by: https://github.com/Skylion007 Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>	2024-12-02 19:22:00 +00:00
PyTorch MergeBot	eb7deb2db5	Revert "Fix NOLINTNEXTLINE (#141794 )" This reverts commit `7dd9b5fc43`. Reverted https://github.com/pytorch/pytorch/pull/141794 on behalf of https://github.com/atalman due to [GH job link](https://github.com/pytorch/pytorch/actions/runs/12087979418/job/33711943084) [HUD commit link](`7dd9b5fc43`) ([comment](https://github.com/pytorch/pytorch/pull/141794#issuecomment-2511789484))	2024-12-02 15:07:50 +00:00
cyyever	7dd9b5fc43	Fix NOLINTNEXTLINE (#141794 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/141794 Approved by: https://github.com/Skylion007 Co-authored-by: Aaron Gokaslan <aaronGokaslan@gmail.com>	2024-11-29 16:23:59 +00:00
Richard Barnes	fddabc6e0b	C10_UNUSED to [[maybe_unused]] (#6357 ) (#138364 ) Summary: Pull Request resolved: https://github.com/pytorch/executorch/pull/6357 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138364 Approved by: https://github.com/Skylion007, https://github.com/eqy	2024-10-19 13:17:43 +00:00
cyy	07fe1dd58f	[13/N] Fix clang-tidy warnings in jit (#132411 ) Follows #132209 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132411 Approved by: https://github.com/Skylion007	2024-08-02 03:14:09 +00:00
Richard Barnes	ed327876f5	[codemod] `c10:optional` -> `std::optional` (#126135 ) Generated by running the following from PyTorch root: ``` find . -regex ".*\.$cpp\\|h\\|cu\\|hpp\\|cc\\|cxx$$" \| grep -v "build/" \| xargs -n 50 -P 4 perl -pi -e 's/c10::optional/std::optional/' ``` `c10::optional` is just an alias for `std::optional`. This removes usages of that alias in preparation for eliminating it entirely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126135 Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi	2024-05-14 19:35:51 +00:00
cyy	dee100945e	[2/N] Move c10::variant to std::variant (#109723 ) This PR moves most of c10::variant calls to std::variant. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109723 Approved by: https://github.com/ezyang	2023-09-24 02:47:43 +00:00
PyTorch MergeBot	380ccfd442	Revert "Added round_with_scale_factor arg to ATen (#97868 )" This reverts commit `aa99c5b4ed`. Reverted https://github.com/pytorch/pytorch/pull/97868 on behalf of https://github.com/osalpekar due to Caused breakages in the glow compiler - see [D45374622](https://www.internalfb.com/diff/D45374622) for more details	2023-04-28 20:47:00 +00:00
vfdev-5	aa99c5b4ed	Added round_with_scale_factor arg to ATen (#97868 ) Addresses #62396 following the strategy described in https://github.com/pytorch/pytorch/pull/64983#issuecomment-1026177629. Fixing output size to match opencv, scikit-image, scipy if scale factor is specified on ATen side only due to JIT FC. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97868 Approved by: https://github.com/lezcano, https://github.com/mikaylagawarecki	2023-04-26 18:48:37 +00:00
Aaron Gokaslan	0247ed27cc	Apply Clang-Tidy readability-container-size-empty (#93236 ) Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236 Approved by: https://github.com/malfet	2023-01-29 23:28:19 +00:00
Nikita Shulga	8f1c3c68d3	[BE] Use nested namespaces in .cpp/.cu files (#92100 ) As we live in C++17 world This is a functional no-op, just - `s/namespace at { namespace native {/namespace at::native {/` - `s/namespace torch { namespace jit {/namespace torch::jit {/` Pull Request resolved: https://github.com/pytorch/pytorch/pull/92100 Approved by: https://github.com/izaitsevfb	2023-01-13 16:32:34 +00:00
Aaron Gokaslan	3916d7a575	Apply modernize-use-emplace to aten, c10, torch (#91077 ) Apply clang-tidy check modernize-use-emplace. This is slightly more efficient by using an inplace constructor and is the recommended style in parts of the codebase covered by clang-tidy. This just manually applies the check to rest of the codebase. Pinging @ezyang as this is related to my other PRs he reviewed like #89000 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91077 Approved by: https://github.com/ezyang	2022-12-19 07:49:56 +00:00
Wu, Chunyuan	ca419c3338	[NNC] add eltwise OPs: mish and elu (#80586 ) Enable more eltwise OPs in NNC: - mish - elu Pull Request resolved: https://github.com/pytorch/pytorch/pull/80586 Approved by: https://github.com/ZolotukhinM, https://github.com/malfet	2022-09-17 01:44:34 +00:00
chunyuan-w	693a8dd04c	[NNC] enable fusion of conv with elementwise OP (#77157 ) ## Pitch Enable Conv-Eltwise fusion in NNC. ## Description This PR adds a `FuseConvWithEltwise` pass to fuse convolution with elementwise OP for TE subgraph. This pass will insert prepack and packed run ops for conv2d and enable fusion of conv2d with elementwise OPs. The fused packed run ops is implemented via external call in NNC. ## Code structure Graph rewrite pass related code is placed in: ``` torch/csrc/jit/passes/mkldnn_rewrite.h torch/csrc/jit/passes/mkldnn_rewrite.cpp ``` NNC integration of fused conv-eltwise OP via external call is located in: ``` torch/csrc/jit/tensorexpr/kernel.cpp torch/csrc/jit/tensorexpr/operators/conv2d.h torch/csrc/jit/tensorexpr/operators/conv2d.cpp torch/csrc/jit/tensorexpr/lowerings.cpp torch/csrc/jit/tensorexpr/external_functions.cpp ``` Fused prepack OP context is in: ``` aten/src/ATen/native/mkldnn/Common.h aten/src/ATen/native/mkldnn/RegisterMkldnnOpContextClass.cpp aten/src/ATen/native/mkldnn/OpContext.h aten/src/ATen/native/mkldnn/OpContext.cpp ``` Fused OP implementation is done in: ``` aten/src/ATen/native/mkldnn/ConvPrepack.h aten/src/ATen/native/mkldnn/ConvPrepack.cpp ``` ## OP benchmark for conv-relu The below performance is measured on top of these two PRs to support NHWC: https://github.com/pytorch/pytorch/pull/76948 and https://github.com/pytorch/pytorch/pull/78238. - Measured on Cascade Lake 8280 - Jemalloc enabled - batch_size = 1 - Channels Last format ### Single thread: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta name=ProgId content=Excel.Sheet> <meta name=Generator content="Microsoft Excel 15"> <link id=Main-File rel=Main-File href="file:///C:/Users/chunyuan/AppData/Local/Temp/msohtmlclip1/01/clip.htm"> <link rel=File-List href="file:///C:/Users/chunyuan/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml"> </head> <body link="#0563C1" vlink="#954F72"> shape \| time (us)_no_fusion \| time (us)_fusion \| Gain -- \| -- \| -- \| -- kernel=3, N=1, iC=64, H=56, W=56, oC=64, stride=1, pad=1, dilates=1, g=1 \| 1706.22 \| 1371.97 \| 19.59% kernel=1, N=1, iC=256, H=56, W=56, oC=512, stride=2, pad=0, dilates=1, g=1 \| 2499.28 \| 1571.52 \| 37.12% kernel=3, N=1, iC=256, H=56, W=56, oC=256, stride=1, pad=1, dilates=1, g=32 \| 4169.52 \| 2738.53 \| 34.32% kernel=3, N=1, iC=512, H=56, W=56, oC=512, stride=2, pad=1, dilates=1, g=32 \| 3998.77 \| 3085.85 \| 22.83% kernel=1, N=1, iC=64, H=56, W=56, oC=64, stride=1, pad=0, dilates=1, g=1 \| 673.73 \| 430.81 \| 36.06% kernel=1, N=1, iC=256, H=56, W=56, oC=64, stride=1, pad=0, dilates=1, g=1 \| 1101.87 \| 801.07 \| 27.30% kernel=1, N=1, iC=256, H=56, W=56, oC=256, stride=1, pad=0, dilates=1, g=1 \| 4692.91 \| 3116.13 \| 33.60% kernel=1, N=1, iC=512, H=28, W=28, oC=512, stride=1, pad=0, dilates=1, g=1 \| 3310.64 \| 2503.39 \| 24.38% </body> </html> ### 4 threads: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta name=ProgId content=Excel.Sheet> <meta name=Generator content="Microsoft Excel 15"> <link id=Main-File rel=Main-File href="file:///C:/Users/chunyuan/AppData/Local/Temp/msohtmlclip1/01/clip.htm"> <link rel=File-List href="file:///C:/Users/chunyuan/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml"> </head> <body link="#0563C1" vlink="#954F72"> shape \| time (us)_no_fusion \| time (us)_fusion \| Gain -- \| -- \| -- \| -- kernel=3, N=1, iC=64, H=56, W=56, oC=64, stride=1, pad=1, dilates=1, g=1 \| 360.07 \| 321.21 \| 10.79% kernel=1, N=1, iC=256, H=56, W=56, oC=512, stride=2, pad=0, dilates=1, g=1 \| 391.49 \| 323.17 \| 17.45% kernel=3, N=1, iC=256, H=56, W=56, oC=256, stride=1, pad=1, dilates=1, g=32 \| 536.4 \| 465.97 \| 13.13% kernel=3, N=1, iC=512, H=56, W=56, oC=512, stride=2, pad=1, dilates=1, g=32 \| 674.98 \| 616.32 \| 8.69% kernel=1, N=1, iC=64, H=56, W=56, oC=64, stride=1, pad=0, dilates=1, g=1 \| 160.97 \| 70.05 \| 56.48% kernel=1, N=1, iC=256, H=56, W=56, oC=64, stride=1, pad=0, dilates=1, g=1 \| 215.81 \| 182.6 \| 15.39% kernel=1, N=1, iC=256, H=56, W=56, oC=256, stride=1, pad=0, dilates=1, g=1 \| 658.45 \| 576.97 \| 12.37% kernel=1, N=1, iC=512, H=28, W=28, oC=512, stride=1, pad=0, dilates=1, g=1 \| 702.18 \| 566.39 \| 19.34% </body> </html> ### 1 socket (28 cores): <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40"> <head> <meta name=ProgId content=Excel.Sheet> <meta name=Generator content="Microsoft Excel 15"> <link id=Main-File rel=Main-File href="file:///C:/Users/chunyuan/AppData/Local/Temp/msohtmlclip1/01/clip.htm"> <link rel=File-List href="file:///C:/Users/chunyuan/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml"> </head> <body link="#0563C1" vlink="#954F72"> shape \| time (us)_no_fusion \| time (us)_fusion \| Gain -- \| -- \| -- \| -- kernel=3, N=1, iC=64, H=56, W=56, oC=64, stride=1, pad=1, dilates=1, g=1 \| 149.92 \| 103.78 \| 30.78% kernel=1, N=1, iC=256, H=56, W=56, oC=512, stride=2, pad=0, dilates=1, g=1 \| 192.76 \| 110.87 \| 42.48% kernel=3, N=1, iC=256, H=56, W=56, oC=256, stride=1, pad=1, dilates=1, g=32 \| 160.67 \| 127.24 \| 20.81% kernel=3, N=1, iC=512, H=56, W=56, oC=512, stride=2, pad=1, dilates=1, g=32 \| 212.45 \| 180.55 \| 15.02% kernel=1, N=1, iC=64, H=56, W=56, oC=64, stride=1, pad=0, dilates=1, g=1 \| 114.57 \| 50.58 \| 55.85% kernel=1, N=1, iC=256, H=56, W=56, oC=64, stride=1, pad=0, dilates=1, g=1 \| 198.64 \| 70.6 \| 64.46% kernel=1, N=1, iC=256, H=56, W=56, oC=256, stride=1, pad=0, dilates=1, g=1 \| 281.35 \| 155.8 \| 44.62% kernel=1, N=1, iC=512, H=28, W=28, oC=512, stride=1, pad=0, dilates=1, g=1 \| 262.15 \| 162.94 \| 37.84% </body> </html> ## UT ``` test/test_mkldnn_fusion.py ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77157 Approved by: https://github.com/ZolotukhinM	2022-08-10 21:46:51 +00:00
richard	0918154967	Supports symbolic diff for silu (#81724 ) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/81724 Approved by: https://github.com/jjsjann123, https://github.com/davidberard98	2022-08-09 01:18:10 +00:00
Kurt Mohler	2bfae07a79	Enable `dim=None` for `torch.mean` (#81286 ) Part of #79525 This will require coordination with XLA before merging, just like #79881 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81286 Approved by: https://github.com/albanD	2022-07-28 22:34:56 +00:00
Kurt Mohler	23bdb570cf	Reland: Enable `dim=None` for `torch.sum` (#79881 ) Part of #29137 Reland of #75845 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79881 Approved by: https://github.com/albanD, https://github.com/kulinseth	2022-07-09 00:54:42 +00:00
PyTorch MergeBot	ee6ebfc06b	Revert "Enable `dim=None` for `torch.sum` (#75845 )" This reverts commit `e79a51f7db`. Reverted https://github.com/pytorch/pytorch/pull/75845 on behalf of https://github.com/malfet due to Breaks MacOS builds, see `e79a51f7db`	2022-06-16 22:01:41 +00:00
Kurt Mohler	e79a51f7db	Enable `dim=None` for `torch.sum` (#75845 ) Part of #29137 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75845 Approved by: https://github.com/ezyang	2022-06-16 20:17:07 +00:00
Wang, Eikan	429a80dded	[NNC] Lowering function generates the output buffer with the specified stride (#76529 ) Summary: Pass stride information to lowering function to generate the output bufer with proper memory layout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76529 Reviewed By: ZolotukhinM Differential Revision: D36116712 Pulled By: IvanKobzarev fbshipit-source-id: d3901f756b3710ecce172d6db3ecb0b7c12fb929 (cherry picked from commit b6cd53c91c01db36ea0e99167dc0ce0ae1d3aa23)	2022-05-04 20:04:22 +00:00
zengk95	1d55518198	Revert "[nnc] Strides to Tensor (#72962 )" This reverts commit `939060925f`. Fixes https://github.com/pytorch/vision/issues/5873 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76332 Approved by: https://github.com/seemethere	2022-04-25 19:50:00 +00:00
Ivan Kobzarev	939060925f	[nnc] Strides to Tensor (#72962 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72962 Test Plan: Imported from OSS Reviewed By: ZolotukhinM, cpuhrsch Differential Revision: D34589306 Pulled By: IvanKobzarev fbshipit-source-id: ecee5249760ecc0c8b2edb1842b90218899bc944 (cherry picked from commit 9e310c4c67389da30da89126d838ffe3864aba6f)	2022-04-23 19:35:15 +00:00
Nikita Shulga	f6c275f55d	Remove `-Wno-unused-variable` from `utils.cmake` (take 2) (#75538 ) Summary: [Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there. Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block. Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538 Reviewed By: anjali411 Differential Revision: D35747333 Pulled By: malfet fbshipit-source-id: 3fc5828e44a4c05ba0e89e92613e6ebbdb260626 (cherry picked from commit c179fba21cfa2a0093fad50ccad5a22dd7cff52c)	2022-04-20 17:41:59 +00:00
PyTorch MergeBot	5c56b2286b	Revert "Remove `-Wno-unused-variable` from utils.cmake" This reverts commit `018cbe1f5c`. Reverted https://github.com/pytorch/pytorch/pull/75538 on behalf of https://github.com/seemethere	2022-04-19 17:19:09 +00:00
Nikita Shulga	018cbe1f5c	Remove `-Wno-unused-variable` from utils.cmake [Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there. Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block. Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538 Approved by: https://github.com/cpuhrsch	2022-04-19 15:26:55 +00:00
Nikita Shulga	43313cbde3	Revert D34647822: [tensorexpr] Add support for aten::stack Test Plan: revert-hammer Differential Revision: D34647822 (`954c7e2a77`) Original commit changeset: 3b863c71886c Original Phabricator Diff: D34647822 (`954c7e2a77`) fbshipit-source-id: e9ce06c9c8d7caf0fbb2565f0d99035bad685793 (cherry picked from commit b2ff355e9dbaa4e940fb221254223984c3c8a215)	2022-03-31 04:25:43 +00:00
Hui Guo	954c7e2a77	[tensorexpr] Add support for aten::stack (#73801 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73801 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D34647822 Pulled By: huiguoo fbshipit-source-id: 3b863c71886c7c6616b16f5d3313079714c8b82a (cherry picked from commit c71778cf6a5724d26b671bf3ee0478add24990e8)	2022-03-30 21:25:15 +00:00
Ivan Kobzarev	519e226b66	[tensorexp] ExternalCall2 without memcpy (#72225 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72225 Test Plan: Imported from OSS Reviewed By: dagitses Differential Revision: D33960933 Pulled By: IvanKobzarev fbshipit-source-id: fc73a3de9e5150919e3806516065b4a6c8316000 (cherry picked from commit f637842c341e0ba94906a0c8a1efc81691dc512c)	2022-03-09 21:19:26 +00:00
Ryan Spring	4f8b986e28	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: VitalyFedyunin Differential Revision: D33894937 Pulled By: jbschlosser fbshipit-source-id: b65e8fb6ea66168af8f34f45ed50e92737a33851 (cherry picked from commit `6e986f91a9`)	2022-02-14 03:40:32 +00:00
Mikhail Zolotukhin	1855b14922	[TensorExpr] Delet `DimArg` class. (#72390 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72390 This class didn't add much value and only caused more boilerplate code. This change removes the class and updates all the use cases with uses of `ExprHandle`. A side effect of this change is different names in loop variables, which caused massive mechanical changes in our tests. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D34030296 Pulled By: ZolotukhinM fbshipit-source-id: 2ba4e313506a43ab129a10d99e72b638b7d40108 (cherry picked from commit `c2ec46a058`)	2022-02-11 01:21:59 +00:00
David Berard	2e04295790	[tensorexpr] support for fusing autocasting ops (#72478 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72478 aten::_autocast_to_reduced_precision and `aten::_autocast_to_full_precision are essentially just aten::to operations, so they can be fused the same way aten::to is fused. Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D34057522 Pulled By: davidberard98 fbshipit-source-id: f3b53641415702a4ac56460587801b9c76d81b3c (cherry picked from commit `838ce5542e`)	2022-02-10 18:12:36 +00:00
Ivan Kobzarev	9e8334e3ae	[tensorexpr][quant] Enable tensorexpr for quant,dequant (#71243 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71243 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D33554981 Pulled By: IvanKobzarev fbshipit-source-id: 461f1cbece3bc8be6a3e9cf16bdbcc4fc5dd2593 (cherry picked from commit `d2f9aac2c6`)	2022-02-01 19:48:53 +00:00
Nikita Shulga	74c44ba9d6	Revert D33850228: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33850228 (`23d03025dc`) Original commit changeset: 3cc33fb298e4 Original Phabricator Diff: D33850228 (`23d03025dc`) fbshipit-source-id: 9436e7df73c2b2e2011f321674f24973316d3692 (cherry picked from commit `c9efb58223`)	2022-01-31 17:44:19 +00:00
Ryan Spring	23d03025dc	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: cpuhrsch Differential Revision: D33850228 Pulled By: jbschlosser fbshipit-source-id: 3cc33fb298e480d7ecc5c67716da019d60c6ab33 (cherry picked from commit `3a53b3e94f`)	2022-01-31 17:07:45 +00:00
Joel Schlosser	cb823d9f07	Revert D33744717: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33744717 (`f499ab9cef`) Original commit changeset: d64532a562ed Original Phabricator Diff: D33744717 (`f499ab9cef`) fbshipit-source-id: 396c3f63de5865f894dbc353d0790a01a624be93 (cherry picked from commit `e9fb2d1db1`)	2022-01-28 18:35:01 +00:00
Ryan Spring	f499ab9cef	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: mikaylagawarecki Differential Revision: D33744717 Pulled By: jbschlosser fbshipit-source-id: d64532a562ed53247bb4fa52bb16722634d5c187 (cherry picked from commit `4713dd9cca`)	2022-01-28 16:59:09 +00:00
Mikhail Zolotukhin	bd6ec4efb4	[TensorExpr] Add lowerings for scalar binary ops (+,-,*,/,&,\|,^,<<,>>,cmp). (#71298 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71298 Differential Revision: D33576534 D33576534 Test Plan: Imported from OSS Reviewed By: anjali411 Pulled By: ZolotukhinM fbshipit-source-id: 93787b6f11180fcbfbacbb55e1bfb79700320a0e (cherry picked from commit `b2a8e83f97`)	2022-01-26 06:32:51 +00:00
Mikhail Zolotukhin	1dbcde2ade	[TensorExpr] Support scalar intermediate and output values. (#71186 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71186 So far we've only supported scalar inputs, but couldn't handle scalar outputs or intermediates. This PR adds it. Scalar outputs are returned as 0-dim tensors. If the kernel is invoked on a stack of IValues, we correctly convert the results to scalar IValues when needed. If the kernel is invoked with a vector of void* pointers, everything works out of the box without any conversions. Lowerings for scalar operators are a bit tricky. Usual lowerings return a pair <Buf, Stmt> (aka Tensor), but for scalar operators we also want to have the corresponding Var that the lowering function supposedly creates (in theory we could just use Loads and Stores, but I'm worried it can affect performance as there is no guarantee this will be optimized by LLVM). So, what we do here to work around this is we return a fake buf + stmt that sets the corresponding var. Then outside of the lowering we create a real buffer and generate a Store to it with the value from the variable we passed as the base handle of the fake buf. This real buffer is then treated as usual by the rest of the system and we can use it if we need to return this scalar value as a kernel output. If we do not need to return it, then the Store will be deleted by the DCE pass. Differential Revision: D33539324 D33539324 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: ab4524b9820ce204f106effcf6232ed33d4ee223 (cherry picked from commit `7faa0939f0`)	2022-01-26 06:32:51 +00:00
Kimish Patel	b37de0a4bb	Update flags in nnc lowering (#70306 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70306 USE_XNNPACK is the right one to enable lowering to prepacked xnnpack based ops Test Plan: CI Reviewed By: ZolotukhinM, priyaramani Differential Revision: D33279375 fbshipit-source-id: d19ded5643f487f7b58c54a860ad39c8d484ed05	2021-12-22 12:25:35 -08:00
Ivan Kobzarev	7503ec58b2	[nnc][fix] xnnpack ifdef (#69870 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69870 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D33075061 Pulled By: IvanKobzarev fbshipit-source-id: dd53ad8b7d0ff36a68f0864540d6f7dd2284f0e0	2021-12-14 09:50:24 -08:00
Mikhail Zolotukhin	791d5087ed	[TensorExpr] Add lowerings for quantized ops: cat, mul, conv1d, relu. (#69055 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69055 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D32710325 Pulled By: ZolotukhinM fbshipit-source-id: 4a7f0ac059ea238463317b6a45a822b8d05610dd	2021-12-02 14:34:21 -08:00
Ivan Kobzarev	7802953dd5	[nnc][quantization] quantized ops for BI bytedoc via aten (#68790 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68790 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D32609427 Pulled By: IvanKobzarev fbshipit-source-id: de8f4209befe2509f5033888c739554470768290	2021-11-24 08:59:44 -08:00
Ivan Kobzarev	39747dc456	[nnc] Loweings for flatten, xnnpack prepack op (#68470 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68470 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D32545261 Pulled By: IvanKobzarev fbshipit-source-id: b2bf5b3260002bcc40a351a9c56d786b16b69287	2021-11-18 20:14:42 -08:00
Mikhail Zolotukhin	66b52d5b49	[TensorExpr] Convert linear_clamp_run to using schema in NNC lowerings. (#66523 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66523 Differential Revision: D31590857 D31590857 Test Plan: Imported from OSS Reviewed By: bdhirsh Pulled By: ZolotukhinM fbshipit-source-id: da8a7d68c8a4cf74c3f622b8a3af54d00ffb14a6	2021-11-12 12:26:06 -08:00
Ivan Kobzarev	362c6069b9	[nnc] Lazy lowerings registration; custom classes network params (#67623 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67623 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D32065076 Pulled By: IvanKobzarev fbshipit-source-id: 4945ac6483938d428c539ed1ce4fcd6988b34250	2021-11-11 09:00:23 -08:00
Mikhail Zolotukhin	ff5c61a74e	[TensorExpr] Add lowering for aten::max (reduction). (#66519 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66519 Differential Revision: D31590853 D31590853 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: a702621621f681d7f5392912e8a77ca124e14170	2021-11-03 09:44:09 -07:00
Mikhail Zolotukhin	00afe9ba7b	[TensorExpr] Add lowering for aten::embedding. (#66518 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66518 Differential Revision: D31590855 D31590855 Test Plan: Imported from OSS Reviewed By: pbelevich Pulled By: ZolotukhinM fbshipit-source-id: aace0a87b1649330dae44182f7873aca27160d64	2021-11-03 09:44:07 -07:00
Mikhail Zolotukhin	008a58d226	[TensorExpr] Add lowering for aten::conv1d. (#66517 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66517 Differential Revision: D31590856 D31590856 Test Plan: Imported from OSS Reviewed By: pbelevich Pulled By: ZolotukhinM fbshipit-source-id: c05a37d8741acd0606c2adb8d6cfeb1f57bc8aa0	2021-11-03 09:44:05 -07:00

1 2

58 Commits