pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Xuehai Pan	541584d22e	[BE][8/16] fix typos in torch/ (torch/csrc/jit/) (#156318 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156318 Approved by: https://github.com/albanD	2025-07-02 22:55:29 +00:00
cyy	419a7e197d	[6/N] Fix Wextra-semi warning (#139605 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/139605 Approved by: https://github.com/ezyang	2024-11-04 13:43:16 +00:00
cyy	07fe1dd58f	[13/N] Fix clang-tidy warnings in jit (#132411 ) Follows #132209 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132411 Approved by: https://github.com/Skylion007	2024-08-02 03:14:09 +00:00
cyy	eccbd408e5	[10/N] Fix clang-tidy warnings in jit (#132122 ) Follows #132010 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132122 Approved by: https://github.com/Skylion007	2024-07-30 12:56:31 +00:00
cyy	30875953a4	[1/N] Remove inclusion of c10/util/string_utils.h (#128300 ) As a first step to remove it. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128300 Approved by: https://github.com/ezyang, https://github.com/eqy	2024-06-10 23:40:47 +00:00
cyy	226384b460	[2/N] Cleanup header inclusions in torch_cpu by iwyu (#109964 ) Further cleaning up of torch_cpu header inclusions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109964 Approved by: https://github.com/ezyang, https://github.com/Skylion007	2023-11-19 20:56:32 +00:00
cyy	77f2883c41	[Reland2] fix missing-prototypes warnings in torch_cpu (Part 4) (#102228 ) This PR relands the changes introduced in PR https://github.com/pytorch/pytorch/pull/100849. The old PR turnd nnc_* functions into static. We now add declarations for them and hope that inter builds will pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/102228 Approved by: https://github.com/albanD	2023-06-02 22:04:44 +00:00
PyTorch MergeBot	32ce06a5ab	Revert "[Reland] fix missing-prototypes warnings in torch_cpu (Part 4) (#101949 )" This reverts commit `4f2c007a1b`. Reverted https://github.com/pytorch/pytorch/pull/101949 on behalf of https://github.com/osalpekar due to As noted in @izaitsevfb's comment, we are still seeing linker errors, this time due to `nnc_prepacked_linear_clamp_run` being made a static function. ([comment](https://github.com/pytorch/pytorch/pull/101949#issuecomment-1560226880))	2023-05-23 22:53:47 +00:00
cyy	4f2c007a1b	[Reland] fix missing-prototypes warnings in torch_cpu (Part 4) (#101949 ) This PR relands the changes introduced in PR #100849. The old PR turnd nnc_aten_embedding into a static function, however, it is actually used in torch/csrc/jit/tensorexpr/operators/misc.cpp. Pull Request resolved: https://github.com/pytorch/pytorch/pull/101949 Approved by: https://github.com/albanD	2023-05-22 10:53:07 +00:00
PyTorch MergeBot	498c34e8e8	Revert " fix missing-prototypes warnings in torch_cpu (Part 4) (#100849 )" This reverts commit `c2f28d1c1d`. Reverted https://github.com/pytorch/pytorch/pull/100849 on behalf of https://github.com/izaitsevfb due to fails internal Meta builds, including fbcode and android, see D46009888: ld.lld: error: undefined symbol: nnc_aten_embedding ([comment](https://github.com/pytorch/pytorch/pull/100849#issuecomment-1555105800))	2023-05-19 19:05:15 +00:00
cyy	c2f28d1c1d	fix missing-prototypes warnings in torch_cpu (Part 4) (#100849 ) This PR fixes more missing-prototypes violations in the torch_cpu source following PRs #100053, #100147 and #100245 Pull Request resolved: https://github.com/pytorch/pytorch/pull/100849 Approved by: https://github.com/albanD	2023-05-18 03:49:45 +00:00
Aaron Gokaslan	0247ed27cc	Apply Clang-Tidy readability-container-size-empty (#93236 ) Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236 Approved by: https://github.com/malfet	2023-01-29 23:28:19 +00:00
Nikita Shulga	8f1c3c68d3	[BE] Use nested namespaces in .cpp/.cu files (#92100 ) As we live in C++17 world This is a functional no-op, just - `s/namespace at { namespace native {/namespace at::native {/` - `s/namespace torch { namespace jit {/namespace torch::jit {/` Pull Request resolved: https://github.com/pytorch/pytorch/pull/92100 Approved by: https://github.com/izaitsevfb	2023-01-13 16:32:34 +00:00
Aaron Gokaslan	18b37bbff9	Clang-Tidy: Improve tensorexpr headers with additional std::moves (#91572 ) Splitting #91559 into smaller pieces Pull Request resolved: https://github.com/pytorch/pytorch/pull/91572 Approved by: https://github.com/ezyang	2023-01-05 09:57:54 +00:00
Aaron Gokaslan	77c2a8a11f	Clang-Tidy: Improve ctors by removing unnecessary copies and initializations (#91538 ) Apply clang-tidy fixups to prefer member initializer and modernize-pass-by-value. This is a mostly a noop, but it should make a few ctors slighlty more readable and more efficient. Also drops in some missing moves that prevents a lot of unnecessary copying. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91538 Approved by: https://github.com/ezyang	2022-12-31 07:19:30 +00:00
Raghavan Raman	c2d5f6a5a4	[nnc] Update bounds overlap analysis to identify non-overlaps even with symbolic bounds Pull Request resolved: https://github.com/pytorch/pytorch/pull/74658 Approved by: https://github.com/ZolotukhinM	2022-04-14 20:24:03 +00:00
Ivan Kobzarev	519e226b66	[tensorexp] ExternalCall2 without memcpy (#72225 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72225 Test Plan: Imported from OSS Reviewed By: dagitses Differential Revision: D33960933 Pulled By: IvanKobzarev fbshipit-source-id: fc73a3de9e5150919e3806516065b4a6c8316000 (cherry picked from commit f637842c341e0ba94906a0c8a1efc81691dc512c)	2022-03-09 21:19:26 +00:00
Ivan Kobzarev	6fb8ebcd92	[tensorexp] Add strides to Buf (#68018 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68018 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D32262381 Pulled By: IvanKobzarev fbshipit-source-id: dba79add0bf703bc2378d64e726d4c47ec30e3be	2021-11-13 08:33:01 -08:00
Mikhail Zolotukhin	f23f21dafe	[TensorExpr] Remove 'Placeholder' class. (#64887 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64887 BufHandle has exactly the same functionality and should be used instead. Differential Revision: D30889483 D30889483 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: 365fe8e396731b88920535a3de96bd3301aaa3f3	2021-09-14 00:22:44 -07:00
Bert Maher	e7fb35021a	[nnc] Enable fusion of bfloat16 ops (#64196 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64196 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D30643864 Pulled By: bertmaher fbshipit-source-id: e95edeaf7089464d713ea1d1f951743d3e5f61c5	2021-08-30 20:09:36 -07:00
Bert Maher	2e6221a232	[nnc] Make 64-bit dimensions work (#64077 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64077 We were assuming kernel dimensions fit in 32 bits (the old fuser made this assumption too), but we should be able to support 64. ghstack-source-id: 136933272 Test Plan: unit tests; new IR level test with huge sizes Reviewed By: ZolotukhinM Differential Revision: D30596689 fbshipit-source-id: 23b7e393a2ebaecb0c391a6b1f0c4b05a98bcc94	2021-08-28 19:59:47 -07:00
Mikhail Zolotukhin	1dc2b52764	[TensorExpr] Add a wrapper for all expr and stmt pointers. (#63195 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63195 This helps us to later switch from using KernelArena with raw pointers to shared pointers without having to change all our source files at once. The changes are mechanical and should not affect any functionality. With this PR, we're changing the following: * `Add` --> `AddPtr` `new Add(...)` --> `alloc<Add>(...)` * `dynamic_cast<Add>` --> `to<Add>` `static_cast<Add>` --> `static_to<Add>` Due to some complications with args forwarding, some places became more verbose, e.g.: `new Block({})` --> `new Block(std::vector<ExprPtr>())` Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D30292779 Pulled By: ZolotukhinM fbshipit-source-id: 150301c7d2df56b608b035827b6a9a87f5e2d9e9	2021-08-17 13:44:45 -07:00
Richard Barnes	d1f9c03cef	Use `const auto` with irange (#62990 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62990 Test Plan: Sandcastle Reviewed By: zhouzhuojie Differential Revision: D30199748 fbshipit-source-id: 284b208ffa3c6c4749e5ac9b1fccb28914590f2c	2021-08-10 17:59:01 -07:00
Mikhail Zolotukhin	b80dffd911	[TensorExpr] Remove more 'const' from IRVisitor methods for *Imm types. (#62932 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62932 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D30172961 Pulled By: ZolotukhinM fbshipit-source-id: 9b7f45880d356f823364135fe29fc08f6565f827	2021-08-06 22:44:09 -07:00
Raghavan Raman	59dd12042e	[nnc] Removed const from all fields in IR. (#62336 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62336 This PR was generated by removing `const` for all types of nodes in NNC IR, and fixing compilation errors that were the result of this change. This is the first step in making all NNC mutations in-place. Test Plan: Imported from OSS Reviewed By: iramazanli Differential Revision: D30049829 Pulled By: navahgar fbshipit-source-id: ed14e2d2ca0559ffc0b92ac371f405579c85dd63	2021-08-03 11:44:36 -07:00
Bin Bao	3dc8112187	[NNC] Handle int64 indices and loop bounds (#59769 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59769 Allow loop bound and tensor indice to be either int32 or int64, and avoid unnecessary cast op. Test Plan: ``` build/bin/test_tensorexpr ``` Reviewed By: H-Huang Differential Revision: D29173970 Pulled By: desertfire fbshipit-source-id: 859a876ddb1b41535b2266089aa1222884295c78	2021-06-17 09:35:59 -07:00
Richard Barnes	3979cb0656	irange for size_t (#55320 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55320 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D27572577 fbshipit-source-id: 97710fd2bb1303006b05828a0d1343b0b59ccb03	2021-06-03 01:04:13 -07:00
Mikhail Zolotukhin	f3743f097f	[TensorExpr] Nuke tensorexpr::ScalarType and instead use c10::ScalarType directly. (#56825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56825 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D27977461 Pulled By: ZolotukhinM fbshipit-source-id: f8a72938ba395e426e2d9449627113abb1c9c34f	2021-04-26 01:51:21 -07:00
Mikhail Zolotukhin	1263448cb2	[TensorExpr] Remove mask field from Load and Store classes. (#55825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55825 The mask has never been used (in vectorization we generate an explicit `IfThenElse` construct when we need to mask out some elements). The PR removes it and cleans up all its traces from tests. Differential Revision: D27717776 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: 41d1feeea4322da75b3999d661801c2a7f82b9db	2021-04-13 12:08:51 -07:00
Nikita Shulga	6a39613f35	[BE] Make torch/csrc/jit/tensorexpr/ clang-tidy clean (#55628 ) Summary: Mostly auto-generated changes using ``` python3 tools/clang_tidy.py -c build -x torch/csrc/jit/tensorexpr/eval.cpp -s ``` With following common patterns manually fixed - Use ` = default` instead of `{}` - deleted methods should be public - Use pass-by-value + std::move instead of pass-by-reference+copy Pull Request resolved: https://github.com/pytorch/pytorch/pull/55628 Reviewed By: walterddr Differential Revision: D27655378 Pulled By: malfet fbshipit-source-id: 92be87a08113435d820711103ea9b0364182c71a	2021-04-08 19:44:14 -07:00
Mike Ruberry	c0ac0fef4e	Revert D27448156: irange for size_t Test Plan: revert-hammer Differential Revision: D27448156 (`041b4431b2`) Original commit changeset: 585da57d4de9 fbshipit-source-id: 8e047c29f391c0166e0a1a87c3fb2a0854377365	2021-04-03 19:14:00 -07:00
Richard Barnes	041b4431b2	irange for size_t (#55163 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55163 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D27448156 fbshipit-source-id: 585da57d4de91c692b6360d65f7b8a66deb0f8c1	2021-04-02 23:22:29 -07:00
Mikhail Zolotukhin	aba33b0042	[TensorExpr] IRVerifier: add index verifier for Store. (#53137 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53137 Also, add casting to Int for Load and Store indices. Fixes #52773. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D26760256 Pulled By: ZolotukhinM fbshipit-source-id: a2d3141b17584724a5feabcabec25d0577b83a30	2021-03-02 19:56:28 -08:00
Mikhail Zolotukhin	e22da0a5c4	[TensorExpr] Add IRVerifier. (#52901 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52901 This PR implements IR Verifier and adds a call to it in `LoopNest` constructors. Checks that were in expr/stmt constructors before are now moved to the corresponding `::make` functions or to the verifier. They didn't really help from the constructors anyway since an exception thrown from there led to a segfault due to the fact our memory management works (object was not fully created but was registered in the kernel arena for destruction anyway). Fixes #52778. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D26682928 Pulled By: ZolotukhinM fbshipit-source-id: c56524015cdffb1ed8bce4394509961a4071dcfa	2021-03-01 20:38:00 -08:00
Mikhail Zolotukhin	d3b427a0e3	[TensorExpr] Add an unmasked `Load` constructor. (#52790 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52790 Fixes #52774. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D26649542 Pulled By: ZolotukhinM fbshipit-source-id: ab1c9e55f52e59d0bd00fbde2ec3125f8c7917ee	2021-02-24 22:45:29 -08:00
Mikhail Zolotukhin	c639513378	[TensorExpr] Resubmit: Introduce ExternalCall nodes to TE IR. (#51594 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51594 ExternalCall nodes represent opaque calls to external functions to fill a tensor (buffer) with values. It could be used to include nodes that are otherwise not-representable as TE, or whose TE representation is currently too slow. To make an external function available in NNC as ExternalCall, one needs to implement a "bridge" function that would take raw (void*) pointers to the data along with the arrays containing dimension info. This function would then internally call the desired external function and make sure the results of the call are correctly placed in the provided raw data buffers. The reason the PR was previously reverted was that the LLVM generated calls to bridge functions were breaking unwind tables. This is now fixed by requiring bridge functions to never throw and setting the corresponding attribute in the LLVM generated code. Differential Revision: D26213882 Test Plan: Imported from OSS Reviewed By: pbelevich, ngimel Pulled By: ZolotukhinM fbshipit-source-id: db954d8338e2d750c2bf0a41e88e38bd494f2945	2021-02-03 10:22:54 -08:00
Luca Wehrstedt	4f37150f40	Revert D26179083: [TensorExpr] Introduce ExternalCall nodes to TE IR. Test Plan: revert-hammer Differential Revision: D26179083 (`f4fc3e3920`) Original commit changeset: 9e44de098ae9 fbshipit-source-id: d15684e04c65c395b4102d4f98a4488482822d1b	2021-02-02 05:29:41 -08:00
Mikhail Zolotukhin	f4fc3e3920	[TensorExpr] Introduce ExternalCall nodes to TE IR. (#51475 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51475 ExternalCall nodes represent opaque calls to external functions to fill a tensor (buffer) with values. It could be used to include nodes that are otherwise not-representable as TE, or whose TE representation is currently too slow. To make an external function available in NNC as ExternalCall, one needs to implement a "bridge" function that would take raw (void*) pointers to the data along with the arrays containing dimension info. This function would then internally call the desired external function and make sure the results of the call are correctly placed in the provided raw data buffers. Test Plan: Imported from OSS Reviewed By: pbelevich, Chillee Differential Revision: D26179083 Pulled By: ZolotukhinM fbshipit-source-id: 9e44de098ae94d25772cf5e2659d539fa6f3f659	2021-02-02 00:50:46 -08:00
Peng Wu	6568572712	Support integral types for kAbs in SimpleIREvaluator (#49357 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/49357 This is a follow-up fix for PR #48679, where the previous PR adds support for integer inputs to aten::abs by promoting integers to float and then demote the result back to integers. This PR supports integer inputs to aten::abs more efficiently in the SimpleIREvaluator by allowing implementing integer inputs for kAbs (renamed from kFabs). - Rename kFabs to kAbs - Add support for integer input to kAbs in SimpleIREvalator (note that: llvm_codegen and cuda_codegen already supports integer inputs to kAbs) Test Plan: - `PYTORCH_TENSOREXPR_DONT_USE_LLVM=1 python test/test_jit_fuser_te.py TestTEFuser.test_unary_ops` - `python test/test_jit_fuser_te.py TestTEFuser.test_unary_ops` Imported from OSS Reviewed By: eellison Differential Revision: D25545791 fbshipit-source-id: e52f51a352d149f66ce8341fb3beb479be08a230	2020-12-18 07:57:58 -08:00
Elias Ellison	50386b9988	[NNC] Add Support For is_nan (#48973 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/48973 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D25413166 Pulled By: eellison fbshipit-source-id: 0c79258345df18c60a862373fa16931228fb92ef	2020-12-16 18:31:01 -08:00
Mikhail Zolotukhin	4aca63d38a	[TensorExpr] Change API for creating Load and Store expressions. (#45520 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45520 With this change `Load`s and `Store`s no longer accept `Placeholder`s in their constructor and `::make` functions and can only be built with `Buf`. `Placeholder` gets its own `store`, `load`, `storeWithMask`, and `loadWithMask` method for more convenient construction. Test Plan: Imported from OSS Reviewed By: glaringlee Differential Revision: D23998789 Pulled By: ZolotukhinM fbshipit-source-id: 3fe018e00c1529a563553b2b215f403b34aea912	2020-09-29 20:52:38 -07:00
Mikhail Zolotukhin	3c33695a6d	[TensorExpr] Rename `Buffer` to `Placeholder`. (#45389 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45389 Differential Revision: D23952866 Test Plan: Imported from OSS Reviewed By: nickgg Pulled By: ZolotukhinM fbshipit-source-id: 17eedd3ac17897501403482ac1866c569d247c75	2020-09-29 01:21:54 -07:00
Mikhail Zolotukhin	92306b85d5	[TensorExpr] Consolidate {buffer,function,tensor}.{h.cpp} in tensor.{h,cpp}. (#45388 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45388 Classes defined in these files are closely related, so it is reasonable to have them all in one file. The change is purely a code move. Differential Revision: D23952867 Test Plan: Imported from OSS Reviewed By: nickgg Pulled By: ZolotukhinM fbshipit-source-id: 12cfaa968bdfc4dff00509e34310a497c7b59155	2020-09-29 01:17:10 -07:00
Nick Gibson	aabdef51f9	[NNC] Registerizer for GPU [1/x] (#42606 ) Summary: Adds a new optimization pass, the Registerizer, which looks for common Stores and Loads to a single item in a buffer and replaces them with a local temporary scalar which is cheaper to write. For example it can replace: ``` A[0] = 0; for (int x = 0; x < 10; x++) { A[0] = (A[0]) + x; } ``` with: ``` int A_ = 0; for (int x = 0; x < 10; x++) { A_ = x + A_; } A[0] = A_; ``` This is particularly useful on GPUs when parallelizing, since after replacing loops with metavars we have a lot of accesses like this. Early tests of simple reductions on a V100 indicates this can speed them up by ~5x. This diff got a bit unwieldy with the integration code so that will come in a follow up. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42606 Reviewed By: bertmaher Differential Revision: D22970969 Pulled By: nickgg fbshipit-source-id: 831fd213f486968624b9a4899a331ea9aeb40180	2020-08-11 11:17:50 -07:00
Xiaoqiang Zheng	6bdfd6ae1a	[TensorExpr] Fast sigmoid for LLVM (#39717 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39717 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D21949849 Pulled By: zheng-xq fbshipit-source-id: f918bb2cb0ea647ce254fc51258af6fd01325f2d	2020-06-09 20:11:35 -07:00
Mikhail Zolotukhin	1c0bad25f3	[TensorExpr] Add dtype to class Buf. (#36611 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36611 Currently Buf represents underlying storage but it didn't have dtype. That resulted in specifying dtypes in different places and there was no mechanism to enforce its consistency: e.g. one could've created a kFloat expression and use a kInt buffer to store its result. Now we're centralizing where the logic regarding the storage is located and we can start enforcing semantics rules. Follow-ups: we can merge Buffer and BufHandle classes as the former is now a mere wrapper over the latter. Test Plan: Imported from OSS Differential Revision: D21027356 Pulled By: ZolotukhinM fbshipit-source-id: c06aa2c4077fdcde3bb4ca622d324aece79b5a9c	2020-05-05 15:04:37 -07:00
Nick Gibson	42457e634d	[TensorExpr] add support for Reduction Ops (#35866 ) Summary: Second attempt at the reduction frontend for the TensorExpr compiler. Has two APIs, a simple version for common reduction types and a customizable Reducer fronted which allows specifying initializer, reduction interaction via lambda and body via lambda. Simple API looks like so: ``` Buffer b(BufHandle("b", {10}), kInt); Tensor* c = Reduce("sum", {}, Sum(b), {{10, "m"}}); ``` An example of specializing a Sum to do Matmul: ``` Buffer tA(BufHandle("tA", {M, K}), kFloat); Buffer tB(BufHandle("tB", {K, N}), kFloat); Sum matmul([&](ParameterList& v) { ExprHandle m = v[0]; ExprHandle n = v[1]; ExprHandle k = v[2]; return tA(m, k) * tB(k, n); }); Tensor* mm = Reduce("mm", {{M, "m"}, {N, "n"}}, matmul, {{K, "k"}}); ``` A fully specialized Reduction: ``` VarHandle searchValue("searchValue", kInt); Buffer b(BufHandle("b", {4, 10}), kInt); Reducer anyEqSV( ExprHandle(0), [](ExprHandle a, ExprHandle b) { return CompareSelect::make(a, 1, 1, b, kEQ); }, [&](ParameterList& v) { return CompareSelect::make(b.call(v), searchValue, kEQ); }); Tensor* any = Reduce("anyEqual", {{4, "i"}}, anyEqSV, {{10, "j"}}); ``` --- Until lowering, Reductions are held in a compound form for easier optimization: ``` VarHandle m("m", kInt); Buffer b(BufHandle("b", {2, 3, m}), kFloat); Tensor* c = Reduce("sum", {{2, "l"}, {3, "n"}}, Sum(b), {{m, "m"}}); LoopNest loop({c}); std::cout << loop.root_stmt() << "\n"; ``` ``` for (int l = 0; l < 2; l++) { for (int n = 0; n < 3; n++) { for (int m = 0; m < m_1; m++) { sum[l, n] = ReduceOp(sum[l, n] = float(0);, (sum[l, n]) + (b[l, n, m]), {m}); } } } ``` ``` loop.prepareForCodegen(); std::cout << loop.root_stmt() << "\n"; ``` ``` for (int l = 0; l < 2; l++) { for (int n = 0; n < 3; n++) { sum[(0 + l * (1 * 3)) + n * 1] = float(0); for (int m = 0; m < m_1; m++) { sum[(0 + l * (1 * 3)) + n * 1] = (sum[(0 + l * (1 * 3)) + n * 1]) + (b[((0 + l * ((1 * m_1) * 3)) + n * (1 * m_1)) + m * 1]); } } } ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/35866 Differential Revision: D20965577 Pulled By: nickgg fbshipit-source-id: afe506c90db794447180056417013bcaf0e2c049	2020-04-10 11:57:10 -07:00
Nick Gibson	d568c7d966	[TensorExpr] add more detail to malformed_input exceptions (#35891 ) Summary: Add an explanation string to malformed_input exceptions thrown inside jit/tensorexpr to aid in debugging issues. Pull Request resolved: https://github.com/pytorch/pytorch/pull/35891 Differential Revision: D20822306 Pulled By: nickgg fbshipit-source-id: ce153a05218f2a4da5ecf5f1a5dc439070c96e55	2020-04-06 10:36:31 -07:00
Mikhail Zolotukhin	b3d30f2dc4	[TensorExpr] Compiler warnings cleanups. (#35925 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35925 Test Plan: Imported from OSS Differential Revision: D20830304 Pulled By: ZolotukhinM fbshipit-source-id: 5ecd8fd403a3222385306a5295a199c86c88a6cc	2020-04-03 12:18:42 -07:00
Mikhail Zolotukhin	3ef5ff6012	[TensorExpr] Make Load and Store multi-dimensional. (#35800 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35800 This PR includes the following changes: * Introduce a new `Expr` type `Buf`: it plays a similar to `Var` role, but also has dimensions. * Use the new `Buf` class in `Store` and `Load` instead of `Var` for specifying where to store to or load from. `Buf` contains the dimensions info of the buffer we're loading/storing to and hence we are able to keep N-d indexes without flattening them into a 1-d index ([x,y] vs [x+yW]). Flattening of the indexes is now a separate pass that is executed in `LoopNest::prepareForCodegen` - backends still expect indexes to be flattened, and this PR preserves that. * `Tensor` now contains a `Buf` instead of `Var`, and thus Tensor now has the dimensions info (previously it was a property of a `Function`, not a `Tensor`). This brings us closer to Tensor being a combination of Buffer + Function, where Buffer specifies iteration domain and the Function defines a computation. TODOs: * Consider merging `Buffer` with `Buf` or `BufHandle`. It seems that we don't need all of them. * Harden the logic of how we create buffers in fuser pass. Currently it seems that sometimes we don't set dimensions. * Use `Buf` in `Allocate` and `Free`. * Make it clearer that `Function` doesn't "own" dimensions info and that dimensions are a property of a Tensor, not a Function. Differential Revision: D20789005 Test Plan: Imported from OSS Reviewed By: zheng-xq Pulled By: ZolotukhinM fbshipit-source-id: e04188d1d297f195f1c46669c614557d6bb6cde4	2020-04-02 11:18:28 -07:00

1 2

53 Commits