pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	e288c258f7	Revert "Remove tensorexpr tests (#158928 )" This reverts commit `d742a2896c`. Reverted https://github.com/pytorch/pytorch/pull/158928 on behalf of https://github.com/yangw-dev due to this breaks bunch of internal dependency since some tests are still using the deleted test files from this pr, the internal reviewer please help fix this using codev ([comment](https://github.com/pytorch/pytorch/pull/158928#issuecomment-3134378616))	2025-07-29 23:32:07 +00:00
cyy	d742a2896c	Remove tensorexpr tests (#158928 ) The tests are not maintained. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158928 Approved by: https://github.com/albanD, https://github.com/malfet	2025-07-27 07:13:27 +00:00
PyTorch MergeBot	f62772f365	Revert "Remove tensorexpr tests (#158928 )" This reverts commit `517eebc1dd`. Reverted https://github.com/pytorch/pytorch/pull/158928 on behalf of https://github.com/ZainRizvi due to Sorry but this breaks trunk test_jit_fuser_te.py::TestNNCOpInfoCPU::test_nnc_correctness_frac_cpu_bfloat16 [GH job link](https://github.com/pytorch/pytorch/actions/runs/16534544469/job/46768022799) [HUD commit link](`517eebc1dd`) ([comment](https://github.com/pytorch/pytorch/pull/158928#issuecomment-3122158944))	2025-07-26 17:01:54 +00:00
cyy	517eebc1dd	Remove tensorexpr tests (#158928 ) The tests are not maintained. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158928 Approved by: https://github.com/albanD, https://github.com/malfet	2025-07-26 01:21:01 +00:00
PyTorch MergeBot	9535995bbc	Revert "Remove tensorexpr tests (#158928 )" This reverts commit `a0bc865123`. Reverted https://github.com/pytorch/pytorch/pull/158928 on behalf of https://github.com/clee2000 due to broke cpp static runtime test? [GH job link](https://github.com/pytorch/pytorch/actions/runs/16517697273/job/46715871457) [HUD commit link](`a0bc865123`) ([comment](https://github.com/pytorch/pytorch/pull/158928#issuecomment-3118554478))	2025-07-25 15:22:51 +00:00
cyy	a0bc865123	Remove tensorexpr tests (#158928 ) The tests are not maintained. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158928 Approved by: https://github.com/albanD	2025-07-25 08:37:51 +00:00
PyTorch MergeBot	13398dab79	Revert "Remove tensorexpr tests (#158928 )" This reverts commit `a3f9f79f59`. Reverted https://github.com/pytorch/pytorch/pull/158928 on behalf of https://github.com/clee2000 due to Theres still some references to the things removed in this PR in test.sh, the jobs on this PR are failing because of that but log classifier is probably pointing to a wrong line, should be an easy fix tho ([comment](https://github.com/pytorch/pytorch/pull/158928#issuecomment-3114873706))	2025-07-24 20:45:30 +00:00
cyy	a3f9f79f59	Remove tensorexpr tests (#158928 ) The tests are not maintained. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158928 Approved by: https://github.com/albanD	2025-07-24 15:38:36 +00:00
cyy	3639d29ea1	Fix warnings of unused-variable (#158627 ) Fixes ``` /var/lib/jenkins/workspace/test/cpp/tensorexpr/test_kernel.cpp:42:22: error: unused variable 'verification_pattern' [-Werror,-Wunused-variable] ``` and also extra semicolons. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158627 Approved by: https://github.com/albanD	2025-07-22 02:49:06 +00:00
Xuehai Pan	c8d43cbc6e	[BE][3/6] fix typos in test/ (#157637 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157637 Approved by: https://github.com/yewentao256, https://github.com/albanD ghstack dependencies: #156605	2025-07-17 12:08:33 +00:00
Richard Barnes	ed327876f5	[codemod] `c10:optional` -> `std::optional` (#126135 ) Generated by running the following from PyTorch root: ``` find . -regex ".*\.$cpp\\|h\\|cu\\|hpp\\|cc\\|cxx$$" \| grep -v "build/" \| xargs -n 50 -P 4 perl -pi -e 's/c10::optional/std::optional/' ``` `c10::optional` is just an alias for `std::optional`. This removes usages of that alias in preparation for eliminating it entirely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126135 Approved by: https://github.com/Skylion007, https://github.com/malfet, https://github.com/albanD, https://github.com/aaronenyeshi	2024-05-14 19:35:51 +00:00
Kazuaki Ishizaki	deb800ee81	Fix typo under test directory (#111304 ) This PR fixes typo in comments under `test` directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111304 Approved by: https://github.com/Skylion007	2023-10-16 23:06:06 +00:00
cyy	dee100945e	[2/N] Move c10::variant to std::variant (#109723 ) This PR moves most of c10::variant calls to std::variant. Pull Request resolved: https://github.com/pytorch/pytorch/pull/109723 Approved by: https://github.com/ezyang	2023-09-24 02:47:43 +00:00
mikey dagitses	322e4b4c8a	set -Wsuggest-override for builds (#89852 ) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/pytorch/pytorch/pull/89852). * __->__ #89852 * #89851 set -Wsuggest-override for builds Summary: This was flagged by a Meta internal build. Test Plan: Rely on CI. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89852 Approved by: https://github.com/malfet	2022-12-19 22:08:47 +00:00
Wang, Eikan	70c6a988d6	Fix the performance issue that the for-loop before ExternallCall could not be parallelized. (#85056 ) Currently, NNC only parallelizes the loop statement of the graph outputs. The logic could bypass some loop statements that could be parallelized. Take an example as follows and suppose the output of `ExternallCall` is also the output of NNC fusion group. Current [parallel logic](https://github.com/pytorch/pytorch/pull/85056/files#diff-9a11174c26e4b57ab73e819520122bc314467c72962f3a5b79e7400ea3c4bbe5L781-L785) only tries to parallel the `ExternalCall` and bypass `stmt1` and `stmt2`. ```c++ stmt1: For: stmt2: For: stmt3: ExternalCall ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/85056 Approved by: https://github.com/frank-wei, https://github.com/bertmaher	2022-10-07 07:36:28 +00:00
Wang, Eikan	45be74cc63	Optimize to if the datatyep of the source tensor is as same as the dest datatype (#85140 ) The AMP inserts `_autocast_to_reduced_precision` and `_autocast_to_full_precision` automatically. The aten implementation provides a fast path to bypass the conversion if the tensor data type has been the reduced/full precision. But NNC always does the conversion which could bring >5% E2E performance regression. This PR is to address the performance issue like aten. We will not pull `_autocast_to_reduced_precision` and `_autocast_to_full_precision` into NNC fusion group and fallback to aten to trigger its fast path if the tensor data type has been the reduced/full precision. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85140 Approved by: https://github.com/frank-wei	2022-09-27 04:40:42 +00:00
Will Constable	4f34cd6d1e	Replace all CHECK_ and DCHECK_ with TORCH_* macros (#82032 ) Avoid exposing defines that conflict with google logging, since this blocks external usage of libtorch in certain cases. All the 'interesting' changes should be in these two files, and the rest should just be mechanical changes via sed. c10/util/logging_is_not_google_glog.h c10/util/logging_is_google_glog.h Fixes https://github.com/pytorch/pytorch/issues/81415 cc @miladm @malfet Pull Request resolved: https://github.com/pytorch/pytorch/pull/82032 Approved by: https://github.com/soumith, https://github.com/miladm	2022-07-26 01:20:44 +00:00
Michael Andreas Dagitses	ab2ca95dd1	turn on -Werror=unused-variable in our Bazel CPU build Summary: We also fix any existing issues. Note that we only do this for the CPU build because nvcc is considered a C++ toolchain but it does not have the same flag support. Adding flags to the GPU build will cause nvcc errors. Test Plan: Built locally, rely on CI to confirm. Reviewers: malfet Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/79156 Approved by: https://github.com/seemethere, https://github.com/osalpekar, https://github.com/albanD	2022-06-11 02:46:34 +00:00
Wang, Eikan	429a80dded	[NNC] Lowering function generates the output buffer with the specified stride (#76529 ) Summary: Pass stride information to lowering function to generate the output bufer with proper memory layout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76529 Reviewed By: ZolotukhinM Differential Revision: D36116712 Pulled By: IvanKobzarev fbshipit-source-id: d3901f756b3710ecce172d6db3ecb0b7c12fb929 (cherry picked from commit b6cd53c91c01db36ea0e99167dc0ce0ae1d3aa23)	2022-05-04 20:04:22 +00:00
zengk95	1d55518198	Revert "[nnc] Strides to Tensor (#72962 )" This reverts commit `939060925f`. Fixes https://github.com/pytorch/vision/issues/5873 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76332 Approved by: https://github.com/seemethere	2022-04-25 19:50:00 +00:00
Ivan Kobzarev	939060925f	[nnc] Strides to Tensor (#72962 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72962 Test Plan: Imported from OSS Reviewed By: ZolotukhinM, cpuhrsch Differential Revision: D34589306 Pulled By: IvanKobzarev fbshipit-source-id: ecee5249760ecc0c8b2edb1842b90218899bc944 (cherry picked from commit 9e310c4c67389da30da89126d838ffe3864aba6f)	2022-04-23 19:35:15 +00:00
Raghavan Raman	c2d5f6a5a4	[nnc] Update bounds overlap analysis to identify non-overlaps even with symbolic bounds Pull Request resolved: https://github.com/pytorch/pytorch/pull/74658 Approved by: https://github.com/ZolotukhinM	2022-04-14 20:24:03 +00:00
Raghavan Raman	d8ad1a579f	[nnc] Fuse loops that have variable bounds Pull Request resolved: https://github.com/pytorch/pytorch/pull/74346 Approved by: https://github.com/ZolotukhinM	2022-04-14 20:24:03 +00:00
Nikita Shulga	43313cbde3	Revert D34647822: [tensorexpr] Add support for aten::stack Test Plan: revert-hammer Differential Revision: D34647822 (`954c7e2a77`) Original commit changeset: 3b863c71886c Original Phabricator Diff: D34647822 (`954c7e2a77`) fbshipit-source-id: e9ce06c9c8d7caf0fbb2565f0d99035bad685793 (cherry picked from commit b2ff355e9dbaa4e940fb221254223984c3c8a215)	2022-03-31 04:25:43 +00:00
Hui Guo	954c7e2a77	[tensorexpr] Add support for aten::stack (#73801 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73801 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D34647822 Pulled By: huiguoo fbshipit-source-id: 3b863c71886c7c6616b16f5d3313079714c8b82a (cherry picked from commit c71778cf6a5724d26b671bf3ee0478add24990e8)	2022-03-30 21:25:15 +00:00
Mikhail Zolotukhin	1855b14922	[TensorExpr] Delet `DimArg` class. (#72390 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72390 This class didn't add much value and only caused more boilerplate code. This change removes the class and updates all the use cases with uses of `ExprHandle`. A side effect of this change is different names in loop variables, which caused massive mechanical changes in our tests. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D34030296 Pulled By: ZolotukhinM fbshipit-source-id: 2ba4e313506a43ab129a10d99e72b638b7d40108 (cherry picked from commit `c2ec46a058`)	2022-02-11 01:21:59 +00:00
Mikhail Zolotukhin	1dbcde2ade	[TensorExpr] Support scalar intermediate and output values. (#71186 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71186 So far we've only supported scalar inputs, but couldn't handle scalar outputs or intermediates. This PR adds it. Scalar outputs are returned as 0-dim tensors. If the kernel is invoked on a stack of IValues, we correctly convert the results to scalar IValues when needed. If the kernel is invoked with a vector of void* pointers, everything works out of the box without any conversions. Lowerings for scalar operators are a bit tricky. Usual lowerings return a pair <Buf, Stmt> (aka Tensor), but for scalar operators we also want to have the corresponding Var that the lowering function supposedly creates (in theory we could just use Loads and Stores, but I'm worried it can affect performance as there is no guarantee this will be optimized by LLVM). So, what we do here to work around this is we return a fake buf + stmt that sets the corresponding var. Then outside of the lowering we create a real buffer and generate a Store to it with the value from the variable we passed as the base handle of the fake buf. This real buffer is then treated as usual by the rest of the system and we can use it if we need to return this scalar value as a kernel output. If we do not need to return it, then the Store will be deleted by the DCE pass. Differential Revision: D33539324 D33539324 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: ab4524b9820ce204f106effcf6232ed33d4ee223 (cherry picked from commit `7faa0939f0`)	2022-01-26 06:32:51 +00:00
CodemodService FBSourceClangFormatLinterBot	88012c7daf	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D33577744 fbshipit-source-id: 7ecc8367998ee1dffde54c2f4dd3cfafe19a53c9	2022-01-14 06:10:57 -08:00
Mike Ruberry	3a0c680a14	Jiterates exp2, erfc, erfinv and entr and refactors code_template.h to ATen (#71295 ) Summary: Per title. cc pietern mrshenli pritamdamania87 zhaojuanmao satgera rohan-varma gqchen aazzolini osalpekar jiayisuse SciPioneer H-Huang Pull Request resolved: https://github.com/pytorch/pytorch/pull/71295 Reviewed By: ngimel Differential Revision: D33575885 Pulled By: mruberry fbshipit-source-id: bc841b46fc0b5458a26a4d4465b18a7a54cd5a5b	2022-01-13 23:58:51 -08:00
Elias Ellison	fb66f561b1	Add copy out to the fallback path in SR invocation of composed op (#70871 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70871 We had previously handled reusing memory in the optimized kernel execution path, but not yet handled it if we hit the unoptimized fallback. Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D33458652 Pulled By: eellison fbshipit-source-id: 4eb62181ed02c95813a99638f5e2d0f9347b5c08	2022-01-10 12:16:38 -08:00
Mikhail Zolotukhin	8223ef1cd8	[TensorExpr] Clean-up logic for copying input tensors and remove some dead code. (#70535 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70535 This also fixes handling of inputs that happen to be outputs (they require copy). Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D33399116 Pulled By: ZolotukhinM fbshipit-source-id: 9845838eb653b82ae47b527631b51893990d5319	2022-01-07 01:03:56 -08:00
Raghavan Raman	4dec15e6d8	[nnc] Add a run method to TensorExprKernel that takes in output tensors (#69477 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69477 This diff adds a new run method to `TensorExprKernel` which takes in output tensors as inputs and stores the output in those given tensors. ghstack-source-id: 146107009 Test Plan: buck test mode/dev-nosan //caffe2/test/cpp/tensorexpr:tensorexpr -- --exact 'caffe2/test/cpp/tensorexpr:tensorexpr - Kernel.RunWithAllocatedOutputs' Reviewed By: ZolotukhinM Differential Revision: D32823890 fbshipit-source-id: edc1f4839785124048b034060feb71cb8c1be34f	2021-12-22 00:30:15 -08:00
David Berard	8c7f4a0d0b	[tensorexpr] check for index out of bounds in ir_eval (#68858 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68858 when executing with ir_eval, check for index out of bounds. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D32657881 Pulled By: davidberard98 fbshipit-source-id: 62dd0f85bb182b34e9c9f795ff761081290f6922	2021-12-16 09:27:45 -08:00
Raghavan Raman	e7a3bbce89	[nnc] Add support for dynamic shapes in TensorExprKernel (#67861 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67861 Previously submitted as https://github.com/pytorch/pytorch/pull/67197. This got reverted because its failures were hidden by the failures of another PR. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D32178196 Pulled By: navahgar fbshipit-source-id: cc8a5c68aed360d06289e69645461cfa773e1300	2021-11-05 11:18:19 -07:00
Natalia Gimelshein	ca445645f9	Revert D31902471: [nnc] Add support for dynamic shapes in TensorExprKernel Test Plan: revert-hammer Differential Revision: D31902471 (`15a3c374e2`) Original commit changeset: d2729a38ba1a fbshipit-source-id: 4c05de82e626bbf744df84fd2b914b66fd165a19	2021-11-03 14:48:12 -07:00
Raghavan Raman	15a3c374e2	[nnc] Add support for dynamic shapes in TensorExprKernel (#67197 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67197 Test Plan: Imported from OSS Reviewed By: eellison, ZolotukhinM Differential Revision: D31902471 Pulled By: navahgar fbshipit-source-id: d2729a38ba1ac607ff07f516ed56fbd9085715dc	2021-11-03 11:24:17 -07:00
Raghavan Raman	383c1f51b1	[nnc] Fixed handling of 0-sized tensors in cat (#67734 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67734 The implementation of `aten::cat` op in NNC has to ignore tensors that have 0-size in any dimension. Test Plan: `buck test mode/dev-nosan //caffe2/test/cpp/tensorexpr:tensorexpr -- --exact 'caffe2/test/cpp/tensorexpr:tensorexpr - Kernel.CatWithEmptyInputs'` Reviewed By: ZolotukhinM Differential Revision: D32122171 fbshipit-source-id: 90c697813bc504664673cdc262df6e7ce419c655	2021-11-03 10:16:16 -07:00
Mikhail Zolotukhin	d58ef2bbff	[TensorExpr] Fix lowering for aten::softmax for the case when dtype parameter is None. (#66516 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66516 Differential Revision: D31590858 D31590858 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: 0aeee7a5be64b3b9c8fa00aacb1a94031a7e25d1	2021-11-03 09:42:48 -07:00
Richard Barnes	e0643fa3fc	use irange for loops 5 (#66744 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66744 Modified loops in files under fbsource/fbcode/caffe2/ from the format `for(TYPE var=x0;var<x_max;x++)` to the format `for(const auto var: irange(xmax))` This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand. Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D31705358 fbshipit-source-id: d6ea350cbaa8f452fc78f238160e5374be637a48	2021-10-18 21:59:50 -07:00
Xue Li	2f099c7555	Revert D30652629: use irange for loops Test Plan: revert-hammer Differential Revision: D30652629 (`687c2267d4`) Original commit changeset: 0ae6c4bbbb55 fbshipit-source-id: 5c4f067b584a021c8c9656454d1ee60999600fb3	2021-10-15 15:23:10 -07:00
Richard Barnes	687c2267d4	use irange for loops (#66234 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66234 Modified loops in files under fbsource/fbcode/caffe2/ from the format `for(TYPE var=x0;var<x_max;x++)` to the format `for(const auto var: irange(xmax))` This was achieved by running r-barnes's loop upgrader script (D28874212) with some modification to exclude all files under /torch/jit and a number of reversions or unused variable suppression warnings added by hand. bypass_size_limit allow-large-files Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D30652629 fbshipit-source-id: 0ae6c4bbbb554bad42e372792a6430e1acf15e3e	2021-10-15 13:50:33 -07:00
Mikhail Zolotukhin	7e9c599784	[TensorExpr] Add a method for sanitizing Var and Buf names in Stmt. (#65010 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65010 This pass ensures all names are legal and not-duplicated. Fixes #52727. Test Plan: Imported from OSS Reviewed By: bertmaher, navahgar Differential Revision: D30939717 Pulled By: ZolotukhinM fbshipit-source-id: 7dbe7f937de41f22ad49137a5e067d698443ed63	2021-09-15 17:15:06 -07:00
Raghavan Raman	cad7a4b0ea	[nnc] Added an implementation of sign op (#64033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64033 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D30579197 Pulled By: navahgar fbshipit-source-id: f9f7fa7f2ffa109cf4e441eb1af821b8b891d4d3	2021-09-10 16:49:04 -07:00
Mikhail Zolotukhin	a17d6c7f80	[TensorExpr] Simplify TE IR before applying any transformations. (#64717 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64717 This also exposed several bugs, which are fixed in this PR. Differential Revision: D30826408 D30826408 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: a67ec5739aceed9ffdf0d24f77eb3787cefe4560	2021-09-09 18:50:51 -07:00
Raghavan Raman	652a8bf7d0	[nnc] Updated indices during broadcast to use int64_t (#64627 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64627 This fixes the root cause of S242719 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D30801686 Pulled By: navahgar fbshipit-source-id: b6d3ebdc7eb57116eaced53c2f35c7798bb17e80	2021-09-09 08:29:37 -07:00
Hui Guo	5c27a580ec	[tensorexpr] Allocate intermediate buffers at compile time (#64227 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64227 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D30652220 Pulled By: huiguoo fbshipit-source-id: cd75005cdfa42751318de7174b44e14a3a01634e	2021-09-08 15:34:44 -07:00
Mikhail Zolotukhin	72274e2a2f	[TensorExpr] Don't rely on exceptions in Vectorizer. (#64609 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64609 We've been using exceptions to indicate whether vectorization succeeded or not, but that posed some problems with (e.g. we spent too much time symbolicazing these exceptions). This change converts this mechanism to a standard error return code. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D30795342 Pulled By: ZolotukhinM fbshipit-source-id: 16e38b37bcdd78ceb438ac814cc377f35b058e17	2021-09-08 00:25:34 -07:00
Bert Maher	2e6221a232	[nnc] Make 64-bit dimensions work (#64077 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64077 We were assuming kernel dimensions fit in 32 bits (the old fuser made this assumption too), but we should be able to support 64. ghstack-source-id: 136933272 Test Plan: unit tests; new IR level test with huge sizes Reviewed By: ZolotukhinM Differential Revision: D30596689 fbshipit-source-id: 23b7e393a2ebaecb0c391a6b1f0c4b05a98bcc94	2021-08-28 19:59:47 -07:00
Raghavan Raman	6d31ba6ddc	[nnc] Sanitized the names of constants in the input graph. (#63990 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/63923 The input graph can contain constants whose names contain special characters. So, all names of constants in the input graph need to be sanitized. Pull Request resolved: https://github.com/pytorch/pytorch/pull/63990 Reviewed By: ZolotukhinM Differential Revision: D30558432 Pulled By: navahgar fbshipit-source-id: de5b0c23d50ee8997f40f2c0fc605dda3719186f	2021-08-26 09:52:02 -07:00
Bert Maher	8dda299d96	Re-apply: [nnc] Support thread level parallelism in fused kernels (#63776 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63776 I reverted this out of an abundance of caution because some test failures occurred, but they were all due to precision issues fixed lower in this stack. Let's try again. I've rolled the elimination of the allow-parallelism-in-fusions toggle into this diff since they're pretty tightly coupled. ghstack-source-id: 136529847 Test Plan: CI Reviewed By: huiguoo Differential Revision: D30484555 fbshipit-source-id: 38fd33520f710585d1130c365a8c60c9ce794a59	2021-08-24 18:56:55 -07:00

1 2

100 Commits