pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Richard Barnes	3979cb0656	irange for size_t (#55320 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55320 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D27572577 fbshipit-source-id: 97710fd2bb1303006b05828a0d1343b0b59ccb03	2021-06-03 01:04:13 -07:00
Mikhail Zolotukhin	27009d6129	[TensorExpr] Add NNC lowerings for `aten::view`, `aten::reshape` and `aten::expand_as`. (#59157 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59157 Currently view is represented as a copy since we don't support inplace operations in NNC (similar to `aten::reshape`). Lowering for `aten::expand_as` is exactly the same as for the `aten::expand`, since we're building the TE expression basing on the output shape anyway. Differential Revision: D28774224 D28774224 Test Plan: Imported from OSS Reviewed By: Chillee Pulled By: ZolotukhinM fbshipit-source-id: 0a1593c4c6500dcc5a374213adb734180ae1f72e	2021-05-29 20:36:32 -07:00
Horace He	a427820350	[NNC] Added triangular_solve external call + fixed permute (#59131 ) Summary: The triangular_solve only returns the first input, since the second input is just a copy of the first one. Why does that exist? Also, I fixed the permute lowering - I was previously doing the inverse application of the permute. Pull Request resolved: https://github.com/pytorch/pytorch/pull/59131 Reviewed By: ansley Differential Revision: D28768169 Pulled By: Chillee fbshipit-source-id: 8e78611c6145fb2257cb409ba98c14ac55cdbccf	2021-05-28 22:29:30 -07:00
Bin Bao	7e4e648c2a	Enable NNC fusion for relu6 (#58773 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58773 Test Plan: ``` python test/test_ops.py -k relu6 python test/test_jit_fuser_te.py ``` Reviewed By: bertmaher Differential Revision: D28721791 Pulled By: desertfire fbshipit-source-id: a94f711977afd080faae052f66eb8dded3cdc79e	2021-05-27 10:54:02 -07:00
Raghavan Raman	dd7bbe1a63	[NNC] Make splitWithMask transform in-place (#58269 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58269 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D28427227 Pulled By: navahgar fbshipit-source-id: 4e38a436abcf4752fd7ef6ab3666876eec6ea5ba	2021-05-25 11:32:51 -07:00
Horace He	e56d3b0238	Added OpInfo tests for NNC (#58719 ) Summary: Finds a couple of bugs: 1. permute needs to wrap dimensions 2. slice needs to wrap dimensions 3. frac doesn't work correctly for negative values 4. Permute has some other failures. This PR also fixes 1 + 2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58719 Reviewed By: SplitInfinity Differential Revision: D28590457 Pulled By: Chillee fbshipit-source-id: a67fce67799602f9396bfeef615e652364918fbd	2021-05-21 01:41:28 -07:00
Edvard Ghazaryan	5211eeb22b	Support aten::leaky_relu for TE (#58464 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58464 Test Plan: ./bin/test_tensorexpr python test/test_jit_fuser_te.py TestTEFuser.test_unary_ops Reviewed By: Krovatkin Differential Revision: D28499776 fbshipit-source-id: 20094a1bc78aa485f76aec4e065ff69e43d692d7	2021-05-20 16:12:03 -07:00
Bert Maher	0e1bed364d	[nnc] Use int64 to compute matmul flops heuristic (#58676 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58676 We only generate asm for small matmuls, but we were computing the # of flops using an int32, which is too small. Test Plan: ``` buck test mode/dev //caffe2/test:static_runtime -- --exact 'caffe2/test:static_runtime - test_mlp (test_static_runtime.TestStaticModule)' ``` Reviewed By: navahgar Differential Revision: D28562157 fbshipit-source-id: a07ceba5209ef6022ead09140380c116994755cf	2021-05-20 13:05:21 -07:00
Raghavan Raman	3fe72d30dc	[NNC] Optimize conditionals that correspond to the form generated for aten::cat op. (#57673 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57673 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28231374 Pulled By: navahgar fbshipit-source-id: 1777a63df4e5ebed6d515683bd772a88be465b3a	2021-05-18 14:23:48 -07:00
Bert Maher	6b8b591a84	[nnc] Fix output restriding of size-1 dimensions (#58256 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58256 Size-1 dims mess up our output restriding logic, because they're technically "dense" no matter what stride the dimension has. In this example a size-1 dim has stride 1, which causes all the indices to be taken mod 1 (i.e., all indices become 0). We work around this peculiar case by skipping size-1 in our layout logic, since it has no impact on the rest of the tensor's indexing. ghstack-source-id: 128932739 Test Plan: new unit test, plus ``` buck test mode/dev //langtech/mobile/audio_stream_processor:audio_stream_processor_test -- --exact 'langtech/mobile/audio_stream_processor:audio_stream_processor_test - AudioStreamProcessorTest.DemucsReadWriteFloat' ``` Reviewed By: eellison Differential Revision: D28424388 fbshipit-source-id: e33e39eef2a5bf2797bee78a5987558308b6d110	2021-05-14 00:09:12 -07:00
Nick Korovaiko	c524448dd1	init hardshrink (#57749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57749 add to a fx test Test Plan: Imported from OSS Reviewed By: huiguoo Differential Revision: D28425974 fbshipit-source-id: 195c7a1944decb7a2a99c2831cab38485f32be17	2021-05-13 19:38:05 -07:00
Bert Maher	6955d4d0f7	[nnc] Handle only the first argument of aten::to (#58028 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58028 We were trying to translate the device argument and thus throwing an unsupported dtype. ghstack-source-id: 128748658 Test Plan: predictor models Reviewed By: navahgar Differential Revision: D28347704 fbshipit-source-id: 331a5786339e01f9df1b1878970b0c5983a92980	2021-05-12 12:52:29 -07:00
Bert Maher	a88673e93e	Enable cat wo conditionals iff cpu (#58026 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58026 Cat-without-conditionals is a valuable optimization on CPU but on GPU it can generate invalid code since it may introduce allocations (i.e. extra kernel launches) ghstack-source-id: 128748630 Test Plan: predictor Reviewed By: navahgar Differential Revision: D28347703 fbshipit-source-id: f9e68cd7bcf5d316082ce8378ddf99f2d33fcc07	2021-05-12 12:51:10 -07:00
Mikhail Zolotukhin	9ad19af935	[TensorExpr] Fix a condition when we use a native depthwise conv2d lowering. (#57906 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57906 I think it was accidentally flipped in #56875. Test Plan: Imported from OSS Reviewed By: Chillee Differential Revision: D28312947 Pulled By: ZolotukhinM fbshipit-source-id: 8d0f45e540f47daefbc270f5a2ade87f2171b958	2021-05-08 23:04:14 -07:00
Nikita Shulga	3a66a1cb99	[clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841 ) Summary: Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy Remove existing nolint warnings using following script: ``` for file in `git ls-files \| grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i $file; done ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841 Reviewed By: samestep Differential Revision: D28295045 Pulled By: malfet fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163	2021-05-07 20:02:33 -07:00
Horace He	b38f153d91	[nnc] Added NNC lowerings for t/transpose/permute/expand + other cleaning (#57426 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57426 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D28293191 Pulled By: Chillee fbshipit-source-id: b8fc44299acf2569c11e87e1991a2b724434b15d	2021-05-07 15:38:56 -07:00
Elias Ellison	241c2f4496	Add Gelu To NNC (#57753 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57753 I'm not adding symbolic gradient because that is being added in https://github.com/pytorch/pytorch/pull/46785. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D28262765 Pulled By: eellison fbshipit-source-id: be365a2d392d7ac4bcc099a184762249ec2e18a6	2021-05-06 16:04:50 -07:00
CodemodService FBSourceClangFormatLinterBot	f1a62264f3	[AutoAccept][Codemod][FBSourceClangFormatLinter] Daily `arc lint --take CLANGFORMAT` Reviewed By: zertosh Differential Revision: D28250914 fbshipit-source-id: 8bec4e0806891a045becf59c2d2f44f12bc41926	2021-05-06 05:11:25 -07:00
Horace He	c27428b5e9	[nnc] ported conv2d lowering over (#56875 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56875 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D28213450 Pulled By: Chillee fbshipit-source-id: bacdcec83ec61aba1d55f5e3a16f81d6ada3cff2	2021-05-05 20:54:43 -07:00
Elias Ellison	7627dd568a	hardswish reland (#57652 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57652 Test Plan: Imported from OSS Reviewed By: Krovatkin Differential Revision: D28226724 Pulled By: eellison fbshipit-source-id: 585a91ffab7a855b5600e79130a37be25ef9b354	2021-05-05 17:21:43 -07:00
Horace He	56211524a7	[NNC] ported over sum and softmax to new scheme (#56775 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56775 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D28173905 Pulled By: Chillee fbshipit-source-id: 865ff71e5a428341d7225f534f7093ef2994fe5a	2021-05-05 17:09:34 -07:00
Mikhail Zolotukhin	e686c66fe7	Reland: [TensorExpr] Add `TensorExprKernel::runFast` method. (#57552 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57552 This method uses `CodeGen::call_raw` instead of `CodeGen::call`. Relanding #57328 (the entire stack) which was reverted because I forgot to guard a new test with `ifdef LLVM`. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28195047 Pulled By: ZolotukhinM fbshipit-source-id: bcfd3cb5b4f33a149b7549515ffd705e2c4f208f	2021-05-05 09:11:37 -07:00
Shen Li	887d0e5657	Revert D28197820: [JIT][NNC] add hardswish symbolic gradient and NNC lowering Test Plan: revert-hammer Differential Revision: D28197820 (`0142fd0b57`) Original commit changeset: 05305d85c5bb fbshipit-source-id: 2e1d9699515982ba2a9be06e83a2ce043ec857ee	2021-05-05 07:53:30 -07:00
eellison	0142fd0b57	[JIT][NNC] add hardswish symbolic gradient and NNC lowering (#57383 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57383 Notes: I picked up an activation from https://github.com/pytorch/pytorch/issues/56969. You can look at the [activations.cpp](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cpu/Activation.cpp#L429) file which has both forward and backward kernel code to help you write the NNC lowering and the symbolic gradient. I added a test in test_jit_fuser_te for the fusion, and I added an OpInfo and asserted that we expect to see autodiffable nodes to test the symbolic gradient. Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D28197820 Pulled By: eellison fbshipit-source-id: 05305d85c5bb0847c8f911b95ba47b137dca7e90	2021-05-04 23:39:59 -07:00
Mikhail Zolotukhin	839d549f8f	[JIT] Add a pass for removing a first (self) argument from a graph if it is unused. (#57169 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57169 The pass is planned to be used in AOT pipeline, where we expect input graphs to be functional. As such, these graphs should not use 'self' argument even if it is present, and thus it can be remove safely. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28128328 Pulled By: ZolotukhinM fbshipit-source-id: a7dfbf7776682826100c8eb0fef982a2e81c2554	2021-05-03 20:02:25 -07:00
Mikhail Zolotukhin	3ad3d8bd3f	[JIT] Add a pass for annotating graph with input types derived from sample inputs. (#57076 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57076 This pass is intended to be used in conjunction with shape propagation pass: first we use sample inputs to specify shape info for graph inputs and then we run shape-prop to infer shapes of intermediate values in the graph. Differential Revision: D28048290 Test Plan: Imported from OSS Reviewed By: astaff Pulled By: ZolotukhinM fbshipit-source-id: 778d772e873d59d77af9f669f45dc44b9ee5e443	2021-05-03 20:01:15 -07:00
Mike Ruberry	3018093066	Revert D28110359: [TensorExpr] Add `TensorExprKernel::runFast` method. Test Plan: revert-hammer Differential Revision: D28110359 (`f219ed6627`) Original commit changeset: 4fdffc8196d2 fbshipit-source-id: 3c93a058b5dd7a3b71e399341a408ec74949ef56	2021-05-01 16:16:37 -07:00
Horace He	47e9ec401a	[nnc] ported some more ops + added vectors to argvalue (#56766 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56766 Test Plan: Imported from OSS Reviewed By: desertfire Differential Revision: D28118331 Pulled By: Chillee fbshipit-source-id: eb012943ad3b83e72a8cb17b594852164c3f0567	2021-04-30 17:34:49 -07:00
Mikhail Zolotukhin	f219ed6627	[TensorExpr] Add `TensorExprKernel::runFast` method. (#57328 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57328 This method uses `CodeGen::call_raw` instead of `CodeGen::call`. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28110359 Pulled By: ZolotukhinM fbshipit-source-id: 4fdffc8196d24fc3300a9b4bc69f67562042a045	2021-04-30 15:26:18 -07:00
Horace He	3a923a555a	[NNC] moved lowerings out of the TensorExprKernel and into independent functions (#56679 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56679 moved lowerings out of the TensorExprKernel and into independent functions Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D28082921 Pulled By: Chillee fbshipit-source-id: af530510957ed4aa8b64dcc77ca36b69866d8000	2021-04-29 05:46:50 -07:00
Nikita Shulga	eac02f85cf	Fix more clang-tidy errors (#57235 ) Summary: In my last PR I've missed CUDA and distributed folders, fixing this now This change is autogenerated by `python tool/clang_tidy.py -s` Pull Request resolved: https://github.com/pytorch/pytorch/pull/57235 Reviewed By: janeyx99 Differential Revision: D28084444 Pulled By: malfet fbshipit-source-id: bf222f69ee90c7872c3cb0931e8cdb84f0cb3cda	2021-04-28 23:29:10 -07:00
Nikita Shulga	4cb534f92e	Make PyTorch code-base clang-tidy compliant (#56892 ) Summary: This is an automatic change generated by the following script: ``` #!/usr/bin/env python3 from subprocess import check_output, check_call import os def get_compiled_files_list(): import json with open("build/compile_commands.json") as f: data = json.load(f) files = [os.path.relpath(node['file']) for node in data] for idx, fname in enumerate(files): if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'): files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')] return files def run_clang_tidy(fname): check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"]) changes = check_output(["git", "ls-files", "-m"]) if len(changes) == 0: return check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"]) def main(): git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n") compiled_files = get_compiled_files_list() for idx, fname in enumerate(git_files): if fname not in compiled_files: continue if fname.startswith("caffe2/contrib/aten/"): continue print(f"[{idx}/{len(git_files)}] Processing {fname}") run_clang_tidy(fname) if __name__ == "__main__": main() ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892 Reviewed By: H-Huang Differential Revision: D27991944 Pulled By: malfet fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179	2021-04-28 14:10:25 -07:00
Mikhail Zolotukhin	f3743f097f	[TensorExpr] Nuke tensorexpr::ScalarType and instead use c10::ScalarType directly. (#56825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56825 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D27977461 Pulled By: ZolotukhinM fbshipit-source-id: f8a72938ba395e426e2d9449627113abb1c9c34f	2021-04-26 01:51:21 -07:00
Mikhail Zolotukhin	441c835733	[TensorExpr] Remove unused field from TensorExprKernel. (#56761 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56761 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D27960594 Pulled By: ZolotukhinM fbshipit-source-id: 8f2bf1d688422363b97f48045ff96601665301f5	2021-04-26 01:51:19 -07:00
Horace He	bcef7ebd60	[NNC] Added matmul for NNC lowering/unified dtypes (#56456 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56456 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D27977532 Pulled By: Chillee fbshipit-source-id: c04372d988c8ef795f27037348a155894c2eddad	2021-04-24 19:15:16 -07:00
Horace He	7c50852a60	moved more lowerings over (#55372 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55372 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D27884601 Pulled By: Chillee fbshipit-source-id: 91b00182abb5dcf60209425d2717fa0303cb4932	2021-04-23 00:08:26 -07:00
Horace He	b66a1e00a6	[NNC] added skeleton for refactoring (#55371 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55371 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D27616418 Pulled By: Chillee fbshipit-source-id: 8187a0cb2495b6bec07bb5992e352da3ffb299fb	2021-04-21 04:07:01 -07:00
Bert Maher	c91c4a081d	[NNC] Horizontally fuse all loops (#56324 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56324 Inlining is great if LLVM's CSE kicks in; but if a kernel has multiple outputs (and thus multiple loops), CSE has no chance. So, this pass "horizontally" fuses the output loops together so that CSE can go to town. Essentially we want to turn ``` for (...) { output_1[] = some_complicated_expr... } for (...) { output_2[] = some_complicated_expr... } ``` Into: ``` for (...) { output_1[] = complicated_expr output_2[] = complicated_expr. // llvm cse should take care of this } ``` Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D27841194 Pulled By: bertmaher fbshipit-source-id: 54153bb59786be87183c636d64f05963c4b1624a	2021-04-20 23:54:40 -07:00
Mikhail Zolotukhin	85126629a5	[TensorExpr] Add support for constant tensors in tensorexpr kernel. (#56319 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56319 With this change the TorchScript graph can have constant tensors in it and we still will be able to lower it to TE. The constants are registered (or bound) within the `TensorExprKernel` object and when the codegen is called, they are passed along with usual inputs and outputs. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D27838747 Pulled By: ZolotukhinM fbshipit-source-id: 4a519d66fcc07fe5fa53f5cf9af28d25611f8437	2021-04-17 11:15:35 -07:00
Mikhail Zolotukhin	dd9ef529ba	[TensorExpr] TensorExprKernel: switch type of tensors_ from Tensor to Buf. (#56318 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56318 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D27838748 Pulled By: ZolotukhinM fbshipit-source-id: 371a454912be76889999eda79e60d8154b749134	2021-04-17 11:14:26 -07:00
Bert Maher	928a4733af	[nnc] Only lower float conv2d's (#56289 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56289 While there's no reason to think non-float32 conv2d's don't work, they're only tested in float32 now. Since that's the most important use case, I'd rather restrict the dtypes than spend time testing all the weird dtype combinations that could possibly happen. ghstack-source-id: 126755549 Test Plan: unit tests Reviewed By: navahgar Differential Revision: D27828495 fbshipit-source-id: fcf179207f2c9b20e0e86eb2b85687517d87063c	2021-04-17 05:12:54 -07:00
Mikhail Zolotukhin	5f19385588	[TensorExpr] Add aten::matmuls to TE fuser. (#54605 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54605 For small sizes we generate a naive 3-layer loopnest, for bigger sizes we generate an external call. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D27298364 Pulled By: ZolotukhinM fbshipit-source-id: 2ddf275ff68d6fca16a3befca5ce5c26aef462b5	2021-04-16 12:54:38 -07:00
Natalia Gimelshein	506eca24b9	Revert D27752279: [nnc] Do not try to vectorize kernels that use float16 Test Plan: revert-hammer Differential Revision: D27752279 (`8df5e61fd6`) Original commit changeset: ac115080bf2a fbshipit-source-id: cbc0aa2dcb7691d9fc9d081c6169dea711cd9fac	2021-04-14 20:21:40 -07:00
Bert Maher	8df5e61fd6	[nnc] Do not try to vectorize kernels that use float16 (#55970 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55970 LLVM's support for float16 is not great, and we were seeing assertion failures trying to generate code for vectorized uses. I note that clang doesn't even try to vectorize operations involving half: https://gcc.godbolt.org/z/86MW4xr17, so that's a good sign we shouldn't either. Fixes #55905 ghstack-source-id: 126511474 Test Plan: pytest test_jit_fuser_te.py -k test_isnan Reviewed By: asuhan Differential Revision: D27752279 Pulled By: bertmaher fbshipit-source-id: ac115080bf2a4a73d52b396d64a5bce0cf13abfe	2021-04-14 11:28:34 -07:00
Mikhail Zolotukhin	7ab654afd7	[TensorExpr] Rename `Tensor::call` to `Tensor::load` to be consistent with `Buf` and `Placeholder`. (#55826 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55826 It's a mechanical change. Differential Revision: D27717777 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: fbc1bb99602250c706cf2c8c2684119c323e4d51	2021-04-13 12:08:53 -07:00
Mikhail Zolotukhin	1263448cb2	[TensorExpr] Remove mask field from Load and Store classes. (#55825 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55825 The mask has never been used (in vectorization we generate an explicit `IfThenElse` construct when we need to mask out some elements). The PR removes it and cleans up all its traces from tests. Differential Revision: D27717776 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: 41d1feeea4322da75b3999d661801c2a7f82b9db	2021-04-13 12:08:51 -07:00
Bert Maher	42486963b2	Integrate NNC conv2d with fuser (#55213 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55213 Adds the integration of conv2d with the TE fuser. A few things of interest: - I'm super selective of what convs get lowered. Only 3x3 depthwise, because I've benchmarked those to death and I'm pretty sure it's a good change. - I'm allowing single-node "fusion" groups for supported convs. (Maybe this is a sign that conv2d codegen should go through a different path entirely, but it seems to basically work). I'll shared full benchmarkr results once I clean them up a little. To summarize, I tested the following torchvision models containing depthwise convolutions. Results are single-core on a skylake-avx512: mobilenet_v2: 8% improvement mobilenet_v3: 9% improvement mnasnet: 10% improvement shufflenet: 18% improvement Note these are comparing against a baseline with a fast-but-buggy grouped convolution implementation in MKLDNN. So perf results will be better if compared on master, but I'm going to assume the MKLDNN bug will be fixed and re-enabled. Perf results are more complicated when comparing to freezing plus conversion to mkldnn layout; mobilenet v2/v3 are still faster, but mnasnet and shufflenet are not. Landing this doesn't prevent MKLDNN freezing from kicking in though, so there's no harm (although landing mkldnn freezing will regress mobilenet, but cest la vie). ghstack-source-id: 126076112 Test Plan: New unit test, plus torchvision Reviewed By: ZolotukhinM Differential Revision: D27530272 fbshipit-source-id: 92153fad234bc9f1eaa4f7624c543168d1294a87	2021-04-08 21:58:27 -07:00
Nikita Shulga	6a39613f35	[BE] Make torch/csrc/jit/tensorexpr/ clang-tidy clean (#55628 ) Summary: Mostly auto-generated changes using ``` python3 tools/clang_tidy.py -c build -x torch/csrc/jit/tensorexpr/eval.cpp -s ``` With following common patterns manually fixed - Use ` = default` instead of `{}` - deleted methods should be public - Use pass-by-value + std::move instead of pass-by-reference+copy Pull Request resolved: https://github.com/pytorch/pytorch/pull/55628 Reviewed By: walterddr Differential Revision: D27655378 Pulled By: malfet fbshipit-source-id: 92be87a08113435d820711103ea9b0364182c71a	2021-04-08 19:44:14 -07:00
Mike Ruberry	c0ac0fef4e	Revert D27448156: irange for size_t Test Plan: revert-hammer Differential Revision: D27448156 (`041b4431b2`) Original commit changeset: 585da57d4de9 fbshipit-source-id: 8e047c29f391c0166e0a1a87c3fb2a0854377365	2021-04-03 19:14:00 -07:00
Richard Barnes	041b4431b2	irange for size_t (#55163 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55163 Test Plan: Sandcastle Reviewed By: ngimel Differential Revision: D27448156 fbshipit-source-id: 585da57d4de91c692b6360d65f7b8a66deb0f8c1	2021-04-02 23:22:29 -07:00

1 2 3 4

174 Commits