pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Mike Ruberry	bb8baea932	[primTorch] flatten, squeeze, unsqueeze... (#77043 ) This PR ... Makes the following testing changes: - Updates stride testing in test_python_reference_consistency to only check strides of dimensions with length > 1 - Creates reference inputs for reshape - Creates reference inputs for chunk - Extends the sample inputs for unsqueeze - Extends the sample inputs for stack -- test_conj_view and test_neg_view are now xfailed - https://github.com/pytorch/pytorch/issues/77046 Makes the following architecture changes: - Adds the refs.special (sub)module - Adds the refs.nn.functional (sub)module Adds the following prims: - expand_dims - view_of - rev - clone Adds the following references: - flatten - squeeze - unsqueeze - special.i0e - special.i1e - logical_or - logical_and - isclose - flip - stack - nn.functional.elu - chunk - clone - narrow Identifies the following bugs in PyTorch today: - https://github.com/pytorch/pytorch/issues/77054 - https://github.com/pytorch/pytorch/issues/77055 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77043 Approved by: https://github.com/ngimel	2022-05-09 11:24:55 +00:00
Elias Ellison	0d7be81c9c	[JIT] Add Context Manager to force strict fusion Fixes https://github.com/pytorch/pytorch/issues/75464 Adds a context manager that will throw if the ops in the context are not fused. API is : ``` with torch.jit.strict_fusion(): ... ``` A few TODOs: [+] Compose/figure out how to do with autodiff - right now it will run on autodiff as well [+] Support all of the nvfuser operators that are added in guarding [+] Figure out what to do with control flow that isn't taken (right now it will just error). this is probably a source of the original issue :/ - will just error [+] (After those are figured out) add to docs Pull Request resolved: https://github.com/pytorch/pytorch/pull/75777 Approved by: https://github.com/davidberard98	2022-04-25 16:08:57 +00:00
David Berard	1324410f2e	[JIT] Reuse traced fn for jit opinfos Previously, jit opinfos would only run the traced function once. This is a problem for NNC and NVFuser, where the fused implementation only runs on the second invocation. This caches the traced function and calls the cached implementation, so that subsequent calls actually perform fusion and use the fused implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76000 Approved by: https://github.com/eellison	2022-04-22 20:14:29 +00:00
Nikita Shulga	320e5a8268	Revert D34808051: [tensorexpr] Enabled aten::stack in the fuser pass with static shapes Test Plan: revert-hammer Differential Revision: D34808051 Original commit changeset: 213e2ffdf87f Original Phabricator Diff: D34808051 fbshipit-source-id: b618daeb346f784e8ab9525040edcb4a30a39613 (cherry picked from commit e47b973cba5c95e9410f8aecdfd5619de6d4be7c)	2022-03-31 04:25:43 +00:00
Hui Guo	90c3699cc8	[tensorexpr] Enabled aten::stack in the fuser pass with static shapes (#74077 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74077 Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D34808051 Pulled By: huiguoo fbshipit-source-id: 213e2ffdf87fb1a74104037cea7ef25e4bfd4307 (cherry picked from commit ad9e84842e5b47eda845827d325b08ba361a8286)	2022-03-31 04:25:43 +00:00
Elias Ellison	6694fdaccd	Clean up profiling mode and profiling executor strategy (#73875 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73875 Previously we had a few settings: - getExecutor - which toggled between Profiling Executor and Legacy - getGraphOptimize - if true, overrides PE/Legacy to run with simple executor (no optimizations) and then... - getProfilingMode - which would set PE to 0 specializtions. The last mode is redundant with getGraphOptimize, we should just remove it and use getGraphOptimize in these cases. It would lead to potentially invalid combinations of logic - what does mean if getProfilingMode is true but getExecutor is set to false ? This would lead to a bug in specialize_autograd_zero in this case, see: https://github.com/pytorch/pytorch/blob/master/torch%2Fcsrc%2Fjit%2Fpasses%2Fspecialize_autogradzero.cpp#L93. The tests here are failing but get fixed with the PR above it, so i'll squash for landing. Test Plan: Imported from OSS Reviewed By: cpuhrsch Differential Revision: D34938130 Pulled By: eellison fbshipit-source-id: 1a9c0ae7f6d1cfddc2ed3499a5af611053ae5e1b (cherry picked from commit cf69ce3d155ba7d334022c42fb2cee54bb088c23)	2022-03-29 18:38:51 +00:00
David Berard	f685dfaac1	[JIT] call super().setUp() in test_jit_fuser_te.py (#73762 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73762 TestCase.setUp() controls slowTest behavior, so calling super().setUp() will prevent fast tests from running in the slow test CI jobs. example: https://github.com/pytorch/pytorch/runs/5413135014?check_suite_focus=true: despite PYTORCH_TEST_SKIP_FAST=1, TestTEFuserStatic tests are still running Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D34628769 Pulled By: davidberard98 fbshipit-source-id: 84311ec1db2ac60fcafb7b77f377e9ae2ef792e3 (cherry picked from commit 67fdba7fb9b73ce2b9119f4c4bc84e5b38041e21)	2022-03-11 01:03:54 +00:00
Ryan Spring	4f8b986e28	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: VitalyFedyunin Differential Revision: D33894937 Pulled By: jbschlosser fbshipit-source-id: b65e8fb6ea66168af8f34f45ed50e92737a33851 (cherry picked from commit `6e986f91a9`)	2022-02-14 03:40:32 +00:00
David Berard	2e04295790	[tensorexpr] support for fusing autocasting ops (#72478 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72478 aten::_autocast_to_reduced_precision and `aten::_autocast_to_full_precision are essentially just aten::to operations, so they can be fused the same way aten::to is fused. Test Plan: Imported from OSS Reviewed By: bdhirsh Differential Revision: D34057522 Pulled By: davidberard98 fbshipit-source-id: f3b53641415702a4ac56460587801b9c76d81b3c (cherry picked from commit `838ce5542e`)	2022-02-10 18:12:36 +00:00
David Berard	bbd42c605a	[JIT] Opinfo tests for nnc fusion - retry (#72486 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72486 Retry #70465. Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D34061628 Pulled By: davidberard98 fbshipit-source-id: e27ed315bc4ad57cdbfbc9cedffcbb7886004524 (cherry picked from commit `7937808d2e`)	2022-02-09 19:01:22 +00:00
Nikita Shulga	bb101ec78d	Revert D33595240: [JIT] Opinfo tests for nnc fusion Test Plan: revert-hammer Differential Revision: D33595240 (`0b57bd4c66`) Original commit changeset: e2e17a921bc3 Original Phabricator Diff: D33595240 (`0b57bd4c66`) fbshipit-source-id: 172a3ffd19d180b1b3617956b1f881be62f37bc9 (cherry picked from commit `324cfaea86`)	2022-02-08 01:28:42 +00:00
David Berard	0b57bd4c66	[JIT] Opinfo tests for nnc fusion (#70465 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70465 These tests check to ensure that (a) the result after nnc fusion (of a single op) is the same as the unfused op (b) for certain ops where fusion is expected to occur, ensure that fusion does actually occur Test Plan: Imported from OSS Reviewed By: wenleix Differential Revision: D33595240 Pulled By: davidberard98 fbshipit-source-id: e2e17a921bc30c313e92e8e5bbc6c1b5fcd14bc1 (cherry picked from commit `b1ba221acc`)	2022-02-07 20:56:21 +00:00
Elias Ellison	defde3bb04	[NNC] Use index for stride mapping in kernel.cpp (#72266 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72266 Within the kernel, we may manipulate `Value ` in `OptimizeCat`, which would invalidate the input `Value ` -> Stride mapping. Fix for https://github.com/pytorch/pytorch/issues/72173 Test Plan: Imported from OSS Reviewed By: dagitses, davidberard98 Differential Revision: D33986306 Pulled By: eellison fbshipit-source-id: dc33cd2b545e49e90d1e46b9fcf1e6dbb4b829db (cherry picked from commit `5e4555968a`)	2022-02-04 00:12:38 +00:00
Elias Ellison	aa99df5cc3	Check for grad mode enabled in dynamic shape fusion check (#72161 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72161 Following logic here: `3dce68fdf4/aten/src/ATen/core/tensor_type.cpp (L329)` Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D33934368 Pulled By: eellison fbshipit-source-id: 8555ef72070559905f65c6e883a7ae49e5bbbdc3 (cherry picked from commit `1db78befd6`)	2022-02-02 04:40:22 +00:00
Elias Ellison	27a4d39756	NNC Dynamic Channels last fixes (#72032 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72032 This contains a few channels last changes from benchmarking: - dont permute back to channels last on dynamic, cpu, perf is not good, and use cases for it are exotic atm - remove the conditional one handling in permutting channels last symbolic tensor on cuda, it's not needed in the permutation case as tests show - removing logic in torch/csrc/jit/tensorexpr/loopnest.cpp preventing inlining. the condition in checks is always valid given valid construction of ir I can split up as needed. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D33864652 Pulled By: eellison fbshipit-source-id: f16674fb02dfff22670d8a2f856c5a317fd15717 (cherry picked from commit `a9a0697839`)	2022-02-01 19:07:02 +00:00
Elias Ellison	59a6375639	[NNC] Add Tests for Dynamic Shape Fusion Change default fusion strategy (#71651 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71651 The only tests that regress are because chunk NYI, the other tests that I touched were passing just because the `assertAllFused` wasn't working correctly. That, and we're no longer compiling conv/matmul w dynamic shapes Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D33801500 Pulled By: eellison fbshipit-source-id: 074118ab4a975b7db876a4fcdfb9483afb879e79 (cherry picked from commit `abaa7948c1`)	2022-02-01 19:07:02 +00:00
Elias Ellison	f1499d6c18	Refactor PE so fusion specializations are configurable (#71650 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71650 * Refactors PE so there is a current fusion strategy set, which will take in a vector of e.g. [(STATIC, 2), (DYNAMIC, 10)] which means fuse two static invocations then fuse 10 dynamic ones, then stop specializing. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33801501 Pulled By: eellison fbshipit-source-id: ebc7ac3c57e35a3b9bb15ab751f0aa1d25cc9bd5 (cherry picked from commit `8dd89088d3`)	2022-02-01 19:07:02 +00:00
Elias Ellison	cf1833df70	[WIP] add explicit dynamic fusion arg (#71173 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71173 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D33536222 Pulled By: eellison fbshipit-source-id: a097408ecdd6e284432de128feb297993d882d52 (cherry picked from commit `0e3419b2d3`)	2022-02-01 19:07:02 +00:00
Nikita Shulga	74c44ba9d6	Revert D33850228: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33850228 (`23d03025dc`) Original commit changeset: 3cc33fb298e4 Original Phabricator Diff: D33850228 (`23d03025dc`) fbshipit-source-id: 9436e7df73c2b2e2011f321674f24973316d3692 (cherry picked from commit `c9efb58223`)	2022-01-31 17:44:19 +00:00
Ryan Spring	23d03025dc	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: cpuhrsch Differential Revision: D33850228 Pulled By: jbschlosser fbshipit-source-id: 3cc33fb298e480d7ecc5c67716da019d60c6ab33 (cherry picked from commit `3a53b3e94f`)	2022-01-31 17:07:45 +00:00
Joel Schlosser	cb823d9f07	Revert D33744717: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33744717 (`f499ab9cef`) Original commit changeset: d64532a562ed Original Phabricator Diff: D33744717 (`f499ab9cef`) fbshipit-source-id: 396c3f63de5865f894dbc353d0790a01a624be93 (cherry picked from commit `e9fb2d1db1`)	2022-01-28 18:35:01 +00:00
Ryan Spring	f499ab9cef	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: mikaylagawarecki Differential Revision: D33744717 Pulled By: jbschlosser fbshipit-source-id: d64532a562ed53247bb4fa52bb16722634d5c187 (cherry picked from commit `4713dd9cca`)	2022-01-28 16:59:09 +00:00
Mikhail Zolotukhin	bd6ec4efb4	[TensorExpr] Add lowerings for scalar binary ops (+,-,*,/,&,\|,^,<<,>>,cmp). (#71298 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71298 Differential Revision: D33576534 D33576534 Test Plan: Imported from OSS Reviewed By: anjali411 Pulled By: ZolotukhinM fbshipit-source-id: 93787b6f11180fcbfbacbb55e1bfb79700320a0e (cherry picked from commit `b2a8e83f97`)	2022-01-26 06:32:51 +00:00
David Berard	8ba1ee6aa7	[tensorexpr][easy] add missing comma to test_jit_fuser_te.py (#71642 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71642 Missing comma was causing string concatenation in a list of strings Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D33713185 Pulled By: davidberard98 fbshipit-source-id: a2458629d78202713a5bb2f8c720ff9b81939c31 (cherry picked from commit `b077598f1d`)	2022-01-24 22:18:37 +00:00
Raghavan Raman	70c9146c40	[nnc] Update block and thread extents in cuda_codegen to use int64_t (#71428 ) Summary: The block and thread extent calculations in `cuda_codegen` should be using `int64_t` instead of `int`. The updated test, `test_dynamic_shapes`, fails without this change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71428 Reviewed By: samdow Differential Revision: D33640374 Pulled By: navahgar fbshipit-source-id: 64c340ad2a9a1fa1fe066cf1c5dfc3b546b7be6d (cherry picked from commit `6ea546ce11`)	2022-01-19 23:21:24 +00:00
Elias Ellison	5480deb183	Add support for permutting dynamic fusion group outputs to channels last format (#70656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70656 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D33458650 Pulled By: eellison fbshipit-source-id: f0c7d20743deac7a87f7c9176e60da8100aefe41	2022-01-12 09:11:34 -08:00
Elias Ellison	39be20f259	[JIT][NNC] Add handling of strides to dynamic shape support. (#70464 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70464 Add handling of strided input tensors to dynamic fusion. This is done with the same set of input striding specializations as https://github.com/pytorch/pytorch/pull/60684/: ``` S_ONE, // STRIDE_ONE: packed S_CONT, // STRIDE_CONTIGUOUS: stride[i + 1] * sizes[i + 1] S_TRAN_CONT, // STRIDE_TRANSPOSED_CONTIGUOUS: stride[i-1] * sizes[i-1] S_AS_ARG, // STRIDE_AS_ARG: stride passed in as runtime value ``` and then two additional specializations for a) contiguous tensor and b) channels-last tensor. channels-last is a common case and we should optimize for it. additionally, tensors natively store whether they are contiguous/channels-last contiguous, which makes it faster to check if tensors follow this pattern. Output striding will be done in a follow up. The striding is stored on both the TensorGroup node and on the guard node. The striding descriptors are stored as a vector of strings on the node for debugability and to make use of storing ivalues as attributes on nodes. As an example: ``` %8 : Double(10, 11, 12, 13, strides=[1716, 1, 143, 11], requires_grad=0, device=cpu) = prim::TensorExprGroup_0[symbolic_shape_inputs=[-37, -36, -35, -34], striding_inputs_desc=[["TENSOR_CONT_CHANNELS_LAST"]](%x, %24, %23, %22, %21)``` ``` Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D33458649 Pulled By: eellison fbshipit-source-id: c42616d3c683d70f6258180d23d3841a31a6030d	2022-01-12 09:11:31 -08:00
Elias Ellison	0adc7cc546	Inline Fallback Functions For Debugging (#70463 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70463 Fix for https://github.com/pytorch/pytorch/issues/52940 When we call inlining on a fallback function, insert the runtime optimized version of its graph. Test Plan: Imported from OSS Reviewed By: jbschlosser, davidberard98 Differential Revision: D33458651 Pulled By: eellison fbshipit-source-id: fd7e5e2b5273a1677014ba1a766538c3ee9cad76	2022-01-10 12:15:11 -08:00
Animesh Jain	6896b2d734	[NNC Testing] Randomized loop nest infrastructure (#70410 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70410 Trying again after #70174 was reverted. Earlier the env variable was read into a static var in C++ causing state to be retained and causing test failures. Static type is removed in this PR. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D33321435 fbshipit-source-id: 6d108eb00cac9150a142ccc3c9a65a1867dd7de4	2022-01-06 16:21:42 -08:00
Mikhail Zolotukhin	0ee663d2fa	Revert D33234529: [NNC Testing] Randomized loop nest infrastructure Test Plan: revert-hammer Differential Revision: D33234529 (`1d094587ea`) Original commit changeset: 9019f1f1d4ca Original Phabricator Diff: D33234529 (`1d094587ea`) fbshipit-source-id: a79deca9f186299bf884587eb7d50af2464979fb	2021-12-23 23:11:23 -08:00
Animesh Jain	1d094587ea	[NNC Testing] Randomized loop nest infrastructure (#70174 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70174 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D33234529 fbshipit-source-id: 9019f1f1d4ca945c92bee401f7ec674b7d987de4	2021-12-22 22:07:39 -08:00
Mikhail Zolotukhin	3186d36972	[TensorExpr] Supress TracerWarnings in test_unsupported in test_jit_fuser_te.py. (#68757 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68757 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D32600951 Pulled By: ZolotukhinM fbshipit-source-id: 7b9859d7dee1e9803b8fde5d071890a72d30cec9	2021-11-30 00:06:36 -08:00
Samantha Andow	e86058559a	Op info for activation functions 2 (softsign, tanh, tanhshrink, threshold, celu, sigmoid, mish, hardsigmoid) (#67492 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67492 Reviewed By: zou3519 Differential Revision: D32282580 Pulled By: samdow fbshipit-source-id: 115afe790328577357a90117bede3b6502590441	2021-11-09 12:57:38 -08:00
Natalia Gimelshein	417dc7f86c	Revert D32007691: [pytorch][PR] Op info for activation functions 2 (softsign, tanh, tanhshrink, threshold, celu, sigmoid, mish, hardsigmoid) Test Plan: revert-hammer Differential Revision: D32007691 (`ea60e7d559`) Original commit changeset: 6cb14dc56e29 fbshipit-source-id: 9ef599ef07302fb521b1f413b989786adfa3576c	2021-11-08 21:16:53 -08:00
Samantha Andow	ea60e7d559	Op info for activation functions 2 (softsign, tanh, tanhshrink, threshold, celu, sigmoid, mish, hardsigmoid) (#67492 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67492 Reviewed By: mruberry Differential Revision: D32007691 Pulled By: samdow fbshipit-source-id: 6cb14dc56e296154e2f48249049c4d2fe4f4d10d	2021-11-08 14:30:50 -08:00
Richard Zou	05d1dcc14c	Split channels_last test cases for tensor conversion OpInfos (#67368 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67368 This PR adds an addition test variant for the tensor conversion functions (bfloat16, char, long, ...) that tests channels_last. This is because some backends (mostly just functorch right now) don't have channels last handling and may want to test that separately from the more general case of these operations. Test Plan: - wait for tests Reviewed By: mruberry Differential Revision: D31972959 Pulled By: zou3519 fbshipit-source-id: 68fea46908b2cdfeb0607908898bb8f9ef25b264	2021-11-03 07:39:41 -07:00
Ivan Kobzarev	7fbcf79684	[tensorexpr][nnc] Support quantization (#66676 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66676 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D31676329 Pulled By: IvanKobzarev fbshipit-source-id: 288b41ff4ed603dfaacb465f296997f14bb23c22	2021-10-31 22:49:30 -07:00
Jane Xu	49251d05ec	[skip ci] Set test owners for NNC tests (#66833 ) Summary: Action following https://github.com/pytorch/pytorch/issues/66232 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66833 Reviewed By: albanD Differential Revision: D31907812 Pulled By: janeyx99 fbshipit-source-id: 5e5013b4276fd208ac68d61cf787679799695602	2021-10-26 07:46:18 -07:00
Bert Maher	bdb889aca1	[nnc] Use a descriptive name for fused kernels when profiling (#66990 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66990 NNC fusion groups currently show up as "TensorExpr" in the profiler, which is true but not super useful since it obscures what's actually happening in the fusion group. This change will log them as `fused_XXX` where XXX is a (length-limited) series of ops describing the subgraph, for instance `fused_mul_add` to represent a group containing `aten::mul`, `aten::add`. Test Plan: New unit test to check the output of autograd profiler. Reviewed By: dzhulgakov Differential Revision: D31762087 fbshipit-source-id: 3fadbdc67b054faa01aa42e5b6ea2c4a6bc3481f	2021-10-21 00:06:23 -07:00
kshitij12345	49a1d7bfcb	[opinfo] elemwise parcel : isfinite, isinf, isposinf, isneginf, isnan, isreal (#66400 ) Summary: Adds OpInfo for `isfinite, isinf, isposinf, isneginf, isnan, isreal` Pull Request resolved: https://github.com/pytorch/pytorch/pull/66400 Reviewed By: bdhirsh Differential Revision: D31602998 Pulled By: mruberry fbshipit-source-id: 235cc414f373f014f4822a72deb1a04a58ad4a7c	2021-10-14 10:11:57 -07:00
Richard Zou	5d4452937d	OpInfos for some Tensor dtype conversion methods (#64282 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64282 OpInfos for: - Tensor.bfloat16, Tensor.bool, Tensor.bypte, Tensor.char - Tensor.double, Tensor.float, Tensor.half, Tensor.int - Tensor.short, Tensor.long None of these are supported by TorchScript. Also, the OpInfo autograd test runner assumes that the operation is not allowed to change the dtype of the argument, so only Tensor.double has `supports_autograd=True` (in theory Tensor.bfloat16, Tensor.float, Tensor.half should be differentiable). Test Plan: - run tests Reviewed By: dagitses Differential Revision: D31452627 Pulled By: zou3519 fbshipit-source-id: b7f272e558558412c47aefe947af7f060dfb45c5	2021-10-14 09:13:30 -07:00
Mikhail Zolotukhin	5f1518609b	[TensorExpr] Fix lowering for aten::t. (#65859 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65859 Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D31289347 Pulled By: ZolotukhinM fbshipit-source-id: b9648416238657fe23366928e43ed63e992a8973	2021-10-12 01:26:36 -07:00
Mikhail Zolotukhin	6864146f2b	[TensorExpr] Fix lowerings for aten::view and aten::reshape. (#65852 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65852 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D31286024 Pulled By: ZolotukhinM fbshipit-source-id: eb5b5f2ed86b6f325f09904e841815b8183b4e1d	2021-10-12 01:26:34 -07:00
jjsjann123	d609957c95	patching graph_for (#55139 ) Summary: Allows individual DifferentiableGraphOp to display optimized forward graph. This improves user visibility to graph mutation via optimization pass, especially fusion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/55139 Reviewed By: albanD Differential Revision: D31330909 Pulled By: dzhulgakov fbshipit-source-id: c745b482fdc34876dc404cbe3bacd99dcf2ac724	2021-10-04 21:50:22 -07:00
Max Ren	0eaf081018	[JIT] canonicalize aten::rsub (#65014 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65014 ghstack-source-id: 138656948 Test Plan: ``` (pytorch) [maxren@devvm3115.atn0 ~/pytorch] python3 test/test_jit.py TestPeephole CUDA not available, skipping tests monkeytype is not installed. Skipping tests for Profile-Directed Typing ........s...................... ---------------------------------------------------------------------- Ran 31 tests in 0.393s OK (skipped=1) (pytorch) [maxren@devvm3115.atn0 ~/pytorch] python3 test/test_jit.py TestPeephole.test_normalized_rsub CUDA not available, skipping tests monkeytype is not installed. Skipping tests for Profile-Directed Typing . ---------------------------------------------------------------------- Ran 1 test in 0.015s OK ``` Reviewed By: eellison Differential Revision: D30941389 fbshipit-source-id: 03f0416d99090845c9bfb1e5fcf771d5f1d7a050	2021-09-22 17:20:46 -07:00
Raghavan Raman	cad7a4b0ea	[nnc] Added an implementation of sign op (#64033 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64033 Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D30579197 Pulled By: navahgar fbshipit-source-id: f9f7fa7f2ffa109cf4e441eb1af821b8b891d4d3	2021-09-10 16:49:04 -07:00
Animesh Jain	18d24bb537	[NNC] Add Softplus operator (#64589 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64589 Adding softplus operator lowering for NNC. Enabling element wise fusion as well. Test Plan: Added a test in test_jit_fuser.py Reviewed By: bertmaher Differential Revision: D30736449 fbshipit-source-id: 6c5fc3bceb5cef2322ecd4449f827e4af018ea93	2021-09-08 10:49:58 -07:00
Elias Ellison	bccbe310ef	Add view with negative dim (#63516 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63516 how to review: pretty much just check that the inputs generated are a good representation of the op semantics, that should be sufficient for correctness, and then you can also double check the op size semantics by going to https://codebrowser.bddppq.com/pytorch/pytorch/ typing in native::{op_name} and looking at the op implementation as a bonus if you want Test Plan: Imported from OSS Reviewed By: driazati Differential Revision: D30738143 Pulled By: eellison fbshipit-source-id: c7cd01cb2c8a13cb2664415f3d98aedec19a8e07	2021-09-07 18:22:28 -07:00
Bert Maher	e7fb35021a	[nnc] Enable fusion of bfloat16 ops (#64196 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64196 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D30643864 Pulled By: bertmaher fbshipit-source-id: e95edeaf7089464d713ea1d1f951743d3e5f61c5	2021-08-30 20:09:36 -07:00
Bert Maher	ebc0aacf83	[nnc] Fix half2float conversion and re-enable float16 (#64199 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64199 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D30643865 Pulled By: bertmaher fbshipit-source-id: 9de6adca53bd08839328cbaf6364f7de9550264b	2021-08-30 18:37:55 -07:00
Bert Maher	4f969db325	[nnc] Fix batchnorm implementation (#64112 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64112 Fixes #64062 Test Plan: Imported from OSS Reviewed By: zhxchen17 Differential Revision: D30622897 Pulled By: bertmaher fbshipit-source-id: 7d7c6131aa786e61fa1d0a517288396a0bdb1d22	2021-08-28 19:20:35 -07:00
Bert Maher	8dda299d96	Re-apply: [nnc] Support thread level parallelism in fused kernels (#63776 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63776 I reverted this out of an abundance of caution because some test failures occurred, but they were all due to precision issues fixed lower in this stack. Let's try again. I've rolled the elimination of the allow-parallelism-in-fusions toggle into this diff since they're pretty tightly coupled. ghstack-source-id: 136529847 Test Plan: CI Reviewed By: huiguoo Differential Revision: D30484555 fbshipit-source-id: 38fd33520f710585d1130c365a8c60c9ce794a59	2021-08-24 18:56:55 -07:00
Bert Maher	543130511a	[nnc] Disable erf and erfc (#63775 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63775 These introduce small accuracy differences that cause some internal tests to fail, and it's not worth fixing the tests right now because they're slower than the ATen ops anyways. ghstack-source-id: 136526229 Test Plan: ``` buck test mode/dev //aml/eccv/mcm/training:tests -- --exact 'aml/eccv/mcm/training:tests - test_build_torch_script_model (aml.eccv.mcm.training.tests.publish_helper_tests.TransformerPredictorPublishHelperTests)' ``` Reviewed By: navahgar Differential Revision: D30484557 fbshipit-source-id: 095a9c810539a499105b76e1d96843dbc61b0079	2021-08-24 18:55:45 -07:00
Bert Maher	76da46ccdc	Revert D30417127: Remove flag to toggle CPU fusion in the presence of parallelism Test Plan: revert-hammer Differential Revision: D30417127 (`6600bc9651`) Original commit changeset: b77d7c68364f fbshipit-source-id: 6b52fb83a84fe241945e3cb3eeb71050d1d9c8f1	2021-08-21 03:38:07 -07:00
Philip Meier	70a3210eca	Add `BinaryUfuncOpInfo` and broadcasting tests (#61964 ) Summary: As proof of concept, this PR uses the new `BinaryUfuncOpInfo` in broadcasting tests for `add`, `sub`, `mul`, `div`, `floor_div`, and `true_div`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/61964 Reviewed By: ngimel Differential Revision: D30407734 Pulled By: mruberry fbshipit-source-id: ada28994f43b0635f279f45a02ecba18bc8ee033	2021-08-20 11:44:15 -07:00
Bert Maher	6600bc9651	Remove flag to toggle CPU fusion in the presence of parallelism (#63514 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63514 Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D30417127 Pulled By: bertmaher fbshipit-source-id: b77d7c68364f2af73570740540f3b1152313016e	2021-08-20 11:18:19 -07:00
Philip Meier	99203580a9	Updates internal `assert_allclose` callsites in favor of `assert_close` (#61841 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61841 Redo of #60863. Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D30408145 Pulled By: mruberry fbshipit-source-id: 0b34ebc7f23ba38ecd89640b61d8aca59b7eab58	2021-08-19 12:50:41 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
Kushashwa Ravi Shrimali	a705b8f08f	OpInfo for `nn.functional.relu` (#62076 ) Summary: See https://github.com/facebookresearch/functorch/issues/78 and https://github.com/pytorch/pytorch/issues/54261. cc: mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/62076 Reviewed By: soulitzer Differential Revision: D30013262 Pulled By: zou3519 fbshipit-source-id: 7df5e930d1588146e09cf58c53c8860392da7348	2021-08-04 15:50:18 -07:00
Bert Maher	93772792e3	[nnc] Get rid of fuser trigger counters (#57334 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57334 Here's a possibly controversial PR. These counters got in the way of generalizing the fuser tests to handle arbitrary devices, and I guess I'm just generally skeptical that they provide much value. While true that they let us observe whether fusion groups were created, we already have assertions based on the shape of the graph, and I'm not sure that I trust those any less than these counters. Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D29471484 Pulled By: bertmaher fbshipit-source-id: f6d76f6e72dbfb581acff1d834b0c74500941b57	2021-06-29 22:22:15 -07:00
Bert Maher	1a0058f593	[nnc] Merge inconsistent profiling information (#60510 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60510 We encountered a situation where loop unrolling caused us to duplicate profiled tensor types in a manner that wasn't logically consistent (see the attached test case). When applying this profiling information, we need to merge the profiled types so that we use a conservative (unspecialized) type. ghstack-source-id: 132160002 Test Plan: new unit test, plus local predictor using P424983338 Reviewed By: Krovatkin Differential Revision: D29322487 fbshipit-source-id: 4c18ee69c71bb0622c2e6f6aa361ab5613cbaca4	2021-06-23 17:05:32 -07:00
Mikhail Zolotukhin	d9e7df707b	[TensorExpr] Add NNC lowerings for `aten::mean`, `aten::addmm`, and `aten::adaptive_avg_pool2d`. (#59347 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59347 We had external call wrappers for them, but they were not used in NNC. This PR adds lowerings using these ext calls and fixes some bugs in them. Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D28853832 Pulled By: ZolotukhinM fbshipit-source-id: 1718400368e1a9cf3f19180ee2290a4ed9c99d41	2021-06-18 11:56:32 -07:00
Bert Maher	469f0e42d6	[nnc] Handle more cases of excessive # of cat args (#60043 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60043 And add a unit test Test Plan: new unit test Reviewed By: navahgar Differential Revision: D29146547 fbshipit-source-id: 31532926032dbef70d163930f3d8be160f5eacc3	2021-06-15 18:19:52 -07:00
Mikhail Zolotukhin	daa35141e8	Reland: "[TensorExpr] Fix handling of 0-dim tensors." (#59508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59508 An assert that was triggering in a previous version is now relaxed to take 0-dim tensors into account. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28918342 Pulled By: ZolotukhinM fbshipit-source-id: c09b62c9725d1603b0ec11fcc051e7c932af06ae	2021-06-08 22:48:17 -07:00
kshitij12345	96ac0e0340	OpInfo: t (#59442 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/54261 Pull Request resolved: https://github.com/pytorch/pytorch/pull/59442 Reviewed By: agolynski Differential Revision: D28898946 Pulled By: mruberry fbshipit-source-id: be32429fa7306554e4912fdcc382593d00c9f4ad	2021-06-05 18:59:38 -07:00
Akifumi Imanishi	0a5bfa9919	Support `__rmod__` (#58476 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/58035. This PR implements `torch.Tensor.__rmod__` and `torch.remainder(scalar, tensor)` for the compatibility with NumPy’s interface. (cc: mruberry, rgommers, emcastillo, kmaehashi) TODO: - [x] Update `tensor_binary_op` in test/test_binary_ufuncs.py after https://github.com/pytorch/pytorch/issues/58216 is merged. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58476 Reviewed By: ngimel Differential Revision: D28776810 Pulled By: mruberry fbshipit-source-id: 74f8aea80f439ef2cc370333524e39971eeb7bf4	2021-06-05 16:19:24 -07:00
Nikita Shulga	ba3a90b55e	Revert D28819780: [TensorExpr] Fix handling of 0-dim tensors. Test Plan: revert-hammer Differential Revision: D28819780 Original commit changeset: f3feff35a1ce fbshipit-source-id: 1dca4ac9cea0b67e9f02800f6d5b3c7e4ae1d81a	2021-06-04 19:25:30 -07:00
Bert Maher	6309b342c3	[nnc] Enable CPU fuser inside FB, take 5 (#59461 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59461 long tail test failues ghstack-source-id: 130607578 Test Plan: fixed T92123560 Reviewed By: navahgar Differential Revision: D28892885 fbshipit-source-id: 762a275b5aa14af0847e46cbf4036d3342b82189	2021-06-04 16:26:46 -07:00
Bert Maher	f5e3eae82a	[nnc] Infer device type from nodes if inputs are all scalars (#59430 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59430 With constant support added, we can now have fusion groups with only scalar inputs. So, we need to get the device type from the nodes in the graph rather than just the inputs. ghstack-source-id: 130613871 Test Plan: new unit test; also see test_tracer test_trace_of_script Reviewed By: navahgar Differential Revision: D28891989 fbshipit-source-id: f9e824acbd4856216b85a135c8cb60a2eac3c628	2021-06-04 16:25:33 -07:00
anjali411	3607478ecd	Conjugate View (#54987 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54987 Based off of ezyang (https://github.com/pytorch/pytorch/pull/44799) and bdhirsh (https://github.com/pytorch/pytorch/pull/43702) 's prototype: Here's a summary of the changes in this PR: This PR adds a new dispatch key called Conjugate. This enables us to make conjugate operation a view and leverage the specialized library functions that fast path with the hermitian operation (conj + transpose). 1. Conjugate operation will now return a view with conj bit (1) for complex tensors and returns self for non-complex tensors as before. This also means `torch.view_as_real` will no longer be a view on conjugated complex tensors and is hence disabled. To fill the gap, we have added `torch.view_as_real_physical` which would return the real tensor agnostic of the conjugate bit on the input complex tensor. The information about conjugation on the old tensor can be obtained by calling `.is_conj()` on the new tensor. 2. NEW API: a) `.conj()` -- now returning a view. b) `.conj_physical()` -- does the physical conjugate operation. If the conj bit for input was set, you'd get `self.clone()`, else you'll get a new tensor with conjugated value in its memory. c) `.conj_physical_()`, and `out=` variant d) `.resolve_conj()` -- materializes the conjugation. returns self if the conj bit is unset, else returns a new tensor with conjugated values and conj bit set to 0. e) `.resolve_conj_()` in-place version of (d) f) `view_as_real_physical` -- as described in (1), it's functionally same as `view_as_real`, just that it doesn't error out on conjugated tensors. g) `view_as_real` -- existing function, but now errors out on conjugated tensors. 3. Conjugate Fallback a) Vast majority of PyTorch functions would currently use this fallback when they are called on a conjugated tensor. b) This fallback is well equipped to handle the following cases: - functional operation e.g., `torch.sin(input)` - Mutable inputs and in-place operations e.g., `tensor.add_(2)` - out-of-place operation e.g., `torch.sin(input, out=out)` - Tensorlist input args - NOTE: Meta tensors don't work with conjugate fallback. 4. Autograd a) `resolve_conj()` is an identity function w.r.t. autograd b) Everything else works as expected. 5. Testing: a) All method_tests run with conjugate view tensors. b) OpInfo tests that run with conjugate views - test_variant_consistency_eager/jit - gradcheck, gradgradcheck - test_conj_views (that only run for `torch.cfloat` dtype) NOTE: functions like `empty_like`, `zero_like`, `randn_like`, `clone` don't propagate the conjugate bit. Follow up work: 1. conjugate view RFC 2. Add neg bit to re-enable view operation on conjugated tensors 3. Update linalg functions to call into specialized functions that fast path with the hermitian operation. Test Plan: Imported from OSS Reviewed By: VitalyFedyunin Differential Revision: D28227315 Pulled By: anjali411 fbshipit-source-id: acab9402b9d6a970c6d512809b627a290c8def5f	2021-06-04 14:12:41 -07:00
Mikhail Zolotukhin	d60efd8207	[TensorExpr] Fix handling of 0-dim tensors. (#59279 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59279 There were some issues with how we handle 0-dim cases in lowerings and also in how we generate reductions in that special case. This PR fixes those issues and reenables a bunch of tests. Differential Revision: D28819780 D28819780 Test Plan: Imported from OSS Reviewed By: navahgar Pulled By: ZolotukhinM fbshipit-source-id: f3feff35a1ce11821ada2f8d04ae9d4be10dc736	2021-06-04 13:58:15 -07:00
Bert Maher	c3bf42e0d8	Fix symbolic derivative of hardswish (#59405 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59405 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D28879698 Pulled By: bertmaher fbshipit-source-id: 2f2d9836bf592b18ed9a19aab4f5967e653b5898	2021-06-03 23:12:18 -07:00
Bert Maher	9ac954789d	[nnc] Add hardsigmoid (#59069 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59069 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D28738166 Pulled By: bertmaher fbshipit-source-id: d9f5b87ef1f2323a3631add79c2670ce794f911e	2021-06-03 23:10:36 -07:00
kshitij12345	ea465f7378	OpInfo: true_divide and minor fix (#59154 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/54261 Pull Request resolved: https://github.com/pytorch/pytorch/pull/59154 Reviewed By: ngimel Differential Revision: D28780115 Pulled By: mruberry fbshipit-source-id: 91e254698597fa0c7d4df6053ec017a85e180304	2021-05-30 18:35:30 -07:00
Mikhail Zolotukhin	27009d6129	[TensorExpr] Add NNC lowerings for `aten::view`, `aten::reshape` and `aten::expand_as`. (#59157 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59157 Currently view is represented as a copy since we don't support inplace operations in NNC (similar to `aten::reshape`). Lowering for `aten::expand_as` is exactly the same as for the `aten::expand`, since we're building the TE expression basing on the output shape anyway. Differential Revision: D28774224 D28774224 Test Plan: Imported from OSS Reviewed By: Chillee Pulled By: ZolotukhinM fbshipit-source-id: 0a1593c4c6500dcc5a374213adb734180ae1f72e	2021-05-29 20:36:32 -07:00
Horace He	a427820350	[NNC] Added triangular_solve external call + fixed permute (#59131 ) Summary: The triangular_solve only returns the first input, since the second input is just a copy of the first one. Why does that exist? Also, I fixed the permute lowering - I was previously doing the inverse application of the permute. Pull Request resolved: https://github.com/pytorch/pytorch/pull/59131 Reviewed By: ansley Differential Revision: D28768169 Pulled By: Chillee fbshipit-source-id: 8e78611c6145fb2257cb409ba98c14ac55cdbccf	2021-05-28 22:29:30 -07:00
kshitij12345	c9af4c2636	OpInfo: where (#58349 ) Summary: Reference: https://github.com/pytorch/pytorch/issues/54261 Pull Request resolved: https://github.com/pytorch/pytorch/pull/58349 Reviewed By: mrshenli Differential Revision: D28744220 Pulled By: mruberry fbshipit-source-id: 893a2fb88a48a60df75c7d6e2f58a42ca949daa7	2021-05-28 18:22:03 -07:00
Kushashwa Ravi Shrimali	0c1420aa3c	OpInfo: `fmod` and `remainder` (#57941 ) Summary: See https://github.com/pytorch/pytorch/issues/54261 cc: mruberry Lezcano kshitij12345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/57941 Reviewed By: mrshenli Differential Revision: D28744464 Pulled By: mruberry fbshipit-source-id: 19847277d4f8d3a39a706c2b3c9eddf0dedcb20c	2021-05-27 20:32:56 -07:00
Bin Bao	7e4e648c2a	Enable NNC fusion for relu6 (#58773 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58773 Test Plan: ``` python test/test_ops.py -k relu6 python test/test_jit_fuser_te.py ``` Reviewed By: bertmaher Differential Revision: D28721791 Pulled By: desertfire fbshipit-source-id: a94f711977afd080faae052f66eb8dded3cdc79e	2021-05-27 10:54:02 -07:00
Bert Maher	e24362746a	[nnc] Concat input shapes must be known to fuse (#58974 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58974 I don't know how we overlooked this for so long... ghstack-source-id: 129932134 Test Plan: Predictor test of model 184778294_0 using multiple request replay threads. It's not clear to me why multithreading matters, except that perhaps it makes it easier to get an unknown shape in the profile. Reviewed By: navahgar Differential Revision: D28702660 fbshipit-source-id: 565550b1d2e571d62d0c8b21150193f2a7ace334	2021-05-26 11:29:26 -07:00
Horace He	6093161158	Separated out working tests from not working tests for NNC OpInfo (#58788 ) Summary: This gets rid of a lot of the try/else rigamarole. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58788 Reviewed By: ZolotukhinM Differential Revision: D28621054 Pulled By: Chillee fbshipit-source-id: d0d8a1b6466eb318d939a1ed172b78f492ee0d5b	2021-05-22 02:24:23 -07:00
Horace He	e56d3b0238	Added OpInfo tests for NNC (#58719 ) Summary: Finds a couple of bugs: 1. permute needs to wrap dimensions 2. slice needs to wrap dimensions 3. frac doesn't work correctly for negative values 4. Permute has some other failures. This PR also fixes 1 + 2. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58719 Reviewed By: SplitInfinity Differential Revision: D28590457 Pulled By: Chillee fbshipit-source-id: a67fce67799602f9396bfeef615e652364918fbd	2021-05-21 01:41:28 -07:00
Edvard Ghazaryan	5211eeb22b	Support aten::leaky_relu for TE (#58464 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58464 Test Plan: ./bin/test_tensorexpr python test/test_jit_fuser_te.py TestTEFuser.test_unary_ops Reviewed By: Krovatkin Differential Revision: D28499776 fbshipit-source-id: 20094a1bc78aa485f76aec4e065ff69e43d692d7	2021-05-20 16:12:03 -07:00
Bert Maher	3d20ddfe92	[nnc] Do not fuse unsqueeze with variable dim (#58346 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58346 If `dim` is a variable, NNC doesn't know how to translate the result, since the shape is unknown. This issue manifested as a `bad_variant_access` when we try to pull an int constant out of that arg. Note that, while the PE will pick up the resultant shape, it won't set guards accordingly. ghstack-source-id: 129078971 Test Plan: new fuser test Reviewed By: navahgar Differential Revision: D28460956 fbshipit-source-id: 57ef918ef309ee57bfdf86717b910b6549750454	2021-05-18 21:44:37 -07:00
Bert Maher	6b8b591a84	[nnc] Fix output restriding of size-1 dimensions (#58256 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58256 Size-1 dims mess up our output restriding logic, because they're technically "dense" no matter what stride the dimension has. In this example a size-1 dim has stride 1, which causes all the indices to be taken mod 1 (i.e., all indices become 0). We work around this peculiar case by skipping size-1 in our layout logic, since it has no impact on the rest of the tensor's indexing. ghstack-source-id: 128932739 Test Plan: new unit test, plus ``` buck test mode/dev //langtech/mobile/audio_stream_processor:audio_stream_processor_test -- --exact 'langtech/mobile/audio_stream_processor:audio_stream_processor_test - AudioStreamProcessorTest.DemucsReadWriteFloat' ``` Reviewed By: eellison Differential Revision: D28424388 fbshipit-source-id: e33e39eef2a5bf2797bee78a5987558308b6d110	2021-05-14 00:09:12 -07:00
Nick Korovaiko	c524448dd1	init hardshrink (#57749 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57749 add to a fx test Test Plan: Imported from OSS Reviewed By: huiguoo Differential Revision: D28425974 fbshipit-source-id: 195c7a1944decb7a2a99c2831cab38485f32be17	2021-05-13 19:38:05 -07:00
Mikhail Zolotukhin	470cd64514	[TensorExpr] Remove disabled tests that we do not plan to re-enable. (#58207 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58207 We probably don't even know what these tests check and there are no plans on re-enabling them - let's just nuke them to keep the code clean. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28403251 Pulled By: ZolotukhinM fbshipit-source-id: fe12e978636a74f309f57e3408ab78d459fe4d29	2021-05-13 09:19:20 -07:00
Mikhail Zolotukhin	a0f4b7cd48	[TensorExpr] Re-enable skipped tests, they seem to be working now. (#58206 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58206 Tested on CUDA with and without `PYTORCH_TENSOREXPR_DONT_USE_LLVM=1`. Closes #48053. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D28403250 Pulled By: ZolotukhinM fbshipit-source-id: 1ae1cfed691e0077a37db646937e580fbd32b23f	2021-05-13 09:18:09 -07:00
Bert Maher	6955d4d0f7	[nnc] Handle only the first argument of aten::to (#58028 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58028 We were trying to translate the device argument and thus throwing an unsupported dtype. ghstack-source-id: 128748658 Test Plan: predictor models Reviewed By: navahgar Differential Revision: D28347704 fbshipit-source-id: 331a5786339e01f9df1b1878970b0c5983a92980	2021-05-12 12:52:29 -07:00
Bert Maher	f97650e70b	[nnc] Fix float->bool conversion on cpu (#57798 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57798 Our instruction sequence was just plain wrong, instead of `fcmp une %x, +0.0` (unordered equal 0.0) we were doing `fcmp uno`, which is just an unordered check (i.e., is either side NaN). ghstack-source-id: 128586464 Test Plan: New unit test against the full cross-product of dtypes. Reviewed By: navahgar Differential Revision: D28276269 fbshipit-source-id: ba5e59778e07770fb78ef02309f10edde333a800	2021-05-10 18:31:38 -07:00
Elias Ellison	241c2f4496	Add Gelu To NNC (#57753 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57753 I'm not adding symbolic gradient because that is being added in https://github.com/pytorch/pytorch/pull/46785. Test Plan: Imported from OSS Reviewed By: navahgar Differential Revision: D28262765 Pulled By: eellison fbshipit-source-id: be365a2d392d7ac4bcc099a184762249ec2e18a6	2021-05-06 16:04:50 -07:00
Elias Ellison	7627dd568a	hardswish reland (#57652 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57652 Test Plan: Imported from OSS Reviewed By: Krovatkin Differential Revision: D28226724 Pulled By: eellison fbshipit-source-id: 585a91ffab7a855b5600e79130a37be25ef9b354	2021-05-05 17:21:43 -07:00
Shen Li	887d0e5657	Revert D28197820: [JIT][NNC] add hardswish symbolic gradient and NNC lowering Test Plan: revert-hammer Differential Revision: D28197820 (`0142fd0b57`) Original commit changeset: 05305d85c5bb fbshipit-source-id: 2e1d9699515982ba2a9be06e83a2ce043ec857ee	2021-05-05 07:53:30 -07:00
eellison	0142fd0b57	[JIT][NNC] add hardswish symbolic gradient and NNC lowering (#57383 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57383 Notes: I picked up an activation from https://github.com/pytorch/pytorch/issues/56969. You can look at the [activations.cpp](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/cpu/Activation.cpp#L429) file which has both forward and backward kernel code to help you write the NNC lowering and the symbolic gradient. I added a test in test_jit_fuser_te for the fusion, and I added an OpInfo and asserted that we expect to see autodiffable nodes to test the symbolic gradient. Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D28197820 Pulled By: eellison fbshipit-source-id: 05305d85c5bb0847c8f911b95ba47b137dca7e90	2021-05-04 23:39:59 -07:00
Bert Maher	151e81b7bc	[nnc][tests] Skip long running tests when using TE interpreter (#57568 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57568 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D28202740 Pulled By: bertmaher fbshipit-source-id: 3f88aed91cd92c270ea5e6b504ae5ebc6810aa2b	2021-05-04 16:57:48 -07:00
Bert Maher	7c8a7efe3f	[nnc] Enable all fuser tests for cpu (#57332 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57332 Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D28113481 Pulled By: bertmaher fbshipit-source-id: b55e4bbcc25a09614b37985873b72337fdefc6b0	2021-04-30 10:11:06 -07:00
Bert Maher	17b8a4db1c	[nnc] Support `pow` on CPU (#56308 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56308 But only for float tensors. Even on CUDA, int tensors just have weird behavior with pow, and I bet FP is so much more common that it's just not worth trying to fuse ints here. ghstack-source-id: 126769637 Test Plan: `pytest test_jit_fuser_te.py -k test_binary_pow` Reviewed By: navahgar Differential Revision: D27834694 fbshipit-source-id: 7274d72cf02ab95d63574b6c17995b8f34560810	2021-04-20 15:13:03 -07:00
Mikhail Zolotukhin	5f19385588	[TensorExpr] Add aten::matmuls to TE fuser. (#54605 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54605 For small sizes we generate a naive 3-layer loopnest, for bigger sizes we generate an external call. Test Plan: Imported from OSS Reviewed By: bertmaher Differential Revision: D27298364 Pulled By: ZolotukhinM fbshipit-source-id: 2ddf275ff68d6fca16a3befca5ce5c26aef462b5	2021-04-16 12:54:38 -07:00
Bert Maher	8e82e932f3	Reland: D27652485: [nnc] Enable CPU fusion only when num_threads == 1" (#56120 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56120 This reverts commit `ad17fadbfc` (D27786457). The big annoyance here is that depending on the threading mode you may not be able to toggle num_threads at will, so the fusion tests won't fail. I hate this solution, but I'm adding a secondary override for the TE fuser. Now you need to both turn on fusion (_jit_override_can_fuse_on_cpu), and you're OK if you're running with 1 thread, or you can add `_jit_set_texpr_parallel_cpu_enabled` to enable it anyways. This is (a) mainly for tests, since a real user probably won't fiddle aimlessly with the thread count, and (b) will go away once NNC's threading support is fully baked. Test Plan: Imported from OSS Reviewed By: Krovatkin Differential Revision: D27788199 Pulled By: bertmaher fbshipit-source-id: 070d04474f15e9689dbdf8cc1fde43050c6506b1	2021-04-15 15:50:18 -07:00

1 2 3 4 5

247 Commits