Commit Graph

76 Commits

Author SHA1 Message Date
Bert Maher
6da26fe79b [te] Fix pow (#48213)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48213

it was completely broken unless rhs was a constant.

Test Plan: new unit test in test_jit_fuser_te.py

Reviewed By: eellison

Differential Revision: D25071639

fbshipit-source-id: ef1010a9fd551db646b83adfaa961648a5c388ae
2020-11-18 22:44:16 -08:00
Bert Maher
736deefc1f [torch][te] aten::type_as is unary, not binary (#48085)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48085

We were treating it as a binary operator, which implies shape
broadcasting, even though the second arg is thrown away aside from the type.
Treating it as a unary is the proper approach.
ghstack-source-id: 116873680

Test Plan: new unit test

Reviewed By: ZolotukhinM

Differential Revision: D25017585

fbshipit-source-id: 0cfa89683c9bfd4fbb132617c74b47b268d7f368
2020-11-17 12:17:19 -08:00
Bert Maher
bbee0ecbd1 [pytorch][te] Handle negative axis in chunk (#48084)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48084

as title
ghstack-source-id: 116870328

Test Plan: new unit test

Reviewed By: Krovatkin

Differential Revision: D25017489

fbshipit-source-id: 0d1998fccad6f509db04b6c67a4e4e4093d96751
2020-11-17 12:12:49 -08:00
kshitij12345
3649a2c170 [numpy] torch.sqrt : promote integer inputs to float (#47293)
Summary:
Reference https://github.com/pytorch/pytorch/issues/42515

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47293

Reviewed By: malfet

Differential Revision: D24855994

Pulled By: mruberry

fbshipit-source-id: 1e6752f2eeba6d638dea0bdea0c650cf722718c9
2020-11-12 16:16:09 -08:00
Nick Gibson
b1a4170ab3 [NNC] Fix lowering of aten::pow (#47795)
Summary:
NNC lowering of aten::pow assumes that the types of the exponent is either float or int cast to to float, which doesn't work great with double (or half for that matter).

Fixes https://github.com/pytorch/pytorch/issues/47304

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47795

Reviewed By: ZolotukhinM

Differential Revision: D24904201

Pulled By: nickgg

fbshipit-source-id: 43c3ea704399ebb36c33cd222db16c60e5b7ada5
2020-11-12 12:33:07 -08:00
Elias Ellison
664d2f48cf [NNC] Enable unary op cpu testing (#47374)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47374

A few small fixes needed to enable unary op cpu testing. If reviewers would prefer I split  them  up let me know.

Test Plan: Imported from OSS

Reviewed By: ansley

Differential Revision: D24805248

Pulled By: eellison

fbshipit-source-id: c2cfe2e3319a633e64da3366e68f5bf21d390cb7
2020-11-12 11:14:03 -08:00
Elias Ellison
f221a19a7f Force LLVM Compilation for CPU Tests (#46949)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46949

Test Plan: Imported from OSS

Reviewed By: ansley

Differential Revision: D24805247

Pulled By: eellison

fbshipit-source-id: 4fcaf02d8a78cc5cbcbde36940d0a2c85fba3fc5
2020-11-12 11:12:08 -08:00
Nick Gibson
f2eac5df18 [NNC] Fix lowering of aten::remainder (#47611)
Summary:
Fix an issue with the TensorExpr lowering of aten::remainder with integral inputs. We were always lowering to fmod and never to Mod.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47611

Reviewed By: bertmaher, heitorschueroff

Differential Revision: D24846929

Pulled By: nickgg

fbshipit-source-id: adac4322ced5761a11a8e914debc9abe09cf5637
2020-11-09 21:45:42 -08:00
Raghavan Raman
8eb228a7f3 Add support for log_softmax (#47409)
Summary:
This diff adds support for `log_softmax` op in NNC.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47409

Reviewed By: ejguan

Differential Revision: D24750203

Pulled By: navahgar

fbshipit-source-id: c4dacc7f62f9df65ae467f0d578ea03d3698273d
2020-11-06 13:29:27 -08:00
Raghavan Raman
2caa3bd453 Inlining all non-output buffers, including intermediate buffers. (#47258)
Summary:
This diff enables inlining for all non-output buffers, including the intermediate buffers that are created as part of an op. However, the buffers that correspond to reductions will not be inlined.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47258

Reviewed By: anjali411

Differential Revision: D24707015

Pulled By: navahgar

fbshipit-source-id: ad8b03e38497600cd69980424db6d586bf93db74
2020-11-03 17:00:32 -08:00
Mikhail Zolotukhin
9b168a1fed [TensorExpr] Pick meaningful names for functions in TE codegen. (#47255)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47255

As a result of this change, the generated CUDA code for the following fusion group:
```
graph(%0 : Float(32, 32, 1, 1, strides=[32, 1, 1, 1], requires_grad=0, device=cuda:0),
      %1 : Float(32, 32, strides=[32, 1], requires_grad=0, device=cuda:0),
      %2 : Float(32, 32, 1, strides=[32, 1, 1], requires_grad=0, device=cuda:0)):
  %3 : int = prim::Constant[value=1]()
  %v1.1 : Float(32, 32, 32, strides=[1024, 32, 1], requires_grad=0, device=cuda:0) = aten::add(%1, %2, %3) # test/test_tensorexpr.py:155:0
  %5 : int = prim::Constant[value=1]()
  %6 : Float(32, 32, 32, 32, strides=[32768, 1024, 32, 1], requires_grad=0, device=cuda:0) = aten::add(%v1.1, %0, %5) # test/test_tensorexpr.py:156:0
  return (%6)
```

Would look like the following:
```
extern "C" __global__
void fused_add_add(float* t0, float* t1, float* t2, float* aten_add) {
{
  float v = __ldg(t1 + 32 * (((512 * blockIdx.x + threadIdx.x) / 32) % 32) + (512 * blockIdx.x + threadIdx.x) % 32);
  float v_1 = __ldg(t2 + ((512 * blockIdx.x + threadIdx.x) / 32) % 32 + 32 * (((512 * blockIdx.x + threadIdx.x) / 1024) % 32));
  float v_2 = __ldg(t0 + ((512 * blockIdx.x + threadIdx.x) / 1024) % 32 + 32 * ((512 * blockIdx.x + threadIdx.x) / 32768));
  aten_add[((((512 * blockIdx.x + threadIdx.x) / 32768) * 32768 + 32 * (((512 * blockIdx.x + threadIdx.x) / 32) % 32)) + 1024 * (((512 * blockIdx.x + threadIdx.x) / 1024) % 32)) + (512 * blockIdx.x + threadIdx.x) % 32] = (v + v_1) + v_2;
}
}
```

Previously we generated:
```
extern "C" __global__
void func(float* t0, float* t1, float* t2, float* aten_add) {
{
  float v = __ldg(t1 + 32 * (((512 * blockIdx.x + threadIdx.x) / 32) % 32) + (512 * blockIdx.x + threadIdx.x) % 32);
  float v_1 = __ldg(t2 + ((512 * blockIdx.x + threadIdx.x) / 32) % 32 + 32 * (((512 * blockIdx.x + threadIdx.x) / 1024) % 32));
  float v_2 = __ldg(t0 + ((512 * blockIdx.x + threadIdx.x) / 1024) % 32 + 32 * ((512 * blockIdx.x + threadIdx.x) / 32768));
  aten_add[((((512 * blockIdx.x + threadIdx.x) / 32768) * 32768 + 32 * (((512 * blockIdx.x + threadIdx.x) / 32) % 32)) + 1024 * (((512 * blockIdx.x + threadIdx.x) / 1024) % 32)) + (512 * blockIdx.x + threadIdx.x) % 32] = (v + v_1) + v_2;
}
}
```

Differential Revision: D24698273

Test Plan: Imported from OSS

Reviewed By: bertmaher

Pulled By: ZolotukhinM

fbshipit-source-id: 6da95c6ac3d5155ebfaaab4f84f55a24deb6d10d
2020-11-03 16:41:22 -08:00
kshitij12345
c424d9389e [numpy] torch.a{cos, tan} : promote integer inputs to float (#47005)
Summary:
Reference https://github.com/pytorch/pytorch/issues/42515

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47005

Reviewed By: mrshenli

Differential Revision: D24681097

Pulled By: mruberry

fbshipit-source-id: 2f29655a5f3871ee96c2bfd35c93f4d721730e37
2020-11-03 13:00:24 -08:00
kshitij12345
0d00724e36 [numpy] torch.{a}tanh : promote integer inputs to float (#47064)
Summary:
Reference https://github.com/pytorch/pytorch/issues/42515

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47064

Reviewed By: mrshenli

Differential Revision: D24681107

Pulled By: mruberry

fbshipit-source-id: 1818206c854dbce7074363bf6f1949daa7bf6052
2020-11-03 12:56:58 -08:00
Raghavan Raman
f58842c214 Enable inlining into reductions (#47020)
Summary:
This diff enables inlining producers into reductions. It also guards against inlining reductions themselves.

Prior to this diff, if there was a reduction in the loopnest, no inlining was happening. After this change, we will inline all non-output buffers that do not correspond to a reduction.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47020

Reviewed By: albanD

Differential Revision: D24644346

Pulled By: navahgar

fbshipit-source-id: ad234a6877b65be2457b734cbb7f3a1800baa6a5
2020-11-02 15:33:38 -08:00
kshitij12345
21e60643c0 [numpy] torch.log{2,10} : promote integer inputs to float (#46810)
Summary:
References https://github.com/pytorch/pytorch/issues/42515

cc: mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46810

Reviewed By: izdeby

Differential Revision: D24536187

Pulled By: mruberry

fbshipit-source-id: b7dd7678d4e996f3dea0245c65055654e02be459
2020-10-27 13:07:44 -07:00
Raghavan Raman
4b6e307191 Replace flatten tensors with flatten loops. (#46737)
Summary:
This is the second attempt at replacing flatten tensors with flatten loops in `TensorExprKernel::generateStmt`. The first attempt (https://github.com/pytorch/pytorch/pull/46539) resulted in a build failure due to an exception that gets thrown during inline.

The reason for the build failure was because there was an inline step, which was supposed to happen on the unflattened tensors. This was necessary earlier because for every flattened tensor there was an unflattened tensor which had to be inlined. That is no longer necessary since we do not have 2 tensors (flattened and unflattened) now. Removed this inline.

Checked python and cpp tests on CPU as well as CUDA.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46737

Reviewed By: anjali411, izdeby

Differential Revision: D24534529

Pulled By: navahgar

fbshipit-source-id: 8b131a6be076fe94ed369550d9f54d3879fdfefd
2020-10-27 00:01:20 -07:00
Bert Maher
c4892c8efe [pytorch][tensorexpr] Promote integer arguments to sin/cos/tan to float (#46776)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46776

Following numpy and (now) eager mode

Fixes #46458

Test Plan: test_jit_fuser_te

Reviewed By: navahgar

Differential Revision: D24509884

fbshipit-source-id: c063030fc609ba4aefcd9abd25b50f082fef1548
2020-10-23 17:32:54 -07:00
Ashkan Aliabadi
4f5b55f722 Revert D24395956: [pytorch][PR] Replace flatten tensors with flatten loops.
Test Plan: revert-hammer

Differential Revision:
D24395956 (2f51ddb81f)

Original commit changeset: f3792903f206

fbshipit-source-id: ef70713f0f67f577b09674219631d22440ceec31
2020-10-20 15:42:23 -07:00
Raghavan Raman
2f51ddb81f Replace flatten tensors with flatten loops. (#46539)
Summary:
This diff changes `TensorExprKernel::generateStmt` to use flatten loops instead of flatten tensors.

Checked all tests on CPU as well as CUDA.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46539

Reviewed By: nickgg

Differential Revision: D24395956

Pulled By: navahgar

fbshipit-source-id: f3792903f2069bda37b571c9f0a840e6fb02f189
2020-10-20 12:16:18 -07:00
Mikhail Zolotukhin
d6de9d573a [TensorExpr] Properly handle input types promotion and special case of empty inputs for aten::cat. (#46500)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46500

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D24373671

Pulled By: ZolotukhinM

fbshipit-source-id: b3be73a89a9ab6654212cb7094f32bf1c445e876
2020-10-16 20:26:46 -07:00
Mikhail Zolotukhin
0f668d95b6 [TensorExpr] Fix shape inference logic for aten::cat. (#46482)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46482

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D24366778

Pulled By: ZolotukhinM

fbshipit-source-id: 000ff363b11599ba3827cdf2db3d4793878b84ab
2020-10-16 20:24:30 -07:00
Mikhail Zolotukhin
4359c5e036 [TensorExpr] Correctly handle negative dimensions in aten::cat when lowering to tensor expressions. (#46446)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46446

Fixes #46440.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D24356016

Pulled By: ZolotukhinM

fbshipit-source-id: b759760bb8c765aeb128eb94d18af20cddd888a2
2020-10-16 01:13:14 -07:00
Raghavan Raman
a5c0dbc519 Add support for Softmax. (#45286)
Summary:
This PR adds support for Softmax in NNC.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45286

Reviewed By: mrshenli

Differential Revision: D24042901

Pulled By: navahgar

fbshipit-source-id: 120bafe17586d3ecf0918f9aee852a7c3a8f4990
2020-10-08 23:57:02 -07:00
Elias Ellison
1197a38a63 [JIT] Bind log1p and lgamma (#45791)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45791

Most of the lowering for log1p and lgamma already existed, add JIT integration.

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D24169536

Pulled By: eellison

fbshipit-source-id: a009c77a3471f3b5d378bad5de6d8e0880e9da3c
2020-10-08 12:06:34 -07:00
Mikhail Zolotukhin
4aca63d38a [TensorExpr] Change API for creating Load and Store expressions. (#45520)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45520

With this change `Load`s and `Store`s no longer accept `Placeholder`s in
their constructor and `::make` functions and can only be built with
`Buf`.
`Placeholder` gets its own `store`, `load`, `storeWithMask`, and
`loadWithMask` method for more convenient construction.

Test Plan: Imported from OSS

Reviewed By: glaringlee

Differential Revision: D23998789

Pulled By: ZolotukhinM

fbshipit-source-id: 3fe018e00c1529a563553b2b215f403b34aea912
2020-09-29 20:52:38 -07:00
Mikhail Zolotukhin
b86008ab75 [TensorExpr] Remove buf_ field from class Tensor. (#45390)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45390

Tensor objects should always refer to their Function's bufs. Currently
we never create a Tensor with a buffer different than of its function,
but having it in two places seems incorrect and dangerous.

Differential Revision: D23952865

Test Plan: Imported from OSS

Reviewed By: nickgg

Pulled By: ZolotukhinM

fbshipit-source-id: e63fc26d7078427514649d9ce973b74ea635a94a
2020-09-29 01:21:57 -07:00
Mikhail Zolotukhin
3c33695a6d [TensorExpr] Rename Buffer to Placeholder. (#45389)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45389

Differential Revision: D23952866

Test Plan: Imported from OSS

Reviewed By: nickgg

Pulled By: ZolotukhinM

fbshipit-source-id: 17eedd3ac17897501403482ac1866c569d247c75
2020-09-29 01:21:54 -07:00
Alex Suhan
76ee58e2ec [TensorExpr] Move inner loops vectorization logic to its own method (#45287)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45287

Test Plan: CI, build

Reviewed By: gmagogsfm

Differential Revision: D23913432

Pulled By: asuhan

fbshipit-source-id: 3bf8fe09753f349e3c857863a43d2b1fca5101c1
2020-09-25 02:29:36 -07:00
Alex Suhan
3dd0e362db [TensorExpr] Fix min and max for integral inputs in CUDA backend (#44984)
Summary:
For integral types, isnan is meaningless. Provide specializations for
maximum and minimum which don't call it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44984

Test Plan: python test/test_jit_fuser_te.py -k TestTEFuser.test_minmax_int_ops

Reviewed By: ezyang

Differential Revision: D23885259

Pulled By: asuhan

fbshipit-source-id: 2e6da2c43c0ed18f0b648a2383d510894c574437
2020-09-23 23:19:12 -07:00
Alex Suhan
0495998862 [TensorExpr] Disallow arithmetic binary operations on Bool (#44677)
Summary:
Arithmetic operations on Bool aren't fully supported in the evaluator. Moreover,
such semantics can be implemented by the client code through insertion of
explicit casts to widen and narrow to the desired types.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44677

Test Plan:
test_tensorexpr --gtest_filter=TensorExprTest.ExprDisallowBoolArithmetic
python test/test_jit_fuser_te.py

Reviewed By: agolynski

Differential Revision: D23801412

Pulled By: asuhan

fbshipit-source-id: fff5284e3a216655dbf5a9a64d1cb1efda271a36
2020-09-23 14:59:11 -07:00
Nick Gibson
69839ea3f6 [NNC] make inlining immediate (take 3) (#44231)
Summary:
This is a reup https://github.com/pytorch/pytorch/issues/43885 with an extra commit which should fix the bugs that caused it to be reverted. Read that for general context.

The issue here was that we were still using the side maps `tensor_to_stmt_` and `stmt_to_tensor_` which get invalidated by any transform of the IR (rather than just any transform that isn't computeInline). I added a comment about this but didn't actually address our usages of it.

I've removed these maps and changed the `getLoopBodyFor` and `getLoopStatementsFor` helpers to search the root stmt directly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44231

Reviewed By: albanD

Differential Revision: D23689688

Pulled By: nickgg

fbshipit-source-id: 1c6009a880f8c0cebf2300fd06b5cc9322bffbf9
2020-09-15 11:12:24 -07:00
Mikhail Zolotukhin
bd8e38cd88 [TensorExpr] Fuser: check node inputs' device before merging the node into a fusion group. (#44241)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44241

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D23554192

Pulled By: ZolotukhinM

fbshipit-source-id: fb03262520303152b83671603e08e7aecc24f5f2
2020-09-08 19:32:23 -07:00
Elias Ellison
5bd2902796 [JIT] Remove references to no longer generated _tanh_backward and _sigmoid_backward (#44138)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44138

If you look at the sigmoid and tanh backward they are composed of other ops: https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/runtime/symbolic_script.cpp#L786
https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/runtime/symbolic_script.cpp#L164

So tanh_backward and sigmoid_backward are no longer generated / legacy ops.

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D23543603

Pulled By: eellison

fbshipit-source-id: ce8353e53043cf969b536aac47c9576d66d4ce02
2020-09-05 01:41:36 -07:00
Mikhail Zolotukhin
6474057c76 Revert D23503636: [pytorch][PR] [NNC] make inlining immediate (take 2) and fix bugs
Test Plan: revert-hammer

Differential Revision:
D23503636 (70aecd2a7f)

Original commit changeset: cdbdc902b7a1

fbshipit-source-id: b5164835f874a56213de4bed9ad690164eae9230
2020-09-04 10:58:23 -07:00
Nick Gibson
70aecd2a7f [NNC] make inlining immediate (take 2) and fix bugs (#43885)
Summary:
A rework of `computeInline` which makes it work a bit better, particularly when combined with other transformations. Previously we stored Functions that were inlined and then deferred the actual inlining of the function body until prepareForCodgen was called. This has an issue when transformations are applied to the LoopNest: the function body can be different from what appears in the root_stmt and result in inlining that a) fails, b) reverses other transformations or c) a weird unpredictable combination of the two.

This PR changes that behaviour so that the inlining occurs in the root stmt immediately, which means it reflects any previous transformations and any future transformations have a true view of the internal IR. It also has the benefit that inspecting the root statement gives an accurate view of it without needing to call prepareForCodgen. I also removed the difference between `computeInline` and `computeInlineWithRand` and we handle calls to `rand()` in all branches.

This is a rework of https://github.com/pytorch/pytorch/issues/38696, with the agreed changes from ZolotukhinM and zheng-xq: we should only inline if the dimensions are trivial (ie. they are vars not exprs).

This PR is mostly tests, and I fixed a bunch of bugs I found along the way. Partial list:
* When inlining an expression involving rand, we would create random vars equal to the dimensionality of the enclosing Tensor not the produced Tensor - meaning we'd use an incorrect value if the inlined tensor was smaller. E.g: `X[i] = rand(); A[i, j] = X[i]` would produce a tensor where `A[0, 0] != A[0, 1]`. This is fixed by inserting the Let binding of the random variable at the correct loop body.
* When inlining we'd replace all calls to `rand()` rather than just those present in the Tensor being inlined.
* `rand()` was treated symbolically by the simplifier and we would aggregate or cancel calls to `rand()`. Have fixed the hasher to hash all calls to `rand()` distinctly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43885

Reviewed By: gmagogsfm

Differential Revision: D23503636

Pulled By: nickgg

fbshipit-source-id: cdbdc902b7a14d269911d978a74a1c11eab004fa
2020-09-03 16:49:24 -07:00
Mikhail Zolotukhin
b2aaf212aa [TensorExpr] Add option to enforce TensorExprKernel fallbacks. (#43972)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43972

It is useful when debugging a bug to disable NNC backend to see whether
the bug is there or in the fuser logic.

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D23455624

Pulled By: ZolotukhinM

fbshipit-source-id: f7c0452a29b860afc806e2d58acf35aa89afc060
2020-09-02 18:34:24 -07:00
Alex Suhan
9db90fe1f3 [TensorExpr] Remove unused functions in kernel.cpp (#43966)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43966

Test Plan: build.

Reviewed By: ZolotukhinM

Differential Revision: D23456660

Pulled By: asuhan

fbshipit-source-id: c13411b61cf62dd5d038e7246f79a8682822b472
2020-09-01 20:25:16 -07:00
Bert Maher
c14a3613a8 Fix NaN propagation in TE fuser's min/max implementation (#43609)
Summary:
Per eager mode source-of-truth, NaNs shall be propagated by min/max.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43609

Reviewed By: ZolotukhinM

Differential Revision: D23349184

Pulled By: bertmaher

fbshipit-source-id: 094eb8b89a02b27d5ecf3988d0f473c0f91e4afb
2020-09-01 02:10:13 -07:00
Alex Suhan
60ad7e9c04 [TensorExpr] Make sum available from Python (#43730)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43730

Test Plan:
python test/test_jit_fuser_te.py -k TestTEFuser.test_sum
test_tensorexpr --gtest_filter=TensorExprTest.KernelSum*

Reviewed By: ZolotukhinM

Differential Revision: D23407600

Pulled By: asuhan

fbshipit-source-id: e6da4690ae6d802f9be012e39e61b7467aa5285c
2020-08-29 10:38:21 -07:00
Protonu Basu
58a7e73a95 [TensorExpr] Block Codegen (#40054)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40054

Reviewed By: ZolotukhinM

Differential Revision: D22061350

Pulled By: protonu

fbshipit-source-id: 004f7c316629b16610ecdbb97e43036c72c65067
2020-08-28 09:53:42 -07:00
Alex Suhan
de84db2a9d [TensorExpr] Add aten::sum lowering to the kernel (#43585)
Summary:
Handles all dimensions and selected dimensions, per PyTorch semantics.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43585

Test Plan: test_tensorexpr

Reviewed By: bertmaher

Differential Revision: D23362382

Pulled By: asuhan

fbshipit-source-id: e8d8f1197a026be0b46603b0807d996a0de5d58c
2020-08-27 02:46:47 -07:00
Bert Maher
6c99d5611d [tensorexpr] Fix promotion of booleans (#43097)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43097

Boolean arguments weren't promoted, so if you tried to write a comparison with
types such as `Tensor(Bool) == Int` you'd fail typechecking inside the TE
engine.

Test Plan: Imported from OSS

Reviewed By: protonu, zheng-xq

Differential Revision: D23167926

Pulled By: bertmaher

fbshipit-source-id: 47091a815d5ae521637142a5c390e8a51a776906
2020-08-18 15:19:38 -07:00
Mikhail Zolotukhin
73351ee91d [TensorExpr] Disallow fallback to JIT interpreter from TensorExprKernel (flip the default). (#42568)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42568

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D22936175

Pulled By: ZolotukhinM

fbshipit-source-id: 62cb505acb77789ed9f483842a8b31eb245697b3
2020-08-05 14:13:49 -07:00
Mikhail Zolotukhin
ea9053b86d [TensorExpr] Handle constant nodes in shape inference. (#42566)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42566

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D22936176

Pulled By: ZolotukhinM

fbshipit-source-id: 69d0f9907de0e98f1fbd56407df235774cb5b788
2020-08-05 14:13:44 -07:00
Mikhail Zolotukhin
b9c49f0e69 [TensorExpr] Support shape inference in TE for aten::cat. (#42387)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42387

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D22879281

Pulled By: ZolotukhinM

fbshipit-source-id: 775e46a4cfd91c63196b378ee587cc4434672c89
2020-08-05 14:11:24 -07:00
Mikhail Zolotukhin
dcc4d11ffa [TensorExpr] Make tensorOrConstant non-templatized function. (#42202)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42202

Currently we used the template in order to be able to take both
`std::vector<ExprHandle>` and `std::vector<VarHandle>`. However,
semantics of this function tells that the only allowed option should be
the former one: we're specifying indices for the tensor access we want
to generate. While it could be convenient to avoid conversion from
vector of vars to a vector of exprs at the callsites, it makes the code
less explicit and thus more difficult to reason about.

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D22806429

Pulled By: ZolotukhinM

fbshipit-source-id: 8403af5fe6947c27213050a033e79a09f7075d4c
2020-07-31 20:05:24 -07:00
Mikhail Zolotukhin
2decccea2e [TensorExpr] Implement shape inference for TE. (#41451)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41451

Since TE operates on a limited subset of ops with a well-defined
semantics, we can easily infer shapes of intermediate and output tensors
given shapes of the inputs.

There is a couple of ops that are not yet supported in the shape
inference, once we add them we could relax the shape info requirements
in the TE fuser: currently it requires all values in the fusion group to
have shapes known and we can change it to only inputs.

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D22543470

Pulled By: ZolotukhinM

fbshipit-source-id: 256bae921028cb6ec3af91977f12bb870c385f40
2020-07-31 20:05:21 -07:00
Mikhail Zolotukhin
f41bb1f92b [TensorExpr] Explicitly cast to bool results of comparison ops in kernel.cpp. (#42201)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42201

Previously, we've been using operators <, >, ==, et al. and relied on
the dtype to be picked automatically. It led to a wrong dtype being
picked for the result, but that choice was overwritten by the type
explicitly specified in JIT IR, which we were lowering. Now we are
moving towards using shape inference instead of relying on all types
being specified in the IR, and that made this issue to immediately pop
up.

Test Plan: Imported from OSS

Reviewed By: Krovatkin

Differential Revision: D22806428

Pulled By: ZolotukhinM

fbshipit-source-id: 89d2726340efa2bb3da45d1603bedc53955e14b9
2020-07-31 20:05:19 -07:00
Mikhail Zolotukhin
f8c5800bb5 [TensorExpr] Add debug dumps to kernel.cpp. (#42196)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42196

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D22803676

Pulled By: ZolotukhinM

fbshipit-source-id: 109372ca45d86478826190b868d005d2fb2c9ba7
2020-07-31 20:02:21 -07:00
Xiaoqiang Zheng
6bdfd6ae1a [TensorExpr] Fast sigmoid for LLVM (#39717)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39717

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D21949849

Pulled By: zheng-xq

fbshipit-source-id: f918bb2cb0ea647ce254fc51258af6fd01325f2d
2020-06-09 20:11:35 -07:00