Commit Graph

91 Commits

Author SHA1 Message Date
Nick Gibson
f2eac5df18 [NNC] Fix lowering of aten::remainder (#47611)
Summary:
Fix an issue with the TensorExpr lowering of aten::remainder with integral inputs. We were always lowering to fmod and never to Mod.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47611

Reviewed By: bertmaher, heitorschueroff

Differential Revision: D24846929

Pulled By: nickgg

fbshipit-source-id: adac4322ced5761a11a8e914debc9abe09cf5637
2020-11-09 21:45:42 -08:00
Raghavan Raman
8eb228a7f3 Add support for log_softmax (#47409)
Summary:
This diff adds support for `log_softmax` op in NNC.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47409

Reviewed By: ejguan

Differential Revision: D24750203

Pulled By: navahgar

fbshipit-source-id: c4dacc7f62f9df65ae467f0d578ea03d3698273d
2020-11-06 13:29:27 -08:00
Nick Gibson
e985503d80 [NNC] Fix an issue with half-scalar vars coerced to float (Take 2) (#47448)
Summary:
Take 2 of this fix, I removed the repro from the issue which is a bit flaky due to parallelism. It broke on Windows but isn't specific to Windows or this fix, I think. I'll make sure all the tests pass this time (cc zou3519).

Fixes an issue where fp16 scalars created by the registerizer could be referenced as floats - causing invalid conversions which would crash in the NVRTX compile. I also noticed that we were inserting patterns like float(half(float(X))) and added a pass to collapse those down inside the CudaHalfScalarRewriter.

Fixes https://github.com/pytorch/pytorch/issues/47138

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47448

Reviewed By: glaringlee

Differential Revision: D24765070

Pulled By: nickgg

fbshipit-source-id: 5297e647534d53657bef81f4798e8aa6a93d1fbd
2020-11-05 19:31:52 -08:00
Richard Zou
745899f926 Revert D24706475: [pytorch][PR] [NNC] Fix an issue in Cuda fusion with fp16 scalar vars coerced to float
Test Plan: revert-hammer

Differential Revision:
D24706475 (33cf7fddd2)

Original commit changeset: 9df72bbbf203

fbshipit-source-id: f16ff04818de4294713d5b97eab5b298c1a75a6b
2020-11-05 08:25:48 -08:00
Nick Gibson
33cf7fddd2 [NNC] Fix an issue in Cuda fusion with fp16 scalar vars coerced to float (#47229)
Summary:
Fixes an issue where fp16 scalars created by the registerizer could be referenced as floats - causing invalid conversions which would crash in the NVRTX compile. I also noticed that we were inserting patterns like `float(half(float(X)))` and added a pass to collapse those down inside the CudaHalfScalarRewriter.

Fixes https://github.com/pytorch/pytorch/issues/47138

Pull Request resolved: https://github.com/pytorch/pytorch/pull/47229

Reviewed By: agolynski

Differential Revision: D24706475

Pulled By: nickgg

fbshipit-source-id: 9df72bbbf203353009e98b9cce7ab735efff8b21
2020-11-04 15:48:12 -08:00
Mikhail Zolotukhin
5233ff9f15 [TensorExpr] Re-enable test for torch.cat, add a test for torch.cat being a single node in a fusion group. (#46447)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46447

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D24356017

Pulled By: ZolotukhinM

fbshipit-source-id: 847c1d9c4f0f77f53ea3412a5663d486e78bccad
2020-10-16 20:26:48 -07:00
Mikhail Zolotukhin
d6de9d573a [TensorExpr] Properly handle input types promotion and special case of empty inputs for aten::cat. (#46500)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46500

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D24373671

Pulled By: ZolotukhinM

fbshipit-source-id: b3be73a89a9ab6654212cb7094f32bf1c445e876
2020-10-16 20:26:46 -07:00
Mikhail Zolotukhin
0f668d95b6 [TensorExpr] Fix shape inference logic for aten::cat. (#46482)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46482

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D24366778

Pulled By: ZolotukhinM

fbshipit-source-id: 000ff363b11599ba3827cdf2db3d4793878b84ab
2020-10-16 20:24:30 -07:00
Mikhail Zolotukhin
4359c5e036 [TensorExpr] Correctly handle negative dimensions in aten::cat when lowering to tensor expressions. (#46446)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46446

Fixes #46440.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D24356016

Pulled By: ZolotukhinM

fbshipit-source-id: b759760bb8c765aeb128eb94d18af20cddd888a2
2020-10-16 01:13:14 -07:00
Nick Gibson
f3db68776c [NNC] Fix two more bugs in Cuda Half support (#46129)
Summary:
Fixes two bugs reported by https://github.com/pytorch/pytorch/issues/45953 in the NNC Cuda codegen which could break when using Half floats:

1. The Registerizer will generate new scalars with the type of the load being replaced, and doesn't have Cuda specific logic to avoid using the half type. I've added a quick mutator to coerce these to float, similar to the existing load casting rules.

2. We're not handling explicit casts to Half inserted by the user (in the report the user being the JIT). Addressing this by replacing these with casts to Float since thats the type we do Half math in.

Fixes https://github.com/pytorch/pytorch/issues/45953.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46129

Reviewed By: glaringlee

Differential Revision: D24253639

Pulled By: nickgg

fbshipit-source-id: 3fef826eab00355c81edcfabb1030332cae595ac
2020-10-12 13:31:07 -07:00
Mikhail Zolotukhin
496d72d700 [TensorExpr] Disable and/or fix some failing tests. (#46146)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46146

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D24238545

Pulled By: ZolotukhinM

fbshipit-source-id: 0d8242da9d1c6960f7b5e9065c3e8defd3d32494
2020-10-10 13:54:25 -07:00
Raghavan Raman
a5c0dbc519 Add support for Softmax. (#45286)
Summary:
This PR adds support for Softmax in NNC.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45286

Reviewed By: mrshenli

Differential Revision: D24042901

Pulled By: navahgar

fbshipit-source-id: 120bafe17586d3ecf0918f9aee852a7c3a8f4990
2020-10-08 23:57:02 -07:00
Elias Ellison
1197a38a63 [JIT] Bind log1p and lgamma (#45791)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45791

Most of the lowering for log1p and lgamma already existed, add JIT integration.

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D24169536

Pulled By: eellison

fbshipit-source-id: a009c77a3471f3b5d378bad5de6d8e0880e9da3c
2020-10-08 12:06:34 -07:00
Elias Ellison
338283057b [JIT] [3/3] Make sure fusion occurs in test_tensorexpr (#45790)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45790

Making sure that more tests invoke a run with a Fusion Group.

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D24169534

Pulled By: eellison

fbshipit-source-id: a2666df53fbb12c64571e960f59dbe94df2437e4
2020-10-08 12:06:25 -07:00
Elias Ellison
564296f051 [2/3] [JIT] Make sure fusion occurs in test_tensorexpr (#45789)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45789

Making sure that more tests invoke a run with a Fusion Group.

Test Plan: Imported from OSS

Reviewed By: Krovatkin

Differential Revision: D24169535

Pulled By: eellison

fbshipit-source-id: 54d7af434772ba52144b12d15d32ae30460c0c3c
2020-10-08 12:06:16 -07:00
Elias Ellison
1b97ffa07a [1/3] [JIT] Make sure fusion occurs in test_tensorexpr file (#45788)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45788

We were only running the traced graph once, which would not yet have been fused at that point. We should run for num_profiled_runs + 1, and also assert that all nodes in the graph  were fused.

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D24169537

Pulled By: eellison

fbshipit-source-id: 8499bb1a5bd9d2221b1f1c54d6352558cf07ba9a
2020-10-08 12:02:57 -07:00
Rong Rong
abedd9a274 Reduce size of test_unsqueeze to resolve consistent timeout issue (#45877)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45877

apex_test_L0_optimizers

Test Plan: `buck test mode/dev-tsan //caffe2/test:tensorexpr -- 'test_unsqueeze \(test_tensorexpr\.TestTensorExprFuser\)' --run-disabled`

Reviewed By: malfet

Differential Revision: D24126211

fbshipit-source-id: e38ba0168b6dd44459c070c01e3e39c93d5fae42
2020-10-06 10:33:20 -07:00
Mikhail Zolotukhin
5d748e6d22 [TensorExpr] Re-enable tests. (#44218)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/44218

Differential Revision: D23546100

Test Plan: Imported from OSS

Reviewed By: ngimel

Pulled By: ZolotukhinM

fbshipit-source-id: 4c4c5378ec9891ef72b60ffb59081a009e0df049
2020-09-07 15:52:03 -07:00
Elias Ellison
5bd2902796 [JIT] Remove references to no longer generated _tanh_backward and _sigmoid_backward (#44138)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44138

If you look at the sigmoid and tanh backward they are composed of other ops: https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/runtime/symbolic_script.cpp#L786
https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/runtime/symbolic_script.cpp#L164

So tanh_backward and sigmoid_backward are no longer generated / legacy ops.

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D23543603

Pulled By: eellison

fbshipit-source-id: ce8353e53043cf969b536aac47c9576d66d4ce02
2020-09-05 01:41:36 -07:00
Bram Wasti
e64879e180 [tensorexpr] Alias analysis tests (#44110)
Summary:
Some tests for alias analysis.

The first aliases at the module level and the second at the input level.

Please let me know if there are other alias situations!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44110

Reviewed By: nickgg

Differential Revision: D23509473

Pulled By: bwasti

fbshipit-source-id: fbfe71a1d40152c8fbbd8d631f0a54589b791c34
2020-09-03 14:52:47 -07:00
Bert Maher
33d51a9b32 Respect canFuseOn{CPU,GPU} in TE fuser (#43967)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43967

Test Plan: Imported from OSS

Reviewed By: asuhan

Differential Revision: D23469048

Pulled By: bertmaher

fbshipit-source-id: 1005a7ae08974059ff9d467492caa3a388070eeb
2020-09-02 18:00:25 -07:00
Bert Maher
c14a3613a8 Fix NaN propagation in TE fuser's min/max implementation (#43609)
Summary:
Per eager mode source-of-truth, NaNs shall be propagated by min/max.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43609

Reviewed By: ZolotukhinM

Differential Revision: D23349184

Pulled By: bertmaher

fbshipit-source-id: 094eb8b89a02b27d5ecf3988d0f473c0f91e4afb
2020-09-01 02:10:13 -07:00
Protonu Basu
58a7e73a95 [TensorExpr] Block Codegen (#40054)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40054

Reviewed By: ZolotukhinM

Differential Revision: D22061350

Pulled By: protonu

fbshipit-source-id: 004f7c316629b16610ecdbb97e43036c72c65067
2020-08-28 09:53:42 -07:00
Mikhail Zolotukhin
c4e5ab6ff2 [TensorExpr] Disable a flaky test. (#43678)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43678

Test Plan: Imported from OSS

Reviewed By: Krovatkin

Differential Revision: D23363651

Pulled By: ZolotukhinM

fbshipit-source-id: 9557fbfda28633cea169836b02d034e9c950bc71
2020-08-26 18:35:24 -07:00
Mikhail Zolotukhin
3ec24f02af [TensorExpr] Start using typecheck in the fuser. (#43173)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43173

With this change the fuser starts to generate typechecks for inputs of
fusion group. For each fusion group we generate a typecheck and an if
node: the true block contains the fused subgraph, the false block
contains unoptimized original subgraph.

Differential Revision: D23178230

Test Plan: Imported from OSS

Reviewed By: eellison

Pulled By: ZolotukhinM

fbshipit-source-id: f56e9529613263fb3e6575869fdb49973c7a520b
2020-08-25 18:13:32 -07:00
Mikhail Zolotukhin
d18566c617 [TensorExpr] Fuser: disallow aten::slice nodes. (#43365)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43365

We don't have shape inference for them yet.

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D23253418

Pulled By: ZolotukhinM

fbshipit-source-id: 9c38778b8a616e70f6b2cb5aab03d3c2013b34b0
2020-08-25 18:13:27 -07:00
Bert Maher
6c99d5611d [tensorexpr] Fix promotion of booleans (#43097)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43097

Boolean arguments weren't promoted, so if you tried to write a comparison with
types such as `Tensor(Bool) == Int` you'd fail typechecking inside the TE
engine.

Test Plan: Imported from OSS

Reviewed By: protonu, zheng-xq

Differential Revision: D23167926

Pulled By: bertmaher

fbshipit-source-id: 47091a815d5ae521637142a5c390e8a51a776906
2020-08-18 15:19:38 -07:00
Yujun Zhao
e5adf45dde Add python unittest target to caffe2/test/TARGETS (#42766)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42766

**Summary**
Some python tests are missing in `caffe2/test/TARGETS`, add them to be more comprehension.

According to [run_test.py](https://github.com/pytorch/pytorch/blob/master/test/run_test.py#L125), some tests are slower. Slow tests are added as independent targets and others are put together into one `others` target. The reason is because we want to reduce overhead, especially for code covarge collection.  Tests in one target can be run as a bundle, and then coverage can be collected together. Typically coverage collection procedure is time-expensive, so this helps us save time.

Test Plan:
Run all the new test targets locally in dev server and record the time they cost.
**Statistics**

```
# jit target
real    33m7.694s
user    653m1.181s
sys     58m14.160s

--------- Compare to Initial Jit Target runtime: ----------------

real    32m13.057s
user    613m52.843s
sys     54m58.678s

```

```
# others target
real    9m2.920s
user    164m21.927s
sys     12m54.840s
```

```
# serialization target
real    4m21.090s
user    23m33.501s
sys     1m53.308s

```

```
# tensorexpr
real    11m28.187s
user    33m36.420s
sys     1m15.925s
```

```
# type target
real    3m36.197s
user    51m47.912s
sys     4m14.149s
```

Reviewed By: malfet

Differential Revision: D22979219

fbshipit-source-id: 12a30839bb76a64871359bc024e4bff670c5ca8b
2020-08-10 09:48:59 -07:00
Mikhail Zolotukhin
102abb877c Reland D22939119: "[TensorExpr] Fix a way we were createing np arrays in tests." (#42608)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42608

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D22952745

Pulled By: ZolotukhinM

fbshipit-source-id: fd6a3efbfcaa876a2f4d27b507fe0ccdcb55a002
2020-08-05 15:14:23 -07:00
Kimish Patel
924a1dbe9b Revert D22939119: [TensorExpr] Fix a way we were createing np arrays in tests.
Test Plan: revert-hammer

Differential Revision:
D22939119 (882ad117cf)

Original commit changeset: 3388270af8ea

fbshipit-source-id: 7c8d159586ce2c4c21184fd84aa6da5183bc71ea
2020-08-05 08:25:47 -07:00
Mikhail Zolotukhin
882ad117cf [TensorExpr] Fix a way we were createing np arrays in tests. (#42575)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42575

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D22939119

Pulled By: ZolotukhinM

fbshipit-source-id: 3388270af8eae9fd4747f06202f366887aaf5f36
2020-08-04 21:24:25 -07:00
Nikolay Korovaiko
3971777ebb Krovatkin/reenable test tensorexpr (#41445)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41445

Reviewed By: ZolotukhinM

Differential Revision: D22543075

Pulled By: Krovatkin

fbshipit-source-id: fd8c0a94f5b3aff34d2b444dbf551425fdc1df04
2020-07-15 10:42:40 -07:00
Nikolay Korovaiko
9b95f757af move num_profiled_runs to common_utils (#38687)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38687

Differential Revision: D21634080

Pulled By: Krovatkin

fbshipit-source-id: 55513124caf3885e475ffecd9d9f3dbc4729a573
2020-05-27 01:14:01 -07:00
Mikhail Zolotukhin
cdd1b9a891 [TensorExpr] Distinguish aten::max reduction op from aten::max elementwise op and only fuse the latter. (#38171)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38171

Test Plan: Imported from OSS

Differential Revision: D21487389

Pulled By: ZolotukhinM

fbshipit-source-id: ac28789bf2bea389f560de4d5b979e036295e96a
2020-05-11 17:45:59 -07:00
Nikolay Korovaiko
a80a438e37 correctly set and restore states in te tests (#37210)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37210

Differential Revision: D21238634

Pulled By: Krovatkin

fbshipit-source-id: 6462239753399c10c871baa5d5fdff5465cf2544
2020-04-24 20:16:51 -07:00
Nick Korovaiko
2f50c11954 add test_tensorexpr.py (#35776)
Summary:
Adding `test_tensorexpr.py` to our CI. There's a few complications: the first one is that we now always run `SimpleIREVal` as a part of simplifier, so the counts will always be greater than one. We can potentially invest some effort to differentiate between a real codegen call to `SimpleIREval` and calls in simplifier, but it's probably not that important and the second change to turn not being able to retrieve a counter into a default value of 0 since the test are structured to test for either an llvm or simpleireval backends, so it only seems appropriate to not fail the test too early.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35776

Differential Revision: D20799333

Pulled By: Krovatkin

fbshipit-source-id: 2a94ff98e647180c6e6aea141a411c3376c509f9
2020-04-01 22:05:37 -07:00
Bram Wasti
a3e10d2a17 Expose enablement of TensorExpr fuser as env variable (#35341)
Summary:
This commit allows one to use an environment variable to enable the fuser in torch/csrc/jit/tensorexpr/

```
PYTORCH_TENSOREXPR=1 python benchmark.py
```

This commit also changes the registration to happen by default, removing the requirement for the python exposed "_jit_register_tensorexpr_fuser"
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35341

Reviewed By: ZolotukhinM

Differential Revision: D20676348

Pulled By: bwasti

fbshipit-source-id: 4c997cdc310e7567c03905ebff72b3e8a4c2f464
2020-03-26 14:31:57 -07:00
Mikhail Zolotukhin
95833a49e6 [TensorExpr] Pull changes from bertmaher/pytorch_fusion. (#34842)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34842

This PR (hopefully the last one of such kind) is merging changes from a
side branch where tensor expessions based fuser work has been done so
far. This PR is is a squashed version of changes in the side branch,
which is available here: https://github.com/bertmaher/pytorch

Differential Revision: D20478208

Test Plan: Imported from OSS

Pulled By: ZolotukhinM

fbshipit-source-id: 21556e009f1fd88099944732edba72ac40e9b9c0
2020-03-17 11:02:48 -07:00
Mikhail Zolotukhin
ea5c86c276 [TensorExpr] Add LLVM codegen. (#34228)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34228

This PR adds LLVM codegen to tensor expressions. LLVM is added as an
optional build dependency specified with `USE_LLVM=<path_to_llvm>`
variable. If this variable is not set or LLVM is not found in the
specified path, the LLVM codegen is completely disabled.

Differential Revision: D20251832

Test Plan: Imported from OSS

Pulled By: ZolotukhinM

fbshipit-source-id: 77e203ab4421eb03afc64f8da17e0daab277ecc2
2020-03-16 11:49:34 -07:00
Mikhail Zolotukhin
35e7efeb9a [TensorExpr] Add CUDA codegen. (#34227)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34227

This PR adds a CUDA support to tensor expressions.

Differential Revision: D20251836

Test Plan: Imported from OSS

Pulled By: ZolotukhinM

fbshipit-source-id: ab36a55834cceff30c8371fef6cca1054a32f017
2020-03-16 11:49:29 -07:00
Mikhail Zolotukhin
42b2c8c65d [TensorExpr] Add a fuser pass based on tensor expressions. (#34226)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34226

LLVM and Cuda backends are added in subsequent PRs, so at this point the fuser is pretty useless, but it still can be tested and its logic is not going to change with addition of the codegens.

Differential Revision: D20251838

Test Plan: Imported from OSS

Pulled By: ZolotukhinM

fbshipit-source-id: 82b0d221fa89904ed526689d02a6c7676a8ce8de
2020-03-16 11:49:24 -07:00