Commit Graph

64 Commits

Author SHA1 Message Date
Shen Li
bb0377bb24 Expose torch.futures.Future (#39008)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39008

This commit adds a `torch.futures.Future` type and exposes its ctor,
`wait`, `then`, and `set_result` APIs. This type is currently a
wrapper of `c10::ivalue::Future` and mainly used by RPC for now. Later,
we could revamp c10d APIs to return this `Future` type as well. More
utils will be added into `torch.futures` package in followup PRs.

Test Plan: Imported from OSS

Differential Revision: D21723022

Pulled By: mrshenli

fbshipit-source-id: 92e56160544e9bf00d11db3e8347a1b9707882c9
2020-06-02 10:12:56 -07:00
Jie
07518e120b [nvFuser] add torch.jit.fuser context manager (#38993)
Summary:
1. `torch.jit.fuser(str)` context manager facilitates switch between backend fusers:
  str - 'fuser0' enables only legacy fuser;
  str - 'fuser1' enables only NNC;
  str - 'fuser2' enables only nvFuser;
2. cleanup updated python tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38993

Reviewed By: nairbv, pbelevich

Differential Revision: D21800620

Pulled By: soumith

fbshipit-source-id: 7fe855f5a5b97368e5e84c98c28d04b2e1276c85
2020-06-01 10:52:40 -07:00
Jerry Zhang
85d0292c14 [quant][graphmode] Cleanup inplace API (#38827)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38827

Test Plan: Imported from OSS

Differential Revision: D21673481

fbshipit-source-id: becca38efcf720089407c981419b33f629a33e91
2020-05-29 11:13:25 -07:00
Kimish Patel
bb12e4dca0 Add JIT fusion pass to fuse quantized add and relu. (#38897)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38897

Quantized ops support add_relu. This pass enables finding quantized add + relu
pattern and fuse them to add_relu.

Test Plan: buck run caffe2/test:quantization -- test_quantization.TestFusionPasses

Reviewed By: jerryzh168

Differential Revision: D21690909

fbshipit-source-id: 607cf72dde535df15eb7638841543ab2156af464
2020-05-27 14:16:57 -07:00
Elias Ellison
f90dc741eb [JIT] Normalize op aliases (#38735)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38735

Follow up to my comment https://github.com/pytorch/pytorch/pull/36597/#issuecomment-613674329

This adds a pass to convert op aliases into a normalized form. Having two ops generated in our IR that do the same thing makes the IR harder for downstream consumers of the IR, such as TorchScript passes but also ONNX, glow, etc.

Another solution would have been to fix our code generation to only emit `aten::abs` from the start. This seems trickier, and doesn't really buy us much if we still have to expose `aten::absolute` in C++, as glaringlee of the C++ API thinks we should.

Bike shedding: maybe this should be `CanonicalizeOps` instead

Test Plan: Imported from OSS

Differential Revision: D21673108

Pulled By: eellison

fbshipit-source-id: c328618907de1af22e07f57fd27fa619978c2817
2020-05-21 21:47:17 -07:00
Elias Ellison
5183e3aa16 [JIT] Rename canonicalize ops (#38734)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38734

As far as I can tell, this pass only exists to canonicalize ops that are generating in the graph fuser, so it's kind of a misnomer.

Test Plan: Imported from OSS

Differential Revision: D21673109

Pulled By: eellison

fbshipit-source-id: b7bedf34ccaf1fcd442bfb2bbb990e64915f51d4
2020-05-21 21:45:15 -07:00
Nikita Shulga
4c0bf93a0e Revert D21057090: Remove useless copy on zip file load
Test Plan: revert-hammer

Differential Revision:
D21057090

Original commit changeset: e3d30a3b09f4

fbshipit-source-id: b24cbe77aae38b321882e7dcf41022710ee28ed0
2020-05-21 19:34:18 -07:00
davidriazati
455bf77da5 Remove useless copy on zip file load (#36362)
Summary:
Instead of copying to a buffer, then setting a tensor's storage with that buffer, create a storage directly from the file
](https://our.intern.facebook.com/intern/diff/21057090/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36362

Pulled By: driazati

Differential Revision: D21057090

fbshipit-source-id: e3d30a3b09f4d67bf4bb7a0dd7f4f60c3dd1a47e
2020-05-21 18:57:06 -07:00
Will Constable
6fd48e24f1 Add support, test for kwargs in jit._fork (#38357) (#38665)
Summary:
Closing 38357
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38665

Reviewed By: suo

Differential Revision: D21643697

Pulled By: wconstab

fbshipit-source-id: c292c037f87bc2bb69a4ca163d7107d5396c53a2
2020-05-19 13:02:46 -07:00
James Reed
db86c8c6f5 Test BC for built-in torchbind methods (#38560)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38560

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D21598067

Pulled By: jamesr66a

fbshipit-source-id: 26a0e92a5c2883326be261cf84b7e916ebfd60d8
2020-05-15 19:06:59 -07:00
David Reiss
6d642a6f6c Remove (most) Python 2 support from C++ code (#35614)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35614

Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up a lot of cruft that we put in place to support it.
These changes were all done manually, and I skipped anything that seemed
like it would take more than a few seconds, so I think it makes sense to
review it manually as well.

Test Plan: CI

Differential Revision: D20842876

Pulled By: dreiss

fbshipit-source-id: 18abf0d324ed2185ec6d27c864e935d856dcc6ad
2020-05-14 15:01:49 -07:00
Kimish Patel
f954dd7823 Add dropout removal pass. (#38253)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38253

This pass removes dropout and dropout_ nodes when training is false. It
requires to have run freeze_module pass which does both inlining and constant
propagation, without which training variable remains as attribute instead of
constant.
ghstack-source-id: 103939141

Test Plan: python test/test_jit.py TestScript.test_remove_dropout

Reviewed By: dreiss

Differential Revision: D21505863

fbshipit-source-id: 42ea45804e4653b625b6a254c8d8480757264aa8
2020-05-12 14:38:34 -07:00
Shen Li
dad552666e Add then(callback)->Future API to ivalue::Future (#37311)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37311

Test Plan: Imported from OSS

Differential Revision: D21247827

Pulled By: mrshenli

fbshipit-source-id: f8fe0617ccb957aa747a78554a000ce2c4a58495
2020-05-11 21:58:56 -07:00
Shihao Xu
3d0279862d Consolidate builtin/python_udf RPC to return ivalue::Future like torchscript RPC does (#35154)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35154

This is for issue https://github.com/pytorch/pytorch/issues/34999.

close https://github.com/pytorch/pytorch/issues/34999.

https://github.com/pytorch/pytorch/issues/34997 need more work.

This will make a few work items easier, like 1) Dist autograd profiler, 2) JIT annotation for Future.

Test Plan:
```
buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork

buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork -- test_rref_forward_chain --stress-runs 100

buck build mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork && \
buck-out/gen/caffe2/test/distributed/rpc/rpc_fork\#binary.par \
-r test_call_method_on_rref
```

buck test mode/dev-nosan //caffe2/test/distributed/rpc:rpc_fork -- 'test_rref_proxy_class \(fb\.test_rpc_fork\.RpcTestWithFork\)' --stress-runs 100

test_rref_proxy_reuse
test_handle_send_exceptions

```
buck test mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork

buck build mode/dev-nosan //caffe2/test/distributed/rpc/jit:rpc_fork && \
buck-out/gen/caffe2/test/distributed/rpc/jit/rpc_fork\#binary.par \
-r test_script_call_python_return_future
```

Differential Revision: D7722184

fbshipit-source-id: bd92b855bfea4913d6672700590c57622fa86e0e
2020-05-08 21:28:56 -07:00
Jerry Zhang
0ed7fc581c [quant][graphmode][refactor] Split quantization.cpp (#37975)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37975

Test Plan:
.

Imported from OSS

Differential Revision: D21468497

fbshipit-source-id: 35cbf98a344ca6e4094d616a4040eacf017fd2de
2020-05-08 12:24:50 -07:00
Jerry Zhang
ff9a809ccd [quant][graphmode][refactor] Remove unused code in quantization.cpp (#37974)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37974

Differential Revision: D21468498

Pulled By: jerryzh168

fbshipit-source-id: 96f34db9f98474ec8e5d33e9b7c406b1637f5de8
2020-05-08 11:03:03 -07:00
James Reed
c1e7758b5e Back out "Revert D20229168: [quantization] Use torchbind for Linear PackedParams" (#38101)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38101

Original commit changeset: 29e8a4d3b8bf
ghstack-source-id: 103730417

Test Plan: waitforsadcastle

Differential Revision: D21471381

fbshipit-source-id: a922cdf31ba32021e7264ae1454c646c0bfd7ef4
2020-05-08 10:53:06 -07:00
Nikita Shulga
4bc0a7f86a Revert D20229168: [quantization] Use torchbind for Linear PackedParams
Test Plan: revert-hammer

Differential Revision:
D20229168

Original commit changeset: 3607cac9aa5b

fbshipit-source-id: 29e8a4d3b8bffd95ff6a58b46c4f1c1e23770304
2020-05-07 19:47:45 -07:00
James Reed
eaf9b28c55 [quantization] Use torchbind for Linear PackedParams (#34140)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/34140

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D20229168

Pulled By: jamesr66a

fbshipit-source-id: 3607cac9aa5b4b044572329742baed03350491c6
2020-05-07 19:03:44 -07:00
eellison
d5df055bbb [WIP][JIT] Add JIT backend registration API (#35833)
Summary:
**Summary**
This commit adds `torch::jit::RegisterBackend`, an API that allows
external backends to be registered for the execution of JIT subgraphs
outside the JIT interpreter. In order to register an external backend,
one must extend the provided abstract class `PyTorchBackendInterface` and provide
two additional functions: one that creates an instance of the aforementioned subclass
of `PyTorchBackendInterface`, and another that preprocesses a `ScriptModule` so that
it can run on the backend. Then, a `ScriptModule` that can compile and execute a given
JIT subgraph using the functions provided at registration time is generated
for each registered backend.

**Testing**
This commit adds a unit test that uses a minimal test backend
to make sure that the registration endpoint and generated
`ScriptModule` work.

```
$ python test/test_jit.py TestBackends
Fail to import hypothesis in common_utils, tests are not derandomized
.
----------------------------------------------------------------------
Ran 1 test in 0.183s

OK

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35833

Differential Revision: D21231955

Pulled By: SplitInfinity

fbshipit-source-id: 452db1123d0e5d83f97fe5da8a00fdfdb50dbef9
2020-05-07 18:15:26 -07:00
Mikhail Zolotukhin
a44824c9ed [TensorExpr] Allow to enable/disable fallback mechanism thru an envvar PYTORCH_TENSOREXPR_FALLBACK. (#37971)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37971

Test Plan: Imported from OSS

Reviewed By: protonu

Differential Revision: D21444831

Pulled By: ZolotukhinM

fbshipit-source-id: c75f58772a4730e8f40f05491f9e5afa4aa3ed30
2020-05-07 12:20:31 -07:00
Jerry Zhang
70f375becf [quant] ConvPackedParams with TorchBind (#35923)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35923

(Note: this ignores all push blocking failures!)

Test Plan:
tbd

Imported from OSS

Differential Revision: D20957089

fbshipit-source-id: 74d8bd628ccba64e902ea6ebabc2b883924050b0
2020-05-05 20:18:36 -07:00
Jerry Zhang
9b3911c073 [quant][graphmode][refactor] rename SwapDequant and refactor code handling general ops (#37555)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37555

Test Plan:
.

Imported from OSS

Differential Revision: D21393514

fbshipit-source-id: 5bc9fa0f0be25f4c35a64acb23513f64ed07e230
2020-05-05 11:20:15 -07:00
Mikhail Zolotukhin
7fa968b10d [TensorExpr] Add python bindings for TE fuser. (#37831)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37831

Test Plan: Imported from OSS

Reviewed By: jackm321

Differential Revision: D21404947

Pulled By: ZolotukhinM

fbshipit-source-id: 8467346d4fd8413985a33832fb3994d3ead746dc
2020-05-05 10:58:30 -07:00
Elias Ellison
c516f84525 [JIT] Add Lower Tuples Call & Run remove mutation after list unrolling (#36829)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36829

This changes the IR complexity from the previous PR for the following tests:
```
('Name', 'Ifs/Loops', 'non-tensor ops')
Before:  ('max_unpool1d', 0, 3)
After:  ('max_unpool1d', 0, 0)
Before:  ('max_unpool2d', 0, 3)
After:  ('max_unpool2d', 0, 0)
Before:  ('max_unpool3d', 0, 4)
After:  ('max_unpool3d', 0, 0)
Before:  ('adaptive_max_pool2d', 0, 3)
After:  ('adaptive_max_pool2d', 0, 0)
Before:  ('adaptive_max_pool3d', 0, 4)
After:  ('adaptive_max_pool3d', 0, 0)
Before:  ('adaptive_avg_pool2d', 0, 3)
After:  ('adaptive_avg_pool2d', 0, 0)
Before:  ('adaptive_avg_pool3d', 0, 4)
After:  ('adaptive_avg_pool3d', 0, 0)
Before:  ('upsample', 13, 68)
After:  ('upsample', 4, 28)
Before:  ('upsample', 13, 68)
After:  ('upsample', 0, 5)
Before:  ('interpolate', 14, 68)
After:  ('interpolate', 0, 4)
Before:  ('interpolate', 13, 67)
After:  ('interpolate', 4, 27)
Before:  ('interpolate', 14, 68)
After:  ('interpolate', 0, 4)
Before:  ('interpolate', 14, 68)
After:  ('interpolate', 0, 4)
Before:  ('interpolate', 13, 67)
After:  ('interpolate', 4, 27)
Before:  ('interpolate', 14, 68)
After:  ('interpolate', 0, 4)
Before:  ('interpolate', 14, 68)
After:  ('interpolate', 0, 4)
Before:  ('interpolate', 13, 67)
After:  ('interpolate', 4, 27)
Before:  ('interpolate', 14, 68)
After:  ('interpolate', 0, 4)
Before:  ('interpolate', 14, 68)
After:  ('interpolate', 0, 4)
Before:  ('interpolate', 13, 67)
After:  ('interpolate', 4, 27)
Before:  ('interpolate', 14, 68)
After:  ('interpolate', 0, 4)
Before:  ('interpolate', 14, 59)
After:  ('interpolate', 0, 3)
Before:  ('interpolate', 13, 57)
After:  ('interpolate', 4, 21)
Before:  ('interpolate', 14, 59)
After:  ('interpolate', 0, 3)
Before:  ('interpolate', 14, 59)
After:  ('interpolate', 0, 3)
Before:  ('interpolate', 13, 57)
After:  ('interpolate', 4, 21)
Before:  ('interpolate', 14, 59)
After:  ('interpolate', 0, 3)
Before:  ('interpolate', 14, 59)
After:  ('interpolate', 0, 3)
Before:  ('interpolate', 13, 57)
After:  ('interpolate', 4, 21)
Before:  ('interpolate', 14, 59)
After:  ('interpolate', 0, 3)
Before:  ('interpolate', 13, 77)
After:  ('interpolate', 4, 33)
Before:  ('interpolate', 14, 77)
After:  ('interpolate', 0, 5)
Before:  ('interpolate', 14, 77)
After:  ('interpolate', 0, 5)
Before:  ('interpolate', 13, 77)
After:  ('interpolate', 4, 33)
Before:  ('interpolate', 14, 77)
After:  ('interpolate', 0, 5)
Before:  ('interpolate', 14, 77)
After:  ('interpolate', 0, 5)
Before:  ('interpolate', 13, 77)
After:  ('interpolate', 4, 33)
Before:  ('interpolate', 14, 77)
After:  ('interpolate', 0, 5)
Before:  ('interpolate', 14, 68)
After:  ('interpolate', 0, 4)
Before:  ('interpolate', 14, 68)
After:  ('interpolate', 0, 4)
Before:  ('interpolate', 15, 103)
After:  ('interpolate', 1, 23)
Before:  ('interpolate', 14, 70)
After:  ('interpolate', 0, 6)
Before:  ('interpolate', 15, 103)
After:  ('interpolate', 1, 21)
Before:  ('interpolate', 14, 70)
After:  ('interpolate', 0, 6)
Before:  ('interpolate', 15, 91)
After:  ('interpolate', 1, 13)
Before:  ('interpolate', 14, 59)
After:  ('interpolate', 0, 3)
Before:  ('interpolate', 15, 93)
After:  ('interpolate', 1, 16)
Before:  ('interpolate', 14, 61)
After:  ('interpolate', 0, 5)
Before:  ('interpolate', 15, 111)
After:  ('interpolate', 1, 28)
Before:  ('interpolate', 14, 77)
After:  ('interpolate', 0, 5)
Before:  ('interpolate', 15, 113)
After:  ('interpolate', 1, 27)
Before:  ('interpolate', 14, 79)
After:  ('interpolate', 0, 7)
Before:  ('test_nn_AdaptiveMaxPool2d_single', 0, 3)
After:  ('test_nn_AdaptiveMaxPool2d_single', 0, 0)
Before:  ('test_nn_AdaptiveMaxPool2d_tuple', 0, 3)
After:  ('test_nn_AdaptiveMaxPool2d_tuple', 0, 0)
Before:  ('test_nn_AdaptiveMaxPool3d_single', 0, 4)
After:  ('test_nn_AdaptiveMaxPool3d_single', 0, 0)
Before:  ('test_nn_AdaptiveMaxPool3d_tuple', 0, 4)
After:  ('test_nn_AdaptiveMaxPool3d_tuple', 0, 0)
Before:  ('test_nn_AdaptiveMaxPool3d_single_nonatomic', 0, 4)
After:  ('test_nn_AdaptiveMaxPool3d_single_nonatomic', 0, 0)
Before:  ('test_nn_AdaptiveMaxPool3d_tuple_nonatomic', 0, 4)
After:  ('test_nn_AdaptiveMaxPool3d_tuple_nonatomic', 0, 0)
Before:  ('test_nn_AdaptiveAvgPool2d_single', 0, 3)
After:  ('test_nn_AdaptiveAvgPool2d_single', 0, 0)
Before:  ('test_nn_AdaptiveAvgPool2d_single_1x1output', 0, 3)
After:  ('test_nn_AdaptiveAvgPool2d_single_1x1output', 0, 0)
Before:  ('test_nn_AdaptiveAvgPool2d_tuple', 0, 3)
After:  ('test_nn_AdaptiveAvgPool2d_tuple', 0, 0)
Before:  ('test_nn_AdaptiveAvgPool3d_single', 0, 4)
After:  ('test_nn_AdaptiveAvgPool3d_single', 0, 0)
Before:  ('test_nn_AdaptiveAvgPool3d_tuple', 0, 4)
After:  ('test_nn_AdaptiveAvgPool3d_tuple', 0, 0)
```

Test Plan: Imported from OSS

Differential Revision: D21160758

Pulled By: eellison

fbshipit-source-id: 68ccbf3af74398e8dbad7e6bedb639635dafdb2e
2020-04-28 23:28:02 -07:00
Nikolay Korovaiko
a80a438e37 correctly set and restore states in te tests (#37210)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/37210

Differential Revision: D21238634

Pulled By: Krovatkin

fbshipit-source-id: 6462239753399c10c871baa5d5fdff5465cf2544
2020-04-24 20:16:51 -07:00
Elias Ellison
9cbeb0faed [JIT] Dont optimize shape peepholes on inline (#36404)
Summary:
With https://github.com/pytorch/pytorch/pull/35562, we are running peephole optimization on inlining to reduce the number of nodes that are copied.

The tracer encodes the sizes in the graph like:
```
graph(%0 : Double(7)):
  %1 : Function = prim::Constant[name="tensor_size"]()
  %2 : Tensor = prim::CallFunction(%1, %0)
  return (%2)
```

however people would like to reuse the graph with different shapes so running size invalidations would invalidate that. long term it might be better for the tracer to not include shape information but there are downstream users of that.

Separates out FuseAddMM from peephole so that now there is a single `disable_size_optimizations` parameter, and onnx explicitly invokes fuseaddmm.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36404

Differential Revision: D20968974

Pulled By: eellison

fbshipit-source-id: 56f8f1699e3b0adeeccdfd5a67bb975fd41a2913
2020-04-15 17:49:48 -07:00
Negin Raoof
f99a28f515 [ONNX] Adding a pass to replace interpolate function with aten::__interpolate (#35744)
Summary:
Since aten;:__interpolate is removed in https://github.com/pytorch/pytorch/pull/34514, we need a pass replace interpolate function with aten::__interpolate for ONNX export.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35744

Reviewed By: hl475

Differential Revision: D20907041

Pulled By: houseroad

fbshipit-source-id: f2d2cdfec47389245c50f538267124eedf682adf
2020-04-14 23:16:22 -07:00
Mikhail Zolotukhin
765bf8f03d Remove duplicate bindings from torch/csrc/jit/python/init.cpp. (#36492)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36492

Test Plan: Imported from OSS

Differential Revision: D20995235

Pulled By: ZolotukhinM

fbshipit-source-id: 6afa3a956e57c2fb94bb29d332177be73a2bac2a
2020-04-13 12:28:32 -07:00
Kimish Patel
d559a47933 Enable relu fusion with prepacked linear/conv. (#35705)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35705

Introduces a pass for relu fusion.

Test Plan:
python test/test_xnnpack_integration.py

Imported from OSS

Differential Revision: D20746592

fbshipit-source-id: 6c22f60a20e9121618c85077b9b58fb8d4082b3b
2020-04-03 15:38:45 -07:00
Mikhail Zolotukhin
af5121f62a Invoke TensorExpr fuser pass from a graph executor. (#35913)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35913

The pass itself is still disabled by default, but with this change we
don't need to register it as a custom pass anymore. It allows us to
control its behavior with env variables more easily.

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D20827189

Pulled By: ZolotukhinM

fbshipit-source-id: e74d90b5e46422e7ab7bc40974a805220da50fbc
2020-04-03 12:20:26 -07:00
Christian Sarofeen
6d24f8fe21 Infrastructure for a new CUDA Fuser (#34785)
Summary:
**Summary:** This PR contains the infrastructure of a new CUDA fuser. This CUDA fuser is based on many of the same principles of TensorExpressions and Halide, however the implementation is ground up. The fusion pass itself is similar to the default CUDA fuser, however, it has undergone some refactoring and is using the new code generation infrastructure. For those who are interested in how the code generation in this PR works, I would recommend reviewing _test/cpp/jit/test_gpu_fusion.cpp_ as well as the long comment section at the beginning of _torch/csrc/jit/codegen/cuda/transform_replay.h_  One of the largest differences between our approach and that of TVM/Halide, is the concept of "TensorView". TensorView from a high level should be thought of similarly to how we think of working with Tensors in PyTorch. It's an N-D object which can undergo transformations that change its dimensionality. Dimensionality changes are done through the operations split/merge/reorder/computeAt. These transformations are similar to split/fuse/reorder/compute_at of TVM, they modify how a tensor is iterated over to generate GPU code. Interestingly, in our scheme these transformations are applied to tensors and only impact how that tensor is generated.

**Warning:** This PR is purposefully not feature complete with the current fuser. We wanted to separate out the infrastructure from the fusion capabilities. Once in, smaller incremental PRs will be submitted to expand capabilities of the fuser.

**Short term goals:**

Parity with current CUDA fuser (including performance):
- Dynamic shapes (no recompilation)
- Implicit handling of braodcast (broadcasted tensors are treated as tensors of the braodcasted size in the generated code)
- Dropout

**Mid-term goals:**

- Transposes fused with pointwise operations where transpose involves only 2 axes (across the fused operation).
- 1-D reductions fused with pointwise operations
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34785

Reviewed By: ZolotukhinM

Differential Revision: D20650977

Pulled By: soumith

fbshipit-source-id: ee39c95a880e1b9822e874ed4cc180971572bf63
2020-04-02 09:22:42 -07:00
Supriya Rao
a090de380c [quant][graph] Add quant fusion for dynamic quantization (#35586)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35586

This pass fuses the choose_qparams-quant-dequant sequence
Fusion for weight tensor is the same as static quant.

Test Plan:
python test/test_quantize_script.py

Imported from OSS

Differential Revision: D20755680

fbshipit-source-id: b7443770642b6e6fa0fa9da8a44637e9b2d4df70
2020-03-30 23:34:56 -07:00
Supriya Rao
1f7ee7b6b7 [quant][graph] Add pass to insert quant dequant for dynamic quantization (#35448)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35448

Add _choose_qparams_per_tensor which returns scale and zero_point similar to the dynamic quantization in the operator

Test Plan:
python test/test_quantize_script.py

Imported from OSS

Differential Revision: D20755679

fbshipit-source-id: c9066d8f1bb3e331809be26c4be806faafc9b981
2020-03-30 23:33:32 -07:00
Jerry Zhang
6fc2403951 [quant][graphmode] qconfig_dict support None (#35336)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35336

Test Plan:
python test/test_quantization.py

Imported from OSS

Differential Revision: D20655302

fbshipit-source-id: b453f3240ac487aa29629953b4d71274dbbc25fc
2020-03-29 12:47:47 -07:00
Nikolay Korovaiko
9e22d15f14 Enable tensorexpr cpp tests in CI. try #2 (#35454)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35454

Differential Revision: D20665160

Pulled By: Krovatkin

fbshipit-source-id: e04cbe92b2ee5a3288f3c4e5c83533bfea85bf85
2020-03-27 12:09:55 -07:00
Bram Wasti
a3e10d2a17 Expose enablement of TensorExpr fuser as env variable (#35341)
Summary:
This commit allows one to use an environment variable to enable the fuser in torch/csrc/jit/tensorexpr/

```
PYTORCH_TENSOREXPR=1 python benchmark.py
```

This commit also changes the registration to happen by default, removing the requirement for the python exposed "_jit_register_tensorexpr_fuser"
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35341

Reviewed By: ZolotukhinM

Differential Revision: D20676348

Pulled By: bwasti

fbshipit-source-id: 4c997cdc310e7567c03905ebff72b3e8a4c2f464
2020-03-26 14:31:57 -07:00
Meghan Lele
6384c2d81b [JIT] clang-format JIT code (#35115)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35115

This commit runs the newly added tools/clang_format.py on the JIT
codebase and includes all of the formatting changes thus produced.

Testing:
Ran the script, CI.

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D20568523

Pulled By: SplitInfinity

fbshipit-source-id: e09bdb982ccf090eecfb7c7b461b8d0681eef82b
2020-03-26 11:24:51 -07:00
Suraj Menon
aa01a95c6d Revert D20630760: [pytorch][PR] Enable NNC tests vol. i. add test_tensorexpr.py tests [WIP]
Test Plan: revert-hammer

Differential Revision:
D20630760

Original commit changeset: 7d2f27aca6b1

fbshipit-source-id: 28ac92b3390651a4a67061d6ebf208515b9b9463
2020-03-25 20:34:46 -07:00
Kimish Patel
dc2c4d02f9 Add a wrapper to wrap all optimization for mobile. (#35227)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35227

This wraps.
1. Conv BN folding (not mobile specific)
2. insert XNNPACK conv2d/Linear ops
3. Remove prepacking ops.

Test Plan: Imported from OSS

Differential Revision: D20603562

fbshipit-source-id: ff373af7112c070ec6198bac51845282e09ff1f8
2020-03-25 19:21:14 -07:00
Nikolay Korovaiko
f3a5081bd4 Enable NNC tests vol. i. add test_tensorexpr.py tests [WIP] (#34897)
Summary:
This  PR add tensorexpr cpp tests to test_jit.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34897

Differential Revision: D20630760

Pulled By: Krovatkin

fbshipit-source-id: 7d2f27aca6b1e23e3ffed1c765d8f590688118e3
2020-03-25 17:23:48 -07:00
Elias Ellison
aab4beb87f [JIT] Pass To Safely Remove Aten Inplace Ops (#33186)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33186

This helps create larger functional graphs. It has the potential to increase memory use, so in order to land this on by default we would probably also do a reuse of buffers pass.

This is currently O(n * | Removed Nodes | ) because we have to rebuild the alias Db each time we make a change. This pass is critical to creating functional graphs, so this might be a compelling use case to build incremental updates to alias Db.

Test Plan: Imported from OSS

Differential Revision: D20603189

Pulled By: eellison

fbshipit-source-id: 105db52bf38e02188ca6df6d36294466d3309a0a
2020-03-24 23:45:58 -07:00
Elias Ellison
5b2f8cef08 [JIT] Functional Graph Pass (#33020)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33020

This is a pass to create functional blocks. The other PRs in the stack help avoid some of the limitations that are are often found in graphs. It's possible that this would work well with a graph that is frozen. Follow up work items that will help this pass:

- We don't currently have any capacity in alias analysis to tell whether a Value that came from the wildcard set "re-escapes" back into the wildcard set.
- More comments on the semantics of the graph and correctness conditions
- We could consider using dynamic dag if the perf of this is a limitation.
- potential make Functional Graphs Functional Blocks instead, so that we do not repeatedly copy constants, also to make IR read easier.

Test Plan: Imported from OSS

Differential Revision: D20603188

Pulled By: eellison

fbshipit-source-id: 6822a6e65f4cc2676f8f6445fe8aa1cb858ebeeb
2020-03-24 23:44:18 -07:00
davidriazati
44622bbda9 [jit] Add lazy script decorator (#34935)
Summary:
Stacked PRs
 * #34938 - [jit] Remove stray `script`
 * **#34935 - [jit] Add lazy script decorator**

Some users maintain libraries of code that is largely trace-able but not
script-able. However, some functions may need to be `torch.jit.script`ed if
they contain control flow so the tracer will use the compiler version.
This however impacts library start up time as in #33418, so this PR adds
a workaround in the form of a `torch.jit._lazy_script_while_tracing`
that will only initialize the compiler if the function is called while
actually tracing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/34935

Pulled By: driazati

Differential Revision: D20569778

fbshipit-source-id: d87c88c02b1abc86b283729ab8db94285d7d4853
2020-03-24 13:43:18 -07:00
Supriya Rao
55019d357e [quant][graphmode] Add observers for dynamic quant (#35121)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35121

For dynamic quantization we insert observers at the input to mimic the quatization of activations that happens in the operator
Observer for weight is inserted similar to static quant

Test Plan:
python test/test_quantize_script.py

Sample output for single layer FC

.graph(%self : __torch__.___torch_mangle_4.M,
      %x.2 : Tensor):
  %_observer_1 : __torch__.torch.quantization.observer.MinMaxObserver = prim::GetAttr[name="_observer_1"](%self)
  %x.1 : Tensor = prim::CallMethod[name="forward"](%_observer_1, %x.2)
  %2 : __torch__.torch.nn.modules.linear.___torch_mangle_5.Linear = prim::GetAttr[name="fc"](%self)
  %3 : Tensor = prim::CallMethod[name="forward"](%2, %x.1) # test/test_quantize_script.py:19:23
  return (%3)

graph(%self : __torch__.torch.nn.modules.linear.___torch_mangle_5.Linear,
      %input.1 : Tensor):
 %2 : Function = prim::Constant[name="linear"]()
 %3 : Tensor = prim::GetAttr[name="weight"](%self)
 %_observer_0 : __torch__.torch.quantization.observer.MinMaxObserver = prim::GetAttr[name="_observer_0"](%self)
 %7 : Tensor = prim::CallMethod[name="forward"](%_observer_0, %3)
 %4 : Tensor = prim::GetAttr[name="bias"](%self)
 %5 : Tensor = prim::CallFunction(%2, %input.1, %7, %4) # /home/supriyar/miniconda3/envs/pytorch_py3/lib/python3.7/site-packages/torch/nn/modules/linear.py:87:15
 return (%5)

Imported from OSS

Differential Revision: D20599144

fbshipit-source-id: 9a8fa0e8655b9908826b981dce8a11d86efce5df
2020-03-24 10:54:16 -07:00
Kimish Patel
e1c092fe3a Changes to transition to generic API for ops with weight prepacking (#35010)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35010

semantics.

This PR moves all the xnnpack specific interfces to a generic interface.
Accordingly removes xnnpac specific reference from API and some variable
names.
What has not yet changed:

TODO:
USE_XNNPACK is still used. This can be removed where no XNNPACK
specific things are done. e.g., RegisterOpContext.cpp and
xnnpack_rewrite.cpp.
Also the filename and structure also remains. Some of the generic class
definition can be moved non-XNNPACK specific folder.

Test Plan:
python test/test_xnnpack_integration.py

Imported from OSS

Differential Revision: D20526416

fbshipit-source-id: 2e1725345c44bbb26bdc448097a7384eca121387
2020-03-22 08:31:53 -07:00
Wanchao Liang
c21fde6421 [jit] make jit/rpc share the same PythonFutureWrapper (#35039)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35039

This is the initial step towards merging ivalue future and rpc future

Test Plan: Imported from OSS

Differential Revision: D20537164

Pulled By: wanchaol

fbshipit-source-id: d4f148c88e49ed6b0881ca4b4dd945ea24166183
2020-03-20 22:35:34 -07:00
Mikhail Zolotukhin
12f0052eee Add TensorExpr Fuser tests (resubmit). (#35085)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35085

Test Plan: Imported from OSS

Differential Revision: D20552334

Pulled By: ZolotukhinM

fbshipit-source-id: 628fcf4719a879f18978ff8a0a64afbb045df645
2020-03-20 13:19:31 -07:00
Natalia Gimelshein
3c90a90730 Revert D20540599: Add TensorExpr Fuser tests.
Test Plan: revert-hammer

Differential Revision:
D20540599

Original commit changeset: ced9b6657fe7

fbshipit-source-id: e8fa11f20207c35f39b3fbe6f45fc627715377c1
2020-03-19 18:37:32 -07:00
Mikhail Zolotukhin
7b59f41009 Add TensorExpr Fuser tests. (#35052)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/35052

Differential Revision: D20540599

Test Plan: Imported from OSS

Pulled By: ZolotukhinM

fbshipit-source-id: ced9b6657fe72bca61833ab5d59bdaddcacd114b
2020-03-19 14:31:54 -07:00