Commit Graph

964 Commits

Author SHA1 Message Date
Mikhail Zolotukhin
b9c49f0e69 [TensorExpr] Support shape inference in TE for aten::cat. (#42387)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42387

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D22879281

Pulled By: ZolotukhinM

fbshipit-source-id: 775e46a4cfd91c63196b378ee587cc4434672c89
2020-08-05 14:11:24 -07:00
Kurt Mohler
df7c059428 Throw error if torch.set_deterministic(True) is called with nondeterministic CuBLAS config (#41377)
Summary:
For CUDA >= 10.2, the `CUBLAS_WORKSPACE_CONFIG` environment variable must be set to either `:4096:8` or `:16:8` to ensure deterministic CUDA stream usage. This PR adds some logic inside `torch.set_deterministic()` to raise an error if this environment variable is not set properly and CUDA >= 10.2.

Issue https://github.com/pytorch/pytorch/issues/15359

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41377

Reviewed By: malfet

Differential Revision: D22758459

Pulled By: ezyang

fbshipit-source-id: 4b96f1e9abf85d94ba79140fd927bbd0c05c4522
2020-08-05 12:42:24 -07:00
Mikhail Zolotukhin
b3ffebda7a [TensorExpr] Properly handle all dtypes of the condition in evaluation of IfThenElse exprs. (#42495)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42495

Test Plan: Imported from OSS

Reviewed By: nickgg

Differential Revision: D22910753

Pulled By: ZolotukhinM

fbshipit-source-id: f9ffd3dc4c50fb3fb84ce6d6916c1fbfd3201c8f
2020-08-04 12:25:56 -07:00
Mikhail Zolotukhin
c334ebf1aa [TensorExpr] Properly handle all dtypes in evaluation of Intrinsics exprs. (#42494)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42494

Note that we're currently assuming that dtypes of all the arguments and
the return value is the same.

Test Plan: Imported from OSS

Reviewed By: nickgg

Differential Revision: D22910755

Pulled By: ZolotukhinM

fbshipit-source-id: 7f899692065428fbf2ad05d22b4ca39cab788ae5
2020-08-04 12:25:54 -07:00
Mikhail Zolotukhin
38a9984451 [TensorExpr] Properly handle all dtypes in evaluation of CompareSelect exprs. (#42493)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42493

Test Plan: Imported from OSS

Reviewed By: nickgg

Differential Revision: D22910754

Pulled By: ZolotukhinM

fbshipit-source-id: cf7073d6ea792998a9fa3989c7ec486419476de0
2020-08-04 12:24:03 -07:00
Ann Shan
d707d4bf6d Implement a light SGD optimizer (#42137)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42137

This PR implements an SGD optimizer class similar to torch::optim::SGD, but it doesn't inherit from torch::optim::Optimizer, for use on mobile devices (or other lightweight use case).

Adding Martin's comment for visibility: "SGD may be the only optimizer used in near future. If more client optimizers are needed, refactoring the full optim codes and reusing the existing code would be an option."

Test Plan: Imported from OSS

Reviewed By: iseeyuan

Differential Revision: D22846514

Pulled By: ann-ss

fbshipit-source-id: f5f46804aa021e7ada7c0cd3f16e24404d10c7eb
2020-08-03 17:27:53 -07:00
Nick Gibson
f47e00bdc3 [NNC] Bounds Inference: make inferred bounds respect gaps (#42185)
Summary:
A heavy refactor of bounds inference to fix some issues and bugs blocking using it to analyze cross thread interactions:
* We were merging all accesses to a Buf into a single bounds info entry, even if they did not overlap. E.g. if we accessed a[0:2] and a[5:6] we would merge that into a bound of a[0:6]. I've changed this behaviour to merge only overlapping bounds.
* We were not separating bounds of different kinds (e.g. Load vs Store) and would merge a Store bounds into a Load bounds, losing the information about what kind of access it was. E.g. this loop would produce bounds: [{Load, 0, 10}] and now produces bounds [{Load, 0, 9}, {Store, 1, 10}]:
```
for i in 1 to 10...
  x[i] = x[i-1]
```
* Both ComputeAt and Rfactor relied on the overzealous merging and only used a single entry in the bounds list to determine the bounds of temporary buffers they created, which could result in temporary buffers allocated smaller than accesses to them. I've fixed Rfactor, but *not* ComputeAt - however all ComputeAt tests still pass (may require loop fusion to trigger this issue) - I will come back to it.

Being more precise about bounds is more complex, rather than taking the minimum of starts and maximum of stops we now need to determine if two bounds overlap or are adjacent. There are many edge cases and so I've added a bunch of test coverage of the merging method.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42185

Reviewed By: mruberry

Differential Revision: D22870391

Pulled By: nickgg

fbshipit-source-id: 3ee34fcbf0740a47259defeb44cba783b54d0baa
2020-07-31 20:22:04 -07:00
Mikhail Zolotukhin
2decccea2e [TensorExpr] Implement shape inference for TE. (#41451)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41451

Since TE operates on a limited subset of ops with a well-defined
semantics, we can easily infer shapes of intermediate and output tensors
given shapes of the inputs.

There is a couple of ops that are not yet supported in the shape
inference, once we add them we could relax the shape info requirements
in the TE fuser: currently it requires all values in the fusion group to
have shapes known and we can change it to only inputs.

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D22543470

Pulled By: ZolotukhinM

fbshipit-source-id: 256bae921028cb6ec3af91977f12bb870c385f40
2020-07-31 20:05:21 -07:00
Wanchao Liang
a9e7e787f8 [jit] make clone works for interface type (#42121)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42121

This PR changes the Module API to allow register a module with module
interface type, and therefore allows Module::clone works on the case
where there's a module interface type being shared by two submodules.

interface type will be shared by the new cloned instance in the same
compilation unit bc it only
contains a list of functionSchema, which does not involve any
attributes compared to classType.

fixes https://github.com/pytorch/pytorch/issues/41882

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D22781205

Pulled By: wanchaol

fbshipit-source-id: f97f4b75970f0b434e38b5a1f778eda2c4e5109b
2020-07-31 10:24:27 -07:00
Edward Yang
352e15f1a2 Revert D22812445: Update TensorPipe submodule
Test Plan: revert-hammer

Differential Revision:
D22812445 (2335430086)

Original commit changeset: e6d824bb28f5

fbshipit-source-id: 606632a9aaf2513b5ac949e4d6687aa7563eae5d
2020-07-31 10:16:48 -07:00
Luca Wehrstedt
2335430086 Update TensorPipe submodule (#42225)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42225

Main changes:
- Consolidated CMake files to have a single entry point, rather than having a specialized one for PyTorch.
- Changed the way the preprocessor flags are provided, and changed their name.

There were a few instances in PyTorch's CMake files where we were directly adding TensorPipe's source directory as an include path, which however doesn't contain the auto-generated header we now added. We fix that by adding the `tensorpipe` CMake target as a dependency, so that the include paths defined by TensorPipe are used, which contain that auto-generated header.

I'm turning off SHM and CMA for now because they have never been covered by the CI. I'll enable them in a separate PR so that if they turn out to be flaky we can revert that change without reverting this one.

Test Plan: CircleCI is all green.

Reviewed By: beauby

Differential Revision: D22812445

fbshipit-source-id: e6d824bb28f5afe75fd765de0430968174f3531f
2020-07-30 02:32:52 -07:00
Yujun Zhao
0444bac940 Add test to cross function
Summary: function `cross_kernel_scalar` is not covered in `Aten/native/cpu/CrossKernel.cpp`, add tests to cover it

Test Plan:
1. Test locally to check new lines are covered
2. CI

https://pxl.cl/1fZjG

Reviewed By: malfet

Differential Revision: D22834122

fbshipit-source-id: 0d50f3a3e6aee52cb6fdee2b9f5883f542c7b6e2
2020-07-29 22:48:52 -07:00
Yujun Zhao
9ea7476d9c Add test to lerp function (#42266)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42266

function `lerp_kernel_scalar` and `lerp_kernel_tensor` are not covered in `Aten/native/cpu/LerpKernel.cpp`, add tests to cover them

Test Plan:
1. Test locally to check new lines are covered
2. CI

https://pxl.cl/1fXPd

Reviewed By: malfet

Differential Revision: D22832164

fbshipit-source-id: b1eaabbf8bfa08b4dedc1a468abfdfb619a50e3c
2020-07-29 22:47:37 -07:00
Ann Shan
4b108ca763 refactor save_data as non member function (#42045)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42045

This PR changes the save_data() member functions of torch::jit::mobile::Module which was introduced in #41403 to be the non member function torch::jit::mobile::_save_parameters() (taking a mobile Module as its first argument).

In addition, this PR:
* adds a getter function _ivalue() for the mobile::Module object
* renames torch::jit::mobile::_load_mobile_data() to torch::jit::mobile_load_parameters()
* refactors the import.h header file into import.h and import_data.h

Test Plan: Imported from OSS

Reviewed By: kwanmacher, iseeyuan

Differential Revision: D22766781

Pulled By: ann-ss

fbshipit-source-id: 5cabae31927187753a958feede5e9a28d71d9e92
2020-07-28 21:52:32 -07:00
lixinyu
5246bc4e87 register parameters correctly in c++ MultiheadAttention (#42037)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42037

This is to fix #41951

Test Plan: Imported from OSS

Reviewed By: yf225

Differential Revision: D22764717

Pulled By: glaringlee

fbshipit-source-id: e6da0aeb05a2356f52446e6d5fad391f2cd1cf6f
2020-07-27 13:58:11 -07:00
Yanan Cao
890b52e09f Reduce instability in runCleanUpPasses by reordering passes. (#41891)
Summary:
Currently constant pooling runs before const propagation, which can create more constants that need pooling. This can get in the way of serialization/deserialization stability because each time user serializes and deserializes a module, runCleanUpPasses is called upon it. Doing so multiple times would lead to different saved module.

This PR moves constant pooling after const propagation, which may slow down const propagation a little bit, but would otherwise side-step aforementioned problem.

test_constant_insertion in test_jit.py is also updated because after fixing the pass ordering, the number of constants is no longer a constant and it is extremely difficult to get the exact number with the current convoluted test structure. So for now, I changed the test to check only that CSE doesn't change number of "prim::constant" rather than comparing against a known number. Also left a TODO to improve this test.

ConstantPropagation pass is replaced by ConstantPropagationImmutableTypes because the latter is used in runCleanUpPasses. If not replaced, the former would create new CSE opportunities by folding more constants. This voids the purpose of the test case.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41891

Reviewed By: colesbury

Differential Revision: D22701540

Pulled By: gmagogsfm

fbshipit-source-id: 8e60dbdcc54a93dac111d81b8d88fb39387224f5
2020-07-24 11:39:20 -07:00
Ann Shan
dfe7d27d0e implement lite parameter serializer (#41403)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/41403

Test Plan: Imported from OSS

Reviewed By: kwanmacher

Differential Revision: D22611633

Pulled By: ann-ss

fbshipit-source-id: b391e8c96234b2e69f350119a11f688e920c7817
2020-07-23 14:25:44 -07:00
Nick Gibson
aa91a65b59 [TensorExpr] Fix propagation of loop options when splitting loops (#40035)
Summary:
Fix a bug in SplitWithTail and SplitWithMask where loop_options such as Cuda block/thread bindings are overwritten by the split. This PR fixes this bug by propagating the loop options to the outer loop, which for axis bindings should be equivalent.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/40035

Reviewed By: ZolotukhinM

Differential Revision: D22080263

Pulled By: nickgg

fbshipit-source-id: b8a9583fd90f69319fc4bb4db644e91f6ffa8e67
2020-07-22 11:49:07 -07:00
Bert Maher
941069ca09 [tensorexpr][trivial] Remove debug printing from test (#41806)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41806

Generally a good practice not to have tests spew output.

Test Plan:
`build/bin/test_tensorexpr`

Imported from OSS

Reviewed By: zheng-xq

Differential Revision: D22646833

fbshipit-source-id: 444e883307d058fe77e7550d436fa61b7d91a701
2020-07-21 15:54:31 -07:00
Nick Gibson
7ffdd765c8 [TensorExpr] more convenient outer Rfactor output (#40050)
Summary:
Auto fuse the output loops of outer Rfactors, so it is in a more convenient format for binding GPU axes.

An example:
```
  Tensor* c = Reduce("sum", {}, Sum(), b, {{m, "m"}, {n, "n"}, {k, "k"}});
  LoopNest loop({c});
  std::vector<For*> loops = loop.getLoopStmtsFor(c);
  auto v = loops.at(0)->var();
  loop.rfactor(c->body(), v);
```
Before:
```
{
  Allocate(tmp_buf, float, {m});
  sum[0] = 0.f;
  for (int m_1 = 0; m_1 < m; m_1++) {
    tmp_buf[m_1] = 0.f;
  }
  for (int m_1 = 0; m_1 < m; m_1++) {
    for (int n = 0; n < n_1; n++) {
      for (int k = 0; k < k_1; k++) {
        tmp_buf[m_1] = (tmp_buf[m_1]) + (b[((n_1 * m_1) * k_1 + k) + k_1 * n]);
      }
    }
  }
  for (int m_1 = 0; m_1 < m; m_1++) {
    sum[0] = (sum[0]) + (tmp_buf[m_1]);
  }
  Free(tmp_buf);
}
```

After:
```
{
  sum[0] = 0.f;
  for (int m = 0; m < m_1; m++) {
    Allocate(tmp_buf, float, {m_1});
    tmp_buf[m] = 0.f;
    for (int n = 0; n < n_1; n++) {
      for (int k = 0; k < k_1; k++) {
        tmp_buf[m] = (tmp_buf[m]) + (b[((n_1 * m) * k_1 + k) + k_1 * n]);
      }
    }
    sum[0] = (sum[0]) + (tmp_buf[m]);
    Free(tmp_buf);
  }
}
```

The existing Rfactor tests cover this case, although I did rename a few for clarity. This change broke the LLVMRFactorVectorizedReduction test because it now does what its intending to (vectorize a loop with a reduction in it) rather than nothing, and since that doesn't work it correctly fails. I've disabled it for now.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/40050

Reviewed By: ZolotukhinM

Differential Revision: D22605639

Pulled By: nickgg

fbshipit-source-id: e359be53ea62d9106901cfbbc42d55d0e300e8e0
2020-07-21 14:44:26 -07:00
Ann Shan
1039bbf4eb add named parameters to mobile module (#41376)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41376

torch::jit::mobile::Module does not currently support accessing parameters via their attribute names, but torch::jit::Module does. This diff adds an equivalent functionality to mobile::Module.

Test Plan: Imported from OSS

Reviewed By: iseeyuan

Differential Revision: D22609142

Pulled By: ann-ss

fbshipit-source-id: 1a5272ff336f99a3c0bb6194c6a6384754f47846
2020-07-20 15:57:49 -07:00
Ilia Cherniavskii
e7a09b4d17 RecordFunction in Dispatcher (#37587)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37587

Lifting RecordFunction up into the dispatcher code

Test Plan: Imported from OSS

Differential Revision: D21374246

fbshipit-source-id: 19f9c1719e6fd3990e451c5bbd771121e91128f7
2020-07-17 22:20:05 -07:00
Stanislau Hlebik
b774ce54f8 remediation of S205607
fbshipit-source-id: 798decc90db4f13770e97cdce3c0df7d5421b2a3
2020-07-17 17:19:47 -07:00
Stanislau Hlebik
8fdea489af remediation of S205607
fbshipit-source-id: 5113fe0c527595e4227ff827253b7414abbdf7ac
2020-07-17 17:17:03 -07:00
Heitor Schueroff de Souza
cf811d2fb3 retain undefined tensors in backward pass (#41490)
Summary:
Leave undefined tensors / None returned from custom backward functions as undefined/None instead of creating a tensor full of zeros. This change improves performance in some cases.

**This is BC-Breaking:** Custom backward functions that return None will now see it potentially being propagated all the way up to AccumulateGrad nodes. Potential impact is that .grad field of leaf tensors as well as the result of autograd.grad may be undefined/None where it used to be a tensor full of zeros. Also, autograd.grad may raise an error, if so, consider using allow_unused=True ([see doc](https://pytorch.org/docs/stable/autograd.html?highlight=autograd%20grad#torch.autograd.grad)) if it applies to your case.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41490

Reviewed By: albanD

Differential Revision: D22578241

Pulled By: heitorschueroff

fbshipit-source-id: f4966f4cb520069294f8c5c1691eeea799cc0abe
2020-07-17 12:42:50 -07:00
Mikhail Zolotukhin
5d7046522b [JIT] Teach IRPrinter and IRParser to handle 'requires_grad' and 'device' as a part of type info. (#41507)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41507

These fields have always been a part of tensor types, this change just
makes them serializable through IR dumps.

Test Plan: Imported from OSS

Reviewed By: Krovatkin, ngimel

Differential Revision: D22563661

Pulled By: ZolotukhinM

fbshipit-source-id: f01aaa130b7e0005bf1ff21f65827fc24755b360
2020-07-17 10:27:04 -07:00
albanD
45c5bac870 [WIP] Fix cpp grad accessor API (#40887)
Summary:
Update the API to access grad in cpp to avoid unexpected thread safety issues.
In particular, with the current API, a check like `t.grad().defined()` is not thread safe.

- This introduces `t.mutable_grad()` that should be used when getting a mutable version of the saved gradient. This function is **not** thread safe.
- The `Tensor& grad()` API is now removed. We could not do a deprecation cycle as most of our call side use non-const Tensors that use the non-const overload. This would lead to most calls hitting the warning. This would be too verbose for all the users.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/40887

Reviewed By: ezyang

Differential Revision: D22343932

Pulled By: albanD

fbshipit-source-id: d5eb909bb743bc20caaf2098196e18ca4110c5d2
2020-07-16 09:11:12 -07:00
Pritam Damania
ff6e560301 Add C++ end to end test for RPC and distributed autograd. (#36893)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36893

Adding an end to end test for running a simple training loop in C++
for the distributed RPC framework.

The goal of this change is to enable LeakSanitizer and potentially catch memory
leaks in the Future. Enabling LSAN with python multiprocessing is tricky and we
haven't found a solution for this. As a result, adding a C++ test that triggers
most of the critical codepaths would be good for now.

As an example, this unit test would've caught the memory leak fixed by:
https://github.com/pytorch/pytorch/pull/31030
ghstack-source-id: 107781167

Test Plan:
1) Verify the test catches memory leaks.
2) waitforbuildbot

Reviewed By: mrshenli

Differential Revision: D21112208

fbshipit-source-id: 4eb2a6b409253108f6b6e14352e593d250c7a64d
2020-07-15 12:59:19 -07:00
Meghan Lele
4972cf06a2 [JIT] Add out-of-source-tree to_backend tests (#41145)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41145

**Summary**
This commit adds out-of-source-tree tests for `to_backend`. These tests check
that a Module can be lowered to a backend, exported, loaded (in both
Python and C++) and executed.

**Fixes**
This commit fixes #40067.

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D22510076

Pulled By: SplitInfinity

fbshipit-source-id: f65964ef3092a095740f06636ed5b1eb0884492d
2020-07-14 10:57:04 -07:00
yyn19951228
98df9781a7 Impl for ParameterList (#41259)
Summary:
This is a new PR for https://github.com/pytorch/pytorch/issues/40850, https://github.com/pytorch/pytorch/issues/40987 and https://github.com/pytorch/pytorch/issues/41206(I unintentionally closed), as I have some issues for rebates for that one. Very sorry about that. And I have fixed the tests failed in that PR.

This diff contains the implementation of C++ API for ParameterList from https://github.com/pytorch/pytorch/issues/25883.
Refer to the Python API: bc9e8af218/torch/nn/modules/container.py (L376)
Not sure about some naming difference between C++ API and Python API, like `append`, should it be called `push_back`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41259

Test Plan: Add unit tests in this diff

Differential Revision: D22495780

Pulled By: glaringlee

fbshipit-source-id: 79ea3592db640f35477d445ecdaeafbdad814bec
2020-07-12 20:50:31 -07:00
Sebastian Messmer
9daba76ba1 Change to.dtype_layout to c10-full (#41169)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41169

-
ghstack-source-id: 107537240

Test Plan: waitforsandcastle

Differential Revision: D22289257

fbshipit-source-id: ed3cc06327951fa886eb3b8f1c8bcc014ae2bc41
2020-07-10 16:04:34 -07:00
Zino Benaissa
690946c49d Generalize constant_table from tensor only to ivalue (#40718)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40718

Currently only constant except tensor must be inlined during serialization.
Tensor are stored in the contant table. This patch generalizes this capability
to any IValue. This is particularly useful for non ASCII string literal that
cannot be inlined.

Test Plan: Imported from OSS

Differential Revision: D22298169

Pulled By: bzinodev

fbshipit-source-id: 88cc59af9cc45e426ca8002175593b9e431f4bac
2020-07-09 09:09:40 -07:00
Rohan Varma
bf9cc5c776 Add callback with TLS state API in futures (#40326)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40326

Adds a helper function `addCallbackWithTLSState` to both
torch/csrc/utils/future.h which is used internally by RPC framework and the JIT
future. Uses this helper function to avoid to pass in TLS state where it is needed for rpc and `record_function_ops.cpp`. For example, the following:

```
at::ThreadLocalState tls_state;
fut->addCallback([tls_state = std::move(tls_state)]() {
at::ThreadLocalStateGuard g(tls_state);
some_cb_that_requires_tls_state();
}
```

becomes

```
fut->addCallbackWithTLSState(some_cb_that_requires_tls_state);
```
ghstack-source-id: 107383961

Test Plan: RPC Tests and added a test in test_misc.cpp

Differential Revision: D22147634

fbshipit-source-id: 46c02337b90ee58ca5a0861e932413c40d06ed4c
2020-07-08 23:25:35 -07:00
Brian Vaughan
2bc9ee97d1 Revert D22418731: [JIT] Add out-of-source-tree to_backend tests
Test Plan: revert-hammer

Differential Revision:
D22418731 (e2a291b396)

Original commit changeset: 621ba4efc1b1

fbshipit-source-id: 475ae24c5b612fe285035e5ebb92ffc66780a468
2020-07-08 13:11:45 -07:00
Meghan Lele
e2a291b396 [JIT] Add out-of-source-tree to_backend tests (#40842)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40842

**Summary**
This commit adds out-of-source-tree tests for `to_backend`. These tests check
that a Module can be lowered to a backend, exported, loaded (in both
Python and C++) and executed.

**Fixes**
This commit fixes #40067.

Test Plan: Imported from OSS

Differential Revision: D22418731

Pulled By: SplitInfinity

fbshipit-source-id: 621ba4efc1b121fa76c9c7ca377792ac7440d250
2020-07-07 21:00:43 -07:00
Meghan Lele
5a4c45f8d1 [JIT] Move TestBackend to test directory (#40840)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40840

**Summary**
This commit moves the TestBackend used for the JIT backend
extension to the tests directory. It was temporarily placed
in the source directory while figuring out some details of
the user experience for this feature.

**Test Plan**
`python test/test_jit.py TestBackends`

**Fixes**
This commit fixes #40067.

Test Plan: Imported from OSS

Differential Revision: D22418682

Pulled By: SplitInfinity

fbshipit-source-id: 9356af1341ec4d552a41c2a8929b327bc8b56057
2020-07-07 21:00:38 -07:00
James Reed
c0f9bf9bea s/torch::jit::class_/torch::class_/ (#40795)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/40795

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D22314215

Pulled By: jamesr66a

fbshipit-source-id: a2fb5c6804d4014f8e437c6858a7be8cd3efb380
2020-07-06 15:53:33 -07:00
Christian Sarofeen
b9b4f05abf [nvFuser] Working towards reductions, codegen improvements (#40864)
Summary:
Have basic reduction fusion working, and have improved code generator to approach performance of eager mode reductions. Coming soon will be pointwise-reduction fusions in a way that should prevent the possibility of hitting regressions. Also working on performant softmax kernels in the code generator which may be our next fusion target.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/40864

Reviewed By: ngimel

Differential Revision: D22392877

Pulled By: soumith

fbshipit-source-id: 457448a807d628b1035f6d90bc0abe8a87bf8447
2020-07-06 14:52:49 -07:00
Sebastian Messmer
53af9df557 Unify boxed function signature between jit and c10 (#37034)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37034

c10 takes a Stack* in boxed functions while JIT took Stack&.
c10 doesn't return anything while JIT returns an int which is always zero.

This changes JIT to follow the c10 behavior.
ghstack-source-id: 106834069

Test Plan: unit tests

Differential Revision: D20567950

fbshipit-source-id: 1a7aea291023afc52ae706957e9a5ca576fbb53b
2020-06-29 19:24:26 -07:00
yyn19951228
4121d34036 Python/C++ API Parity: Add impl and tests for ParameterDict (#40654)
Summary:
This diff contains the implementation of C++ api for ParameterDict from https://github.com/pytorch/pytorch/issues/25883, refer to  https://github.com/pytorch/pytorch/issues/36904 and https://github.com/pytorch/pytorch/issues/28652
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40654

Test Plan: Add unit test in this diff

Differential Revision: D22273265

Pulled By: glaringlee

fbshipit-source-id: 9134a92c95eacdd53d5b24470d5f7edbeb40a488
2020-06-29 08:50:44 -07:00
Peter Bell
3dcc329746 Use tree-based sum for floats to avoid numerical instability (#39516)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/38716, fixes https://github.com/pytorch/pytorch/issues/37234

This algorithm does the summation along a single axis with multiple "levels" of accumulator, each of which is designed to hold the sum of an order of magnitude more values than the previous.

e.g. if there are 2^16 elements, the first level will hold the sum of 2^4 elements, and so on in increasing powers of 2: 2^4, 2^8, 2^12 and finally 2^16.

This limits the differences in magnitude of the partial results being added together, and so we don't lose accuracy as the axis length increases.

WIP to write a vectorized version.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39516

Reviewed By: ezyang

Differential Revision: D22106251

Pulled By: ngimel

fbshipit-source-id: b56de4773292439dbda62b91f44ff37715850ae9
2020-06-24 17:06:38 -07:00
Peter Bell
16f276cef9 Add C++-only int dim overloads to std-related operations (#40451)
Summary:
Fixes gh-40287

The `int -> bool` conversion takes higher precedence than `int -> IntArrayRef`. So, calling `std(0)` in C++ would select the `std(unbiased=False)` overload instead.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40451

Differential Revision: D22217926

Pulled By: ezyang

fbshipit-source-id: 7520792fab5ab6665bddd03b6f57444c6c729af4
2020-06-24 16:56:55 -07:00
Mike Ruberry
e66445878d Adds dynamic versioning pattern (#40279)
Summary:
BC NOTE:

This change makes it so modules saved with torch.jit.save in PyTorch 1.6 can be loaded by previous versions of PyTorch unless they use torch.div or (soon) torch.full. It also lets tensors saved using torch.save be loaded by previous versions. So this is the opposite of BC-breaking, but I'm using that label to highlight this issue since we don't have a "BC-improving" label.

PR NOTE:
When an operator's semantics change in PyTorch we want to do two things:

1) Preserve the semantics of older serialized Torchscript programs that use the operator
2) Ensure the new semantics are respected

Historically, this meant writing a Versioned Symbol that would remap older versions of the operator into current PyTorch code (1), and bumping the produced file format version (2). Unfortunately, bumping the produced file format version is a nuclear option for ensuring semantics are respected, since it also prevents older versions of PyTorch from loading anything (even tensors!) from newer versions.

Dynamic versioning addresses the nuclear consequences of bumping the produced file format version by only bumping it when necessary. That is, when an operator with changed semantics is detected in the serialized Torchscript. This will prevent Torchscript programs that use the changed operator from loading on earlier versions of PyTorch, as desired, but will have no impact on programs that don't use the changed operator.

Note that this change is only applicable when using torch.jit.save and torch.jit.load. torch.save pickles the given object using pickle (by default), which saves a function's Python directly.

No new tests for this behavior are added since the existing tests for versioned division in test_save_load already validate that models with div are loaded correctly at version 4.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40279

Reviewed By: dzhulgakov

Differential Revision: D22168291

Pulled By: mruberry

fbshipit-source-id: e71d6380e727e25123c7eedf6d80e5d7f1fe9f95
2020-06-24 12:52:50 -07:00
Mike Ruberry
cb26661fe4 Throws runtime error when torch.full would infer a float dtype from a bool or integral fill value (#40364)
Summary:
BC-breaking NOTE:

In PyTorch 1.6 bool and integral fill values given to torch.full must set the dtype our out keyword arguments. In prior versions of PyTorch these fill values would return float tensors by default, but in PyTorch 1.7 they will return a bool or long tensor, respectively. The documentation for torch.full has been updated to reflect this.

PR NOTE:

This PR causes torch.full to throw a runtime error when it would have inferred a float dtype by being given a boolean or integer value. A versioned symbol for torch.full is added to preserve the behavior of already serialized Torchscript programs. Existing tests for this behavior being deprecated have been updated to reflect it now being unsupported, and a couple new tests have been added to validate the versioned symbol behavior. The documentation of torch.full has also been updated to reflect this change.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40364

Differential Revision: D22176640

Pulled By: mruberry

fbshipit-source-id: b20158ebbcb4f6bf269d05a688bcf4f6c853a965
2020-06-23 23:27:22 -07:00
Deepali Chourasia
02e091902f Release DistAutogradContainer context for each dist_autograd test case (#38711)
Summary:
this fixes - https://github.com/pytorch/pytorch/issues/38710
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38711

Differential Revision: D22132057

fbshipit-source-id: 894280d164543c63beaec679c18f2059e7055b01
2020-06-18 20:58:55 -07:00
Xiang Gao
954a59a2f5 Add at::tensor(complex) and torch::tensor(complex) overload (#39793)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39793

Differential Revision: D22067181

Pulled By: anjali411

fbshipit-source-id: 3cec1289a8aa3a9cc6bd1fcdb2974f858f75f7bd
2020-06-18 16:20:27 -07:00
Sotiris Lamprinidis
41f2dbde31 Add AdamW to C++ frontend (#40009)
Summary:
Slightly modified Adam, following the python implementation, and the `ProducesPyTorchValues` tests pass. I had a problem with another test though (see commit c1a6241676ab84fc531c1c3a10f964aa5704092e), it seems that optimizing for two steps with the same optimizer vs optimizing for two steps using freshly initialized objects will produce the same output.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40009

Differential Revision: D22096053

Pulled By: glaringlee

fbshipit-source-id: a31a8f5488cb37c53752ddf15436efabdba67dc4
2020-06-18 15:28:12 -07:00
Mikhail Zolotukhin
79450edad3 [JIT] IRParser: properly parse negative numbers. (#39981)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39981

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D22032786

Pulled By: ZolotukhinM

fbshipit-source-id: b6c5237ac5c1c331d5053a620eb9a37a4c698125
2020-06-15 12:28:41 -07:00
Jeremy Lilley
569c85b45d [futures] Add assert to Future constValue() accessor, add hasValue(). (#39950)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39950

Per the comment in the code, constValue() should only be used in
the case where the future was complete and value was not an error.
Add an assert to enforce this.

Also, add hasValue() accessor for completeness.
ghstack-source-id: 105815597

Test Plan: buck test mode/dev-nosan caffe2/test/cpp/jit:

Differential Revision: D22021776

fbshipit-source-id: b59b6c775eab344068a76f4cd8c3a9dc1f2a174e
2020-06-15 12:11:22 -07:00
Kurt Mohler
124cdf2290 Add experimental deterministic flag (#38683)
Summary:
Adds `torch.experimental.deterministic` flag to enforce deterministic algorithms across all of pytorch.
Adds `torch.experimental.deterministic_error_level` to allow users to choose between error/warning/silent if determinism for an operation is not available.
Adds `torch.experimental.alert_not_deterministic()` which should be called within operations that are not deterministic.
Offers both Python and ATen interfaces

Issue https://github.com/pytorch/pytorch/issues/15359
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38683

Differential Revision: D21998093

Pulled By: ezyang

fbshipit-source-id: 23aabbddd20f6199d846f97764ff24d728163737
2020-06-12 08:44:06 -07:00