Commit Graph

6361 Commits

Author SHA1 Message Date
Nikitha Malgi
616503ac19 Merge branch 'cuda_stream_jit' of https://github.com/pytorch/pytorch into cuda_stream_jit 2020-11-13 08:14:10 -08:00
Nikitha Malgi
91142f77fe WIP changes 2020-10-26 09:30:51 -07:00
Nikitha Malgi
fa3801a44e WIP changes 2020-10-26 08:41:58 -07:00
Nikitha Malgi
68eb21b72a Adding JIT CUDA Stream support 2020-10-23 11:00:41 -07:00
Ansley Ussery
6c5f634657 Fix grammar and spelling errors (#46713)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46713

Test Plan: Imported from OSS

Reviewed By: Lilyjjo

Differential Revision: D24477771

Pulled By: ansley

fbshipit-source-id: bc39b63ab2158a5233e48b89bfaa97a4cfb1f7a1
2020-10-23 01:31:17 -07:00
albanD
27e2ea4cea Make add_relu an internal function (#46676)
Summary:
Cleanup for 1.7

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46676

Reviewed By: gchanan

Differential Revision: D24458565

Pulled By: albanD

fbshipit-source-id: b1e4b4630233d3f1a4bac20e3077411d1ae17f7b
2020-10-22 18:08:15 -07:00
Hao Lu
d6519d4e9f [pt][static_runtime] Add option enable_out_variant (#46690)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46690

- Add option enable_out_variant to Static Runtime
- Add gflags --pt_cleanup_activations and --pt_enable_out_variant to the benchmark script

Reviewed By: yinghai, houseroad

Differential Revision: D24438107

fbshipit-source-id: c1185c0fee93edc0118542b2faa8bc4ffdd19075
2020-10-22 15:00:23 -07:00
Ivan Kobzarev
3112e23428 [py][vulkan][reland] Add is_vulkan to py api, add vulkan to device type parsing (#46655)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46655

Test Plan: Imported from OSS

Pulled By: IvanKobzarev

Reviewed By: mrshenli

Differential Revision: D24448984

fbshipit-source-id: 5000846a06077f7a5a06dd51da422d2a42f70820
2020-10-22 09:35:50 -07:00
albanD
143d1fd9f5 Namespace cleanup for 1.7 Part 2 (#46673)
Summary:
make valgrind_toggle and valgrind_supported_platform private functions

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46673

Reviewed By: gchanan

Differential Revision: D24458133

Pulled By: albanD

fbshipit-source-id: 6f3fad9931d73223085edbd3cd3b7830c569570c
2020-10-22 07:57:51 -07:00
Yi Wang
98aad933b6 [pytorch][PR] Record FutureNCCL callback stream on CUDA caching allocator (#45318)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45318

When calling `then()` from WorkNCCL, record the input data pointers in futureNCCLCallbackStream_ before the execution of the input callback.

Note that the recording cannot be directly added to the lambda used by addCallback in ProcessGroupNCCL.hpp. This is because the type of future value in that context is pyobject rather than TensorList, but a type casting will require pybind and introduce Python dependency, which should not be allowed in c10d library.

I have considered creating a util function in a separate file to support this type casting, and then placing it under torch/csrc directory where python dependency is allowed. However, torch/csrc has a dependency on c10d, so this will create a circular dependency.

Finally, a `record_stream_cb_` member is added to FutureNCCL, and the default value is nullptr. A default `record_stream_cb_` implementation is added to `PythonFutureWrapper,` where Python dependency is allowed.

In addition, a few lines are reformatted by lint.
caffe2/torch/csrc/distributed/c10d/init.cpp is only reformatted.

#Closes: https://github.com/pytorch/pytorch/issues/44203

Test Plan:
buck test mode/dev-nosan caffe2/test/distributed:c10d -- ProcessGroupNCCLTest
buck test mode/dev-nosan caffe2/test/distributed:c10d  -- test_accumulate_gradients_no_sync_allreduce_with_then_hook
buck test mode/dev-nosan caffe2/test/distributed:c10d  -- test_ddp_comm_hook_allreduce_with_then_hook_nccl

Reviewed By: pritamdamania87

Differential Revision: D23910257

fbshipit-source-id: 66920746c41f3a27a3689f22e2a2d9709d0faa15
2020-10-22 01:49:47 -07:00
Yi Wang
adffd8eb6b Add const to the first arg 'grad' of Reducer::copy_grad_to_bucket (#46501)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46501

Gradients in this method will not be modified.
ghstack-source-id: 114851646

Test Plan: waitforbuildbot

Reviewed By: pritamdamania87

Differential Revision: D24374300

fbshipit-source-id: a2941891008f9f197a5234b50260218932d2d37d
2020-10-21 21:34:31 -07:00
Rahul Nambiar
adbb50ea67 Enabling alias annotation checks for all operations during autograd tests (#46601)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46601

* except excluded tests and magic methods.

https://github.com/pytorch/pytorch/issues/38731

Previously, we'd only do run these tests for inplace operations. Since this is a lot more tests, fixed these issues that came up when running them -
- Updated schema of conj() to reflect existing behaviour.
- Updated deepEquals method in check_alias_annotation.cpp to re-use the overloaded == operator. Previous implementation did not cover all types of IValues.
- Corrected the order inputs are passed in during autograd testing of 'view' & 'reshape'.
- Subbed out atn::ger with the func its aliased to, atn::outer, for testing. The alias annotation checking code doesn't handle aliased operators properly.
ghstack-source-id: 114830903

Test Plan: Ran all tests in test:jit and verified they pass.

Reviewed By: eellison

Differential Revision: D24424955

fbshipit-source-id: 382d7e2585911b81b1573f21fff1d54a5e9a2054
2020-10-21 20:01:57 -07:00
Shen Li
cebe87fe3a Revert D24379422: [py][vulkan] Add is_vulkan to py api, add vulkan to device type parsing
Test Plan: revert-hammer

Differential Revision:
D24379422 (e8fbe54cf5)

Original commit changeset: afab89bb9e17

fbshipit-source-id: 743c77e453239f10c155c67490cba5a42ab42f58
2020-10-21 08:23:05 -07:00
Ivan Kobzarev
e8fbe54cf5 [py][vulkan] Add is_vulkan to py api, add vulkan to device type parsing (#46511)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46511

Test Plan: Imported from OSS

Reviewed By: AshkanAliabadi

Differential Revision: D24379422

Pulled By: IvanKobzarev

fbshipit-source-id: afab89bb9e17c50934083598262bbe14ea82e893
2020-10-20 20:04:24 -07:00
Pritam Damania
cb3c1d17e4 Promote -Wcast-function-type to an error in builds. (#46356)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46356

Adding the flag `-Werror=cast-function-type` to ensure we don't allow
any invalid casts (ex: PyCFunction casts).

For more details see: https://github.com/pytorch/pytorch/issues/45419
ghstack-source-id: 114632980

Test Plan: waitforbuildbot

Reviewed By: albanD

Differential Revision: D24319759

fbshipit-source-id: 26ce4650c220e8e9dd3550245f214c7e6c21a5dc
2020-10-20 18:09:06 -07:00
Yanan Cao
42a70dc5a8 Implement all communication APIs in DistributedC10d new frontend (#46053)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46053

Reviewed By: wanchaol

Differential Revision: D24300487

Pulled By: gmagogsfm

fbshipit-source-id: 0d0b01c4f9d9e1d59dd17d7606ce47d54d61951d
2020-10-20 17:52:07 -07:00
Lillian Johnson
f83cf2dab3 [JIT] adding torch.jit.isinstance support (#46062)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46062

Adds support for torch.jit.isinstance in both eager and script mode

Example use:

```
import torch
from typing import Any, List

class TestModule(torch.nn.Module):
    def __init__(self):
        super(TestModule, self).__init__()

    def call(self, input1: str, input2: str) -> str:
        return input1

    def forward(self, input: Any) -> None:
        if torch.jit.isinstance(input, List[str]):
            for el in input:
                print(el)

TestModule().forward(["1","2"])
scripted_module = torch.jit.script(TestModule())
scripted_module(["1", "2"])
```

Test Plan: Imported from OSS

Reviewed By: bertmaher, zou3519

Differential Revision: D24264415

Pulled By: Lilyjjo

fbshipit-source-id: 039c95bddd854c414027ac8332832e6bc830b5b9
2020-10-20 16:47:49 -07:00
Ansley Ussery
fdc5261a20 Support %-based string formatting (#45976)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45976

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D24374215

Pulled By: ansley

fbshipit-source-id: 2005fe7f09dc8d3c44c4bfdccab6b4dc46a5e517
2020-10-20 16:13:36 -07:00
Ashkan Aliabadi
4f5b55f722 Revert D24395956: [pytorch][PR] Replace flatten tensors with flatten loops.
Test Plan: revert-hammer

Differential Revision:
D24395956 (2f51ddb81f)

Original commit changeset: f3792903f206

fbshipit-source-id: ef70713f0f67f577b09674219631d22440ceec31
2020-10-20 15:42:23 -07:00
Pritam Damania
2b221a9599 Remove PyCFunction casts as much as possible. (#46227)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46227

Follow up from https://github.com/pytorch/pytorch/issues/45419, in
this PR I've removed as many PyCFunction casts as I could from the codebase.

The only ones I didn't remove were the ones with `METH_VARARGS | METH_KEYWORDS`
which have 3 parameters instead of 2 and had to be casted. Example: `
{"copy_", (PyCFunction)(void(*)(void))THPStorage_(copy_), METH_VARARGS |
METH_KEYWORDS, nullptr},`
ghstack-source-id: 114632704

Test Plan: waitforbuildbot

Reviewed By: albanD

Differential Revision: D24269435

fbshipit-source-id: 025cfd43a9a2a3e59f6b2951c1a78749193d77cf
2020-10-20 15:01:51 -07:00
Hao Lu
1a3ea46dbf [StaticRuntime] Threading model (#46219)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46219

- Refactor StaticRuntime and group common data structures, the jit graph, and the script module into a separate struct `InferenceModule`:
```
struct InferenceModule {
  explicit InferenceModule(const torch::jit::Module& m);
  explicit InferenceModule(std::shared_ptr<torch::jit::Graph> g);
  torch::jit::Module module;
  std::shared_ptr<torch::jit::Graph> graph;
  std::unique_ptr<c10::FunctionSchema> schema;

  std::unordered_map<Value*, size_t> value_to_reg;
  std::vector<size_t> input_regs; // inputs to the graph
  std::vector<size_t> output_regs; // outputs of the graph
  std::vector<size_t> internals;
};
```
which is stored in the PyTorchPredictor, as well as the static runtime, and shared across threads. Then this is what's left inside the Static Runtime:
```
  mutable std::vector<IValue> reg_;
  // The nodes we need to run
  std::vector<ProcessedNode> nodes_;
```
`reg_` holds all the weights and activations, which is different across threads during running. `nodes_` holds the op nodes and input/output registers, and is the same across threads for now. We could potentially put other stateful data structures in it, so I kept it inside the static runtime. It could be easily moved into the `InferenceModule` if we decide not to anything else into `ProcessedNode`.

- Added StaticRuntimeOptions so we can toggle certain optimizations on/off, for testing and benchmarking. `cleanup_activations` is an example.

- Integration with PyTorchPredictor. Added a lockfree stack in the PyTorchPredictor to hold all the static runtime instances. Benchmark shows that the `push` and `pop` combo takes about 80 ns, which is quite acceptable.

This diff focuses on threading model only. Benchmarks will be separate.

Reviewed By: bwasti

Differential Revision: D24237078

fbshipit-source-id: fd0d6347f02b4526ac17dec1f731db48424bade1
2020-10-20 14:37:30 -07:00
Raghavan Raman
2f51ddb81f Replace flatten tensors with flatten loops. (#46539)
Summary:
This diff changes `TensorExprKernel::generateStmt` to use flatten loops instead of flatten tensors.

Checked all tests on CPU as well as CUDA.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46539

Reviewed By: nickgg

Differential Revision: D24395956

Pulled By: navahgar

fbshipit-source-id: f3792903f2069bda37b571c9f0a840e6fb02f189
2020-10-20 12:16:18 -07:00
Shen Li
5003fd189c Add an option to getWriteableTensorData to avoid copy CUDA tensor to CPU (#46524)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46524

Test Plan: Imported from OSS

Reviewed By: wanchaol

Differential Revision: D24392794

Pulled By: mrshenli

fbshipit-source-id: 21bf81dfc6c1d81689f8278d81f4c8776bc76ec1
2020-10-20 08:54:58 -07:00
Mikhail Zolotukhin
e5ed037529 [StaticRuntime] Add a 'speed of light' benchmark. (#46308)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46308

This PR adds a hand optimized version of DeepAndWide model with the goal
of estimating overheads of static runtime. While static runtime is
currently much faster than the existing JIT interpreter, it would be
useful to understand how close we are to an absolutely 0-overhead
system. Currently, this "ideal" implementation is 2x faster than the
static runtime on batchsize=1.

Full benchmark results:
```
Running build/bin/static_runtime_bench
Run on (24 X 2394.71 MHz CPU s)
CPU Caches:
  L1 Data 32K (x24)
  L1 Instruction 32K (x24)
  L2 Unified 4096K (x24)
  L3 Unified 16384K (x24)
------------------------------------------------------------------------------
Benchmark                                       Time           CPU Iterations
------------------------------------------------------------------------------
BM_deep_wide_base/1                         59518 ns      59500 ns      10909
BM_deep_wide_base/8                         74635 ns      74632 ns       9317
BM_deep_wide_base/20                        82186 ns      82147 ns       9119
BM_deep_wide_fast/1                         13851 ns      13851 ns      49825 << new
BM_deep_wide_fast/8                         22497 ns      22497 ns      32089 << new
BM_deep_wide_fast/20                        23868 ns      23841 ns      31184 << new
BM_deep_wide_jit_graph_executor/1           62786 ns      62786 ns      10835
BM_deep_wide_jit_graph_executor/8           76730 ns      76718 ns       7529
BM_deep_wide_jit_graph_executor/20          78886 ns      78883 ns       8769
BM_deep_wide_jit_profiling_executor/1       69504 ns      69490 ns      10309
BM_deep_wide_jit_profiling_executor/8       75718 ns      75715 ns       9199
BM_deep_wide_jit_profiling_executor/20      75364 ns      75364 ns       9010
BM_deep_wide_static/1                       40324 ns      40318 ns      17232
BM_deep_wide_static/8                       50327 ns      50319 ns      13335
BM_deep_wide_static/20                      53075 ns      53071 ns      12855
BM_deep_wide_static_threaded/threads:8       6258 ns      49873 ns      14008
```

PS: The implementation could probably be optimized even more.

Differential Revision: D24300702

Test Plan: Imported from OSS

Reviewed By: dzhulgakov

Pulled By: ZolotukhinM

fbshipit-source-id: 7870bdef127c39d11bcaa4f03a60eb80a46be58e
2020-10-19 23:35:55 -07:00
Nick Gibson
17f8c329df [NNC] IRSimplifier rules for Compare and Mod (#46412)
Summary:
Adds new rules to the NNC IRSimplifier to take care of the following cases:

* Comparisons which are symbolic but have a constant difference. E.g. this is most useful in cases like `if (x > x + 4) ...` which we can now eliminate.

* Simplification of `Mod` nodes, including simple rules such as `0 % x` and `x % 1`, but also factorization of both sides to find common symbolic multiples. E.g. `(x * y) % x` can be cancelled out to `0`.

See tests for many more examples!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46412

Reviewed By: navahgar

Differential Revision: D24396151

Pulled By: nickgg

fbshipit-source-id: abb954dc930867d62010dcbcd8a4701430733715
2020-10-19 19:37:09 -07:00
Richard Barnes
cda88e8e4b Fix interval midpoint calculation in register_op_utils
Summary: Interval midpoint calculations can overflow (integers). This fixes such an instance.

Test Plan: Standard test rigs.

Reviewed By: iseeyuan

Differential Revision: D24392608

fbshipit-source-id: 0face1133d99cea342abbf8884b14262d50b0826
2020-10-19 16:11:22 -07:00
jiej
ac146c4820 [nvFuser] Switching to CudaFusionGuard from BailOut for nvfuser - update 2 (#46452)
Summary:
1. Added CudaFusionGuard as the custom TypeCheck for nvfuser; enabled dynamic shape support with profiling executor;
2. dropped support for legacy fuser;
3. re-enabled nvfuser tests;
4. added registration for profiling record to allow profiling on user specified nodes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46452

Reviewed By: zou3519, anjali411

Differential Revision: D24364642

Pulled By: ngimel

fbshipit-source-id: daf53a9a6b6636e1ede420a3a6d0397d4a8b450b
2020-10-19 15:44:31 -07:00
Iurii Zdebskyi
e7564b076c Refactor scalar list APIs to use overloads (#45673)
Summary:
Refactor foreach APIs to use overloads in case of scalar list inputs.
Tested via unit tests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45673

Reviewed By: heitorschueroff

Differential Revision: D24053424

Pulled By: izdeby

fbshipit-source-id: 35976cc50b4acfe228a32ed26cede579d5621cde
2020-10-19 09:28:49 -07:00
Brian Hirsh
00c779a92b detect inplace modifications of views earlier (fix #21875) (#46204)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46204

Test Plan: Imported from OSS

Reviewed By: izdeby

Differential Revision: D24259500

Pulled By: bdhirsh

fbshipit-source-id: 223f8a07da4e4121009fc0a8b6760d90eef089b3
2020-10-19 08:58:33 -07:00
Nikolay Korovaiko
c3466dabaa Disable profiling when getGraphExecutorOptimize is unset (#46479)
Summary:
`getGraphExecutorOptimize` mandates we don't do any optimizations beyond what's required to run graphs. In this scenario, we don't want to do any profiling as profiling information will not be used.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46479

Reviewed By: ZolotukhinM

Differential Revision: D24368292

Pulled By: Krovatkin

fbshipit-source-id: a2c7618d459efca9cb0700c4d64d829b352792a8
2020-10-17 22:30:05 -07:00
Peter Bell
da95eec613 torch.fft: Two dimensional FFT functions (#45164)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45164

This PR implements `fft2`, `ifft2`, `rfft2` and `irfft2`. These are the last functions required for `torch.fft` to match `numpy.fft`. If you look at either NumPy or SciPy you'll see that the 2-dimensional variants are identical to `*fftn` in every way, except for the default value of `axes`. In fact you can even use `fft2` to do general n-dimensional transforms.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D24363639

Pulled By: mruberry

fbshipit-source-id: 95191b51a0f0b8e8e301b2c20672ed4304d02a57
2020-10-17 16:23:06 -07:00
Tao Xu
495070b388 [Metal] Add the Python binding for optimize_for_mobile (#46456)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46456

Add the python binding in CMake. The general workflow is

- Build pytorch -  `USE_PYTORCH_METAL=ON python setup.py install --cmake`
- Run optimize_for_mobile

```
import torch
from torch.utils.mobile_optimizer import optimize_for_mobile

scripted_model = torch.jit.load('./mobilenetv2.pt')
optimized_model = optimize_for_mobile(scripted_model, backend='metal')
torch.jit.export_opnames(optimized_model)
torch.jit.save(optimized_model, './mobilenetv2_metal.bc')
```
The exported ops are

```
['aten::adaptive_avg_pool2d', 'aten::add.Tensor', 'aten::addmm', 'aten::reshape', 'aten::size.int', 'metal::copy_to_host', 'metal_prepack::conv2d_run']
```
ghstack-source-id: 114559878

Test Plan:
- Sandcastle CI
- Circle CI

Reviewed By: kimishpatel

Differential Revision: D24356768

fbshipit-source-id: fb5c4c4b6316347b67edb4132da044a81470ddfd
2020-10-17 10:26:25 -07:00
Mikhail Zolotukhin
d6de9d573a [TensorExpr] Properly handle input types promotion and special case of empty inputs for aten::cat. (#46500)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46500

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D24373671

Pulled By: ZolotukhinM

fbshipit-source-id: b3be73a89a9ab6654212cb7094f32bf1c445e876
2020-10-16 20:26:46 -07:00
Mikhail Zolotukhin
0f668d95b6 [TensorExpr] Fix shape inference logic for aten::cat. (#46482)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46482

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D24366778

Pulled By: ZolotukhinM

fbshipit-source-id: 000ff363b11599ba3827cdf2db3d4793878b84ab
2020-10-16 20:24:30 -07:00
Tao Xu
04e5fcc0ed [GPU] Introduce USE_PYTORCH_METAL (#46383)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46383

The old `USE_METAL` is actually being used by Caffe2. Here we introduce a new macro to enable metal in pytorch.
ghstack-source-id: 114499392

Test Plan:
- Circle CI
- The Person Segmentation model works

Reviewed By: linbinyu

Differential Revision: D24322018

fbshipit-source-id: 4e5548afba426b49f314366d89b18ba0c7e745ca
2020-10-16 18:19:32 -07:00
Raghavan Raman
fa108bd264 Add flatten loops transformation (#46365)
Summary:
This diff removes the dependency of flattening on tensors by performing flattening on loops instead.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46365

Reviewed By: ailzhang

Differential Revision: D24366347

Pulled By: navahgar

fbshipit-source-id: 4ba182f37212b6e4033cae13f8e75bc5144389f4
2020-10-16 17:05:26 -07:00
Ailing Zhang
8c629ecc9a [WIP] Move catchAll to Math (#45939)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/45939

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D24165890

Pulled By: ailzhang

fbshipit-source-id: 72fe71ea95a738251b2fafc9eea4ab3831cf426b
2020-10-16 16:17:16 -07:00
Yi Wang
d1ca7ef33e [Gradient Compression] Rename the first arg of pybinding of _register_comm_hook: ddp_model -> reducer. (#46498)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46498

The name of the first arg "ddp_model" is misleading, because it is actually the reducer of DDP model rather than entire model.

This method is called in the file caffe2/torch/nn/parallel/distributed.py:
`dist._register_comm_hook(self.reducer, state, hook)`
ghstack-source-id: 114531188

(Note: this ignores all push blocking failures!)

Test Plan: waitforbuildbot

Reviewed By: pritamdamania87

Differential Revision: D24372827

fbshipit-source-id: dacb5a59e87400d93a2f35da43560a591ebc5499
2020-10-16 16:12:42 -07:00
Zino Benaissa
11cc7f143d Run __setstate__ when cloning modules (#45858)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45858

When cloning a module that has __setstate__, __getstate__ methods.
We need to load these methods to initialize these modules.

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D24116524

Pulled By: bzinodev

fbshipit-source-id: a5111638e2dc903781f6468838c000850d1f9a74
2020-10-16 15:55:31 -07:00
Kurt Mohler
28f8372bf4 Avoid mat1 references in mm_mat1_backward (#45777)
Summary:
Avoiding references to `mat1` in `mm_mat1_backward` is a first step to solving issue https://github.com/pytorch/pytorch/issues/42371

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45777

Reviewed By: malfet

Differential Revision: D24347967

Pulled By: albanD

fbshipit-source-id: f09a8149d9795481b5ed5b48fdd0e598ba027d0b
2020-10-16 13:52:44 -07:00
Richard Barnes
be0c431874 Fix implicit cast in custom_function (#46445)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46445

Fix an instance in which a truncated integer prevents downstream type safety checks.

Test Plan: I'm not sure what's appropriate here.

Reviewed By: albanD

Differential Revision: D24339292

fbshipit-source-id: 15748ec64446344ff1a8344005385906d3484d7c
2020-10-16 10:58:02 -07:00
Mikhail Zolotukhin
4359c5e036 [TensorExpr] Correctly handle negative dimensions in aten::cat when lowering to tensor expressions. (#46446)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46446

Fixes #46440.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D24356016

Pulled By: ZolotukhinM

fbshipit-source-id: b759760bb8c765aeb128eb94d18af20cddd888a2
2020-10-16 01:13:14 -07:00
Jinwoo Park
92921c82bb Add named tuple's error message and workaround for RET failure (#46347)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46347

Added the named tuple's error messages & workarounds when it returns from a function of a class in Pytorch Mobile.

To identify the error cases (returning NamedTuple type), I used the following coditions:
1) ins.op == RET  (for returing)
2) type->kind() == TypeKind::TupleType  (for pruning non-tuple types)
3) type->cast<TupleType>().name()  (for pruning Tuple type)
  - I could use the type's str (str() or repr_str()) directly, but I used whether it has the "name" attribute. Please give the comment for this.

[Information of Tuple and NamedTuple types]
1. Tuple
type->str(): (int, int)
type->repr_str(): Tuple[int, int]
type->kind():  TypeKind::TupleType         # different with other types
type()->cast<NamedType>(): True
type()->cast<NamedType>()>name(): False    # different with NamedTuple

2. NamedTuple
type->str():  __torch__.myNamedTuple
type->repr_str(): __torch__.myNamedTuple
type->kind():  TypeKind::TupleType         # different with other types
type()->cast<NamedType>(): True
type->cast<TupleType>().name() = True      # different with Tuple

(From the next diff, I will handle the other error cases: 1) returning List<module class>, Dict<module class> and 2) accessing Module class's member functions)
ghstack-source-id: 114361762

Test Plan:
[Added test results]
  buck test mode/dev caffe2/test:mobile -- 'test_unsupported_return'

  Summary
    Pass: 2
    ListingSuccess: 1
    Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/7036874440497926

[Whole test results]
  buck test mode/dev caffe2/test:mobile -- 'test'

  Summary
    Pass: 11
    ListingSuccess: 1
    Finished test run: https://our.intern.facebook.com/intern/testinfra/testrun/4503599664074084

Reviewed By: iseeyuan

Differential Revision: D24291962

fbshipit-source-id: a1a9e24e41a5f1e067738f59f1eae34d07cba31a
2020-10-15 17:41:06 -07:00
Ailing Zhang
d278e83e69 Update VariableTypeManual.cpp to not use catchAllKernel. (#46353)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46353

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D24319416

Pulled By: ailzhang

fbshipit-source-id: e6ca74919949f757112a35e8fce74bded45dcfde
2020-10-15 17:10:28 -07:00
albanD
849bc77ee4 Add quick fix for view/inplace issue with DDP (#46406)
Summary:
As per title, temporary mitigation for https://github.com/pytorch/pytorch/issues/46242 for which https://github.com/pytorch/pytorch/pull/46296 will be a proper fix.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46406

Reviewed By: malfet

Differential Revision: D24339689

Pulled By: albanD

fbshipit-source-id: 0726e5abe4608d8ffcd7846cbaaffbb8564b04ab
2020-10-15 15:13:11 -07:00
Ivan Yashchuk
528158af47 Updated derivatives for complex mm, mv, ger, bmm, triangular_solve (#45737)
Summary:
This PR updates derivatives for a few functions so that `gradgradcheck` for `torch.cholesky` is passed ([ref](https://github.com/pytorch/pytorch/pull/45267#discussion_r494439967)).

Some tests (that call to `bmm_cuda`) fail with with `RuntimeError: _th_bmm_out not supported on CUDAType for ComplexDouble`
until PR https://github.com/pytorch/pytorch/issues/42553 is merged.

Ref. https://github.com/pytorch/pytorch/issues/33152

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45737

Reviewed By: bdhirsh

Differential Revision: D24279917

Pulled By: anjali411

fbshipit-source-id: 7b696d2cfc2ef714332c2e3e5d207e257be67744
2020-10-15 11:27:30 -07:00
Elias Ellison
908c23579d [JIT] Revert Freezing shared type PR (#46285)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/45902 by reverting https://github.com/pytorch/pytorch/pull/42457

The test case introduced by https://github.com/pytorch/pytorch/pull/42457 was fixed by https://github.com/pytorch/pytorch/pull/46250, which I'm assuming is the real source of the bug.

In the future it would be good to provide repro's for freezing issues without including a quantization dependency; there was another another issue in freezing (see: https://github.com/pytorch/pytorch/pull/46054) who's root cause was the same quantization issue https://github.com/pytorch/pytorch/pull/46250.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46285

Reviewed By: bdhirsh

Differential Revision: D24288739

Pulled By: eellison

fbshipit-source-id: b69ee8c713f749cd93d5eba370c3eafed86568bb
2020-10-15 10:57:30 -07:00
Yanan Cao
86abc8cd48 [JIT] Make InsertInstruction overflow check a warning instead of fatal (#46369)
Summary:
This diff restores previous behavior of silently allow overflowing when inserting instructions. The behavior was changed recently in https://github.com/pytorch/pytorch/issues/45382. But it started to break some existing use cases that haver overflow problems.

Restoring original behavior but throw a warning to to unblock existing use cases where overflowing happens.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46369

Reviewed By: kwanmacher, wanchaol, fbhuba

Differential Revision: D24324345

Pulled By: gmagogsfm

fbshipit-source-id: 1c0fac421d4de38f070e21059bbdc1b788575bdf
2020-10-14 23:09:53 -07:00
Alexander Golynski
e7e919fc34 Add warning on ProcessGroup and ProcessGroup::Work APIs (#46220)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/46220

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D24294437

Pulled By: gmagogsfm

fbshipit-source-id: 198f8e5760beeb1d18740f971647d2537afb3dd6
2020-10-14 16:27:37 -07:00
BowenBao
b28b5d3c68 [ONNX] Update squeeze test for opset 9 (#45369)
Summary:
Only under static axes does opset 9 supports no-op squeeze when dim is not 1.
Updating the test case where it was setting dynamic axes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/45369

Reviewed By: anjali411

Differential Revision: D24280180

Pulled By: bzinodev

fbshipit-source-id: d7cda88ab338a1c41a68052831dcebe739a3843c
2020-10-14 12:53:13 -07:00