Commit Graph

441 Commits

Author SHA1 Message Date
Meghan Lele
6c1c1111de [JIT] Add reference semantics to TorchScript classes (#44324)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44324

**Summary**
This commit adds reference semantics to TorchScript class types;
modifications made to them within TorchScript will be visible in Python.

**Test Plan**
This commit adds a unit test to `TestClassType` that checks that
modifications made to a class type instance passed into TorchScript are
visible in Python after executing the scripted function or module.

**Fixes**
This commit closes #41421.

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D24912807

Pulled By: SplitInfinity

fbshipit-source-id: d64ac6211012425b040b987e3358253016e84ca0
2021-06-30 14:27:17 -07:00
Mengwei Liu
10fc58620e [PyTorch][NASProfiler] Add moduleHierarchy Python API to print out hierarchical information about a Node (#60384)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60384

Currently inlining module graph will drop module hierarchy info on Python side. Here we retrieve the module hierarchy from cpp side and expose it to a new Python API on Node called `moduleHierarchy()`.

Test Plan:
Usage:
```
torch._C._jit_pass_inline(module.graph)
torch._C._jit_pass_propagate_shapes_on_graph(module.graph)
node = module.graph.findNode("quantized::conv2d_relu")
'top(' + module.original_name + ').' + node.moduleHierarchy() + '.' + node.kind()
```
Output:
```
'top(QuantWrapper).module(FBNetHR).0(Sequential).xif0_0(ConvBNRelu).conv(ConvReLU2d).quantized::conv2d_relu'
```

Reviewed By: kimishpatel

Differential Revision: D29252169

fbshipit-source-id: 74163a87f919e061e5e75dfebc4c5cdbe8489d93
2021-06-30 01:32:31 -07:00
Bert Maher
93772792e3 [nnc] Get rid of fuser trigger counters (#57334)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57334

Here's a possibly controversial PR.  These counters got in the way of
generalizing the fuser tests to handle arbitrary devices, and I guess I'm just
generally skeptical that they provide much value.  While true that they let us
observe whether fusion groups were created, we already have assertions based on
the shape of the graph, and I'm not sure that I trust those any less than these
counters.

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D29471484

Pulled By: bertmaher

fbshipit-source-id: f6d76f6e72dbfb581acff1d834b0c74500941b57
2021-06-29 22:22:15 -07:00
Lily Johnson
0dd90cceaf [package] track storages across lifetime of PackageExporter (#59735)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59735

1. Fixes ABA storage identity problem during serialization for `torch.package` by keeping reference of serialized storages through lifetime of `PackageExporter` to prevent reuse of memory address. Achieved by extending logic used in solution to mobile's same issue.
2. Adds determinism to naming scheme of serialized storages in export code paths which utilize `tensor_cdata_naming_scheme`(introduced 2nd mapping in `StorageContext`, now maps `storage cdata ptr` -> `unique id`, `unique id` -> `c10::Storage`)
3. Additionally uses presence of a storage in the `StorageContext` instance as marker for if a storage has been serialized or not, removing the need to scan the `PythonStreamWriter` for presence of the storage's serialization file

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D29075276

Pulled By: Lilyjjo

fbshipit-source-id: 15a5c30b1de99c5bd7079388f2db9b6ece2eca12
2021-06-29 14:16:54 -07:00
Ansley Ussery
0fbc471d10 Support default values on NamedTuple fields (#54682)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54682

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D27327241

Pulled By: ansley

fbshipit-source-id: 76546f1770d50ebc3435bba3b74540e3c6be8a1c
2021-06-26 15:18:21 -07:00
Hariom Narang
9d1d799034 Added API to change logging levels for JIT (#58821)
Summary:
Description:
- Before this, logging level could only be changed by changing the env
variable "PYTORCH_JIT_LOG_LEVEL"
    - Can change the level from python now
- Have not added stream configuration for now
- Configuration is stored in a singleton class managing the options

Issue Link: https://github.com/pytorch/pytorch/issues/54188

Gotchas:
- Created separate functions
`::torch::jit::get_jit_logging_levels/set_jit_logging_levels` instead of
using the singleton class's method directly
    - This is because when running test cases, two different instances
    of the singleton are created for the test suite and the actual code
    (`jit_log.cpp`)
    - On using these methods directly, `is_enabled` calls the singleton
    in `jit_log.cpp` while we are setting the config using another
    singleton
    - See: https://stackoverflow.com/questions/55467246/my-singleton-can-be-called-multiple-times

API:
- To set the level: `torch._C._jit_set_logging_option("level")`
- To get the level: `torch._C._jit_get_logging_option()`

Testing:
- UTs were added for C++
- A very simple UT was added for python to just check if the API is
being called correctly
- The API was checked by running trace in a sample python file
    - Set env variable to "" and used `_jit_set_logging_option` in python to set the variable to `>dead_code_elimination`
    - The error output had logs of form [DUMP..] [UPDATE...] etc

Fixes https://github.com/pytorch/pytorch/issues/54188

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58821

Reviewed By: soulitzer

Differential Revision: D29116712

Pulled By: ZolotukhinM

fbshipit-source-id: 8f2861ee2bd567fb63b405953d035ca657a3200f
2021-06-21 16:10:49 -07:00
Richard Barnes
b162d95e46 Fix a number of lint perf and safety issues in torch (#59897)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59897

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D29037012

fbshipit-source-id: 7c16286d5fc2b67964fb65f8374dfff4d1a7aefb
2021-06-15 13:14:51 -07:00
Meghan Lele
d9d7d5e24a [torch] Remove migration warning for ScriptDict
Summary:
This commit removes the warning that suggests that users script their
dictionaries before passing them into TorchScript code. The ScriptDict feature
is not fully ready, so it does not make sense to recommend this yet.

Test Plan:
Sandcastle.

In addition, the PyPER test broken by the original diff passes:

```
buck test mode/opt //caffe2/torch/fb/training_toolkit/backend/tests:test_model_materializer_full_sync_lwt -- --exact 'caffe2/torch/fb/training_toolkit/backend/tests:test_model_materializer_full_sync_lwt - caffe2.torch.fb.training_toolkit.backend.tests.test_model_materializer_full_sync_lwt.ModelMaterializerFullSyncLwtTest: test_materialization_determinism_cpu' --run-disabled
```

Differential Revision: D28891351

fbshipit-source-id: 2a3a00cde935d670fb1dc7fd8c709ae9c2ad8cdc
2021-06-03 20:55:40 -07:00
Bin Bao
add291cf66 [JIT] Add a phase to perform inplace<->functional conversion for activation operators (#57477)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57477

Currently the conversion only deals with activation operators. The legality check is somewhat strict for now.

Test Plan:
```
python test/test_jit.py -k test_functional_to_inplace_activation
python test/test_jit.py -k test_inplace_to_functional_activation
```

Reviewed By: mrshenli

Differential Revision: D28155153

Pulled By: desertfire

fbshipit-source-id: df092830c4dff3ce9578ff76285eb7a566b7d81b
2021-06-03 06:43:23 -07:00
Richard Barnes
3979cb0656 irange for size_t (#55320)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55320

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D27572577

fbshipit-source-id: 97710fd2bb1303006b05828a0d1343b0b59ccb03
2021-06-03 01:04:13 -07:00
Meghan Lele
484d53f4a0 [torch][JIT] Warn only once when using unscripted dictionary (#59287)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59287

D27211605 added a warning in `toIValue` that warns users to script their
dictionaries before passing them to TorchScript functions in order to get some
performance benefits and reference semantics. However, this warning is emitted
every time `toIValue` is called (e.g. when a dictionary is passed to
TorchScript function), which can lead to noisy log output. This diff changes
this changes to use `TORCH_WARN_ONCE` instead.

Test Plan: Sandcastle, OSS CI.

Reviewed By: hyuen

Differential Revision: D28824468

fbshipit-source-id: e651eade4380abaf77c6c8a81ec4e565b0c2c714
2021-06-02 11:41:37 -07:00
eellison
d8cbba3ee2 [JIT] Disable Complete Shape Inlining For Testing Purposes (#56966)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56966

This PR adds a toggle to shape analysis which won't inline complete tensor shapes as constants into the shape compute graph, which is a good stress test on the partial evaluation pipeline.

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D28444664

Pulled By: eellison

fbshipit-source-id: a62e424515a8837a4b596546efa93af5e8e61f10
2021-05-27 17:57:48 -07:00
eellison
f66fbb1e2e Add unary/binary ops necessary for mobilenet (#56828)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56828

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D28444660

Pulled By: eellison

fbshipit-source-id: 656673e6139550f2752c0d3ac2fb8731f4bf9bbb
2021-05-27 17:56:30 -07:00
Meghan Lele
b14c3205fd [JIT] Add torch._C.ScriptDict (#52659)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52659

**Summary**
This commit adds `torch._C.ScriptDict`, a dictionary type that has reference
semantics across the Python/TorchScript boundary. That is, modifications
made to instances of `torch._C.ScriptDict` in TorchScript are visible in
Python even when it is not returned from the function. Instances can be
constructed by passing an instance of a Python dictionary to
`torch.jit.script`. In the case of an empty dictionary, its type is
assumed to be `Dict[str, Tensor]` to be consistent with the handling of
empty dictionaries in TorchScript source code.

`torch._C.ScriptDict` is implemented using a modified version of pybind's `stl_bind.h`-style bindings attached to `ScriptDict`, `ScriptDictIterator` and `ScriptDictKeyIterator`, wrapper classes around `c10::impl::GenericDict` and `c10::impl::GenericDict::iterator`. These bindings allow instances of `torch._C.ScriptDict` to be used as if it were a regular `dict` Python. Reference semantics are achieved by simply retrieving the `IValue` contained in `ScriptDict` in `toIValue` (invoked when converting Python arguments to `IValues` before calling TorchScript code).

**Test Plan**
This commit adds `TestScriptDict` to `test_list_dict.py`, a set of tests
that check that all of the common dictionary operations are supported
and that instances have reference semantics across the
Python/TorchScript boundary.

Differential Revision:
D27211605
D27211605

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Pulled By: SplitInfinity

fbshipit-source-id: 446d4e5328375791aa73eb9e8b04dfe3465af960
2021-05-27 10:25:30 -07:00
Ansley Ussery
5268b5a29a Add parsing logic for Tuple[()] annotation (#58340)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58340

Test Plan: Imported from OSS

Reviewed By: jamesr66a

Differential Revision: D28459502

Pulled By: ansley

fbshipit-source-id: 4bb188448d66269b42b068858b895debac86e9ee
2021-05-25 12:12:43 -07:00
Kimish Patel
e067675167 [Pytorch] Provide API to preserve source range and callstack information during graph rewrite (#58300)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58300

Current state: During graph rewriting that can fuse nodes or add nodes
result in new nodes without debug information that was available in
original node. Thus we lose this information during graph rewrite.

This PR changes graph rewriting API to let user specify how the values
in the replacement pattern map to values in the pattern to be matched.
Then the graph rewriting will copy source range and inlined callstack
from the matched nodes onto the nodes being inserted.

(Note: this ignores all push blocking failures!)

Test Plan:
python test/test_jit.py
TestJit.test_pattern_based_rewrite_with_source_range_preserved

Imported from OSS

Reviewed By: malfet

Differential Revision: D28512465

fbshipit-source-id: 863173c29de726be85b3acbd3ddf3257eea36d13
2021-05-25 09:18:59 -07:00
Meghan Lele
0b8931fe4b [torch][JIT] Predicate uses of RPC APIs on torch.distributed.rpc.is_available() (#58887)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58887

There are some callsites of `torch.distributed.rpc.XXX` APIs that are compiled
or not based on `USE_RPC`. However, `torch::deploy`, at least for now,
is compiled with `USE_RPC=1`, but the `torch.distributed.rpc.XXX` APIs used by
the aforementioned pieces of code are not available (i.e.
`torch.distributed.rpc.is_available()` returns `False`). This can cause
Torchscript compilation to fail, even if the code being compiled doesn't use
RPC.

This commit fixes this problem (at least temporarily) by predicating the use
all thse `torch.distributed.rpc` APIs on the value of
`torch.distributed.rpc.is_available()`.

Test Plan: Ran packaged XLM-R model with C++ benchmark.

Reviewed By: suo

Differential Revision: D28660925

fbshipit-source-id: fbff7c7ef9596549105e79f702987a53b04ba6f9
2021-05-24 21:53:53 -07:00
Zhengxu Chen
2b0ec9c3cf Reapply "[jit] Implement ScriptProfile to collect instruction profiles." (#58783)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58783

This reverts commit fc804b5def.

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D28617037

Pulled By: zhxchen17

fbshipit-source-id: 645de2ede20500a5c218d6ec3c7faae94de37a14
2021-05-24 18:23:21 -07:00
Jacob Szwejbka
1c5f63d86d [Pytorch Edge] Model Ops compatibility api (#57501)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57501

Add an api _get_model_ops_and_info to get root operators and versioning info of a model in both cxx and python, and the input can be from a file path or buffer.
ghstack-source-id: 129620112

Test Plan: unit test.

Reviewed By: xcheng16, raziel

Differential Revision: D28162765

fbshipit-source-id: 4413c1e906b8a872e4a717d849da37347adbbea4
2021-05-24 12:00:06 -07:00
Elias Ellison
5313bafd31 [JIT] integer value refinement (#56438)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56438

Test Plan: Imported from OSS

Reviewed By: nikithamalgifb

Differential Revision: D27924239

Pulled By: eellison

fbshipit-source-id: ace54fcb594853f30c242369ea203b0eb5527ac1
2021-05-21 08:51:01 -07:00
Elias Ellison
5cebf29b4e Add list len refinement (#55926)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55926

This is necessary for code like conv2d where we wish to share a generic convolution shape function logic with that of conv2d but for conv2d always infer the output is dimension 4. I'm also hoping the refinement algorithm here could be refactored out and used to support refining tensor types from user annotations. i have a length comment explaining how this works, and the logic outside of data structures is pretty small and contained. Additionally, you might check out https://fb.quip.com/X7EVAdQ99Zzm for a very similar description of how to refine values based on comparison operators.

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D27750997

Pulled By: eellison

fbshipit-source-id: d962415af519ac37ebc9de88f2e1ea60a1374f7c
2021-05-21 08:50:54 -07:00
Elias Ellison
9fd2306036 Add handling of symbolic shapes (#55925)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55925

This sets up the initial handling of symbolic shapes. As in the test, it doesn't work perfectly yet because it needs a couple other optimization passes. The basic description is pretty simple: we resolve tensor dimension indices to the same Value *, and before extracting out the output Tensor shape we substitute in symbolic shapes. We don't substitute during optimization because they are represented as negative numbers so we don't want them inadvertently used in Constant prop or something else.

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D27750996

Pulled By: eellison

fbshipit-source-id: 6984e7276b578f96b00fc2025cef0e13f594b6e6
2021-05-21 08:50:52 -07:00
Elias Ellison
f39471a171 Initial Symbolic Shape Analysis (#54809)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54809

I'm going to post on dev-discuss soon with a more thorough explanation of the design and advantages of this shape analysis, so I'm leaving out that for now.

There is still a ton left to do, I'm posting this initial version so we can get something on master multiple can work on. List of many remaining steps to do:

- [ ] Add symbolic shapes support
- [ ] Bind shape functions for operators in C++
- [ ] Make classes of operators share the same shape function (e.g. pointwise, broadcast two inputs)
- [ ] Refactor APIs
- [ ] Only iteratively optimize shape function while a change has been made
- [ ] Expand coverage of coverage to common ops
- [ ] Add shape analysis pass on Graph that handles Ifs and Loops
- [ ] Allow concurrent reads to the operator map
- [ ] Successive applications of same inputs to same shape function (e.g. series of pointwise ops)

For this review, I am mostly looking for comments related to the implementation of symolic_shape_analysis.cpp, with the caveats listed above. I am not really looking for comments related to api/registration/graph level analysis as those are all planned to be changed. I am fine landing this as is or waiting until necessary components of the TODOs above are finished.

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D27750998

Pulled By: eellison

fbshipit-source-id: 4338b99e8651df076291c6b781c0e36a1bcbec03
2021-05-21 08:49:46 -07:00
Edward Yang
fc804b5def Revert D28133579: [jit] Implement ScriptProfile to collect instruction profiles.
Test Plan: revert-hammer

Differential Revision:
D28133579 (034a238bab)

Original commit changeset: e7e30e961513

fbshipit-source-id: 5a7756468b4f2eeed24d2abb7b52ab46d081a95e
2021-05-21 08:18:40 -07:00
Zhengxu Chen
034a238bab [jit] Implement ScriptProfile to collect instruction profiles. (#57397)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57397

Introduces two main classes in C++ runtime:

ScriptProfile is the implementation for enalbing and disabling interpreter
profiling in C++. This should be only used from Python, and we will add
corresponding Python API in the next diff.

InstructionSpan is a utility class to instrument execution of each single
instruction. A start timestamp is recorded in the consturctor, and an end
timestamp is recorded in the destructor. During destruction, this will send
runtime data to all enabled ScriptProfile instances.

Test Plan:
build/bin/test_jit --gtest_filter='ScriptProfileTest.Basic'

Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D28133579

fbshipit-source-id: e7e30e96151367022793ab3ad323f01c51ad4a3b
2021-05-20 14:11:03 -07:00
Raghavan Raman
3fe72d30dc [NNC] Optimize conditionals that correspond to the form generated for aten::cat op. (#57673)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57673

Test Plan: Imported from OSS

Reviewed By: bertmaher

Differential Revision: D28231374

Pulled By: navahgar

fbshipit-source-id: 1777a63df4e5ebed6d515683bd772a88be465b3a
2021-05-18 14:23:48 -07:00
Luca Wehrstedt
5a238eb96e Fix deadlock in Future due to lock inversion with GIL (#58382)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58382

Calling markCompleted on a Future now first acquires the Future's mutex (as usual) but then sometimes tries to acquire the GIL during the DataPtr extraction while still holding the Future's mutex. (This happens when the value passed to markCompleted is a Python object). This can cause a deadlock if someone else calls any of the other methods of Future while holding the GIL.

There are two solutions to this: avoid holding the Future's mutex when extracting DataPtrs, and avoid holding the GIL while invoking the Future's method. In this PR I'm going for the latter, because it's a very simple immediate fix, but I believe this is brittle and that we should probably also consider the former fix.
ghstack-source-id: 129105358

Test Plan: The repro in https://github.com/pytorch/pytorch/issues/58239 now doesn't deadlock.

Reviewed By: mrshenli

Differential Revision: D28472816

fbshipit-source-id: 1bc9bca426dd004f9eb2568db1ffd38f014450e2
2021-05-17 10:53:19 -07:00
Lillian Johnson
9403fe17ce [torch.package/TorchScript] logic to enable sharing of tensors on load (#57573)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57573

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D28226975

Pulled By: Lilyjjo

fbshipit-source-id: bc8cb3e8052fa18336c437e0601d8b0028fd1895
2021-05-14 08:21:43 -07:00
Lillian Johnson
3ad11803f7 [torch.Package/TorchScript] ScriptModuleSerializer add unified format (#56299)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56299

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D27832545

Pulled By: Lilyjjo

fbshipit-source-id: 1b2880a8458f99bd66a8c9656c5ca700f43cffe8
2021-05-14 08:21:40 -07:00
Lillian Johnson
07de11c26d [torch.Package/TorchScript] TS serialization importer to handle unified format (#54891)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54891

Changed TorchScript's jit/serialization importer logic to handle both original TS serialization format and new unified TS format

Original TS file format:
```
resnet.pt
├── data  # tensor data
│   ├── 94286146172688
│   ├── 94286146172784
│   └── ...
├── code/  # TorchScript code
│   ├── __torch__
│   │   ├── torch
│   │   │   └── nn ...
│   │   └── torchvision ...
│   ├── __torch__.py
│   └── __torch__.py.debug_pkl
├── data.pkl  # the ScriptModule object, pickled by the TS pickler
├── version  # version metadata
├── constants.pkl  # any tensor constants present in the TS code
└── extra
     ├── name_of_file
     └── foo
```

Unified file format:
```
─── package_name.pt
    ├── .data
    │   ├── ts_code # code shared between models
    │   │   ├── 0
    │   │   │   ├── constants.pkl
    │   │   │   └── data.pkl
    │   │   ├── 1
    │   │   │   ├── constants.pkl
    │   │   │   └── data.pkl
    │   │   └── code
    │   │       ├── __torch__
    │   │       │   ├── torch
    │   │       │   │   └── nn ...
    │   │       │   └── torchvision ...
    │   │       ├── __torch__.py
    │   │       └── __torch__.py.debug_pkl
    │   ├── 0.storage
    │   ├── 1.storage
    │   ├── <many more storages>
    │   ├── 201.storage
    │   ├── extern_modules
    │   └── version
    └── res
        ├── mod.pkl  # maps to ts_id 0 and .data/ts_code/0
        └── mod2.pkl # maps to ts_id 1 and .data/ts_code/1
```

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D27832548

Pulled By: Lilyjjo

fbshipit-source-id: 4a6e84c3a9bac8eed6a4e4afc2ac76dd691858b0
2021-05-14 08:20:34 -07:00
Dhruv Matani
38e606d056 [RFC] Add method torch.jit._clone_module_with_class (#56152)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56152

Currently, the Bundled Inputs API mutates the module in-place. It adds class methods and not instance methods. This results in a small problem that one can't re-run an already executed cell in Bento if the class has already been subject to bundled inputs.

In addition, there is no way to add bundled inputs to a module that has bundled inputs added already. This API provides a way to solve this problem as well by adding an `ignored_methods` to the call to `clone()` by allowing the implementation of bundled inputs to pass in the methods that it will add as `ignored_methods` so that when it does try to add those methods, it will be able to do so successfully.

We'll have to be careful when ignoring those methods during the call to `torch.jit._clone_module_with_class` since any bundled input that relies on a user-provided method will need to be preserved and not ignored during the clone.

Looking for feedback on whether this is an acceptable direction.
ghstack-source-id: 128908360

Test Plan:
Added unit test and ran it as `buck test //caffe2/test:mobile`

Also see this Bento Notebook: https://www.internalfb.com/intern/anp/view/?id=550829

Reviewed By: gmagogsfm

Differential Revision: D27788394

fbshipit-source-id: 48109cd4583506d4efdb345e4ba31385db23a273
2021-05-13 22:31:05 -07:00
BowenBao
346dc88bfa [ONNX] Support registering custom export for prim::PythonOp from torch.autograd.Function (#55630) (#57600)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57600

Demo script:

```python
import torch

class MyReLU(torch.autograd.Function):
    staticmethod
    def forward(ctx, input, scalar_tuple, scalar, scalar_list):
        ctx.save_for_backward(input)
        return input.clamp(min=scalar)
    staticmethod
    def backward(ctx, grad_output):
        input, = ctx.saved_tensors
        grad_input = grad_output.clone()
        grad_input[input < 0] = 0
        return grad_input

class MyModule(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.linear_a = torch.nn.Linear(2, 2)
        self.linear_b = torch.nn.Linear(2, 2)
        self.relu = MyReLU.apply
    def forward(self, x):
        h = self.linear_a(x)
        h = self.relu(h, (5, 3), 2, [1, 2, 3])
        h = self.linear_b(h)
        return h

"""
User define how to export prim::PythonOp into custom op.
"""
def symbolic_pythonop(g, n, *args, **kwargs):
    # Print information:
    print('arguments of ', kwargs['name'], ':')
    print('original node: ', n)
    for i, out in enumerate(n.outputs()):
        print('original output {}: {}, requires grad: {}'.format(i, out, out.requiresGrad()))
    import torch.onnx.symbolic_helper as sym_helper
    for i, arg in enumerate(args):
        print('arg {}: {}, requires grad: {}'.format(i, arg, arg.requiresGrad() if sym_helper._is_value(arg) else False))
    for k, v in kwargs.items():
        print('key: ', k, ' v: ', v)

    # TODO: all inputs (tensors and scalars) are in args.
    #       backend can define CustomDomain::PythonOp and how info are stored however it deem fit.
    return g.op("CustomDomain::PythonOp", args[0], name_s=kwargs['name'])

torch.onnx.register_custom_op_symbolic("::prim_PythonOp", symbolic_pythonop, 9)

# Define input.
x = torch.tensor([[0.3971, 0.7544],
                  [0.5695, 0.4388]], requires_grad=True)

model = MyModule()
# Forward.
y = model(x)

torch.onnx.export(model, (x,), 'model.onnx', opset_version=12, verbose=True)
```

Test Plan: Imported from OSS

Reviewed By: malfet

Differential Revision: D28393528

Pulled By: SplitInfinity

fbshipit-source-id: e0d55b7c737c5916fda08a3b26b3306037f970df

Co-authored-by: BowenBao <bowbao@microsoft.com>
2021-05-13 13:42:49 -07:00
neginraoof
1de3525ca8 [ONNX] Handle PackedParams inputs for _propagate_and_assign_input_shapes (#56449) (#57079)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57079

Testing onnx 1.9 release, we see that the old bug is triggered for the caffe2 test:
`pytest test/onnx/test_pytorch_onnx_caffe2_quantized.py::TestQuantizedOps::test_small_model`
This is because the graph inputs
```python
graph(%x.1 : Tensor,
      %conv1._packed_params : __torch__.torch.classes.quantized.Conv2dPackedParamsBase,
      %conv2._packed_params : __torch__.torch.classes.quantized.Conv2dPackedParamsBase,
      %fc.bias : Float(10, strides=[1], requires_grad=0, device=cpu),
      %fc.weight : Float(10, 72, strides=[72, 1], requires_grad=0, device=cpu)):
```
contains `Conv2dPackedParamsBase` which is a PackedParams.
When we do flatten, we will flatten to several tensors, then the shape inference for input misaligned.
This PR record how may tensors got flattened in PackeParams, and skip by these number rather than 1, then the UT passed.
Note that tuple case should still follow the original logic.

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D28393949

Pulled By: malfet

fbshipit-source-id: 98d48aad27e5ca03fb10d260f8e625478d996ee2

Co-authored-by: David <jiafa@microsoft.com>
2021-05-12 15:20:26 -07:00
Chen Lai
8c04593c0a [PyTorch Edge] Add backport to export old bytecode models (#56802)
Summary:
Add an api to backport a model vn to model vi. It accept an input model (file or buffer) and output a model (file or buffer) with an expected bytecode version.

In this change, the input is a model and it can come from a file or buffer. The output is a model and can be either file path or buffer.

When backport fails, function return false with a warning message :
```
/Users/chenlai/pytorch/cmake-build-debug/bin/test_jit --gtest_filter=LiteInterpreterTest.BackPortByteCodeModelV4:LiteInterpreterTest/*.BackPortByteCodeModelV4:*/LiteInterpreterTest.BackPortByteCodeModelV4/*:*/LiteInterpreterTest/*.BackPortByteCodeModelV4 --gtest_color=no
Testing started at 2:32 PM ...
CUDA not available. Disabling CUDA and MultiCUDA tests

[W backport.cpp:419] Warning: Backport doesn't support backport to version3 (function _backport_for_mobile_impl)
Process finished with exit code 0
```

## Test
1. Run both `caffe2/test/cpp/jit/test_lite_interpreter.cpp` and `caffe2/test/mobile/test_bytecode.py`.
2. Run all prod models with backport api.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56802

ghstack-source-id: 128425510

Test Plan: CI

Reviewed By: raziel, iseeyuan

Differential Revision: D27844651

fbshipit-source-id: 8a803cf6c76433ee0a3049b1a5570585d569f8d6
2021-05-07 18:14:33 -07:00
Luca Wehrstedt
36e47af58b Pass reference to parent future in callbacks (#57635)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57635

Note: this PR looks massive, but it's just one simple change, codemodded many times.

In many cases, a callback needs to access the value/error produced by the parent future. In Python this was easy because the callback was invoked with the parent future as argument, and could thus inspect it. In C++ the callbacks didn't take any arguments, thus in many cases we worked around this by capturing the future in its own callback. This is risky (leads to reference cycle and thus memory leak) and must be done carefully (spoiler: sometimes we weren't).
ghstack-source-id: 128296580

Test Plan: CI

Reviewed By: wanchaol

Differential Revision: D28178783

fbshipit-source-id: 6de02c4568be42123372edc008f630d5ddae0081
2021-05-07 03:59:18 -07:00
Luca Wehrstedt
8e9bbd3113 Make DataPtr extraction in CUDAFuture faster for Python values (#56918)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56918

Re-importing a Python module each time is a bit expensive, and it's unnecessary because this is a private module which won't change and thus we can cache the value once we first extract it.

ghstack-source-id: 128184666

Test Plan: CI

Reviewed By: mrshenli

Differential Revision: D27985910

fbshipit-source-id: be40ae9b67ab8ea6c07bc2cb9a78d2c2c30b35d3
2021-05-06 01:12:53 -07:00
Yi Huang (Symphony)
ba78bf1363 [standaloneRunner] fix another GIL mutithreading issue exposed by torch::jit::toIValue() (#57688)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57688

P412982836 says that `torch::jit::toIValue()` will also touch GIL through `torch::jit::createGenericDict()` (P412848640)
So we have to move `torch::jit::toIValue()` out of multithreading execution

Reviewed By: hyuen

Differential Revision: D28236527

fbshipit-source-id: 43a33dbcfc828cc42c5e1230c8f5cb415bf7bde4
2021-05-05 21:41:04 -07:00
Chen Lai
fb9a32b7b4 [PyTorch][Edge] Add api to get bytecode model version (#56801)
Summary:
Add an api `_get_bytecode_version` to get version number given a bytecode model in both cxx and python, and the input can be both from file path and buffer.
## Test
CI (new added unit test will run as part of `pytorch_core-buck`)

1. run test_lite_interpreter.cpp
2. `python test/mobile/test_bytecode.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56801

ghstack-source-id: 128169647

Test Plan:
CI (new added unit test will run as part of `pytorch_core-buck`)

1. run test_lite_interpreter.cpp
2. `python test/mobile/test_bytecode.py`

Reviewed By: iseeyuan

Differential Revision: D27961417

fbshipit-source-id: f786cc9573d855feecff0b4fe8e5363e25f5728c
2021-05-05 09:17:26 -07:00
Luca Wehrstedt
58bc003487 Add pybind type caster for c10::Device (#57292)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57292

In Future (and soon in other places too) we need to receive a list of devices from Python-land. We don't want to just take their indices because we need full devices in order to infer the type from them. torch.device is not defined through pybind, it's defined through a plain `PyModule_AddObject` call with CPython, thus pybind isn't naturally able to understand and convert it. However we can provide a custom type caster which fixes that. We have this already for at::Tensor, at::Generator, ...
ghstack-source-id: 127916268

Test Plan: CI

Reviewed By: mrshenli

Differential Revision: D28092732

fbshipit-source-id: 1c31d0b85a4d5c9e7bde8161efbb7574d505157c
2021-05-01 16:11:10 -07:00
Scott Wolchok
b87d3fa432 [PyTorch][jit] Don't allow create() on singleton types (#56807)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56807

If I understand correctly, there's no reason to create your own instance of these global singleton types.
ghstack-source-id: 127312270

Test Plan: CI

Reviewed By: SplitInfinity

Differential Revision: D27973447

fbshipit-source-id: f12df69d185f1baaa45f2ac6eac70570a7a65912
2021-04-30 10:28:50 -07:00
Luca Wehrstedt
311ad5e3af Merge CUDAFuture into ivalue::Future (#57052)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57052

This PR caps a stack whose goal was to merge CUDAFuture into ivalue::Future. CUDAFuture used to be a subclass of ivalue::Future, which was already pretty good, but it meant that in several places we needed `#ifdef`s or registries in order to create the right type of class, which was annoying. We've made CUDAFuture device-agnostic, by using generic helpers, so that it doesn't depend on CUDA. Now all its code can be inserted into ivalue::Future.

This PR does this very naively, by copy-pasting CUDAFuture's code into the (previously empty) virtual methods of ivalue::Future. This helps ensure the correctness of this PR, as it's straightforward to see it behaves exactly like before. However we probably want to polish it a bit later to iron out so wrinkles.
ghstack-source-id: 127713138

(Note: this ignores all push blocking failures!)

Test Plan: CI

Reviewed By: mrshenli

Differential Revision: D28036829

fbshipit-source-id: 3e5b16402f5dc245c1fcb9d7bf06db64dcb0d2a3
2021-04-29 09:31:52 -07:00
Luca Wehrstedt
71c2f88b90 Make CUDAFuture handle any kind of device type (#57051)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57051

Make CUDAFuture autodetect the devicetype from its arguments (which thus change from DeviceIndices to full Devices). This in fact transforms CUDAFuture into a AnythingFuture, since it's not tied to CUDA in any way anymore. Having made it fully device-agnostic, we'll merge it into ivalue::Future in the next PR.
ghstack-source-id: 127713134

(Note: this ignores all push blocking failures!)

Test Plan: CI

Reviewed By: mrshenli

Differential Revision: D28032711

fbshipit-source-id: 8ba23b1b0d97f61db8693cd5f3c7bae7989a9bcd
2021-04-29 09:31:50 -07:00
Nikita Shulga
4cb534f92e Make PyTorch code-base clang-tidy compliant (#56892)
Summary:
This is an automatic change generated by the following script:
```
#!/usr/bin/env python3
from subprocess import check_output, check_call
import os

def get_compiled_files_list():
    import json
    with open("build/compile_commands.json") as f:
        data = json.load(f)
    files = [os.path.relpath(node['file']) for node in data]
    for idx, fname in enumerate(files):
        if fname.startswith('build/') and fname.endswith('.DEFAULT.cpp'):
            files[idx] = fname[len('build/'):-len('.DEFAULT.cpp')]
    return files

def run_clang_tidy(fname):
    check_call(["python3", "tools/clang_tidy.py", "-c", "build", "-x", fname,"-s"])
    changes = check_output(["git", "ls-files", "-m"])
    if len(changes) == 0:
        return
    check_call(["git", "commit","--all", "-m", f"NOLINT stubs for {fname}"])

def main():
    git_files = check_output(["git", "ls-files"]).decode("ascii").split("\n")
    compiled_files = get_compiled_files_list()
    for idx, fname in enumerate(git_files):
        if fname not in compiled_files:
            continue
        if fname.startswith("caffe2/contrib/aten/"):
            continue
        print(f"[{idx}/{len(git_files)}] Processing {fname}")
        run_clang_tidy(fname)

if __name__ == "__main__":
    main()
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56892

Reviewed By: H-Huang

Differential Revision: D27991944

Pulled By: malfet

fbshipit-source-id: 5415e1eb2c1b34319a4f03024bfaa087007d7179
2021-04-28 14:10:25 -07:00
Jacob Szwejbka
60a5ebfac2 [Pytorch Edge] Remove methods_to_optimize arg (#57045)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57045

Went back and adjusted the previous optimizations to just be applied to every function.
Cleaned up api to match.

ghstack-source-id: 127214412
ghstack-source-id: 127536155

Test Plan: unit test

Reviewed By: kimishpatel

Differential Revision: D27950859

fbshipit-source-id: 214e83d5a19b452747fe223615815c10fa4aee58
2021-04-27 14:54:13 -07:00
Pritam Damania
dc8a8cea79 Move caffe2 signal_handler to c10. (#56717)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56717

The signal_handler was under the caffe2 namespacee but was being used
by PyTorch as well.

I've fixed this my moving it to the c10 namespace where now both C2 and PyTorch
can use it.

The signal_handler interface in caffe2/utils/signal_handler.h is kept the same
for backward compatiblity for C2, but most of the commmon code is moved to c10.
ghstack-source-id: 127446929

Test Plan: waitforbuildbot

Reviewed By: ezyang

Differential Revision: D27946738

fbshipit-source-id: d6228d1a0108f4c807d405e7a0bb799c5375388f
2021-04-26 23:08:12 -07:00
Luca Wehrstedt
a688b29750 Support custom Python classes in CUDAFuture (#56516)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56516

One problem with CUDAFuture's extraction of DataPtrs from IValues is that it only supported Python objects that could be converted to "regular" IValues (e.g., lists/dicts/tuples of ints/strings/tensors/...). One notable exception are custom Python classes, which are in fact a very common data type transferred over RPC. The only solution we found for those is to use the Python pickler to extract the tensors contained in them.

We can't insert a Python dependency directly into CUDAFuture, so instead I'm proposing to use the same indirection technique used to support `getSubValues` on Python objects: define some methods on the abstract class `PyObjectHolder` (which can be used by CUDAFuture) but only implement them in the concrete subclass `ConcretePyObjectHolder` (which is only built when Python support is enabled).

I am a bit worried about the performance toll of this (pickling isn't exactly known to be cheap) but I think we should start by providing a functionally complete API. We already have ideas on how to make this faster if needed, for example by having users provide a custom DataPtr extractor tailored to their class via a decorator. (Or just use TorchScript).
ghstack-source-id: 127295014

Test Plan: Added a test later in the stack

Reviewed By: mrshenli

Differential Revision: D27887189

fbshipit-source-id: 9d27e4e62390b836e5bb4f06f401cc002f0cf95b
2021-04-24 07:06:28 -07:00
Luca Wehrstedt
15ca379bde Add CUDA support to a user-created torch.futures.Future (#56517)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56517

Currently a torch.futures.Future could wrap a CUDAFuture, but it could not create one from scratch. This prevented users from using CUDAFutures in some occasions, for example when using `rpc.functions.async_execution`, or in their own code. I don't see any reason for such a limitation, hence here I add support for this.
ghstack-source-id: 127261554

Test Plan: Added a test later in the stack

Reviewed By: mrshenli

Differential Revision: D27887190

fbshipit-source-id: ecbb39c1ad7cd189d478ded9c361448f05a270ad
2021-04-23 08:13:56 -07:00
BowenBao
818ce1d0d2 Add standardOps match more input type in ORT (#53813) (#56172)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56172

Enable the standardOps include **Add\Sub\Mul\Div\Gemm\Pow\Mod**  with low precision input in ORT

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D27866136

Pulled By: SplitInfinity

fbshipit-source-id: f2cf5649fffefd68c0cc7b6dce94198751636727
2021-04-21 17:58:08 -07:00
BowenBao
9986b109d2 [ONNX] Fix assign input shape for tuple inputs & primitive type inputs (#54112) (#56164)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56164

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D27866139

Pulled By: SplitInfinity

fbshipit-source-id: c59f5a07df685e1ccdc4860d603ec422ec80d188
2021-04-20 23:00:37 -07:00
Zhengxu Chen
8176ab6ca0 [JIT] Put explicit error message on class attribute accesses. (#55723)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55723

Resolving https://github.com/pytorch/pytorch/issues/51139

Test Plan:
python test/test_jit.py TestClassType.test_unresolved_attributes

Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D27691960

fbshipit-source-id: 1d078a4ab25af1a73109ca6ef0333a67a634bff6
2021-04-16 15:47:10 -07:00
Bert Maher
8e82e932f3 Reland: D27652485: [nnc] Enable CPU fusion only when num_threads == 1" (#56120)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56120

This reverts commit ad17fadbfc (D27786457).

The big annoyance here is that depending on the threading mode you may not be
able to toggle num_threads at will, so the fusion tests won't fail.

I hate this solution, but I'm adding a secondary override for the TE fuser.
Now you need to both turn on fusion (_jit_override_can_fuse_on_cpu), and you're
OK if you're running with 1 thread, or you can add
`_jit_set_texpr_parallel_cpu_enabled` to enable it anyways.

This is (a) mainly for tests, since a real user probably won't fiddle aimlessly
with the thread count, and (b) will go away once NNC's threading support is
fully baked.

Test Plan: Imported from OSS

Reviewed By: Krovatkin

Differential Revision: D27788199

Pulled By: bertmaher

fbshipit-source-id: 070d04474f15e9689dbdf8cc1fde43050c6506b1
2021-04-15 15:50:18 -07:00
Edward Yang
6ec71ed4f9 Replace all direct cdata access with THPVariable_Unpack (#55799)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55799

I'm going to change the implementation of cdata soon so I need to
abstract over cdata access with a function.  Additionally, many
users are casting manually casting to THPVariable to access
the member so I can remove these unsafe casts in the client code
(the implementation, of course, is still doing an unsafe cast.)

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D27712130

Pulled By: ezyang

fbshipit-source-id: 95fcc013bf3913d67f2c634068eb5b3aab144cb3
2021-04-15 08:57:04 -07:00
James Reed
71a5314591 Fix ScriptMethod dispatch on __torch_function__ (#56103)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56103

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D27784142

Pulled By: jamesr66a

fbshipit-source-id: 555dcb7c3a98b8fb9e9ca9b499cafad54e819aa7
2021-04-15 08:46:43 -07:00
Nikitha Malgi
88c06d9dfc Add cuda device synchronization support in JIT (#55469)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55469

Test Plan: Imported from OSS

Reviewed By: ZolotukhinM

Differential Revision: D27749077

Pulled By: nikithamalgifb

fbshipit-source-id: bce3d331ab781cf3232b47b4f02ef504b9eadc7e
2021-04-14 09:13:07 -07:00
Nikita Shulga
6a39613f35 [BE] Make torch/csrc/jit/tensorexpr/ clang-tidy clean (#55628)
Summary:
Mostly auto-generated changes using
```
 python3 tools/clang_tidy.py -c build -x torch/csrc/jit/tensorexpr/eval.cpp -s
```
With following common patterns manually fixed
- Use ` = default` instead of `{}`
- deleted methods should be public
- Use pass-by-value + std::move instead of pass-by-reference+copy

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55628

Reviewed By: walterddr

Differential Revision: D27655378

Pulled By: malfet

fbshipit-source-id: 92be87a08113435d820711103ea9b0364182c71a
2021-04-08 19:44:14 -07:00
Jacob Szwejbka
20d7916a6a [Pytorch Mobile] Fold Conv BatchNorm for functions besides forward (#54619)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54619

Minor refactor to conv batchnorm folding to work on other functions besides forward
ghstack-source-id: 125767010

Test Plan: unit test and {P339453712}

Reviewed By: kimishpatel

Differential Revision: D27301452

fbshipit-source-id: 4e0cc544a171a970583979a496b2908935124497
2021-04-06 13:07:12 -07:00
Nikitha Malgi
197f9f0826 Merge CUDA Streams and Events (#53902)
Summary:
-----------
- Updates current_stream and default stream API's to take `optional[device]` argument
- Adds parsing logic to replace `torch.cuda.Stream` and `torch.cuda.Event` -> `torch.classes.cuda.Stream` and `torch.classes.cuda.Event` for JIT
- Merges StreamContext manager for both Eager and JIT.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53902

Test Plan:
------
Run JIT tests:
python test/test_jit.py -v TestCUDA

Run eager tests:
python test/test_cuda.py -v TestCuda

Reviewed By: glaringlee

Differential Revision: D27494627

Pulled By: nikithamalgifb

fbshipit-source-id: b30b0570e38a33fb335c83762eb06ffd46a44b5c
2021-04-05 08:19:55 -07:00
Mike Ruberry
c0ac0fef4e Revert D27448156: irange for size_t
Test Plan: revert-hammer

Differential Revision:
D27448156 (041b4431b2)

Original commit changeset: 585da57d4de9

fbshipit-source-id: 8e047c29f391c0166e0a1a87c3fb2a0854377365
2021-04-03 19:14:00 -07:00
Richard Barnes
041b4431b2 irange for size_t (#55163)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55163

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D27448156

fbshipit-source-id: 585da57d4de91c692b6360d65f7b8a66deb0f8c1
2021-04-02 23:22:29 -07:00
Meghan Lele
6866c033d5 [JIT] Add recursive scripting for class type module attributes (#55124)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55124

**Summary**
This commit modifies type inference (used by the module scripting code)
so that it tries to script the type of any class instances that it
encounters. This enables recursive, automatic scripting of class type
module attributes.

**Test Plan**
This commit adds a test case for this to `TestClassType`.

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D23971883

Pulled By: SplitInfinity

fbshipit-source-id: 7a5a2e7c12ee68cbdeb0a07e6aaf98734a79cb06
2021-04-02 12:16:21 -07:00
Negin Raoof
cd9dd653e9 [ONNX] Support primitive type input/outputs and attributes (#53550) (#54864)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54864

Support primitive type attributes. Needed for Silero model.

Test Plan: Imported from OSS

Reviewed By: nikithamalgifb

Differential Revision: D27408982

Pulled By: SplitInfinity

fbshipit-source-id: 16b291eedbe9f9bb31d7664a29a484555df53755
2021-03-31 21:14:20 -07:00
Rohan Varma
a37fbf9b45 [Futures] Bump log verbosity when ignoring cb errors in python future. (#54476)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54476

Per title. For `add_done_callback`, we log but swallow exceptions in order to keep consistent with what concurrent.futures python library does, see discussion in https://github.com/pytorch/pytorch/pull/45675.

Although, it would be good to improve the verbosity here as this can be a source of confusion if users are setting a different future via `add_done_callback`, and an error is hit resulting in an unexpected hang (see https://github.com/pytorch/pytorch/issues/52132 for more details on how this can happen).
ghstack-source-id: 125300389

Test Plan: CI

Reviewed By: lw

Differential Revision: D27253004

fbshipit-source-id: 72ed21c8fb6d27de5797c17fc46b762f893e6fea
2021-03-31 15:17:06 -07:00
Jianyu Huang
7fc03dd7c9 Back out "[pytorch][PR] Merge CUDA Streams and Events" (#54996)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54996

Original commit changeset: 45d9fee9a582

Test Plan: CI

Reviewed By: jspark1105

Differential Revision: D27444718

fbshipit-source-id: deb627230817923eaf84ade50ecb14bfbce4e779
2021-03-31 10:21:35 -07:00
Michael Suo
8a170fbacd [package] fix mangling issues with TorchScript (#54915)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54915

TorchScript and torch.package have different mangling schemes. To avoid
them interfering with each other, we should undo the torch.package
mangling before processing anything with TorchScript (since TS
independently makes sure that no names collide).

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D27410472

Pulled By: suo

fbshipit-source-id: d1cc013c532d9abb7fb9615122bc465ded4785bb
2021-03-31 00:58:05 -07:00
anjali411
1bccd48465 Allow creating SugaredValue for a complex valued IValue and deserialization logic for "infj" and "nanj" global constants (#54328)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54328

Test Plan: Imported from OSS

Reviewed By: nikithamalgifb

Differential Revision: D27369134

Pulled By: anjali411

fbshipit-source-id: aec26750a6fc8917ee15306684b743d13a91570c
2021-03-29 14:46:29 -07:00
Nikitha Malgi
416ba5c48f Merge CUDA Streams and Events (#53902)
Summary:
-----------
- Updates current_stream and default stream API's to take `optional[device]` argument
- Adds parsing logic to replace `torch.cuda.Stream` and `torch.cuda.Event` -> `torch.classes.cuda.Stream` and `torch.classes.cuda.Event` for JIT
- Merges StreamContext manager for both Eager and JIT.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53902

Test Plan:
------
Run JIT tests:
python test/test_jit.py -v TestCUDA

Run eager tests:
python test/test_cuda.py -v TestCuda

Reviewed By: SplitInfinity

Differential Revision: D27285996

Pulled By: nikithamalgifb

fbshipit-source-id: 45d9fee9a582b5f4c82330f5f99eb88584804270
2021-03-26 14:19:39 -07:00
anjali411
f9ca0d87a7 Teach Python TS frontend to parse complex literals (#52881)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52881

**This PR adds:**
1. logic to parse complex constants (complex literals of the form `bj`)
2. logic to parse complex lists
3. support for complex constructors: `complex(tensor/int/float/bool, tensor/int/float/bool)`
4. Limited operator support
     - `add`, `sub`, `mul`, `torch.tensor`, `torch.as_tensor`

**Follow-up work:**
1. Add complex support for unary and other registered ops.
2. support complex constructor with string as input (this is supported in Python eager mode).
3. Test all emitXYZ for all XYZ in `ir_emitter.cpp` (currently only emitConst, emitValueToTensor are tested). e.g., test loops etc.
4. onnx doesn't support complex tensors, so we should error out with a clear and descriptive error message.

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D27245059

Pulled By: anjali411

fbshipit-source-id: af043b5159ae99a9cc8691b5a8401503fa8d6f05
2021-03-24 08:12:17 -07:00
Christian Puhrsch
2668149b8c Export torch::jit::toIValue (#54449)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/54448

Pull Request resolved: https://github.com/pytorch/pytorch/pull/54449

Reviewed By: SplitInfinity

Differential Revision: D27243154

Pulled By: cpuhrsch

fbshipit-source-id: fc21d6ce251b868356ad8ea13ae891fb56e311ce
2021-03-22 17:17:18 -07:00
Bin Bao
4626886f21 [JIT] Add CUDNN Conv-Add-Relu fusion for Frozen Model Optimization (#52102)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52102

Test Plan: Imported from OSS

Reviewed By: eellison

Differential Revision: D26646100

fbshipit-source-id: 7f7a82cc0b42c958b9e0c854b3b5dc6ea7cfff6c
2021-03-18 15:18:52 -07:00
James Reed
255b103c1b [WIP] Function to retrieve inspect.Signature instances for PyTorch ops (#53830)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53830

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D26982802

Pulled By: jamesr66a

fbshipit-source-id: 18fddc9f3f34b09e173de59f2fe886f8eedd000e
2021-03-17 20:41:27 -07:00
Jacob Szwejbka
8f61b13e80 [Pytorch Mobile] Optimize Non Forward for Mobile (#53314)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53314

Introduction of api for optimizing non forward functions for mobile. As of this diff, all functions that you say to optimize will be preserved, and those functions will be run through canonical optimization. The intention is to stack each further optimization onto separate diffs since they touch multiple files, and it seems like it'd be a nightmare to review.
ghstack-source-id: 123909414

Test Plan:
torch.utils.mobile_optimizer.optimize_for_mobile(net, methods_to_optimize=["forward", "foo"]) runs fine

torch.utils.mobile_optimizer.optimize_for_mobile(net, methods_to_optimize={"foo"}) optimizes just foo if the model doesnt define forward otherwise optimizes foo and forward

torch.utils.mobile_optimizer.optimize_for_mobile(net, methods_to_optimize=["forward"]) runs fine

torch.utils.mobile_optimizer.optimize_for_mobile(net) runs fine if the model defines forward, Throws otherwise

Reviewed By: kimishpatel

Differential Revision: D26618689

fbshipit-source-id: 5bff1fb3f3f6085c4a649a8128af9c10f0fa9400
2021-03-17 14:31:24 -07:00
Thomas Viehmann
fd5c1123e4 wrap AliasDb in Python (#51336)
Summary:
Also added a wrapper tlemo 's graphviz export to string.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51336

Reviewed By: ezyang

Differential Revision: D26150809

Pulled By: eellison

fbshipit-source-id: 9beafce5cbdc1785b986b71c3cd986c1087faa11
2021-03-17 12:55:22 -07:00
BowenBao
57d1df071f [ONNX] Support inplace operations on inplace indexing (#52063) (#53306)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53306

* [ONNX] Fix for sequence of mutations in blocks (#51577)

Fixes consecutive mutations in a tensor inside blocks.
Also, support append and pop in blocks.

* Support inplace operations + indexing

* Clean up old pass for remove mutations

* Add loop test

* Fixes for set attr in loops

* Removing the new jit API flag

* [ONNX] Redesign onnx pass to enable shape type dependent pattern conversion - cont (#51795)

With the introduction of ONNX shape inference, shape and type are inferred on the fly as operators get converted from ATen to ONNX when running symbolic function. This resolves the shape/type requirement for the symbolic functions. The pre-onnx passes however, can not be supported by shape inference, since at that stage the operators in the graph are still ATen operators.

This PR is to update the design of ONNX pass, to enable a mechanism of capturing subgraphs of ATen operators of certain patterns, and convert them later, when shape/type information of upstream operators are available.

The new design will require pre-onnx passes that need shape/type to be written in two parts, encapsulation and conversion.

    The encapsulation part will find the nodes of patterns, like how pre-onnx passes were written previously. But instead of converting the nodes, it will encapsulate them into a sub-block of a new placeholder node. This part is called before onnx pass, so it runs before calling symbolic functions.

    The conversion part will be called inside the onnx pass. In onnx pass, run_symbolic_func will be called for each node in topological order. When it reaches the placeholder node, the conversion part will be invoked. It will convert the nodes inside the sub-block based on pattern. By that time, it will have shape/type of upstream operators available. After the conversion is complete, the placeholder node will be removed, and nodes inside its sub-block converted. Run_symbolic_func will be called for these nodes, and they will be converted from ATen operator to ONNX operator.

This PR includes several other fixes, listed below.
* ~~replace helper.cpp with onnx_utils.cpp for holding utility functions.~~
* fix EraseNumberTypes on Bool type, the code was outdated that back then Bool type doesn't exist.
* ~~enable onnx shape inference in export with parameter/initializer data.~~
* other code clean ups.
* fix insertion of identity nodes for loop opset 13 sequence output.

~~PR depends on #51603~~

* Fix after merge

* clang

* Fix clang

* Fix clang

* Fix warning message.

* Fixes for non-model param attributes

* Fix for caffe2

* Additional test

* clang

* Skip test for lower opsets

* fix clang-tidy

* Update init.cpp

* Update remove_inplace_ops_for_onnx.cpp

* Update remove_inplace_ops_for_onnx.cpp

* Update remove_inplace_ops_for_onnx.cpp

* Fix for clang formatting

Test Plan: Imported from OSS

Reviewed By: pbelevich, malfet

Differential Revision: D26922416

Pulled By: SplitInfinity

fbshipit-source-id: e7108620b39b6404c594910786c4d275fee59d84

Co-authored-by: Bowen Bao <bowbao@microsoft.com>
2021-03-12 02:49:11 -08:00
BowenBao
3f9c803fe8 [ONNX] Redesign onnx pass to enable shape type dependent pattern conversion - cont (#51795) (#53304)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53304

With the introduction of ONNX shape inference, shape and type are inferred on the fly as operators get converted from ATen to ONNX when running symbolic function. This resolves the shape/type requirement for the symbolic functions. The pre-onnx passes however, can not be supported by shape inference, since at that stage the operators in the graph are still ATen operators.

This PR is to update the design of ONNX pass, to enable a mechanism of capturing subgraphs of ATen operators of certain patterns, and convert them later, when shape/type information of upstream operators are available.

The new design will require pre-onnx passes that need shape/type to be written in two parts, encapsulation and conversion.

    The encapsulation part will find the nodes of patterns, like how pre-onnx passes were written previously. But instead of converting the nodes, it will encapsulate them into a sub-block of a new placeholder node. This part is called before onnx pass, so it runs before calling symbolic functions.

    The conversion part will be called inside the onnx pass. In onnx pass, run_symbolic_func will be called for each node in topological order. When it reaches the placeholder node, the conversion part will be invoked. It will convert the nodes inside the sub-block based on pattern. By that time, it will have shape/type of upstream operators available. After the conversion is complete, the placeholder node will be removed, and nodes inside its sub-block converted. Run_symbolic_func will be called for these nodes, and they will be converted from ATen operator to ONNX operator.

This PR includes several other fixes, listed below.
* ~~replace helper.cpp with onnx_utils.cpp for holding utility functions.~~
* fix EraseNumberTypes on Bool type, the code was outdated that back then Bool type doesn't exist.
* ~~enable onnx shape inference in export with parameter/initializer data.~~
* other code clean ups.
* fix insertion of identity nodes for loop opset 13 sequence output.

~~PR depends on #51603~~

Test Plan: Imported from OSS

Reviewed By: SplitInfinity

Differential Revision: D26922417

Pulled By: malfet

fbshipit-source-id: 14ed06158d539e2451c2e5e63ba1b32fb0f75095
2021-03-11 10:30:09 -08:00
Nikitha Malgi
cfaa0bf286 [JIT] Update Namespace from cuda to _cuda (#53378)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53378

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D26970607

Pulled By: nikithamalgifb

fbshipit-source-id: 20a55dd9c0071c5870a4b176d30cb9c1e1496687
2021-03-11 00:52:01 -08:00
Michael Suo
b4d8f4af82 [package] implement get_resource_reader API (#51674)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51674

See
https://docs.python.org/3/library/importlib.html#importlib.abc.ResourceReader

Test Plan: Imported from OSS

Reviewed By: zdevito

Differential Revision: D26237034

Pulled By: suo

fbshipit-source-id: 4c19f6172d16b710737528d3de48372873b9368d
2021-03-10 12:11:11 -08:00
Meghan Lele
60ed8fb244 [JIT] Enable ModuleList non-literal indexing (#53410)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53410

**Summary**
This commit enables indexing into `ModuleList` using a non-literal
index if the LHS of the assignment statement of which the indexing is
the RHS is annotated with an interface type.

This feature already exists for `ModuleDict`, and this commit builds on
top of that implementation. A `prim::ModuleContainerIndex` operator is
emitted for any statement of the form `lhs: InterfaceType =
module_container[idx]`. The same operator has to be used for both
`ModuleDict` and `ModuleList` because serialization does not preserve
the metadata that indicates whether a `Module` is a `ModuleDict` or
`ModuleList`.

**Testing**
This commit extends the existing unit tests for non-literal `ModuleDict`
indexing to test non-literal `ModuleList` indexing.

**Fixes**
This commit fixes #47496.

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D26857597

Pulled By: SplitInfinity

fbshipit-source-id: d56678700a264d79aae3de37ad6b08b080175f7c
2021-03-09 16:11:34 -08:00
Sean Silva
34d9278c19 Remove notion of "level" from Module::dump_to_str. (#52539)
Summary:
The code uses `torch::jit::jit_log_prefix` for handling recursive
indenting in most places in this function. There was one place that was
using "level", but it was buggy -- it would result in a compounding
superlinear indent. Note that changing it to "level+1" doesn't fix the
bug.

Before/after:
https://gist.github.com/silvasean/8ee3ef115a48de6c9c54fbc40838d8d7

The new code establishes a recursive invariant for
`Module::dump_to_str`: the function returns the module printed at the
base indent level (i.e. no indent). `torch::jit:log_prefix` is used
to prefix recursive calls. The code was already nearly there, except for
this spurious use of "level".

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52539

Reviewed By: navahgar

Differential Revision: D26773657

Pulled By: gmagogsfm

fbshipit-source-id: ab476f0738bf07de9f40d168dd038dbf62a9a79e
2021-03-09 05:45:57 -08:00
Raghavan Raman
d3cde6c23c [NNC] Implementation for aten::cat without conditionals. (#53128)
Summary:
This PR adds an implementation for `aten::cat` in NNC without any conditionals. This version is not enabled by default.

Here is the performance of some micro benchmarks with and without conditionals. There is up to 50% improvement in performance without conditionals for some of the shapes.

aten::cat implementation in NNC **with** conditionals
```
$ python -m benchmarks.tensorexpr --device cpu --mode fwd --jit_mode trace --cpu_fusion concat
pt: concat2d2input_fwd_cpu_1_160_1_14_1: 5.44 us, SOL 0.26 GB/s, algorithmic 0.51 GB/s
pt: concat2d2input_fwd_cpu_1_580_1_174_1: 5.75 us, SOL 1.05 GB/s, algorithmic 2.10 GB/s
pt: concat2d2input_fwd_cpu_20_160_20_14_1: 6.87 us, SOL 4.05 GB/s, algorithmic 8.11 GB/s
pt: concat2d2input_fwd_cpu_20_580_20_174_1: 14.52 us, SOL 8.31 GB/s, algorithmic 16.62 GB/s
pt: concat2d2input_fwd_cpu_8_512_8_512_1: 9.58 us, SOL 6.84 GB/s, algorithmic 13.68 GB/s
```
aten::cat implementation in NNC **without** conditionals
```
$ python -m benchmarks.tensorexpr --device cpu --mode fwd --jit_mode trace --cpu_fusion --cat_wo_conditionals concat
pt: concat2d2input_fwd_cpu_1_160_1_14_1: 4.67 us, SOL 0.30 GB/s, algorithmic 0.60 GB/s
pt: concat2d2input_fwd_cpu_1_580_1_174_1: 5.65 us, SOL 1.07 GB/s, algorithmic 2.14 GB/s
pt: concat2d2input_fwd_cpu_20_160_20_14_1: 6.10 us, SOL 4.56 GB/s, algorithmic 9.12 GB/s
pt: concat2d2input_fwd_cpu_20_580_20_174_1: 7.44 us, SOL 16.22 GB/s, algorithmic 32.44 GB/s
pt: concat2d2input_fwd_cpu_8_512_8_512_1: 6.46 us, SOL 10.14 GB/s, algorithmic 20.29 GB/s
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53128

Reviewed By: bertmaher

Differential Revision: D26758613

Pulled By: navahgar

fbshipit-source-id: 00f56b7da630b42bc6e7ddd4444bae0cf3a5780a
2021-03-07 22:57:02 -08:00
James Reed
1fe6a6507e [WIP][FX] Fix tracing support for torchbind (#52884)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52884

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D26675801

Pulled By: jamesr66a

fbshipit-source-id: 8e5100bcea17589a53163abf6ab991658e11fa3a
2021-03-05 23:40:16 -08:00
Bram Wasti
56f8379802 [static runtime] Move all heavy constructor logic into InferenceModule (renamed to StaticModule) (#51564)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51564

Constructor logic was spread throughout InferenceModule and StaticRuntime.  This diff unifies the two.  After a lot of discussion on this diff D25961626 it became apparent that `clone` is uglier than a cheap StaticRuntime.

This means StaticRuntime is effectively StaticModule and the only code in the new StaticRuntime is the `run` functions.

```
graph, schema = PrepareForStaticModule(torchscript_module)
sm = StaticModule(graph, schema, options)
sm(inputs)
// or create many cheap runtimes with the module
sr = StaticRuntime(sm)
sr(inputs)
```

Changelist:
- Rename InferenceModule StaticModule
- Move all logic for construction into StaticModule
- Create a new StaticRuntime that only has a unique memory planner (everything else is in StaticModule)
- Update comments with explanation
- Propagate all changes to predictor integration
- Propagate all changes to python integration
- Change semantics to be a bit more PyTorch-standard (no "run" calls, no "get_" getters).

Test Plan:
buck test //caffe2/test:static_runtime
buck test caffe2/benchmarks/static_runtime:static_runtime_cpptest

Reviewed By: hlu1

Differential Revision: D25592967

fbshipit-source-id: 8233bed03137ce129137af2d44bce0095033ef0f
2021-03-05 10:15:26 -08:00
Joel Schlosser
6557ea0509 Context manager for hiding source ranges (#53188)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/52456

## Background

Provides a context manager `_hide_source_ranges()` that disables printing graph source ranges by default. It can be overridden on a per-graph basis if desired.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/53188

Test Plan:
```
python test/test_jit.py TestJit.test_hide_source_ranges_context_manager
```

```python
import torch

torch.jit.script
def foo(x):
    return torch.add(x, x)

print(foo.graph)
with torch.jit._hide_source_ranges():
    print(foo.graph)

    # Override context manager
    print(foo.graph.str(print_source_ranges=True))

print(foo.graph)
```

```
graph(%x.1 : Tensor):
  %3 : int = prim::Constant[value=1]()
  %4 : Tensor = aten::add(%x.1, %x.1, %3) # /Users/jbschlosser/misc/example.py:5:11
  return (%4)

graph(%x.1 : Tensor):
  %3 : int = prim::Constant[value=1]()
  %4 : Tensor = aten::add(%x.1, %x.1, %3)
  return (%4)

graph(%x.1 : Tensor):
  %3 : int = prim::Constant[value=1]()
  %4 : Tensor = aten::add(%x.1, %x.1, %3) # /Users/jbschlosser/misc/example.py:5:11
  return (%4)

graph(%x.1 : Tensor):
  %3 : int = prim::Constant[value=1]()
  %4 : Tensor = aten::add(%x.1, %x.1, %3) # /Users/jbschlosser/misc/example.py:5:11
  return (%4)
```

Reviewed By: walterddr, zhangguanheng66

Differential Revision: D26817070

Pulled By: jbschlosser

fbshipit-source-id: e9d123452c616b0a9dda9e134ef6c2886f229d9b
2021-03-04 09:11:08 -08:00
Tugsbayasgalan Manlaibaatar
4008df3507 Add property binding in torchbind (#50670)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50670

This PR adds property support to Torchbind. There are two cases that it needs to work:

**Torchscript**
Inside Torchscript, we don't go through pybind so there is no issue with accessing properties through ClassType.

**Eager Mode**
In Eager Mode, Torchbind creates ScriptObject which we cannot dynamically add (aka access) properties after initializing it. (https://stackoverflow.com/questions/1325673/how-to-add-property-to-a-class-dynamically
) Therefore we created a Python wrapper (ScriptObjectWrapper) around ScriptObject where we can use property method to set properties.  By doing so, we can look up wrapped object's property through __getattr__ method of the ScriptObjectWrapper. This logic is inspired from https://github.com/pytorch/pytorch/pull/44324

Test Plan:
test cases in test_torchbind.py

Imported from OSS

Reviewed By: pbelevich

Differential Revision: D26632781

fbshipit-source-id: dd690887cfda0c48ff0d104aa240ce0ab09055bc
2021-03-03 14:25:52 -08:00
Elias Ellison
bfae3789ba Move conv to mkldnn (#51483)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51483

This PR moves the conv weights of a frozen model to MKLDNN, and AOT reorders the weights. When the weights are already in MKLDNN, just computing a single conv by converting the input and output from/to mkldnn provides large speedups. I benchmark'd the results of the top 200 shapes in predictor [here](https://www.internalfb.com/phabricator/paste/view/P171537938), as well as verified that it sped up popular models in torchvision.

Test Plan: Imported from OSS

Reviewed By: navahgar

Differential Revision: D26696703

Pulled By: eellison

fbshipit-source-id: 0b4441bee4f6e0890a4540fbca3bb5e58b8c5adf
2021-03-01 21:19:27 -08:00
jiej
4d94ee566e Ge v1 (#52136)
Summary:
This is a second attempt to use graph executor to run forward on a gradient. This allows a secondary chance to profile intermediate tensor introduced by autodiff.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52136

Reviewed By: pbelevich

Differential Revision: D26693978

Pulled By: Krovatkin

fbshipit-source-id: 91dde8009a210950af8e5173668ada241e16dd52
2021-02-28 00:53:13 -08:00
Meghan Lele
1d6bd15790 [JIT] Add torch._C._jit submodule (#52910)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52910

**Summary**
PR #52158 tried to move all JIT bindings from `torch._C` to a new
submodule `torch._C._jit`, but that...did not go well. This pull request
adds the new `torch._C._jit` submodule, but does not migrate the
existing bindings. Instead, it adds a unit test that fails if any new
bindings are added to `torch._C`. A comment in the test instructs
developers to add their new binding to the allowlist if it really should
be in `torch._C`, or to add it to the appropriate submodule (e.g
`torch._C._jit`, for example). The idea is to prevent the issue
described in #51691 from getting *worse* if it cannot be fixed.

**Test Plan**
Continuous integration.

**Fixes**
This commit fixes #51691.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D26698373

Pulled By: SplitInfinity

fbshipit-source-id: ec9f5426051227a513d4fd09512b624420e0100b
2021-02-26 16:05:05 -08:00
Lillian Johnson
b72a72a477 torch.Package extend PyTorchStreamWriter to track written records (#52218)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52218

Test Plan: Imported from OSS

Reviewed By: suo

Differential Revision: D26429794

Pulled By: Lilyjjo

fbshipit-source-id: 5f68e7991c673ada629d0370c705520243d0637a
2021-02-22 15:02:41 -08:00
Nikolay Korovaiko
847d1d4d53 add debug_flush_compilation_cache to Method (#52317)
Summary:
Forgot to add `debug_flush_compilation_cache ` to `Method` as well.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52317

Reviewed By: bdhirsh

Differential Revision: D26583313

Pulled By: Krovatkin

fbshipit-source-id: 1b3e503950cc3314796aff53b3b8038d16767870
2021-02-22 12:31:09 -08:00
Zachary DeVito
60518d10f6 [deploy] torch::deploy API (#51754)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51754

This API allows you to manage multiple python interpreters in a single
process to deploy PyTorch models packaged with torch.package.

torch/csrc/deploy/deploy.h contains the API definition
torch/csrc/deploy/test_deploy.cpp has some examples.

Notes:
* mutex is added to PyTorchStreamReader to make it safe to use from multiple threads at once.
* USE_DEPLOY is only true for the special libtorch_deployinterpreter.so library, when enabled
  we use a hash table to maintain PyObject <> at::Tensor mappping rather than the internal pointer
  in Tensor since >1 interpreter may have a reference to the tensor.
* serialization.py has some additional functions for creating pickle objects
  but keeping storages in memory for use transfering tensors between interpreters

Test Plan: Imported from OSS

Reviewed By: wconstab

Differential Revision: D26329468

Pulled By: zdevito

fbshipit-source-id: d75f4ebb9a27f1d911179d9996041bcb3ca04a07
2021-02-18 02:30:08 -08:00
Nikolay Korovaiko
0019a20a2b [WIP] Add a _flush_compilation_cache for testing (#52001)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52001

Reviewed By: eellison

Differential Revision: D26371876

Pulled By: Krovatkin

fbshipit-source-id: db773d7124916bad31e80bdd7bb9b4170060977b
2021-02-16 10:49:38 -08:00
Meghan Lele
73de98204d [JIT] Add static method support for TorchBind (#51177)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51177

**Summary**
This commit adds support for static methods to TorchBind. Just like
pybind, the API for declaring a static method is `def_static(...)`. A
static method must be called on the class directly, and can be called
both in Python as well as TorchScript.

Support for static methods is implemented in a manner similar to that of
instance methods. Registered static functions are wrapped in a layer of
unboxing logic, their schemas are inferred using templates and
metaprogramming, and they are added to the `ClassType` object
corresponding to the TorchBind class on which they are registered.
ScriptClass has been extended to support a `__getattr__` function so
that static methods of TorchBind classes can be invoked in Python. The
implementation of `__getattr__` returns `ScriptClassFunctionPtr`, a
version of `StrongFunctionPtr` without a compilation unit (since the
functions of a TorchBind class live inside the TorchBind registry).
Within TorchScript, TorchBind static functions are desugared in
`PythonClassValue::attr` by looking them up on the class type of the
`PythonClassValue` instance.

**Test Plan**
This commit adds a unit test that tests a simple static method on a
TorchBind class.

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D26356942

Pulled By: SplitInfinity

fbshipit-source-id: 1b6a9bc2e5f3e22071ad78e331a0201fbbf7ab30
2021-02-13 19:41:27 -08:00
Yanan Cao
705fa7e964 [Usability] Capture argument names for traced functions and modules (#51775)
Summary:
Previously `torch.jit.trace` relies on AutoGrad hooks to infer name of tensors in computation, including those of function/method arguments. This often doesn't work out because:

- These names often do not exist
- Tracer uses argument name of first tensor operation on each tensor as inferred argument names. These tensor operations have programmatically-generated names like `argument_1`

This PR extracts argument names directly from Python functions and pass them down to tracer, which then assigns them to correct graph inputs. This way, we always have the correct argument names captured in IR.

This is useful for both debugging and supporting using `InterfaceType` to represent traced modules.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51775

Reviewed By: izdeby

Differential Revision: D26273105

Pulled By: gmagogsfm

fbshipit-source-id: 934a385041137dc3731bb6fa8657b11532fed9e5
2021-02-10 18:28:08 -08:00
Thomas Viehmann
bd6248106b Keep alive graph when creating iterators from it (#51951)
Summary:
Previously, the graph might have been delete while Python still has iterators, leading to segfaults.

This does not fully work for iterators from Nodes and Blocks as they may be invalidated when the owning graph goes out of scope. I will look into these separately.

Fixes https://github.com/pytorch/pytorch/issues/50454

Pull Request resolved: https://github.com/pytorch/pytorch/pull/51951

Reviewed By: mrshenli

Differential Revision: D26352629

Pulled By: SplitInfinity

fbshipit-source-id: 67299b6cbf1ac7ab77f8703a0ca8f1162e03fcd4
2021-02-10 11:09:51 -08:00
Michael Suo
c357f8b826 [package] make torch.package produce unified format (#51826)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51826

Looks like this:
```
resnet.pt
├── .data  # Data folder named so it can't clash with torch.package codemodules.
│   │      # Names/extensions automatically added to avoid namingconflicts.
│   ├── 94286146172688.storage   # tensor data
│   ├── 94286146172784.storage
│   ├── extern_modules           # torch.package metadata
│   ├── version                  # version metadata
│   └── ...
├── model  # package pickled model created w/
│   │      # exporter.save_pickel('model','model.pkl', resnet_model)
│   └── model.pkl
└── torchvision  # all code dependencies for packaged picked
    └── models   # models are captured as source files
            ├── resnet.py
                    └── utils.py
```

Since `version` is hardcoded in our zip reader/writer implementation,
add it as an option that defaults to "version" but accepts other
locations for putting the version metadata.

Test Plan: Imported from OSS

Reviewed By: zdevito

Differential Revision: D26295649

Pulled By: suo

fbshipit-source-id: 2d75feeb7de0f78196b4d0b6e2b814a7d58bd1dd
2021-02-09 07:45:59 -08:00
Yanan Cao
1065c2d5b6 Fix clang-tidy warnings in python_sugared_value.{h,cpp} (#51703)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51703

Reviewed By: gchanan

Differential Revision: D26245798

Pulled By: gmagogsfm

fbshipit-source-id: 01620adca820968324687982cc48390ff9336d20
2021-02-04 21:29:40 -08:00
Rohan Varma
c941730b96 [JIT/Futures] support set_exception api (#50983)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50983

There is currently no way to handle/propagate errors with the python-based futures API (they are raised correctly if set with an error, but this is only possible from C++).

This diff allows the Future's `unwrap_func` to be set in python optionally, so users can set futures completed with an exception and the error will throw as expected. This is mostly to support the following use case in the next diff:

```
ret_fut = torch.futures.Future(unwrap_func = lambda python_result: {
    # throw exception if needed
    if isinstance(python_result, Exception):
        throw python_result
})

rpc_fut = rpc.rpc_async(...) # RPC future that times out
# Goal is to propagate RPC error to this future
rpc_fut.add_done_callback(
res => {
    # Note that ret_fut.set_result(res.wait()) won't propagate the error
    try:
        ret_fut.set_result(res.wait())
    except Exception as e:
        ret_fut.set_result(e)
}
)
```
ghstack-source-id: 121021434

Test Plan:
unittest
```
buck test mode/dev-nosan mode/no-gpu //caffe2/test:futures -- te
st_unwrap --print-passing-details
```

Reviewed By: mrshenli

Differential Revision: D25950304

fbshipit-source-id: 7ee61e98fcd783b3f515706fa141d538e6d2174d
2021-02-04 20:22:19 -08:00
Thomas Viehmann
86861095fa Graceful invalidation of Python Node/Value/Block when C++ object is deleted (#50326)
Summary:
Previously we might have gotten segfaults and all, now it raises an exception.
Thread safety hasn't been an objective.

I have a followup to expand the Python interface for the API.

Fixes https://github.com/pytorch/pytorch/issues/49969.

wanchaol

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50326

Reviewed By: pbelevich

Differential Revision: D26096234

Pulled By: gmagogsfm

fbshipit-source-id: 5425772002eb4deb3830ed51eaa3964f22505840
2021-02-04 01:34:46 -08:00
anjali411
18a7ec7d7d Update the JIT complex type name to be consistent with Python (#51476)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51476

Test Plan: Imported from OSS

Reviewed By: ezyang

Differential Revision: D26179237

Pulled By: anjali411

fbshipit-source-id: 6a5c60c8545eb42416583836b8038ceffd3f3244
2021-02-03 09:59:08 -08:00
Yanan Cao
351ee1ece7 Remove duplicate check for THPLayout in toSugaredValue (#51543)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51543

Reviewed By: Lilyjjo

Differential Revision: D26202297

Pulled By: gmagogsfm

fbshipit-source-id: f0d40c9d73b579a68e34c54b004d329fd3b76ff3
2021-02-02 12:34:29 -08:00
Meghan Lele
751c30038f [JIT] Properly convert Python strings implictly to device (#51340)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51340

**Summary**
`toIValue` assumes that any value passed for an argument of type
`torch.device` is a valid device object, even when it is not. This can
lead to device type arguments of functions being assigned incorrect
values (see #51098).

This commit adds an explicit check that the passed in object is indeed a
`torch.device` using `THPDevice_Check` and only then does is it
converted to an `IValue`. Since implicit conversion from strings to
devices is generally allowed, if `THPDevice_Check` fails, it is assumed
that the object is a string and an `IValue` containing a `c10::Device`
containing the passed in string is returned.

**Test Plan**
This commit adds a unit test to `test_jit.py` to test that invalid
strings passed as devices are not longer silently accepted.

**Fixes**
This commit fixes #51098.

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D26187190

Pulled By: SplitInfinity

fbshipit-source-id: 48c990203431da30f9f09381cbec8218d763325b
2021-02-02 10:57:56 -08:00