Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43298
IR emitter uses `ModuleValue` to represent ScriptModules and emit IR for
attribute access, submodule access, etc.
`ModuleValue` relies on two pieces of information, the JIT type of the
module, and the `ConcreteModuleType`, which encapsulates Python-only
information about the module.
ScriptModules loaded from a package used to create a dummy
ConcreteModuleType without any info in it. This led to divergences in
behavior during compilation.
This PR makes the two ways of constructing a ConcreteModuleType equivalent,
modulo any py-only information (which, by definition, is never present in
packaged files anyway).
Test Plan: Imported from OSS
Reviewed By: bertmaher
Differential Revision: D23228738
Pulled By: suo
fbshipit-source-id: f6a660f42272640ca1a1bb8c4ee7edfa2d1b07cc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43284
The IR emitter looks for attributes on modules like:
1. Check the JIT type for the attribute
2. Check the originating Python class, in order to fulfill requests for, e.g. static methods or ignored methods.
In the case where you do:
```
inner_module = torch.jit.load("inner.pt")
wrapped = Wrapper(inner_module) # wrap the loaded ScriptModule in an nn.Module
torch.jit.script(wrapped)
```
The IR emitter may check for attributes on `inner_module`. There is no
originating Python class for `inner_module`, since it was directly
compiled from the serialized format.
Due to a bug in the code, we don't guard for this case an a segfault
results if the wrapper asks for an undefined attribute. The lookup in
this case looks like:
1. Check the JIT type for the attribute (not there!)
2. Check the originating Python class (this is a nullptr! segfault!)
This PR guards this case and properly just raises an attribute missing
compiler error instead of segfaulting.
Test Plan: Imported from OSS
Reviewed By: bertmaher
Differential Revision: D23224337
Pulled By: suo
fbshipit-source-id: 0cf3060c427f2253286f76f646765ec37b9c4c49
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43965
As part of a larger effort to unify the API between the lite interpreter and full JIT:
- implement torch::jit::mobile::Method, a proxy for torch::jit::mobile::Function
- add support for overloaded operator() to mobile Method and Function
- mobile find_method now returns a c10::optional<Method> (so signature matches full jit)
- moves some implementation of Function from module.cpp to function.cpp
ghstack-source-id: 111161942
Test Plan: CI
Reviewed By: iseeyuan
Differential Revision: D23330762
fbshipit-source-id: bf0ba0d711d9566c92af31772057ecd35983ee6d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43248
We add the support of __torch_function__ override for C++ custom op. The logic is the same as the other components, like torch.nn.Module.
Refactored some code a little bit to make it reusable.
Test Plan: buck test //caffe2/test:fx -- test_torch_custom_ops
Reviewed By: bradleyhd
Differential Revision: D23203204
fbshipit-source-id: c462a86e407e46c777171da32d7a40860acf061e
Summary:
It is often that the conversion from torch operator to onnx operator requires input rank/dtype/shape to be known. Previously, the conversion depends on tracer to provide these info, leaving a gap in conversion of scripted modules.
We are extending the export with support from onnx shape inference. If enabled, onnx shape inference will be called whenever an onnx node is created. This is the first PR introducing the initial look of the feature. More and more cases will be supported following this PR.
* Added pass to run onnx shape inference on a given node. The node has to have namespace `onnx`.
* Moved helper functions from `export.cpp` to a common place for re-use.
* This feature is currently experimental, and can be turned on through flag `onnx_shape_inference` in internal api `torch.onnx._export`.
* Currently skipping ONNX Sequence ops, If/Loop and ConstantOfShape due to limitations. Support will be added in the future.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40628
Reviewed By: mrshenli
Differential Revision: D22709746
Pulled By: bzinodev
fbshipit-source-id: b52aeeae00667e66e0b0c1144022f7af9a8b2948
Summary:
In case we want to store binary files using `ScriptModule.save(..., _extra_files=...)` functionality. With python3 we can just use bytes only and not bother about it.
I had to do a copy-pasta from pybind sources, maybe we should upstream it, but it'd mean adding a bunch of template arguments to `bind_map` which is a bind untidy.
Let me know if there's a better place to park this function (it seems to be the only invocation of `bind_map` so I put it in the same file)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43241
Reviewed By: zdevito
Differential Revision: D23205244
Pulled By: dzhulgakov
fbshipit-source-id: 8f291eb4294945fe1c581c620d48ba2e81b3dd9c
Summary:
[Re-review tips: nothing changed other than a type in python_ir.cpp to fix a windows build failure]
Adds code printing for enum type
Enhance enum type to include all contained enum names and values
Adds code parsing for enum type in deserialization
Enabled serialization/deserialization test in most TestCases. (With a few dangling issues to be addressed in later PRs to avoid this PR grows too large)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43460
Reviewed By: albanD
Differential Revision: D23284929
Pulled By: gmagogsfm
fbshipit-source-id: e3e81d6106f18b7337ac3ff5cd1eeaff854904f3
Summary:
This patch allows to freeze model that utilizes interfaces. Freezing works
under the user assumption that the interfase module dones not aliases with
any value used in the model.
To enable freezing of such modules, added an extra pramater:
torch._C._freeze_module(module, ignoreInterfaces = True)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41860
Reviewed By: eellison
Differential Revision: D22670566
Pulled By: bzinodev
fbshipit-source-id: 41197a724bc2dca2e8495a0924c224dc569f62a4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42963
* Adds code printing for enum type
* Enhance enum type to include all contained enum names and values
* Adds code parsing for enum type in deserialization
* Enabled serialization/deserialization test in most TestCases. (With a few dangling issues to be addressed in later PRs to avoid this PR grows too large)
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision: D23223281
Pulled By: gmagogsfm
fbshipit-source-id: 716d1866b7770dfb7bd8515548cfe7dc4c4585f7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42869
We realized that when we invoke a simple callback that divides the tensors by `world_size` after `allreduce`, the performance was almost 50% lower in terms of QPS compared to the case where a simple `allreduce` hook is used with no `then` callback.
The main problem was as we call `work.wait()` before invoking `then` callback, we were synchronizing `work`'s stream with the default PyTorch stream inside [`runHook`](https://github.com/pytorch/pytorch/blob/master/torch/csrc/distributed/c10d/reducer.cpp#L609) and stalling the backward computation.
In that PR, we ensure that FutureNCCL's `then` callback is not stalling the backward computation. Assuming single-process single-device, `FutureNCCL` gets a new stream from device's pool using `at::cuda::getStreamFromPool` to run `callback` and before invoking the `callback` inline it synchronizes `WorkNCCL`'s stream by callback's stream not the default stream.
ghstack-source-id: 110208431
Test Plan: Run performance benchmark tests to validate performance issue is resolved. Also, `python test/distributed/test_c10d.py` to avoid any odd issues.
Reviewed By: pritamdamania87
Differential Revision: D23055807
fbshipit-source-id: 60e50993f1ed97497514eac5cb1018579ed2a4c5
Summary:
The ONNX spec for the Squeeze operator:
> Remove single-dimensional entries from the shape of a tensor. Takes a parameter axes with a list of axes to squeeze. If axes is not provided, all the single dimensions will be removed from the shape. If an axis is selected with shape entry not equal to one, an error is raised.
Currently, as explained in issue https://github.com/pytorch/pytorch/issues/36796, it is possible to export such a model to ONNX, and this results in an exception from ONNX runtime.
Fixes https://github.com/pytorch/pytorch/issues/36796.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38476
Reviewed By: hl475
Differential Revision: D22158024
Pulled By: houseroad
fbshipit-source-id: bed625f3c626eabcbfb2ea83ec2f992963defa19
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42389
**Summary**
This commit adds support for properties to TorchScript classes,
specifically for getters and setters. They are implemented essentially
as pointers to the methods that the corresponding decorators decorate,
which are treated like regular class methods. Deleters for properties
are considered to be out of scope (and probably useless for TorchScript
anyway).
**Test Plan**
This commit adds a unit test for a class with a property that has both
getter and setter and one that has only a getter.
`python test/test_jit.py TestClassType.test_properties`
Test Plan: Imported from OSS
Reviewed By: eellison, ppwwyyxx
Differential Revision: D22880232
Pulled By: SplitInfinity
fbshipit-source-id: 4828640f4234cb3b0d4f3da4872a75fbf519e5b0
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/42133
Test Plan:
We save a module with module debugging information as follows.
```
import torch
m = torch.jit.load('./detect.pt')
# Save module without debug info
m._save_for_lite_interpreter('./detect.bc')
# Save module with debug info
m._save_for_lite_interpreter('./detect.bc', _save_debug_info_in_bytecode=True)
```
Size of the file without module debugging information: 4.508 MB
Size of the file with module debugging information: 4.512 MB
Reviewed By: kimishpatel
Differential Revision: D22803740
Pulled By: taivu1998
fbshipit-source-id: c82ea62498fde36a1cfc5b073e2cea510d3b7edb
Summary:
The premise of this approach is that a small subset of neural networks are well represented by a data flow graph. The README contains more information.
The name is subject to change, but I thought it was a cute reference to fire.
suo let me know if you'd prefer this in a different spot. Since it lowers a JIT'd module directly I assumed the JIT folder would be appropriate. There is no exposed Python interface yet (but is mocked up in `test_accelerant.py`)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42753
Reviewed By: zou3519
Differential Revision: D23043771
Pulled By: bwasti
fbshipit-source-id: 5353731e3aae31c08b5b49820815da98113eb551
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42740
Adds a pass to hoist conv packed params to root module.
The benefit is that if there is nothing else in the conv module,
subsequent passes will delete it, which will reduce module size.
For context, freezing does not handle this because conv packed
params is a custom object.
Test Plan:
```
PYTORCH_JIT_LOG_LEVEL=">hoist_conv_packed_params.cpp" python test/test_mobile_optimizer.py TestOptimizer.test_hoist_conv_packed_params
```
Imported from OSS
Reviewed By: kimishpatel
Differential Revision: D23005961
fbshipit-source-id: 31ab1f5c42a627cb74629566483cdc91f3770a94
Summary:
[5/N] Implement Enum JIT support
Implement Enum class iteration
Add aten.ne for EnumType
Supported:
Enum-typed function arguments
using Enum type and comparing them
Support getting name/value attrs of enums
Using Enum value as constant
Support Enum-typed return values
Support iterating through Enum class (enum value list)
TODO:
Support serialization and deserialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42661
Reviewed By: SplitInfinity
Differential Revision: D22977364
Pulled By: gmagogsfm
fbshipit-source-id: 1a0216f91d296119e34cc292791f9aef1095b5a8
Summary:
in `_jit_pass_onnx`, symbolic functions are called for each node for conversion. However, there are nodes that cannot be converted without additional context. For example, the number of outputs from split (and whether it is static or dynamic) is unknown until the point where it is unpacked by listUnpack node. This pass does a preprocess, and prepares the nodes such that enough context can be received by the symbolic function.
* After preprocessing, `_jit_pass_onnx` should have enough context to produce valid ONNX nodes, instead of half baked nodes that replies on fixes from later postpasses.
* `_jit_pass_onnx_peephole` should be a pass that does ONNX specific optimizations instead of ONNX specific fixes.
* Producing more valid ONNX nodes in `_jit_pass_onnx` enables better utilization of the ONNX shape inference https://github.com/pytorch/pytorch/issues/40628.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41832
Reviewed By: ZolotukhinM
Differential Revision: D22968334
Pulled By: bzinodev
fbshipit-source-id: 8226f03c5b29968e8197d242ca8e620c6e1d42a5
Summary:
* move both under new file `fixup_onnx_controlflow`
* move the fixup to where the ONNX loop/if node is created, as oppose to running the fixup as postpass. This will help with enable onnx shape inference later.
* move `fuseSequenceSplitConcat` to `Peephole`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40943
Reviewed By: mrshenli
Differential Revision: D22709999
Pulled By: bzinodev
fbshipit-source-id: 51d316991d25dc4bb4047a6bb46ad1e2401d3d2d
Summary:
Raise and assert used to have a hard-coded error message "Exception". User provided error message was ignored. This PR adds support to represent user's error message in TorchScript.
This breaks backward compatibility because now we actually need to script the user's error message, which can potentially contain unscriptable expressions. Such programs can break when scripting, but saved models can still continue to work.
Increased an op count in test_mobile_optimizer.py because now we need aten::format to form the actual exception message.
This is built upon an WIP PR: https://github.com/pytorch/pytorch/pull/34112 by driazati
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41907
Reviewed By: ngimel
Differential Revision: D22778301
Pulled By: gmagogsfm
fbshipit-source-id: 2b94f0db4ae9fe70c4cd03f4048e519ea96323ad
Summary:
[3/N] Implement Enum JIT support
* Add enum value as constant support
* Add sugared value for EnumClass
Supported:
Enum-typed function arguments
using Enum type and comparing them
Support getting name/value attrs of enums
Using Enum value as constant
TODO:
Add PyThon sugared value for Enum
Support Enum-typed return values
Support serialization and deserialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42085
Reviewed By: eellison
Differential Revision: D22758042
Pulled By: gmagogsfm
fbshipit-source-id: 5c6e571686c0b60d7fbad59503f5f94b3b3cd125
Summary:
Currently constant pooling runs before const propagation, which can create more constants that need pooling. This can get in the way of serialization/deserialization stability because each time user serializes and deserializes a module, runCleanUpPasses is called upon it. Doing so multiple times would lead to different saved module.
This PR moves constant pooling after const propagation, which may slow down const propagation a little bit, but would otherwise side-step aforementioned problem.
test_constant_insertion in test_jit.py is also updated because after fixing the pass ordering, the number of constants is no longer a constant and it is extremely difficult to get the exact number with the current convoluted test structure. So for now, I changed the test to check only that CSE doesn't change number of "prim::constant" rather than comparing against a known number. Also left a TODO to improve this test.
ConstantPropagation pass is replaced by ConstantPropagationImmutableTypes because the latter is used in runCleanUpPasses. If not replaced, the former would create new CSE opportunities by folding more constants. This voids the purpose of the test case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41891
Reviewed By: colesbury
Differential Revision: D22701540
Pulled By: gmagogsfm
fbshipit-source-id: 8e60dbdcc54a93dac111d81b8d88fb39387224f5
Summary:
Add pass that fuses Conv and Batchnormalization nodes into one node Conv.
This pass is only applied in inference mode (training is None or TrainingMode.Eval).
Since this pass needs access to param_dict it is written outside peephole file where these kind of passes (fusing multiple nodes into one) is usually placed.
This PR also adds wrapper skipIfNoEmbed to skip debug_embed_params test:
Pass that fuses Conv and Batchnorm changes the params of resnet model and parameters of onnx and pytorch model won't match. Since parameters are not matching, debug_embed_params test for test_resnet will fail and that is expected, therefore debug_embed_params test for test_resnet should be skipped.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40547
Reviewed By: gchanan
Differential Revision: D22631687
Pulled By: bzinodev
fbshipit-source-id: fe45812400398a32541e797f727fd8697eb6d8c0
Summary:
* Add EnumType and AnyEnumType as first-class jit type
* Add Enum-typed IValue
* Enhanced aten::eq to support Enum
Supported:
Enum-typed function targuments
using Enum type and comparing them
TODO:
Add PyThon sugared value for Enum
Support getting name/value attrs of enums
Support Enum-typed return values
Support enum values of different types in same Enum class
Support serialization and deserialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41390
Reviewed By: eellison
Differential Revision: D22524388
Pulled By: gmagogsfm
fbshipit-source-id: 1627154a64e752d8457cd53270f3d14aea4b1150
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41414
This diff exports replaceAllUsesAfterNodeWith to PythonAPI.
Test Plan: Tested locally. Please let me know if there is a set of unit tests to be passed outside of the default ones triggered by Sandcastle.
Reviewed By: soumith
Differential Revision: D22523211
fbshipit-source-id: 3f075bafa6208ada462abc57d495c15179a6e53d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41146
**Summary**
This commit adds support for using `Modules` that have been lowered as
submodules in `ScriptModules`.
**Test Plan**
This commit adds execution and save/load tests to test_backends.py for
backend-lowered submodules.
**Fixes**
This commit fixes#40069.
Test Plan: Imported from OSS
Reviewed By: ailzhang
Differential Revision: D22459543
Pulled By: SplitInfinity
fbshipit-source-id: 02e0c0ccdce26c671ade30a34aca3e99bcdc5ba7
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40252
As title says.
Test Plan:
python test/test_mobile_optimizer.py
Imported from OSS
Differential Revision: D22126825
fbshipit-source-id: a1880587ba8db9dee0fa450bc463734e4a8693d9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39343
Building on top of previous PR that adds fused add_relu op, this PR adds
a JIT pass to transform input graph to find all fusable instancs of add
+ relu and fuses them.
Test Plan:
python test/test_jit.py TestJit.test_add_relu_fusion
Imported from OSS
Differential Revision: D21822396
fbshipit-source-id: 12c7e8db54c6d70a2402b32cc06c7e305ffbb1be
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40718
Currently only constant except tensor must be inlined during serialization.
Tensor are stored in the contant table. This patch generalizes this capability
to any IValue. This is particularly useful for non ASCII string literal that
cannot be inlined.
Test Plan: Imported from OSS
Differential Revision: D22298169
Pulled By: bzinodev
fbshipit-source-id: 88cc59af9cc45e426ca8002175593b9e431f4bac
Summary:
This should be in its own file...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/41137
Reviewed By: jamesr66a
Differential Revision: D22437922
Pulled By: eellison
fbshipit-source-id: 1b62dde1a4ebac673b5c60aea4f398f734d62501
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40902
See the bottom of this stack for context.
Test Plan: Imported from OSS
Reviewed By: eellison
Differential Revision: D22360210
Pulled By: suo
fbshipit-source-id: 4275127173a36982ce9ad357aa344435b98e1faf
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40841
**Summary**
This commit adds support for using `Modules` that have been lowered as
submodules in `ScriptModules`.
**Test Plan**
This commit adds execution and save/load tests to test_backends.py for
backend-lowered submodules.
**Fixes**
This commit fixes#40069.
Test Plan: Imported from OSS
Differential Revision: D22418716
Pulled By: SplitInfinity
fbshipit-source-id: d2b2c6d5d2cf3042a620b3bde7d494f1abe28dc1
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40939
Previously, when we would do shape analysis by running the op with representative inputs, we would always set the grad property to false. This led to a wrong static analysis when we would create differentiable subgraphs, and propagate shapes without also propagating requires_grad, and then uninline them.
Test Plan: Imported from OSS
Differential Revision: D22394676
Pulled By: eellison
fbshipit-source-id: 254e6e9f964b40d160befe0e125abe1b7aa2bd5e