Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59814
Using opinfos to test shape analysis. By default, we just check that we don't give incorrect answers, and then if `assert_jit_shape_analysis` is true, tests that we correctly propagates the full shape. and it found a couple bugs {emoji:1f603}
Test Plan: Imported from OSS
Reviewed By: Krovatkin
Differential Revision: D30200058
Pulled By: eellison
fbshipit-source-id: 6226be87f5390277cfa5a1fffaa1b072d4bc8803
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62200
This commit brings back the `RemoveInplaceOps` pass removed in D29523283 (dec5aa2260) that apparently had a bunch of internal users.
Test Plan: danthe3rd
Reviewed By: danthe3rd
Differential Revision: D29833316
fbshipit-source-id: 6cf13d463ab0a5e50ba3eb3243f79a9c51623809
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61791
methods from forward
During inlining we attached InlinedCallstack to nodes being inlined. In
the process we attach moodule information as well, such that if
CallMethod is being inlined we know which class instance and class type
the method belongs to. However, CallMethod can be calling a method of
the same object to which the graph belongs. e.g.:
```
def forward(self, input):
x = input + 10
return forward_impl_(x, input)
```
Here forward_impl is method defined on the same class in which forward
is defined. Existing module hierarchy annotation will mislabel this as
unknown instance since the method is not associated with output of
GetAttr node (it would be we had called self.conv.forward_impl_ for
example).
Change in this PR reconciles this by creating a placeholder name "SELF"
for module instance indicating that you can traverse InlinedCallStack
backwards to find first node with name != SELF, which would be the name
of the object.
e.g.:
TOP(ResNet)::forward.SELF(ResNet)::_forward_impl.layer1(Sequential)::forward.0(BasicBlock)::forward.conv1(Conv2d)::forward.SELF(Conv2d)::_conv_forward
Test Plan:
Add test
Imported from OSS
Reviewed By: larryliu0820
Differential Revision: D29745443
fbshipit-source-id: 1525e41df53913341c4c36a56772454782a0ba93
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`
All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`; do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008
Reviewed By: driazati, r-barnes
Differential Revision: D29838584
Pulled By: malfet
fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61806
Currently, if you do `save_pickle` on a ScriptModule, then `save_pickle`
on a tensor, this would result in a `0.storage` tensor being written
*twice* to the zip archive. This would cause weird bugs on the
serializing side (this presented as a ASAN-detected heap buffer overflow
because we tried to read more memory from a tensor than we actually
had).
Turns out this was because when we did:
```
self.storage_context = self.script_module_serializer.storage_context()
```
it returned a new copy of the storage context, so we weren't actually
assigning unique names to tensors!!
This PR fixes the issue by making `(De)SerializationStorageContext`
non-copyable and fixing up the parts of the bindings that returned by
copy.
Differential Revision:
D29748969
D29748969
Test Plan: Imported from OSS
Reviewed By: Lilyjjo
Pulled By: suo
fbshipit-source-id: c2f89ab270e07e7a111fb35c545b5e07b804dc3c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61520
This commit widens the exception caught by the try-catch block that checks if
an object passed to a scripted function is a `ScriptList`. It turns out that
there are internal tests that do not throw a `py::cast_error` so catching only
that is not sufficient.
Test Plan: Ran the failing tests in T94889011.
Reviewed By: Chillee
Differential Revision: D29560815
fbshipit-source-id: 442258f8997146d833a9d5db923e1f6359f2bfdd
Summary:
* Minor: spelling, grammar.
* Add calls to `GRAPH_DUMP()` where they were missing.
* Add or expand a few comments.
* Move a few comments to seemingly more appropriate spots.
* In canonicalize_graph_fuser_ops.cpp inline `runnableInputs()` since it
was only called in one place and had a misleading comment and
confusing name.
* In `PeepholeOptimizeImpl::optimizeBlock()`, set `changed = true;` when
removing `aten::is_complex`. Pretty sure its absence was a bug.
* Delete unused `_jit_pass_remove_inplace_ops` and and its
implementation `RemoveInplaceOps()`.
* In `preprocessCaffe2Ops()`, remove redundant check for nested optional
types. It was already checked in `checkONNXCompatibility()`.
* In `EncoderBase::AddAttribute`, log the unexpected attribute kind.
I don't remember the repro case now but I did hit this error at some
point and this additional logging made it easier to understand.
* In `fuseConvBatchNorm()` in eval_peephole.cpp, consistently use
camelCase instead of snake_case for local variables.
* Add curly braces around the bodies of if and loops.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60390
Reviewed By: Krovatkin
Differential Revision: D29523283
Pulled By: SplitInfinity
fbshipit-source-id: 4e16c5648616f53da07d68dab7fdf252e06a0752
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60244
Do 2GB size check for protocol buffer serialization at a later time, to avoid false alarming for cases like shape inference where no serialization actually happens.
Test Plan: Imported from OSS
Reviewed By: zou3519, ZolotukhinM
Differential Revision: D29494910
Pulled By: SplitInfinity
fbshipit-source-id: 4c36d26de9a94e5d6cf78f332d4dffc46588ebf0
Co-authored-by: BowenBao <bowbao@microsoft.com>
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52832
**Summary**
This commit adds `torch._C.ScriptList`, a list type that has reference
semantics across the Python/TorchScript boundary. That is, modifications
made in TorchScript to instances of `torch._C.ScriptList`
are visible in Python even when it is not returned from the function.
`torch._C.ScriptList` is implemented using a modified version of pybind's
`stl_bind.h`-style bindings attached to `ScriptList` and `ScriptListIterator`,
wrapper classes around `c10::impl::GenericList` and
`c10::impl::GenericList::iterator`. These bindings allow instances of
`torch._C.ScriptList` to be used as if it were a
regular `list` in Python. Reference semantics are achieved by simply
retrieving the `IValue` contained in `ScriptList` in `toIValue` (invoked
when converting Python arguments to `IValues` before calling TorchScript
code).
**Test Plan**
This commit adds `TestScriptList` to `test_list_dict.py`, a set of tests
that check that all of the common list operations are supported
and that instances have reference semantics across the
Python/TorchScript boundary.
Test Plan: Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D29478121
Pulled By: SplitInfinity
fbshipit-source-id: 652cc25cfa37debe28db9527504846f22abd8b54
Summary:
This PR suppresses clang-tidy warnings in the codebase (for now) so that we can re-enable clang-tidy checks on master.
I ran this script to add the `NOLINTNEXTLINE` comments (on a devserver):
```bash
python3 setup.py develop
# Uses same script that's run on CI and adds the -j (parallel), -s (add comments), -k (continue if diagnostic errors are found) options
python3 tools/clang_tidy.py \
-j \
-s \
-k \
-v \
--paths torch/csrc/ \
-g"-torch/csrc/jit/passes/onnx/helper.cpp" \
-g"-torch/csrc/jit/passes/onnx/shape_type_inference.cpp" \
-g"-torch/csrc/jit/serialization/onnx.cpp" \
-g"-torch/csrc/jit/serialization/export.cpp" \
-g"-torch/csrc/jit/serialization/import.cpp" \
-g"-torch/csrc/jit/serialization/import_legacy.cpp" \
-g"-torch/csrc/onnx/init.cpp" \
-g"-torch/csrc/cuda/nccl.*" \
-g"-torch/csrc/cuda/python_nccl.cpp" \
-g"-torch/csrc/autograd/FunctionsManual.cpp" \
-g"-torch/csrc/generic/*.cpp" \
-g"-torch/csrc/jit/codegen/cuda/runtime/*" \
-g"-torch/csrc/deploy/interpreter/interpreter.cpp" \
-g"-torch/csrc/deploy/interpreter/interpreter.h" \
-g"-torch/csrc/deploy/interpreter/interpreter_impl.h" \
-g"-torch/csrc/deploy/interpreter/test_main.cpp"
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60649
Test Plan: Verified changes by re-running the script (without the `-s` option) and seeing no warnings/errors.
Reviewed By: walterddr, janeyx99
Differential Revision: D29504258
Pulled By: 1ntEgr8
fbshipit-source-id: 78310b30ee8213b73ddb4771ad874665323e7a4e
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44324
**Summary**
This commit adds reference semantics to TorchScript class types;
modifications made to them within TorchScript will be visible in Python.
**Test Plan**
This commit adds a unit test to `TestClassType` that checks that
modifications made to a class type instance passed into TorchScript are
visible in Python after executing the scripted function or module.
**Fixes**
This commit closes#41421.
Test Plan: Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D24912807
Pulled By: SplitInfinity
fbshipit-source-id: d64ac6211012425b040b987e3358253016e84ca0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60384
Currently inlining module graph will drop module hierarchy info on Python side. Here we retrieve the module hierarchy from cpp side and expose it to a new Python API on Node called `moduleHierarchy()`.
Test Plan:
Usage:
```
torch._C._jit_pass_inline(module.graph)
torch._C._jit_pass_propagate_shapes_on_graph(module.graph)
node = module.graph.findNode("quantized::conv2d_relu")
'top(' + module.original_name + ').' + node.moduleHierarchy() + '.' + node.kind()
```
Output:
```
'top(QuantWrapper).module(FBNetHR).0(Sequential).xif0_0(ConvBNRelu).conv(ConvReLU2d).quantized::conv2d_relu'
```
Reviewed By: kimishpatel
Differential Revision: D29252169
fbshipit-source-id: 74163a87f919e061e5e75dfebc4c5cdbe8489d93
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57334
Here's a possibly controversial PR. These counters got in the way of
generalizing the fuser tests to handle arbitrary devices, and I guess I'm just
generally skeptical that they provide much value. While true that they let us
observe whether fusion groups were created, we already have assertions based on
the shape of the graph, and I'm not sure that I trust those any less than these
counters.
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision: D29471484
Pulled By: bertmaher
fbshipit-source-id: f6d76f6e72dbfb581acff1d834b0c74500941b57
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59735
1. Fixes ABA storage identity problem during serialization for `torch.package` by keeping reference of serialized storages through lifetime of `PackageExporter` to prevent reuse of memory address. Achieved by extending logic used in solution to mobile's same issue.
2. Adds determinism to naming scheme of serialized storages in export code paths which utilize `tensor_cdata_naming_scheme`(introduced 2nd mapping in `StorageContext`, now maps `storage cdata ptr` -> `unique id`, `unique id` -> `c10::Storage`)
3. Additionally uses presence of a storage in the `StorageContext` instance as marker for if a storage has been serialized or not, removing the need to scan the `PythonStreamWriter` for presence of the storage's serialization file
Test Plan: Imported from OSS
Reviewed By: suo
Differential Revision: D29075276
Pulled By: Lilyjjo
fbshipit-source-id: 15a5c30b1de99c5bd7079388f2db9b6ece2eca12
Summary:
Description:
- Before this, logging level could only be changed by changing the env
variable "PYTORCH_JIT_LOG_LEVEL"
- Can change the level from python now
- Have not added stream configuration for now
- Configuration is stored in a singleton class managing the options
Issue Link: https://github.com/pytorch/pytorch/issues/54188
Gotchas:
- Created separate functions
`::torch::jit::get_jit_logging_levels/set_jit_logging_levels` instead of
using the singleton class's method directly
- This is because when running test cases, two different instances
of the singleton are created for the test suite and the actual code
(`jit_log.cpp`)
- On using these methods directly, `is_enabled` calls the singleton
in `jit_log.cpp` while we are setting the config using another
singleton
- See: https://stackoverflow.com/questions/55467246/my-singleton-can-be-called-multiple-times
API:
- To set the level: `torch._C._jit_set_logging_option("level")`
- To get the level: `torch._C._jit_get_logging_option()`
Testing:
- UTs were added for C++
- A very simple UT was added for python to just check if the API is
being called correctly
- The API was checked by running trace in a sample python file
- Set env variable to "" and used `_jit_set_logging_option` in python to set the variable to `>dead_code_elimination`
- The error output had logs of form [DUMP..] [UPDATE...] etc
Fixes https://github.com/pytorch/pytorch/issues/54188
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58821
Reviewed By: soulitzer
Differential Revision: D29116712
Pulled By: ZolotukhinM
fbshipit-source-id: 8f2861ee2bd567fb63b405953d035ca657a3200f
Summary:
This commit removes the warning that suggests that users script their
dictionaries before passing them into TorchScript code. The ScriptDict feature
is not fully ready, so it does not make sense to recommend this yet.
Test Plan:
Sandcastle.
In addition, the PyPER test broken by the original diff passes:
```
buck test mode/opt //caffe2/torch/fb/training_toolkit/backend/tests:test_model_materializer_full_sync_lwt -- --exact 'caffe2/torch/fb/training_toolkit/backend/tests:test_model_materializer_full_sync_lwt - caffe2.torch.fb.training_toolkit.backend.tests.test_model_materializer_full_sync_lwt.ModelMaterializerFullSyncLwtTest: test_materialization_determinism_cpu' --run-disabled
```
Differential Revision: D28891351
fbshipit-source-id: 2a3a00cde935d670fb1dc7fd8c709ae9c2ad8cdc
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59287
D27211605 added a warning in `toIValue` that warns users to script their
dictionaries before passing them to TorchScript functions in order to get some
performance benefits and reference semantics. However, this warning is emitted
every time `toIValue` is called (e.g. when a dictionary is passed to
TorchScript function), which can lead to noisy log output. This diff changes
this changes to use `TORCH_WARN_ONCE` instead.
Test Plan: Sandcastle, OSS CI.
Reviewed By: hyuen
Differential Revision: D28824468
fbshipit-source-id: e651eade4380abaf77c6c8a81ec4e565b0c2c714
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56966
This PR adds a toggle to shape analysis which won't inline complete tensor shapes as constants into the shape compute graph, which is a good stress test on the partial evaluation pipeline.
Test Plan: Imported from OSS
Reviewed By: bdhirsh
Differential Revision: D28444664
Pulled By: eellison
fbshipit-source-id: a62e424515a8837a4b596546efa93af5e8e61f10
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52659
**Summary**
This commit adds `torch._C.ScriptDict`, a dictionary type that has reference
semantics across the Python/TorchScript boundary. That is, modifications
made to instances of `torch._C.ScriptDict` in TorchScript are visible in
Python even when it is not returned from the function. Instances can be
constructed by passing an instance of a Python dictionary to
`torch.jit.script`. In the case of an empty dictionary, its type is
assumed to be `Dict[str, Tensor]` to be consistent with the handling of
empty dictionaries in TorchScript source code.
`torch._C.ScriptDict` is implemented using a modified version of pybind's `stl_bind.h`-style bindings attached to `ScriptDict`, `ScriptDictIterator` and `ScriptDictKeyIterator`, wrapper classes around `c10::impl::GenericDict` and `c10::impl::GenericDict::iterator`. These bindings allow instances of `torch._C.ScriptDict` to be used as if it were a regular `dict` Python. Reference semantics are achieved by simply retrieving the `IValue` contained in `ScriptDict` in `toIValue` (invoked when converting Python arguments to `IValues` before calling TorchScript code).
**Test Plan**
This commit adds `TestScriptDict` to `test_list_dict.py`, a set of tests
that check that all of the common dictionary operations are supported
and that instances have reference semantics across the
Python/TorchScript boundary.
Differential Revision:
D27211605
D27211605
Test Plan: Imported from OSS
Reviewed By: gmagogsfm
Pulled By: SplitInfinity
fbshipit-source-id: 446d4e5328375791aa73eb9e8b04dfe3465af960
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58300
Current state: During graph rewriting that can fuse nodes or add nodes
result in new nodes without debug information that was available in
original node. Thus we lose this information during graph rewrite.
This PR changes graph rewriting API to let user specify how the values
in the replacement pattern map to values in the pattern to be matched.
Then the graph rewriting will copy source range and inlined callstack
from the matched nodes onto the nodes being inserted.
(Note: this ignores all push blocking failures!)
Test Plan:
python test/test_jit.py
TestJit.test_pattern_based_rewrite_with_source_range_preserved
Imported from OSS
Reviewed By: malfet
Differential Revision: D28512465
fbshipit-source-id: 863173c29de726be85b3acbd3ddf3257eea36d13
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58887
There are some callsites of `torch.distributed.rpc.XXX` APIs that are compiled
or not based on `USE_RPC`. However, `torch::deploy`, at least for now,
is compiled with `USE_RPC=1`, but the `torch.distributed.rpc.XXX` APIs used by
the aforementioned pieces of code are not available (i.e.
`torch.distributed.rpc.is_available()` returns `False`). This can cause
Torchscript compilation to fail, even if the code being compiled doesn't use
RPC.
This commit fixes this problem (at least temporarily) by predicating the use
all thse `torch.distributed.rpc` APIs on the value of
`torch.distributed.rpc.is_available()`.
Test Plan: Ran packaged XLM-R model with C++ benchmark.
Reviewed By: suo
Differential Revision: D28660925
fbshipit-source-id: fbff7c7ef9596549105e79f702987a53b04ba6f9
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57501
Add an api _get_model_ops_and_info to get root operators and versioning info of a model in both cxx and python, and the input can be from a file path or buffer.
ghstack-source-id: 129620112
Test Plan: unit test.
Reviewed By: xcheng16, raziel
Differential Revision: D28162765
fbshipit-source-id: 4413c1e906b8a872e4a717d849da37347adbbea4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55926
This is necessary for code like conv2d where we wish to share a generic convolution shape function logic with that of conv2d but for conv2d always infer the output is dimension 4. I'm also hoping the refinement algorithm here could be refactored out and used to support refining tensor types from user annotations. i have a length comment explaining how this works, and the logic outside of data structures is pretty small and contained. Additionally, you might check out https://fb.quip.com/X7EVAdQ99Zzm for a very similar description of how to refine values based on comparison operators.
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision: D27750997
Pulled By: eellison
fbshipit-source-id: d962415af519ac37ebc9de88f2e1ea60a1374f7c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55925
This sets up the initial handling of symbolic shapes. As in the test, it doesn't work perfectly yet because it needs a couple other optimization passes. The basic description is pretty simple: we resolve tensor dimension indices to the same Value *, and before extracting out the output Tensor shape we substitute in symbolic shapes. We don't substitute during optimization because they are represented as negative numbers so we don't want them inadvertently used in Constant prop or something else.
Test Plan: Imported from OSS
Reviewed By: ZolotukhinM
Differential Revision: D27750996
Pulled By: eellison
fbshipit-source-id: 6984e7276b578f96b00fc2025cef0e13f594b6e6
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54809
I'm going to post on dev-discuss soon with a more thorough explanation of the design and advantages of this shape analysis, so I'm leaving out that for now.
There is still a ton left to do, I'm posting this initial version so we can get something on master multiple can work on. List of many remaining steps to do:
- [ ] Add symbolic shapes support
- [ ] Bind shape functions for operators in C++
- [ ] Make classes of operators share the same shape function (e.g. pointwise, broadcast two inputs)
- [ ] Refactor APIs
- [ ] Only iteratively optimize shape function while a change has been made
- [ ] Expand coverage of coverage to common ops
- [ ] Add shape analysis pass on Graph that handles Ifs and Loops
- [ ] Allow concurrent reads to the operator map
- [ ] Successive applications of same inputs to same shape function (e.g. series of pointwise ops)
For this review, I am mostly looking for comments related to the implementation of symolic_shape_analysis.cpp, with the caveats listed above. I am not really looking for comments related to api/registration/graph level analysis as those are all planned to be changed. I am fine landing this as is or waiting until necessary components of the TODOs above are finished.
Test Plan: Imported from OSS
Reviewed By: pbelevich
Differential Revision: D27750998
Pulled By: eellison
fbshipit-source-id: 4338b99e8651df076291c6b781c0e36a1bcbec03
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57397
Introduces two main classes in C++ runtime:
ScriptProfile is the implementation for enalbing and disabling interpreter
profiling in C++. This should be only used from Python, and we will add
corresponding Python API in the next diff.
InstructionSpan is a utility class to instrument execution of each single
instruction. A start timestamp is recorded in the consturctor, and an end
timestamp is recorded in the destructor. During destruction, this will send
runtime data to all enabled ScriptProfile instances.
Test Plan:
build/bin/test_jit --gtest_filter='ScriptProfileTest.Basic'
Imported from OSS
Reviewed By: gmagogsfm
Differential Revision: D28133579
fbshipit-source-id: e7e30e96151367022793ab3ad323f01c51ad4a3b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58382
Calling markCompleted on a Future now first acquires the Future's mutex (as usual) but then sometimes tries to acquire the GIL during the DataPtr extraction while still holding the Future's mutex. (This happens when the value passed to markCompleted is a Python object). This can cause a deadlock if someone else calls any of the other methods of Future while holding the GIL.
There are two solutions to this: avoid holding the Future's mutex when extracting DataPtrs, and avoid holding the GIL while invoking the Future's method. In this PR I'm going for the latter, because it's a very simple immediate fix, but I believe this is brittle and that we should probably also consider the former fix.
ghstack-source-id: 129105358
Test Plan: The repro in https://github.com/pytorch/pytorch/issues/58239 now doesn't deadlock.
Reviewed By: mrshenli
Differential Revision: D28472816
fbshipit-source-id: 1bc9bca426dd004f9eb2568db1ffd38f014450e2
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56152
Currently, the Bundled Inputs API mutates the module in-place. It adds class methods and not instance methods. This results in a small problem that one can't re-run an already executed cell in Bento if the class has already been subject to bundled inputs.
In addition, there is no way to add bundled inputs to a module that has bundled inputs added already. This API provides a way to solve this problem as well by adding an `ignored_methods` to the call to `clone()` by allowing the implementation of bundled inputs to pass in the methods that it will add as `ignored_methods` so that when it does try to add those methods, it will be able to do so successfully.
We'll have to be careful when ignoring those methods during the call to `torch.jit._clone_module_with_class` since any bundled input that relies on a user-provided method will need to be preserved and not ignored during the clone.
Looking for feedback on whether this is an acceptable direction.
ghstack-source-id: 128908360
Test Plan:
Added unit test and ran it as `buck test //caffe2/test:mobile`
Also see this Bento Notebook: https://www.internalfb.com/intern/anp/view/?id=550829
Reviewed By: gmagogsfm
Differential Revision: D27788394
fbshipit-source-id: 48109cd4583506d4efdb345e4ba31385db23a273
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57079
Testing onnx 1.9 release, we see that the old bug is triggered for the caffe2 test:
`pytest test/onnx/test_pytorch_onnx_caffe2_quantized.py::TestQuantizedOps::test_small_model`
This is because the graph inputs
```python
graph(%x.1 : Tensor,
%conv1._packed_params : __torch__.torch.classes.quantized.Conv2dPackedParamsBase,
%conv2._packed_params : __torch__.torch.classes.quantized.Conv2dPackedParamsBase,
%fc.bias : Float(10, strides=[1], requires_grad=0, device=cpu),
%fc.weight : Float(10, 72, strides=[72, 1], requires_grad=0, device=cpu)):
```
contains `Conv2dPackedParamsBase` which is a PackedParams.
When we do flatten, we will flatten to several tensors, then the shape inference for input misaligned.
This PR record how may tensors got flattened in PackeParams, and skip by these number rather than 1, then the UT passed.
Note that tuple case should still follow the original logic.
Test Plan: Imported from OSS
Reviewed By: SplitInfinity
Differential Revision: D28393949
Pulled By: malfet
fbshipit-source-id: 98d48aad27e5ca03fb10d260f8e625478d996ee2
Co-authored-by: David <jiafa@microsoft.com>
Summary:
Add an api to backport a model vn to model vi. It accept an input model (file or buffer) and output a model (file or buffer) with an expected bytecode version.
In this change, the input is a model and it can come from a file or buffer. The output is a model and can be either file path or buffer.
When backport fails, function return false with a warning message :
```
/Users/chenlai/pytorch/cmake-build-debug/bin/test_jit --gtest_filter=LiteInterpreterTest.BackPortByteCodeModelV4:LiteInterpreterTest/*.BackPortByteCodeModelV4:*/LiteInterpreterTest.BackPortByteCodeModelV4/*:*/LiteInterpreterTest/*.BackPortByteCodeModelV4 --gtest_color=no
Testing started at 2:32 PM ...
CUDA not available. Disabling CUDA and MultiCUDA tests
[W backport.cpp:419] Warning: Backport doesn't support backport to version3 (function _backport_for_mobile_impl)
Process finished with exit code 0
```
## Test
1. Run both `caffe2/test/cpp/jit/test_lite_interpreter.cpp` and `caffe2/test/mobile/test_bytecode.py`.
2. Run all prod models with backport api.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56802
ghstack-source-id: 128425510
Test Plan: CI
Reviewed By: raziel, iseeyuan
Differential Revision: D27844651
fbshipit-source-id: 8a803cf6c76433ee0a3049b1a5570585d569f8d6