Summary:
Same as title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61021
Test Plan: pytest test/test_nnapi.py::TestNNAPI
Reviewed By: anshuljain1
Differential Revision: D29480746
fbshipit-source-id: 7217c8f3a811db8c3c373f3e7ca31caf9502ef22
Summary:
Add support for aten::slice op in the NNAPI model converter
* If start = 0; end = max -> identity
* Flexible shapes can be passed through
* Flexible shapes can't be sliced over
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59364
Test Plan: pytest test/test_nnapi.py::TestNNAPI::test_slice
Reviewed By: anshuljain1
Differential Revision: D28881039
fbshipit-source-id: 3c1c630ff27b5bba6eda403d87570c61d43ae90e
Summary:
* Add support for aten::detach op in the NNAPI model converter as a no-op
* Also add flexible op support for add_pointwise_simple_unary_op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58543
Test Plan: pytest test/test_nnapi.py::TestNNAPI::test_detatch
Reviewed By: anshuljain1
Differential Revision: D28531942
fbshipit-source-id: 4387dbbbadd8ce6b690841f3a903e68a380b849d
Summary:
Add support for aten::div op in the NNAPI model converter. Startup time
variable size support isn't supported as shapes go as inputs to NNAPI op
Runtime variable size support to supported soon
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60885
Test Plan: pytest test/test_nnapi.py::TestNNAPI::test_flatten
Reviewed By: anshuljain1
Differential Revision: D29451725
fbshipit-source-id: 8902745f7758c8cc88ad4b4ce02b8301ff894bd4
Summary:
Add support for aten::div op in the NNAPI model converter. Add variable
size input test as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58541
Test Plan: pytest test/test_nnapi.py::TestNNAPI::test_div
Reviewed By: anshuljain1
Differential Revision: D28531943
fbshipit-source-id: e96342146f6de216f7b88443618edfc54963747c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58540
Add support for aten::to op in the NNAPI model converter for simple
cases like to("cpu"), to("gpu")
Test Plan: pytest test/test_nnapi.py::TestNNAPI::test_to
Reviewed By: anshuljain1
Differential Revision: D28531941
fbshipit-source-id: 0c934f7aceaff2669307c3426efe32046d8c44f3
Summary:
Add support for aten::softmax op in the NNAPI model converter with
flexible size
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58539
Test Plan: pytest test/test_nnapi.py::TestNNAPI::test_softmax
Reviewed By: anshuljain1
Differential Revision: D28531946
fbshipit-source-id: 8633f3e3f7f52795f9866ff16ad0867ea36a19e8
Summary:
Add support for aten::avgpool2d op in the NNAPI model converter with var
size support
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58538
Test Plan: pytest test/test_nnapi.py::TestNNAPI::test_avgpool2d
Reviewed By: anshuljain1
Differential Revision: D28531944
fbshipit-source-id: 43ff8c9389365698c282f204042b49c7ec84d824
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57563
Add flexible size support for upsample_nearest2d op in nnapi model conversion
Test Plan:
pytest test/test_nnapi.py
Imported from OSS
Reviewed By: dreiss
Differential Revision: D28200847
fbshipit-source-id: 901fe3f6e68e4c16ece730f3ffa68dc88c6ed6c3
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57562
Add flexible size support for qadd op in nnapi model conversion
Test Plan:
pytest test/test_nnapi.py
Imported from OSS
Reviewed By: dreiss
Differential Revision: D28200849
fbshipit-source-id: d5b2ea8e9eb8ae405ff2c960f7549cef60bc0991
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57561
Add flexible size support for conv2d op in nnapi model conversion
Test Plan:
pytest test/test_nnapi.py
Imported from OSS
Reviewed By: dreiss
Differential Revision: D28200848
fbshipit-source-id: d94ccf48a3d8453aa8e96c7cac02948c4cd870cc
Summary:
Fixes https://github.com/pytorch/pytorch/issues/48141
~Mypy is complaining about a missing arg in a function call.~
```bash
torch/backends/_nnapi/serializer.py:806: error: Too few arguments for "_do_add_binary" [call-arg]
Found 1 error in 1 file (checked 1140 source files)
```
9392137dbe/torch/backends/_nnapi/serializer.py (L804-L806)
~dreiss, would you mind take a look when you have some cycles to spare and see what would be the appropriated value for `fuse_code` here? Thanks :)~
Edit: https://github.com/pytorch/pytorch/issues/48925 got merged a couple of days ago. The blocking part is now unblocked, and I just pushed the changes to make mypy happy again. This PR is ready for review.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48142
Reviewed By: ezyang
Differential Revision: D28006249
Pulled By: walterddr
fbshipit-source-id: 5e43eeba7143512a549efaad31541f86718add7c
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54701
We need NNAPI models to support inputs (and, by extension, intermediate
values and outputs) whose shape is only determined at load time. For
example, a vision models input shape might be dependent on the aspect
ratio of the device camera. While NNAPI has full support for variable
shapes (by setting components of the operand shape to 0), the guidance
we have received is that vendor-provided drivers for real hardware are
not able to support this efficiently. Therefore, we take a hybrid
approach where shapes are calculated at model load time to
semi-dynamically construct our NNAPI model. While this doesn't let us
have truly dynamic input shapes, it does allow us to ensure that the
vendor driver only sees fixed shapes, so we get maximum performance.
In this initial commit, only PReLU supports dynamic shapes. Additional
operators will be converted in separate diffs.
- In order to convert a flexible-shape model, the user supplies inputs
with shapes containing dimensions of size 0 for the flexible
dimensions.
- During conversion, we generate code to compute the shapes of all
intermediates and outputs as a function of the input shapes.
- We no longer run the input model to produce the output templates.
Instead, we generate code to return properly-sized templates, given
the input shapes.
- All of this generated code goes into a "ShapeComputeModule" that is
used by the NnapiModule during initialization.
- The ShapeComputeModule mutates the serialized model to fill in the
computed sizes for each operand. This requires us to change the dtype
for the serialized model to int32, but this should be fine because
everything in it is already 4-byte aligned.
- NnapiInitWrapper no longer exists. Instead, initialization is
performed on the first run, based on the real arguments. We plan to
provide an API for doing eager initialization.
- Unit test updated to allow separate arguments to be given for trace,
conversion, and inference. A flexible-shape test case was added for
PReLU.
Test Plan: Unit test
Reviewed By: axitkhurana
Differential Revision: D27536796
Pulled By: dreiss
fbshipit-source-id: 105585f247987b1e6ec6946a6fe44401237cb0a0
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54700
This is an internal method just to make it more clear what
len(self.operands) is doing.
Test Plan: Unit test
Reviewed By: axitkhurana
Differential Revision: D27536794
Pulled By: dreiss
fbshipit-source-id: 678cee8a47df6757dd2e6feabf2560fd82d32e26
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54699
We'll soon be adding support for flexible-size tensors to the NNAPI
converter, but it won't be added to all ops at once. Create
get_tensor_operand_by_jitval_fixed_size as a wrapper for
get_tensor_operand_by_jitval that verifies that the argument has a fixed
shape. Update all call sites. As flexible size support is added to
each op, the call sites can be converted back and proper size checks
added.
Test Plan: Unit test
Reviewed By: axitkhurana
Differential Revision: D27536791
Pulled By: dreiss
fbshipit-source-id: 6fb1fea814d767b6ff263fd8b88240a51be74777
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54698
"mf" was short for memory format, but the concept that this variable
represents was renamed to "dim_order", so rename the variable.
Test Plan: Unit test
Reviewed By: axitkhurana
Differential Revision: D27536793
Pulled By: dreiss
fbshipit-source-id: 2b31c70da1ff221a7833e67486690fa606f01dea
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54697
Previously, models being converted to NNAPI were expected to take inputs
as separate arguments, but the generated NNAPI model could only take
multiple inputs as a list. Now the generated model always takes inputs
(single or multiple) as separate tensor arguments.
Previously, models being converted to NNAPI were expected to return
outputs as a single tensor or tuple of tensors, but the generated NNAPI
model would return multiple outputs as a list. Now the generated model
returns a tuple as well (or single tensor).
Internally, we decied what output format to use (single tensor or tuple)
based on the conversion process, rather than by running the model.
Test Plan: Unit test
Reviewed By: axitkhurana
Differential Revision: D27536790
Pulled By: dreiss
fbshipit-source-id: c0f93c85d450757e568985947cc2f32043795859
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54696
This was originally developed for a Python version where array was not
available.
Test Plan: Unit test
Reviewed By: axitkhurana
Differential Revision: D27536792
Pulled By: dreiss
fbshipit-source-id: 39e5507e37d4f91871113439fe752a4d5373eaba
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/48812
This came up in a squeeze-and-excitation model. Starting with an NHWC
tensor T, we perform a mean operation across H and W, giving an NxC
tensor, which (after some fully connected layers) is reshaped to
NxCx1x1, then multiplied with T. To handle this, we detect the specific
case of a binary op with one NHWC input and one contiguous input with
H,W == 1,1 and allow the op to be applied (after transposing the
contiguous input).
Test Plan: Unit test.
Reviewed By: axitkhurana
Differential Revision: D25317939
Pulled By: dreiss
fbshipit-source-id: b4c17ab3b874d1a7defa04664010ba82115f1c20
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54695
Previously, torch.nn.Linear was calling aten::addmm internally. Now
it's calling aten::linear, so add support for that.
Test Plan: Unit test
Reviewed By: axitkhurana
Differential Revision: D27536795
Pulled By: dreiss
fbshipit-source-id: 42c8d2a80b20ac12ed9bba599c5e0e874256bb13
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47518
This was left over from an old version of the code. The idea was that
instead of indexing into separate tensors for each weight, you could
bundle them all into a single file and use different offsets into that
file. With the current design, this is nontrivial to support, so drop
the code for now.
Test Plan: CI
Reviewed By: axitkhurana
Differential Revision: D25317935
Pulled By: dreiss
fbshipit-source-id: e26ab3a8d437cb1bbb50319209fa56d9c571ce61
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/46780
This is in prototype status, but pretty functional. There are two major
parts.
- Model converter. This is a pure Python component that consumes a
model in TorchScript format, converts the operations into NNAPI
semantics, and serializes the model in a custom format. It then wraps
the result in a new TorchScript model that can invoke NNAPI under the
hood.
- Runtime. This is a TorchBind object that deserializes the model and
sends the result to NNAPI. This is fairly simple since the serialized
format is basically just a list of NNAPI calls to make, so most of the
code is spent on bounds checking.
A few notes on the design.
- Currently, all tensor sizes need to be fixed, and those fixed sizes
are burned directly into the serialized model. This will probably
need to change. NNAPI supports variable-sized tensors, but the
important hardware backends do not. However, we're seeing use cases
crop up where the input size is not known until around the time that
the model is loaded (for example, it might depend on the camera aspect
ratio). I think the proper fix here is to remove the code in the
converter that eagerly calculates the sizes of the intermediate
tensors and replace it with a code generator that will generate some
TorchScript code that will perform those calculations at model load
time. This way, we will be able to support models that have
variable-sized inputs while still only showing fixed-sized operands to
NNAPI.
- The important hardware backends want operands to be in NHWC order, but
PyTorch natively represents all tensors and NCHW. The strategy for
this is to keep NCHW during most of the conversion process, but track
and additional value per operand representing the "dimension order".
The dimension order gets propagated through convolutions and pointwise
ops. When we're ready to serialize the model, we reorder the
dimensions for "channels last" operands to NHWC.
Test Plan:
Some local testing with FB prod models. I'll need to add some examples
and automated tests.
Reviewed By: iseeyuan
Differential Revision: D24574040
Pulled By: dreiss
fbshipit-source-id: 6adc8571b234877ee3666ec0c0de24da35c38a1f