Commit Graph

119 Commits

Author SHA1 Message Date
Nikita Shulga
d7caef7996 [CI] Update clang-format (#116002)
To 17.0.6 build using https://github.com/pytorch/test-infra/blob/main/.github/workflows/clang-tidy-linux.yml

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116002
Approved by: https://github.com/suo
2023-12-18 14:58:46 +00:00
Jacob Szwejbka
e8996055a9 [iOS][PTMCoreMLCompiler] update other deprecated function (#114177)
Summary: old way was deprecated

Test Plan: ci

Reviewed By: kirklandsign

Differential Revision: D51172622

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114177
Approved by: https://github.com/kirklandsign
2023-11-21 01:36:00 +00:00
Jacob Szwejbka
ff592f1038 [iOS][PTMCoreMLCompiler] Refactor use of deprecated writeToFile:atomically: (#113377)
Summary:
The NSString writeToFile:atomically: method was deprecated in iOS 2.0.
This diff replaces it with a call to writeToFile:atomically:encoding:error:

duplicate of D51003188 to fix gh permissions

Test Plan: ci

Differential Revision: D51164941

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113377
Approved by: https://github.com/kirklandsign
2023-11-09 21:08:23 +00:00
Digant Desai
5845fc2fa6 [PyTorch][Coreml] Bubble up NSError from loadModel (#109444)
Summary: This can help debug issues esp fc/bc issues with coreml tools, when a model fails to load.

Test Plan:
On a macbook fbsource,
```
arc focus2 -b pp-ios -a ModelRunner -a //xplat/caffe2/c10:c10Apple -a //xplat/caffe2/fb/dynamic_pytorch:dynamic_pytorch_implApple -a //xplat/caffe2:coreml_delegateApple --auto-test-schemes --force-with-wrong-xcode
```
It builds and runs the Playground app using a bunch of coreml models on my iPhone. Here is one for example,
https://pxl.cl/3nSPn

Also forcefully triggering MLModel ctor failure to test this code by setting a `modelURL=nil`, and as expected got this,
```
libc++abi: terminating due to uncaught exception of type c10::Error: Error loading MLModel Error details:  Localized_description: nil value for URL Domain: com.apple.CoreML Code: 3 User Info: {
    NSLocalizedDescription = "nil value for URL";
} Input Shapes: N/A

Exception raised from compile at xplat/caffe2/torch/csrc/jit/backends/coreml/objc/PTMCoreMLBackend.mm:162 (most recent call first):
(no backtrace available)
```

Instead of a previous message would have been,
```
Loading MLModel failed
```

Unrelated issues
* P829736691 - with running MaskRCNN on Coreml with the Playground app. Only happens some times.
* P829741377 - with Metal Operator Tests with the Playground app.

Differential Revision: D49349726

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109444
Approved by: https://github.com/kimishpatel
2023-09-19 20:08:37 +00:00
cyy
483f748dd5 [BE] Enforce missing override keyword (#104032)
This PR enables `-Winconsistent-missing-destructor-override` and `-Winconsistent-missing-override`
and fixes violations.

<!--
copilot:summary
-->
### <samp>🤖 Generated by Copilot at 47e904e</samp>

This pull request updates the code of various classes and operators in the `caffe2` and `aten` subdirectories to use the `override` specifier instead of the `virtual` keyword for destructors and other virtual functions that override a base class function. This improves the code readability, quality, and consistency with C++ best practices. It also modifies the `./CMakeLists.txt` file to enable warnings for these specifiers, but disable errors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104032
Approved by: https://github.com/malfet
2023-06-24 02:34:24 +00:00
mikey dagitses
93aac15d82 make torch/csrc/jit/backends/coreml/objc/PTMCoreMLFeatureProvider.mm data_ptr-correct (#100886)
make torch/csrc/jit/backends/coreml/objc/PTMCoreMLFeatureProvider.mm data_ptr-correct

Summary:
https://developer.apple.com/documentation/coreml/mlmultiarray shows
that this is looking for a mutable input and is permitted to mutate
the data in subsequent operations.

Test Plan: Rely on CI.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100886
Approved by: https://github.com/Skylion007
2023-05-09 15:35:48 +00:00
Max Ren
fa077377ea [PtE][CoreML] Create modelID as value not reference (#98655)
Summary:
https://www.internalfb.com/logview/details/instagram_ios_crashes/d5fd49a99f3ee21a82b66861de797711

CoreML is crashing in torch::jit::mobile::coreml::CoreMLBackend::compile(c10::IValue, c10::Dict<c10::IValue, c10::IValue>) (PTMCoreMLBackend.mm<175>)

This is related to the crash here https://www.internalfb.com/logview/details/instagram_ios_crashes/a8a317c8da13cd577529e1763364f496/?trace_key=8002f84f5ea00ac68b0dfb91878c754a&selected-logview-tab=shared

kimishpatel's original fix here D44386623 by passing modelID by value instead of reference, however I believe it just moved the error to loadModel invocation.

When we create a copy of modelID on loadModel invocation, it is a reference to the string within the preprocessed IValue payload. When the payload is deallocated, modelID is no longer valid and the dispatched thread still tries to use it causing the error

Test Plan:
```
Running with tpx session id: 2a77b7b1-7594-4479-8ac3-c01db29cf5cc
Trace available for this run at /tmp/tpx-20230407-173155.849234-2a77b7b1-7594-4479-8ac3-c01db29cf5cc/trace.log
RemoteExecution session id: reSessionID-2a77b7b1-7594-4479-8ac3-c01db29cf5cc-tpx
I0407 17:31:55.970502 780835 ConfigeratorDomainConfigs.cpp:177] Notify user with updated size: 92 removed size: 0
Started reporting to test run: https://www.internalfb.com/intern/testinfra/testrun/1970325002807752
    ✓ ListingSuccess: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests : 13 tests discovered (0.177)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchBITests/testBITextModel (0.028)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchBITests/testBIXRayModel (0.167)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCPUBlasTests/testGemmComplexDouble (0.001)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCPUBlasTests/testGemmComplexFloat (0.001)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCPUBlasTests/testGemmDouble (0.001)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCPUBlasTests/testGemmFloat (0.001)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCoreMLTests/testGanModel (0.303)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCoreMLTests/testMCSModel (0.395)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCoreMLTests/testMCSModelInvalidInputShape (0.305)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchCoreMLTests/testXirpModel (0.110)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchDynamicPyTorchTests/testDynamicPytorchFamFlDictModel (0.014)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchDynamicPyTorchTests/testDynamicPytorchFamFlModel (0.005)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - PyTorchDynamicPyTorchTests/testDynamicPyTorchXirpModel (0.065)
    ✓ Pass: //fbobjc/Apps/Internal/PyTorchPlayground:PyTorchPlaygroundTests - main (13.177)
```

Differential Revision: D44808433

Pull Request resolved: https://github.com/pytorch/pytorch/pull/98655
Approved by: https://github.com/SS-JIA, https://github.com/tiandiao123, https://github.com/kirklandsign
2023-04-11 01:05:13 +00:00
Kimish Patel
100b396b9b [Pytorch][coreml]Pass backend and modelid by value (#97566)
Summary:
Due to async dispatch passing by reference may cause crash.

Test Plan: CI

Reviewed By: mcr229

Differential Revision: D44386623

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97566
Approved by: https://github.com/mcr229
2023-03-28 06:34:55 +00:00
Kyle Yoon
b992199487 [pytorch][coreml] Use from_blob instead of empty in pack_outputs (#96564)
Summary:
We don't want to load when loading model on Core ML and `at::empty` is considered an op.

So replace it with from_blob.

Test Plan:
Run Core ML backend to ensure it works for existing use cases.

Also test running Core ML backend without any ops.

Differential Revision: D43961679

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96564
Approved by: https://github.com/f-meloni, https://github.com/kimishpatel
2023-03-13 20:23:43 +00:00
Nikita Shulga
4242e698a3 [BE][MPS] Add MPS to clang format (#96562)
I'm getting tired of asking to add space after if and all that jazz, so let's linter do that.
Add section for Objective-C language, where column with is extended to 120 characters and `AlignAfterOpenBracket` is set to `Align`

All `.mm` changes in this PR are made by running linter as follows:
```
lintrunner --take CLANGFORMAT --all-files --apply-patches
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96562
Approved by: https://github.com/seemethere, https://github.com/janeyx99, https://github.com/ZainRizvi, https://github.com/izaitsevfb, https://github.com/PaliC, https://github.com/albanD
2023-03-10 23:17:54 +00:00
Kyle Yoon
8693604bc6 coreml - Wrap Core ML execute and forward calls in autorelease pool (#95384)
Summary:
When performing inference using the Core ML delegate, memory is increasing indefinitely. This is due to Core ML allocating memory within `predictionFromFeatures:error:`. Seems that the autorelease pool does not release the return values from the prediction method until inference is stopped completely. So we need to release with `autoreleasepool` manually ([per Apple guidance in the Apple Developer Forums](https://developer.apple.com/forums/thread/692425)).

This commit wraps `autoreleasepool` around the `execute` function of `PTMCoreMLBackend`, which is the scope of where the return values of `predictionFromFeatures:error:` are. Also added in `PTMCoreMLExecutor` for good measure.

Differential Revision: D43520767

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95384
Approved by: https://github.com/mcr229
2023-02-25 01:06:36 +00:00
Kyle Yoon
a257486bdd coreml_delegate - Add input shape in error when throwing from predicting (#95249)
Summary: This change adds input shape when CoreML throws an errors.

Test Plan: testMCSModelInvalidInputShape tests that the assert throws when invalid input shapes are provided.

Differential Revision: D43449112

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95249
Approved by: https://github.com/mcr229
2023-02-23 00:45:44 +00:00
Aaron Gokaslan
0247ed27cc Apply Clang-Tidy readability-container-size-empty (#93236)
Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236
Approved by: https://github.com/malfet
2023-01-29 23:28:19 +00:00
Nikita Shulga
fb18c29486 [BE] Tweak Meta copyright headers (#90805)
s/Facebook, Inc./Meta Platforms, Inc/
s/Confidential and proprietary./This source code is licensed under the BSD-style license/

Per https://www.internalfb.com/intern/wiki/Open_Source/Licenses/Straight_BSD/

Also, add linter that prevents adding those in the future

Fixes https://github.com/pytorch/pytorch/issues/90187
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90805
Approved by: https://github.com/zpao
2022-12-14 20:30:31 +00:00
maxren
496c8ae760 [xnnpack][lite-int] Handle Constant Data (#89445)
Handling constant data for xnnpack delegation. This allows us to handle new modules like such:

```
class Module(torch.nn.Module):
            def __init__(self):
                super().__init__()
                self._constant = torch.ones(4, 4, 4)

            def forward(self, x):
                return x + self._constant
```

this is the precursor work to handling convolution, as we need to serialize constant data(weights)

Differential Revision: [D41050349](https://our.internmc.facebook.com/intern/diff/D41050349/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89445
Approved by: https://github.com/digantdesai
2022-11-22 02:20:54 +00:00
maxren
7beb151889 [xnnpack][executorch] remove unordered_set from xnn_compiler (#89231)
Removing unrodered_set from xnncompiler for executorch.

While some STL libraries are unavoidable, and I think it should be ok for delegate to pull these libraries, unordered_set wasn't really needed, and we should be serializing the number of external ids anyways

After this, the backend classes should be good to hg copy into executorch

Differential Revision: [D41227391](https://our.internmc.facebook.com/intern/diff/D41227391/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89231
Approved by: https://github.com/salilsdesai, https://github.com/cccclai
2022-11-18 07:07:19 +00:00
maxren
637e764ec5 [xnnpack][executorch] Pass xnnexecutor pointer to compileModel() (#89090)
Here we pass XNNExecutor* to compile model so that XNNExecutor can be allocated by runtime. This signature change is for executorch:

```
XNNExecutor compileModel(void* buffer) --> void compileModel(void* buffer, XNNExecutor* executor)
```

The intended usecase for allocating Executor and Compiling the serialized flatbuffer:

```
XNNExecutor* executor = runtime_allocator->allocateList<jit::xnnpack::delegate::XNNExecutor>(1);
XNNCompiler::compileModel(processed.buffer, executor);

```

Differential Revision: [D41208387](https://our.internmc.facebook.com/intern/diff/D41208387/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89090
Approved by: https://github.com/digantdesai
2022-11-17 04:29:25 +00:00
maxren
d1f48f05ce [xnnpack][Bug Fix] Pass serialized model by reference (#89089)
Two changes
- Remove XNNCompiler Dependence on std::string by passing void*
- Grab ser_model by reference: This bug was causing data pointers given to xnn_runtime to be freed because ser_model was on the stack.

Differential Revision: [D41208380](https://our.internmc.facebook.com/intern/diff/D41208380/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89089
Approved by: https://github.com/digantdesai
2022-11-17 04:17:23 +00:00
maxren
366f1b2c2f [xnnpack][lite-int] Freeze/Inline module to remove reference to self (#88863)
We need to inline graph before converting from torchscript to xnnpack flatubuffer. Remove graph dependence on self.

This will later help us work with constant data.

Differential Revision: [D41049858](https://our.internmc.facebook.com/intern/diff/D41049858/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88863
Approved by: https://github.com/digantdesai
2022-11-17 04:14:57 +00:00
Chen Lai
2452e3f99a Update xnnpack graph schema to use xnode and xvalue (#89036)
There are different nodes definition like [Node in autograd](https://www.internalfb.com/code/fbsource/fbcode/caffe2/torch/csrc/autograd/function.h?lines=108-609&reveal=108-609) and onnxnodes and etc. Understand namespace can be used where nodes from definition are used together, however it's still better to slightly differentiate the name.

Differential Revision: [D41002324](https://our.internmc.facebook.com/intern/diff/D41002324/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89036
Approved by: https://github.com/mcr229
2022-11-15 10:34:45 +00:00
Chen Lai
8c46a5de3a Add debug handle to xnnpack schema (#89033)
As title, add three things to the schema
1. debug handle for each node
2. file identifier, so we can sanity check we are getting the xnnpack schema flatbuffers file, instead of other random binary
3. extension, so the dumped binary will end up with its own extension like `myschema.xnnpack` (maybe can have a better name) instead of the default extension `.bin`

Differential Revision: [D40906970](https://our.internmc.facebook.com/intern/diff/D40906970/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89033
Approved by: https://github.com/mcr229
2022-11-15 09:49:54 +00:00
Kazuaki Ishizaki
e0c194f10b Fix typos in messages under torch (#88961)
This PR fixes typos of messages and parms in c++ source and head files under `torch` directory.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88961
Approved by: https://github.com/albanD
2022-11-14 19:06:41 +00:00
maxren
37b468ac77 [xnnpack][lite-int][on-device] rebuild serialized modules at runtime (#88780)
This is the on-device runtime work. We modify the compile and execute from our hacky solution from before to what will actually be running at runtime.

First we rebuild our graph from the serialized flatbuffer string. We also introduce a runtime wrapper that inherits CustomClassHolder that allows us to forward along the built xnngraph runtime to our execute function

Once the subgraph object has been rebuilt by our we pass it along to the runtime wrapper for us to forward along to execute

At execute we prep the input/outputs and invoke the runtime using our runtime wrapper. Finally we forward those results to our execution

Differential Revision: [D39413031](https://our.internmc.facebook.com/intern/diff/D39413031/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39413031/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88780
Approved by: https://github.com/digantdesai
2022-11-10 21:35:28 +00:00
maxren
3a4e8736ad [xnnpack][on-device] compiler --> executor object (#88779)
#### XNN Compiler Object
This is purely to abstract away the subgraph rebuild from the flatbuffer object. CompileModel return an executor object which we can use to setup inputs and run forward with.

#### Executorch Considerations
We Include ATen/utils for torch_check, this will be changed when moving to executorch

Differential Revision: [D40733163](https://our.internmc.facebook.com/intern/diff/D40733163/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88779
Approved by: https://github.com/digantdesai
2022-11-10 21:09:22 +00:00
maxren
d5e1e2f0fc [xnnpack][on-device] executor class (#88778)
# Executor Class

Executor object used to wrap our xnn_runtime object. The ideal flow of this object looks as such:

```
executor.set_inputs(vector<tensor> inputs, vector<tensor> outputs)
executor.forward()
```

This will likely be returned by our delegate compile and given over to execute in order to run inference using the xnn runtime

##### Executorch Considerations
```
#include <ATen/Functions.h>
#include <ATen/Utils.h>
```
These Aten functions are included in order to use at::Tensor when setting the inputs, this will change when used for Executorch because we will be switching from at::Tensor to whatever tensor abstraction is used for ET. Seems like they have the same call for `.data_ptr<float>()`, so realistically all logic here will be the same.

ATen/Utils is used for TORCH_CHECK. We will switch to ET_CHECK_MESSAGE for executorch.

Differential Revision: [D40733121](https://our.internmc.facebook.com/intern/diff/D40733121/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88778
Approved by: https://github.com/digantdesai
2022-11-10 21:01:46 +00:00
Max Ren
826b4a9c2d [coreml] delegate multiple outputs (#88345)
Summary:
https://www.internalfb.com/code/fbsource/[c0e4da0b5c7fff3b4e31e4611033c30cabdc6aef]/fbcode/caffe2/torch/csrc/jit/backends/backend_detail.cpp?lines=268-276

seems like the torchscript addition of
`$unpack, = self.__backend.execute( ... `

the comma after unpack forces the result of execute to have only one item. So for this fix now when the size of the outputs > 1, execute returns a List List of outputs (basically put the outputs in another list before putting it into the list we return)
```
[[output1, output2, output3, ...]]
```
instead of
```
[output1, output2, output3, ...]
```

Do we want to fix this in backend_detail? Or should we make the change in our delegate to accomadate the torchscript? Proposing this q here. Requesting cccclai, kimishpatel for approval here

Test Plan: unblocked models for chengxiangyin and models in pytorch playground all passing unit tests

Reviewed By: kimishpatel, cccclai

Differential Revision: D40328684

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88345
Approved by: https://github.com/jmdetloff, https://github.com/Skylion007
2022-11-03 20:05:53 +00:00
maxren
3aa7a52855 [xnnpack][lite-int][4/n] introduce serialization to delegate (#87908)
We introduced the serializer we created in the previous diff to our XNNGraph builder, the purpose of this is to serialize parts of the graph as we build this. At the end, we are able to finish and serialize the xnngraph into a std::string for use when we forward this along to on-device runtime.

The next diff will rebuild the xnngraph from the serialization we introduce here, so testing the serialization of the graph will be done in the next diff

Differential Revision: [D39335580](https://our.internmc.facebook.com/intern/diff/D39335580/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39335580/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87908
Approved by: https://github.com/digantdesai
2022-11-01 01:48:32 +00:00
maxren
8287c1d964 [xnnpack][lite-int][3/n] flatbuffer serializer class (#87907)
Creating a serializer class that allows us to serialize the xnnpack graph creation arguments. This essentially abstracts away the flatbuffer api manipulation and serialization that we deal with.

As a result we can call
```
XNNSerializer::serializeAddNode()
XNNSerializer::serializeTensorValue()
XNNSerializer::finishAndSerialize
```
to serialize the graph

Differential Revision: [D39196312](https://our.internmc.facebook.com/intern/diff/D39196312/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39196312/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87907
Approved by: https://github.com/digantdesai
2022-11-01 01:44:18 +00:00
maxren
7bf819b181 [xnnpack]lite-int][2/n] flatbuffer xnn_value schema (#87906)
serializer schema for xnnpack graphs

Differential Revision: [D39003170](https://our.internmc.facebook.com/intern/diff/D39003170/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87906
Approved by: https://github.com/digantdesai
2022-11-01 01:39:41 +00:00
maxren
905d532d39 [xnnpack][lite-int][1/n] flatbuffer buck rules (#87826)
Writing a placeholder schema.fbs file for now to setup the buck gen rules. The generated schema file will be used in the xnnpack name space and be reserved for serialization/deserialization of our xnnpack lowered graph

Steps Accomplished

- Buck rules to compile flatbuffer schema
- added header file to preprocess
- everything compiles correctly

Differential Revision: [D38999169](https://our.internmc.facebook.com/intern/diff/D38999169/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D38999169/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87826
Approved by: https://github.com/digantdesai
2022-11-01 01:36:52 +00:00
maxren
aa1f9a1bd7 [xnnpack][lite-int][graph-build] torchscript -> xnnpack graph (#87824)
This point we perform conversion for Torchscript IR to XNNPack graph. Currently we only support converting Add Nodes and fp32 tensor values.

As a caveat, we are not building this at runtime. So for testing we just run the xnn graph once ahead of time and with sample inputs and forward it to execute. This is only for testing, and will be changed in a later diff. This will allow us to check that graph creation is sound.

Differential Revision: [D39838851](https://our.internmc.facebook.com/intern/diff/D39838851/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87824
Approved by: https://github.com/digantdesai, https://github.com/salilsdesai
2022-11-01 01:24:56 +00:00
maxren
b013eb5447 [xnnpack][lite-int][graph-build] graph passes and op checking (#87128)
Beginning of building the xnnpack graph from the torchscript IR. We first massage the torchscript graph using a few graph passes that perform things such as unused self argument removal and constant propagation.
This also performs tracing for us so that the model does not have to be prepped by tracing before being lowered by us.

The other check we perform is through the torchscript IR to identify any nodes that are not lowerable/supported, and throwing an error to spit out the specific nodes that are not lowerable.

Differential Revision: [D39838338](https://our.internmc.facebook.com/intern/diff/D39838338/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D39838338/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87128
Approved by: https://github.com/salilsdesai
2022-10-25 22:08:29 +00:00
maxren
155b885806 [xnnpack][lite-int] preprocess (#86980)
Split up original preprocess diff:

This diff introduces the skeleton structure of the delegate APIs. first introducing the method compile spec error handling. For now it just outputs an empty tensor object upon execute. But just proves that delegate apis is working and a new xnnpack delegate backend has been added.

Differential Revision: [D38562918](https://our.internmc.facebook.com/intern/diff/D38562918/)

**NOTE FOR REVIEWERS**: This PR has internal Meta-specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D38562918/)!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86980
Approved by: https://github.com/salilsdesai, https://github.com/cccclai
2022-10-14 22:07:12 +00:00
maxren
1a7409c771 [CoreML][ios_crash] Use special throw macro when encountering CoreML API errors (#86938)
Error messages from TORCH_CHECK are stripped during production builds via  -DSTRIP_ERROR_MESSAGES. This diff introduces a new macro COREML_CHECK which will always preserve the error message. This macro is used when encountering errors produced by CoreML API calls so that we can heve enough context to debug.

Differential Revision: [D40351013](https://our.internmc.facebook.com/intern/diff/D40351013/)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86938
Approved by: https://github.com/salilsdesai
2022-10-14 21:06:25 +00:00
John Detloff
652707abc0 Don't cache model specs within PTMCoreMLCompiler (#85136)
Summary: It turns out disk cache space is more limited than I realized - Instagram starts evicting cached items at 10mb. We don't actually need to cache the model specs, once the model is compiled all we need is the compiled model. With this diff, after model compilation succeeds we cleanup the model specs from disk.

Test Plan: Delete instagram from device to ensure an empty cache, build, launch camera, open a MCS or Segmentation effect, confirm it loads and works correctly. Restart the app and launch again, to confirm it can load the compiled model from cache as well.

Differential Revision: D39562009

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85136
Approved by: https://github.com/kimishpatel
2022-09-17 03:24:44 +00:00
John Detloff
dbd38f63f5 Include CoreML error description in exception thrown when inference fails (#84804)
Summary:
Catch the error and throw an exception with PTMCoreMLBackend when inference fails. This way the error description will be available in the logged crash, as opposed to crashing with a less descriptive exception.

I'll be drafting follow up diffs to actually catch exceptions in the segmentation shim next. Ideally we would fail inference gracefully and not crash at all, but at least after this diff we'll have the full diagnostic info

Test Plan: Force an error, and confirm its description appears in the exception thrown via the console

Differential Revision: D39407865

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84804
Approved by: https://github.com/mcr229
2022-09-13 05:09:15 +00:00
Max Ren
5c0c8f2ce3 [coreml][bug] coreml gpu flag not set (#84725)
Summary:
Delegated CoreML models with cpuAndGPU flag set does not properly run models on CPU

- Fix will allow us to target models on CPU

Test Plan: brymkowski can you test this on your performance benchmarks?

Reviewed By: salilsdesai

Differential Revision: D39361382

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84725
Approved by: https://github.com/jmdetloff
2022-09-09 19:32:40 +00:00
John Detloff
21bc77ca96 Remove CoreMLMemoryObserver (#83703)
Summary: We added this observer to help us diagnose memory issues that have since resolved. It should be safe to clean this up.

Test Plan: Diff just removed logging, so just build IG and confirm no errors.

Differential Revision: D38843701

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83703
Approved by: https://github.com/mcr229
2022-08-23 22:50:09 +00:00
John Detloff
7d5db63076 Store compilation os version on a per model basis (#82661)
Summary: Ensure that all models are recompiled when OS updates, instead of just the first model loaded after the OS update. To do this we need to save the compilation os version for each model.

Test Plan: Build and run IG, launch a segmentation effect, confirm it renders correctly. To test the OS upgrade flow you could manually change '[UIDevice currentDevice].systemVersion' to a different string and launch again, and confirm you can still access effects.

Differential Revision: D38361641

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82661
Approved by: https://github.com/mcr229
2022-08-04 04:29:07 +00:00
Edward Z. Yang
df69660832 Revert "Revert "Add a lint rule for torch/csrc/util/pybind.h include (#82552)"" (#82599)
This reverts commit 532b8a9e00.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82599
Approved by: https://github.com/albanD
2022-08-02 19:37:02 +00:00
John Detloff
7d490dba72 CoreML backend should log on failure not success (#82604)
### Description
The original intention of this code was to log whenever inference fails, not whenever it succeeds. This change updates the logic of should_log to match this intention.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82604
Approved by: https://github.com/SS-JIA
2022-08-02 04:14:43 +00:00
PyTorch MergeBot
532b8a9e00 Revert "Add a lint rule for torch/csrc/util/pybind.h include (#82552)"
This reverts commit 9465c0e0b5.

Reverted https://github.com/pytorch/pytorch/pull/82552 on behalf of https://github.com/zengk95 due to This seems to be breaking windows binary wheels
2022-08-01 20:25:35 +00:00
Edward Z. Yang
9465c0e0b5 Add a lint rule for torch/csrc/util/pybind.h include (#82552)
We define specializations for pybind11 defined templates
(in particular, PYBIND11_DECLARE_HOLDER_TYPE) and consequently
it is important that these specializations *always* be #include'd
when making use of pybind11 templates whose behavior depends on
these specializations, otherwise we can cause an ODR violation.

The easiest way to ensure that all the specializations are always
loaded is to designate a header (in this case, torch/csrc/util/pybind.h)
that ensures the specializations are defined, and then add a lint
to ensure this header is included whenever pybind11 headers are
included.

The existing grep linter didn't have enough knobs to do this
conveniently, so I added some features.  I'm open to suggestions
for how to structure the features better.  The main changes:

- Added an --allowlist-pattern flag, which turns off the grep lint
  if some other line exists.  This is used to stop the grep
  lint from complaining about pybind11 includes if the util
  include already exists.

- Added --match-first-only flag, which lets grep only match against
  the first matching line.  This is because, even if there are multiple
  includes that are problematic, I only need to fix one of them.
  We don't /really/ need this, but when I was running lintrunner -a
  to fixup the preexisting codebase it was annoying without this,
  as the lintrunner overall driver fails if there are multiple edits
  on the same file.

I excluded any files that didn't otherwise have a dependency on
torch/ATen, this was mostly caffe2 and the valgrind wrapper compat
bindings.

Note the grep replacement is kind of crappy, but clang-tidy lint
cleaned it up in most cases.

See also https://github.com/pybind/pybind11/issues/4099

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82552
Approved by: https://github.com/albanD
2022-08-01 17:16:58 +00:00
John Detloff
665fd30a5a Load CPU MLModel first, and configured MLModel async (#80941)
Summary:
MLModel loads much faster when compute units are set to CPU only. It seems when loading with compute units set to all a large amount of preprocessing work is done during init.

So, in order to speed up our effect load time, load a cpu MLModel synchronously, and a configured MLModel asyncronously. When the second model finishes loading about 600 ms later, swap the models out.

So, for about half a second inference will occur on the cpu, but after that will kick over to gpu or npu.

On iPhone 12 I'm seeing a > 10x improvement in load time as recorded by RenderTimeLogger.cpp

Test Plan:
- Add an override to https://www.internalfb.com/intern/qe2/ig_ios_person_segmentation_universe to opt into the coreml segmentation model
- Launch IG camera and apply an effect that uses segmentation, such as green screen
- Confirm that segmentation works.

https://pxl.cl/277JL

Reviewed By: kimishpatel

Differential Revision: D37597965

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80941
Approved by: https://github.com/mcr229, https://github.com/kimishpatel
2022-07-07 00:08:34 +00:00
John Detloff
f36a5d23ce Prevent potential memory leaks in CoreML backend (#79928)
Differential Revision: D37266033

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79928
Approved by: https://github.com/mcr229
2022-07-05 23:36:01 +00:00
Stephen Jia
93d84c0fcf [coreml] Use special throw macro when encountering CoreML API errors (#77429)
Summary: Error messages from `TORCH_CHECK` are stripped during production builds via  `-DSTRIP_ERROR_MESSAGES`. This diff introduces a new macro `COREML_CHECK` which will always preserve the error message. This macro is used when encountering errors produced by CoreML API calls so that we can heve enough context to debug.

Test Plan:
Test in pytorch playground:

```
arc focus2 -b pp-ios -a ModelRunner -a //xplat/caffe2/c10:c10Apple -a //xplat/caffe2:torch_mobile_coreApple  -a //xplat/caffe2/fb/dynamic_pytorch:dynamic_pytorch_implApple -a //xplat/caffe2:coreml_delegateApple  -a ModelRunnerDevOps -a //xplat/caffe2:torch_mobile_all_opsApple -fd --force-with-wrong-xcode
```

Differential Revision: D36378286

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77429
Approved by: https://github.com/kimishpatel
2022-05-16 18:02:48 +00:00
Stephen Jia
ccd5fa506f [iOS][coreml] Add CoreML memory observer Round 2
Summary:
Add an observer to `PTMCoreMLExecutor` so we can inspect OOMs in production to help with T115554493.

The behaviour of the logger is as such:

1. Each time a model is compiled, there is a chance we publish all logs to QPL. This is determined by the randomly generated `_model_load_id` and `_sample_thresh`.
2. If we are publishing all logs, then every `_sample_every` inferences will be logged via QPL.
3. Every QPL log will collect memory metrics before and after model compilation/inference
4. If memory pressure is not normal (remaining mem < 400 MB) before or after compilation/inference, then that compilation/inference will be logged to QPL no matter what.

Previous diff got reverted due to OSS CI failures. Fixed those failures in this iteration.

Test Plan:
We can test in pytorch playground and inspect the QPL logs through Flipper:

```
arc focus2 -b pp-ios -a ModelRunner -a //xplat/caffe2/c10:c10Apple -a //xplat/caffe2:torch_mobile_coreApple  -a //xplat/caffe2/fb/dynamic_pytorch:dynamic_pytorch_implApple -a //xplat/caffe2:coreml_delegateApple  -a ModelRunnerDevOps -a //xplat/caffe2:torch_mobile_all_opsApple -a coreml_memory_observer -a //xplat/perflogger:perfloggerApple -fd --force-with-wrong-xcode
```

To check results in Hive/Scuba, test in instagram:

```
arc focus2 -b igios-no-extensions -a //fbobjc/Apps/Instagram/AppLibraries/Core/QPL/IGPerformanceLogging:IGPerformanceLogging -a //xplat/caffe2/c10:c10Apple -a //xplat/caffe2:torch_mobile_coreApple  -a //xplat/caffe2/fb/dynamic_pytorch:dynamic_pytorch_implApple -a //xplat/caffe2:coreml_delegateApple -a //xplat/caffe2:torch_mobile_all_opsApple -a //xplat/perflogger:perfloggerApple -a coreml_memory_observerApple -c pt.enable_qpl=1 --force-with-wrong-xcode
```

Note that we need to change `_sample_thresh` to ensure logs show up.

Reviewed By: kimishpatel

Differential Revision: D35970823

fbshipit-source-id: 2cb4d73931f35a0fa7f362b9fb44b9f0a00aeb82
(cherry picked from commit 1e965acf72958fc59d0cd0b4958e54955cc1adf2)
2022-05-03 17:30:55 +00:00
Jane Xu
122999919c Revert D35511873: [iOS][coreml] Add CoreML memory observer
Test Plan: revert-hammer

Differential Revision:
D35511873 (22fb929405)

Original commit changeset: 59f2fa2d0211

Original Phabricator Diff: D35511873 (22fb929405)

fbshipit-source-id: be83517fee4ac58e6ca4f126cba7b088625ddd1f
(cherry picked from commit 448e9c8794dc9bd77442583eefeef2d6cb2a5308)
2022-04-27 15:16:39 +00:00
Stephen Jia
22fb929405 [iOS][coreml] Add CoreML memory observer (#76251)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76251

Add an observer to `PTMCoreMLExecutor` so we can inspect OOMs in production to help with T115554493.

The behaviour of the logger is as such:

1. Each time a model is compiled, there is a chance we publish all logs to QPL. This is determined by the randomly generated `_model_load_id` and `_sample_thresh`.
2. If we are publishing all logs, then every `_sample_every` inferences will be logged via QPL.
3. Every QPL log will collect memory metrics before and after model compilation/inference
4. If memory pressure is not normal (remaining mem < 400 MB) before or after compilation/inference, then that compilation/inference will be logged to QPL no matter what.

Test Plan:
We can test in pytorch playground and inspect the QPL logs through Flipper:

```
arc focus2 -b pp-ios -a ModelRunner -a //xplat/caffe2/c10:c10Apple -a //xplat/caffe2:torch_mobile_coreApple  -a //xplat/caffe2/fb/dynamic_pytorch:dynamic_pytorch_implApple -a //xplat/caffe2:coreml_delegateApple  -a ModelRunnerDevOps -a //xplat/caffe2:torch_mobile_all_opsApple -a coreml_memory_observer -a //xplat/perflogger:perfloggerApple -fd --force-with-wrong-xcode
```

To check results in Hive/Scuba, test in instagram:

```
arc focus2 -b igios-no-extensions -a //fbobjc/Apps/Instagram/AppLibraries/Core/QPL/IGPerformanceLogging:IGPerformanceLogging -a //xplat/caffe2/c10:c10Apple -a //xplat/caffe2:torch_mobile_coreApple  -a //xplat/caffe2/fb/dynamic_pytorch:dynamic_pytorch_implApple -a //xplat/caffe2:coreml_delegateApple -a //xplat/caffe2:torch_mobile_all_opsApple -a //xplat/perflogger:perfloggerApple -a coreml_memory_observerApple -c pt.enable_qpl=1 --force-with-wrong-xcode
```

Note that we need to change `_sample_thresh` to ensure logs show up.

Reviewed By: kimishpatel

Differential Revision: D35511873

fbshipit-source-id: 59f2fa2d021178ceab1fcf5ee94b2f15ceca32ee
(cherry picked from commit 8b8af55410ea1231693ee980c80d8a749f5ad870)
2022-04-26 23:54:22 +00:00
Nikita Shulga
78305ad2b7 Fix sign-compare in nnapi backend
Prerequisite change for enabling `-Werror=sign-compare` across PyTorch repo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75083

Approved by: https://github.com/ngimel
2022-04-05 00:08:04 +00:00