This change introduces a mechanism to test onnx export based on sample inputs registered in OpInfo, similar to how MPS and other components of pytorch are tested. It provides test coverage on ops and dtypes previously unattainable with manually created test models. This is the best way for us to discover gaps in the exporter support, especially for ops with partial existing support.
This test is adapted from https://github.com/pytorch/pytorch/blob/master/test/test_mps.py
This PR also
- Update sqrt to support integer inputs to match pytorch behavior
- Add pytest-subtests for unittest subtests support in the new test file
I only enabled very few ops: `t`, `ceil` and `sqrt` because otherwise too many things will fail due to (1) unsupported dtypes in the exporter (2) unimplemented dtype support in onnxruntime (3) unexpected input to verification.verify.
Subsequent PRs should improve `verification.verify` first for it to accept any legal input to a pytorch model, then incrementally fix the symbolic functions to enable more test cases.
Fixes#85363
Design #88118
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86182
Approved by: https://github.com/BowenBao
Summary: Currently, build_mobile.sh doesn't allow lite interpreter builds or tracing based selective builds. build_mobile.sh is used for host builds of PyTorch for Mobile deployment.
Additionally, certain flags such as `USE_BLAS` were not being respected as they should be. This change addresses that as well.
Test Plan: Build using:
```
cat /tmp/selected_ops.yaml
- aten::add
- aten::sub
```
```
BUILD_PYTORCH_MOBILE_WITH_HOST_TOOLCHAIN=1 USE_LIGHTWEIGHT_DISPATCH=0 BUILD_LITE_INTERPRETER=1 SELECTED_OP_LIST=/tmp/selected_ops.yaml ./scripts/build_mobile.sh
```
```
cat /tmp/main.cpp
int main() {
auto m = torch::jit::_load_for_mobile("/tmp/path_to_model.ptl");
auto res = m.forward({});
return 0;
}
```
Test using:
```
g++ /tmp/main.cpp -L build_mobile/lib/ -I build_mobile/install/include/ -lpthread -lc10 -ltorch_cpu -ltorch -lXNNPACK -lpytorch_qnnpack -lcpuinfo -lclog -lpthreadpool -lgloo -lkineto -lfmt -ldl -lc10
```
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84647
Approved by: https://github.com/JacobSzwejbka, https://github.com/cccclai
We're no longer building Caffe2 mobile as part of our CI, and it adds a lot of clutter to our make files. Any lingering internal dependencies will use the buck build and so wont be effected.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84338
Approved by: https://github.com/dreiss
This PR adds an internal wrapper on the [beartype](https://github.com/beartype/beartype) library to perform runtime type checking in `torch.onnx`. It uses beartype when it is found in the environment and is reduced to a no-op when beartype is not found.
Setting the env var `TORCH_ONNX_EXPERIMENTAL_RUNTIME_TYPE_CHECK=ERRORS` will turn on the feature. setting `TORCH_ONNX_EXPERIMENTAL_RUNTIME_TYPE_CHECK=DISABLED` will disable all checks. When not set and `beartype` is installed, a warning message is emitted.
Now when users call an api with invalid arguments e.g.
```python
torch.onnx.export(conv, y, path, export_params=True, training=False)
# traning should take TrainingModel, not bool
```
they get
```
Traceback (most recent call last):
File "bisect_m1_error.py", line 63, in <module>
main()
File "bisect_m1_error.py", line 59, in main
reveal_error()
File "bisect_m1_error.py", line 32, in reveal_error
torch.onnx.export(conv, y, cpu_model_path, export_params=True, training=False)
File "<@beartype(torch.onnx.utils.export) at 0x1281f5a60>", line 136, in export
File "pytorch/venv/lib/python3.9/site-packages/beartype/_decor/_error/errormain.py", line 301, in raise_pep_call_exception
raise exception_cls( # type: ignore[misc]
beartype.roar.BeartypeCallHintParamViolation: @beartyped export() parameter training=False violates type hint <class 'torch._C._onnx.TrainingMode'>, as False not instance of <protocol "torch._C._onnx.TrainingMode">.
```
when `TORCH_ONNX_EXPERIMENTAL_RUNTIME_TYPE_CHECK` is not set and `beartype` is installed, a warning message is emitted.
```
>>> torch.onnx.export("foo", "bar", "f")
<stdin>:1: CallHintViolationWarning: Traceback (most recent call last):
File "/home/justinchu/dev/pytorch/torch/onnx/_internal/_beartype.py", line 54, in _coerce_beartype_exceptions_to_warnings
return beartyped(*args, **kwargs)
File "<@beartype(torch.onnx.utils.export) at 0x7f1d4ab35280>", line 39, in export
File "/home/justinchu/anaconda3/envs/pytorch/lib/python3.9/site-packages/beartype/_decor/_error/errormain.py", line 301, in raise_pep_call_exception
raise exception_cls( # type: ignore[misc]
beartype.roar.BeartypeCallHintParamViolation: @beartyped export() parameter model='foo' violates type hint typing.Union[torch.nn.modules.module.Module, torch.jit._script.ScriptModule, torch.jit.ScriptFunction], as 'foo' not <protocol "torch.jit.ScriptFunction">, <protocol "torch.nn.modules.module.Module">, or <protocol "torch.jit._script.ScriptModule">.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/justinchu/dev/pytorch/torch/onnx/_internal/_beartype.py", line 63, in _coerce_beartype_exceptions_to_warnings
return func(*args, **kwargs)
File "/home/justinchu/dev/pytorch/torch/onnx/utils.py", line 482, in export
_export(
File "/home/justinchu/dev/pytorch/torch/onnx/utils.py", line 1422, in _export
with exporter_context(model, training, verbose):
File "/home/justinchu/anaconda3/envs/pytorch/lib/python3.9/contextlib.py", line 119, in __enter__
return next(self.gen)
File "/home/justinchu/dev/pytorch/torch/onnx/utils.py", line 177, in exporter_context
with select_model_mode_for_export(
File "/home/justinchu/anaconda3/envs/pytorch/lib/python3.9/contextlib.py", line 119, in __enter__
return next(self.gen)
File "/home/justinchu/dev/pytorch/torch/onnx/utils.py", line 95, in select_model_mode_for_export
originally_training = model.training
AttributeError: 'str' object has no attribute 'training'
```
We see the error is caught right when the type mismatch happens, improving from what otherwise would become `AttributeError: 'str' object has no attribute 'training'`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83673
Approved by: https://github.com/BowenBao
* Move memory heavy tests from `test_pytorch_onnx_onnxruntime.py` to
`test_models_onnxruntime.py`. The former is run in parallel in CI,
while the latter is not. A change is that the moved tests are now
only covered in default opset export.
* Refactor and create base class for tests that export model to ONNX
and verify with ONNX Runtime. The new base class are parameterized
with `opset_version` and `is_script`. Further work can be done to
refactor existing test classes in `test_pytorch_onnx_onnxruntime.py`.
See #75630
* Reduce unnecessarily large tensor size in
`test_pytorch_onnx_onnxruntime.py` to further reduce memory usage
and test time.
After this PR, the running time for `test_pytorch_onnx_onnxruntime.py`
is reduced from `1338.82s (0:22:18)` to `225.07s (0:03:45)`,
benchmarked on 10900x with `-n 10`.
Fixes#79179
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79640
Approved by: https://github.com/justinchuby, https://github.com/garymm
Should fix#78844
Custom op related tests utilize inline cpp extension to build custom
operator from c++ source snippet. Only two test cases become flaky after
parallel run, and both use inline cpp extension. Reverting to run these
tests in single process to try resolve the flakiness.
Reverts test skip added previously #78936.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78944
Approved by: https://github.com/janeyx99, https://github.com/garymm
Currently `torch.onnx.export(.., operator_export_type=OperatorExportTypes.ONNX_ATEN_FALLBACK)` only issues ATen ops through explicit requests (e.g. `g.at()`) calls inside each op symbolic function. This is done based on specific conditions such as `operator_export_type==OperatorExportTypes.ONNX_ATEN_FALLBACK)` or `is_caffe2_aten_fallback()`
This PR extends the ATen fallback mechanism for scenarios when the symbolic function raises `RuntimeError` during export. The idea is that partial implementation of existing ONNX ops can fallback to ATen as a last resort. That is valuable because each operator can have many input combinations and not all are always implemented.
A minor fix was done to make sure the `overload_name` attribute is added to explicit ATen op fallback requests when a symbolic is not registered to a particular op.
ps: The behavior for builds with BUILD_CAFFE2=1 is not changed to ensure BC.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74759
Approved by: https://github.com/garymm, https://github.com/msaroufim
`torch.cuda.synchronize()` is a heavy hammer and distorts benchmarking results a lot. Timer provides results that are closer to kernel times observed in profiler.
If you want, instead of `blocked_autorange` you can use `timeit` that repeats the stmt fixed number of times.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75393
Approved by: https://github.com/davidberard98
Summary:
And add a new tool to update it in the future, which follows the policy
of using "latest as of 18 months ago". This policy is meant to balance:
* recent enough to increase the odds of being able to successfully
export
* old enough to increase the odds of exported model being runnable by
different ONNX implementations
Related changes:
* test_models.py: explicitly fix opset_version to 9 rather than relying on default. Caffe2 doesn't support newer versions.
* symbolic_helper.py:
* Remove a misleading comment
* Remove unnecessary check in `_set_opset_version`
* Use a range to define `_onnx_stable_opsets`
* test_pytorch_common.py:
* Rename a variable from min -> max. I think it was a copy-paste error.
* Make skip test messages more informative.
* Remove unused `skipIfONNXShapeInference`. More on that below.
* test_pytorch_onnx_onnxruntime.py:
* Make all the `TestCase` classes explicitly specify opset version.
* Make `test_unsupported_pad` respect `opset_version` by using `run_test`
* Unrelated simplification: make it obvious that all tests run with `onnx_shape_inference=True`. AFAICT this was already the case.
* There was one test that was entirely disabled (test_tolist) because it was asking to be skipped whenever `onnx_shape_inference=True`, but it was always True. I changed the model being tested so as to preserve the intended test coverage but still have the test actually pass.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73898
Reviewed By: msaroufim
Differential Revision: D35264615
Pulled By: malfet
fbshipit-source-id: cda8fbdffe4cc8210d8d96e659e3a9adf1b5f1d2
(cherry picked from commit b5e639e88828d34442282d0b50c977e610a2ba3a)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75108
- Add option to only run some graphs
- Add NNC Static vs Dynamic
- Update make_tensor bc it wasnt using strides
Test Plan: Imported from OSS
Reviewed By: ejguan
Differential Revision: D35374000
Pulled By: eellison
fbshipit-source-id: df16b8647f2309a8837207cacba55d30f46845ce
(cherry picked from commit 19feb54db049186972b47548cf3d83e76512adfd)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74076
Extends the repro script to cpu and NNC. As in file:
Usage:
```
1. Run your script and pipe into a log file
PYTORCH_JIT_LOG_LEVEL=">>tensorexpr_fuser" python3 my_test.py &> log.txt
2. Run log_extract:
log_extract.py log.txt --baseline --nnc
```
Test Plan: Imported from OSS
Reviewed By: gchanan
Differential Revision: D34946883
Pulled By: eellison
fbshipit-source-id: 644012dbbca0b490820ef83e761c06b0dd009e52
(cherry picked from commit 5256c8f3ff8545033d1335cc96d34194abda1370)