This reworks the DORT backend factory function to support the options kwarg of torch.compile, and defines a concrete OrtBackendOptions type that can be used to influence the backend.
Caching is also implemented in order to reuse backends with equal options.
Wrapping the backend in auto_autograd also becomes an option, which allows `OrtBackend` to always be returned as the callable for torch.compile; wrapping happens internally if opted into (True by default).
Lastly, expose options for configuring preferred execution providers (will be attempted first), whether or not to attempt to infer an ORT EP from a torch found device in the graph or inputs, and finally the default/fallback EPs.
### Demo
The following demo runs `Gelu` through `torch.compile(backend="onnxrt")` using various backend options through a dictionary form and a strongly typed form. It additionally exports the model through both the ONNX TorchScript exporter and the new TorchDynamo exporter.
```python
import math
import onnx.inliner
import onnxruntime
import torch
import torch.onnx
torch.manual_seed(0)
class Gelu(torch.nn.Module):
def forward(self, x):
return x * (0.5 * torch.erf(math.sqrt(0.5) * x) + 1.0)
@torch.compile(
backend="onnxrt",
options={
"preferred_execution_providers": [
"NotARealEP",
"CPUExecutionProvider",
],
"export_options": torch.onnx.ExportOptions(dynamic_shapes=True),
},
)
def dort_gelu(x):
return Gelu()(x)
ort_session_options = onnxruntime.SessionOptions()
ort_session_options.log_severity_level = 0
dort_gelu2 = torch.compile(
Gelu(),
backend="onnxrt",
options=torch.onnx._OrtBackendOptions(
preferred_execution_providers=[
"NotARealEP",
"CPUExecutionProvider",
],
export_options=torch.onnx.ExportOptions(dynamic_shapes=True),
ort_session_options=ort_session_options,
),
)
x = torch.randn(10)
torch.onnx.export(Gelu(), (x,), "gelu_ts.onnx")
export_output = torch.onnx.dynamo_export(Gelu(), x)
export_output.save("gelu_dynamo.onnx")
inlined_model = onnx.inliner.inline_local_functions(export_output.model_proto)
onnx.save_model(inlined_model, "gelu_dynamo_inlined.onnx")
print("Torch Eager:")
print(Gelu()(x))
print("DORT:")
print(dort_gelu(x))
print(dort_gelu2(x))
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107973
Approved by: https://github.com/BowenBao
In PR #93822 the `fx2trt` backend was removed which registered the `tensorrt` backend names to point to `fx2trt` / `torch_tensorrt` and move the name to `onnxrt`. We want to reserve the name `tensorrt` for `torch_tensorrt` to prevent any confusion but due to code-freeze we cannot complete the integration and set up testing for the next release. So we propose leaving out the `tensorrt` name until we can set up the backend and testing for it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94632
Approved by: https://github.com/frank-wei
These backends have been broken for some time. I tried to get them
running again, but as far as I can tell they are not maintained.
Installing torch_tensorrt downgrades PyTorch to 1.12. If I manually
bypass that downgrade, I get import errors from inside fx2trt. Fixes that
re-add these are welcome, but it might make sense to move these wrappers
to the torch_tensorrt repo once PyTorch 2.0 support is added.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93822
Approved by: https://github.com/frank-wei