Commit Graph

1710 Commits

Author SHA1 Message Date
Justin Chu
d11720efdb [ONNX] Remove unused logic from internal verification module (#161449)
Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161449
Approved by: https://github.com/xadupre, https://github.com/titaiwangms
ghstack dependencies: #161323
2025-09-02 16:22:49 +00:00
Justin Chu
524b78d4f6 [ONNX] Refactor torchscript based exporter (#161323)
Refactor torchscript based exporter logic to move them to a single (private) location for better code management. Original public module and method apis are preserved.

- Updated module paths in `torch/csrc/autograd/python_function.cpp` accordingly
- Removed `check_onnx_broadcast` from `torch/autograd/_functions/utils.py` because it is private&unused

@albanD / @soulitzer could you review changes in `torch/csrc/autograd/python_function.cpp` and
`torch/autograd/_functions/utils.py`? Thanks!

## BC Breaking
- **Deprecated members in `torch.onnx.verification` are removed**

Differential Revision: [D81236421](https://our.internmc.facebook.com/intern/diff/D81236421)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161323
Approved by: https://github.com/titaiwangms, https://github.com/angelayi
2025-09-02 16:10:30 +00:00
Ti-Tai Wang
da838f65af [ONNX] Drop draft_export in exporter API (#161454)
If onnx exporter fallbacks to draft_export with big models, this is taking forever for users, and possibly spam the printout, which keeps users from their stack trace with strict=False.

We could consider make another API for draft_export as debugging tool, or combine it with report=True when "model is small"?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161454
Approved by: https://github.com/justinchuby
2025-08-26 22:13:43 +00:00
Justin Chu
36ac916929 [ONNX] Fix lower opset version support in dynamo=True (#161056)
After we switched to constructing the registry with the specified opset version in dynamo=True, support for opset<18 was broken because there would be no torchlib ops registered for these opsets. I updated the registry creation logic to always use opset 18 if the requested opset is lower, and use the version converter (as designed) to target those opsets.

This requires onnxscript>=0.4 (https://github.com/pytorch/pytorch/pull/161312)

Fixes https://github.com/onnx/onnx/issues/7235

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161056
Approved by: https://github.com/titaiwangms
2025-08-23 05:04:36 +00:00
Justin Chu
38a492d40d [ONNX] Remove unused _onnx_supported_ops (#161322)
Signed-off-by: Justin Chu <justinchuby@users.noreply.github.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/161322
Approved by: https://github.com/titaiwangms
2025-08-23 02:42:25 +00:00
Justin Chu
0d9da384ef Bump onnxscript to 0.4.0 in CI (#161312)
Use onnxscript apis for torch 2.9.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161312
Approved by: https://github.com/titaiwangms, https://github.com/malfet
2025-08-22 23:23:08 +00:00
Justin Chu
419a2dbf5f [ONNX] Remove enable_fake_mode and exporter_legacy (#161222)
Remove enable_fake_mode and exporter_legacy entirely. Even though this is bc breaking, `enable_fake_mode` is no longer compatible with the latest version of transformers, and so it is no longer useful.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161222
Approved by: https://github.com/titaiwangms
2025-08-22 22:15:27 +00:00
PyTorch MergeBot
82c7a1eb4b Revert "[ONNX] Default to dynamo export (#159646)"
This reverts commit 11b6ceb7b4.

Reverted https://github.com/pytorch/pytorch/pull/159646 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/159646#issuecomment-3198507767))
2025-08-18 21:41:32 +00:00
Justin Chu
11b6ceb7b4 [ONNX] Default to dynamo export (#159646)
Set dynamo=True and enable fallback.

1. Implemented the compatible behavior where BytesIO objects as `f` is accepted
2. Update tests to explicitly set dynamo=False

#151693

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159646
Approved by: https://github.com/titaiwangms
2025-08-16 04:48:58 +00:00
Shiva Kaul
e299926f72 [ONNX] Fix doc typo for symbolic_multi_out (#160702)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160702
Approved by: https://github.com/justinchuby
2025-08-15 14:34:42 +00:00
Ti-Tai Wang
566c6d52ef [ONNX] Fix the export of the model having none as output (#160200)
Fixes #160150

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160200
Approved by: https://github.com/justinchuby

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
2025-08-08 23:09:34 +00:00
IlyasMoutawwakil
c859ba7114 Make onnx export SDPA match aten behavior (#159973)
This PR makes onnx sdpa export match the behavior of aten sdpa when boolean mask is used.
@justinchuby

```python
import onnxruntime as ort
import torch

class ScaledDotProductAttention(torch.nn.Module):
    def forward(self, query, key, value, attn_mask):
        return torch.nn.functional.scaled_dot_product_attention(query, key, value, attn_mask=attn_mask)

model = ScaledDotProductAttention()
attn_mask = torch.ones(2, 4, 8, 8).bool()  # boolean mask for attention
attn_mask[0, 0, 0, :] = False  # masking an entire row (padding token)
query = key = value = torch.randn(2, 4, 8, 16)
output = model(query, key, value, attn_mask)

torch.onnx.export(
    model,
    (query, key, value, attn_mask),
    "scaled_dot_product_attention.onnx",
    input_names=["query", "key", "value", "attn_mask"],
    output_names=["output"],
    dynamo=false, # or True,
)
ort_session = ort.InferenceSession("scaled_dot_product_attention.onnx")

np_inputs = {"query": query.numpy(), "key": key.numpy(), "value": value.numpy(), "attn_mask": attn_mask.numpy()}
onnx_outputs = ort_session.run(None, np_inputs)[0]

torch.testing.assert_close(output, torch.tensor(onnx_outputs), equal_nan=True)
```
fails the assertion because the ort model outputs nans.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159973
Approved by: https://github.com/xadupre, https://github.com/titaiwangms
2025-08-07 04:06:07 +00:00
Justin Chu
73ee323380 [ONNX] RMS Norm (#159377)
- Implement rms norm using onnx RMSNormalization-23
- Use the correct eps for float32
  eaadd1282c/aten/src/ATen/native/cuda/layer_norm_kernel.cu (L1844-L1866)
  <img width="743" height="107" alt="image" src="https://github.com/user-attachments/assets/a6fd45aa-01d9-4667-924d-3012232cfcde" />

- Created facility to run tests with the reference runtime by extending ONNXProgram and assert_onnx_program.

Fix https://github.com/pytorch/pytorch/issues/159257
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159377
Approved by: https://github.com/titaiwangms
2025-07-30 18:55:47 +00:00
Nikita Shulga
6d071bd65d Remove numpy dependency from onnx (#159177)
One should not expect numpy to be there during onnx import
Forward fix for : https://github.com/pytorch/pytorch/pull/157734
Added regression test to `test_without_numpy` function

Test plan: Run `python -c "import sys;sys.path.insert(0, 'fake_numpy');import torch; import torch.onnx"` with/without this fix
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159177
Approved by: https://github.com/atalman, https://github.com/justinchuby, https://github.com/titaiwangms, https://github.com/cyyever, https://github.com/Skylion007, https://github.com/andrewboldi
2025-07-27 13:23:03 +00:00
Aaron Orenstein
e20736bf1d Dont't GC as often when collecting cudagraphs (#158193)
TL;DR: Cuts vLLM cudagraph collection from 80s -> 24s

Stop garbage collecting by default on every cudagraph recording. The old behavior can be re-enabled by setting `TORCH_CUDAGRAPH_GC=1` or the config `force_cudagraph_gc`.

We were previously garbage collecting at the beginning of each cudagraph
capture. vLLM collects 5427 graphs and most of those garbage collections weren't
actually collecting any memory (CPU or GPU). This changes it to not collect more
than every 10s so if we're capturing in a loop we don't burn all our cycles
looking for garbage.

(These number have a lot of variance from run to run but give the correct
general scale)
```
       | calls | total | synchronize |  gcs | collect | empty cache | sys freed | cuda freed |
-------+-------+-------+-------------+------+---------+-------------+-----------+------------+
before |  5427 |   78s |       1.48s | 5427 |  53.22s |       1.21s |    145855 | 1539309568 |
-------+-------+-------+-------------+------+---------+-------------+-----------+------------+
after  |  5427 |   24s |          0s |    3 |   1.53s |       0.84s |       592 | 1539309568 |
-------+-------+-------+-------------+------+---------+-------------+-----------+------------+
```
total - this is the total time reported by vLLM's "Graph capturing finished" log.
The rest of these are measured in torch.cuda.graphs.graph.__enter__():
  calls - number of times torch.cuda.graphs.graph.__enter__ was called
  synchronize - this is the duration taken by the cuda.synchronize call
  gcs - number of times gc.collect was called
  collect - this is the duration taken by the gc.collect call
  empty cache - this is the duration taken by the torch.cuda.empty_cache call
  sys freed - the number of bytes reported freed by gc.collect
  cuda freed - the number of bytes reported freed by torch.cuda.memory_reserved

So it seems like the heavy lifting is done by torch.cuda.empty_cache() which is
fairly quick.

Cudagraph results from the TorchInductor Performance DashBoard (this is from the original version using the GC clock so the real results will be slightly better than this):
<img width="1494" height="382" alt="image" src="https://github.com/user-attachments/assets/69b705ef-47ce-4b6e-9733-1ec941cad93d" />

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158193
Approved by: https://github.com/ngimel
2025-07-24 21:37:11 +00:00
Pian Pawakapan
39b54b78d7 [export] runtime asserts for while HOP subgraphs (#158467)
Differential Revision: D78431075

For #158366
- Calls runtime asserts pass for HOP subgraphs (in reenter_make_fx)
- For while_loop only (can be expanded), clones input tensors for subgraph tracing, so unbacked memos (item, nonzero, etc.) aren't reused

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158467
Approved by: https://github.com/ydwu4
2025-07-23 00:34:18 +00:00
Justin Chu
767791943d [ONNX] Set default opset to 20 (#158802)
Bump default opset to 20, which is a newer opset and the max torchscript exporter supports.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158802
Approved by: https://github.com/titaiwangms
2025-07-22 19:55:05 +00:00
Alexander Novikov
0971637c11 Fix torch.tensor warning in ONNX symbolic_opset10 export (#158835)
Fix PyTorch tensor copying warning in ONNX export

## Problem

PyTorch ONNX exporter was generating a warning about incorrect tensor copying method:

```
UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158835
Approved by: https://github.com/justinchuby
2025-07-22 16:32:49 +00:00
Zain Rizvi
193b29ee0c
[BE][EZ] Minor doc fixes (#158574)
[BE] Minor doc fixes
2025-07-18 10:34:55 -05:00
Ti-Tai Wang
3f83e3eeca [ONNX] Remove legacy registration and dispatcher (#158283)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158283
Approved by: https://github.com/Skylion007, https://github.com/justinchuby
ghstack dependencies: #158258, #158262, #158282
2025-07-15 21:00:49 +00:00
Ti-Tai Wang
e4c17d5e1c [ONNX] Remove fx_onnx_interpreter.py (#158282)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158282
Approved by: https://github.com/Skylion007, https://github.com/justinchuby
ghstack dependencies: #158258, #158262
2025-07-15 20:46:06 +00:00
Ti-Tai Wang
205241a0d5 [ONNX] Remove legacy dynamo graph extractor (#158262)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158262
Approved by: https://github.com/justinchuby
ghstack dependencies: #158258
2025-07-15 20:21:49 +00:00
Aaron Orenstein
250ae2531c Fix types in graphs.py (#158192)
Added type annotations for torch/cuda/graphs.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158192
Approved by: https://github.com/oulgen
2025-07-15 19:49:38 +00:00
Ti-Tai Wang
5606c516fd [ONNX] Remove legacy Dort (#158258)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158258
Approved by: https://github.com/justinchuby, https://github.com/malfet
2025-07-15 19:14:06 +00:00
albanD
058fb1790f Fix compilation and "import torch" issues for cpython 3.14 (#158184)
Beginning of process for 3.14 bringup.

State of things from this PR:
- Nothing too scary looking from the Dynamo CPython side, nothing we heavily rely on seems to be missing @williamwen42
- The existing check that makes torch.compile() nicely fail is working as expected. So all these empty functions shouldn't cause any weirdness.
- The `__module__` update changes look suspicious, we should investigate what is the reason and impact of that, in particular for our public API checking @jbschlosser
- Leaving the weakref.py thread safety change as a follow up to keep this a bit simpler. I vendored the whole struct in the meantime FYI @ezyang

EDIT: The `__module__` change is even more cursed than I though due to changes to Union and Optional type where the `__module__` field cannot be changed anymore. See https://github.com/python/cpython/issues/132139 for details.
For now, I'm just skipping the `__module__` setting for 3.14 which will trip the public API checks. Will revisit once I have a final answer on the cpython issue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158184
Approved by: https://github.com/msaroufim
2025-07-15 05:06:55 +00:00
Ti-Tai Wang
5fb07acbc3 [ONNX] Remove legacy modularization (#158257)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158257
Approved by: https://github.com/justinchuby
ghstack dependencies: #158255, #158256
2025-07-15 04:36:01 +00:00
Ti-Tai Wang
336bff6d58 [ONNX] Remove legacy graph passes (#158256)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158256
Approved by: https://github.com/justinchuby
ghstack dependencies: #158255
2025-07-15 04:27:30 +00:00
Ti-Tai Wang
12151c96d9 [ONNX] Remove legacy io_adapter (#158255)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158255
Approved by: https://github.com/justinchuby
2025-07-15 03:39:18 +00:00
Ti-Tai Wang
2eff14c445 [ONNX] Delete torch.onnx.dynamo_export (#158130)
It's deprecated since torch==2.7.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158130
Approved by: https://github.com/justinchuby
2025-07-12 02:30:47 +00:00
Ti-Tai Wang
08e9dd280f [ONNX] Support symbolic arguments in onnx exporter (#157734)
Previous to this PR, torch.onnx.export(..., dynamo=True, veriy=True, report=True) does not support symbolic arguments. Such examples are like follwing:

```python
class M(torch.nn.Module):
    def forward(self, a, x):
        return a + torch.tensor(1) + x

op = torch.onnx.export(M(), (1, torch.ones(2)),
                       dynamic_shapes=(torch.export.Dim.DYNAMIC, {0: torch.export.Dim.DYNAMIC}),
                       dynamo=True, report=True)
```

symbolic arguments are like constant arguments that they don't have tensor_meta wither. Besides, torch.export.export supports model inputs having constants, which is different from the legacy issue: https://github.com/pytorch/pytorch/issues/99534 where we tried to get the FX directly from dynamo export. Thus, `_remove_non_tensor` is deleted from args processing.

NOTE: If the ConstantArugment shows up in exported_program, it was kept to align the length of inputs to nn.Module, but it's irrelevant to the model graph, hwich is why in ONNX model the input is omitted.

The test `test_constant_argument_user_input_is_omitted_in_onnx_graph` needs #157719
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157734
Approved by: https://github.com/justinchuby
2025-07-09 21:15:45 +00:00
Xuehai Pan
4cc8b60d1b [BE][1/16] fix typos in torch/ (#156311)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156311
Approved by: https://github.com/albanD
2025-07-09 11:02:22 +00:00
Xuehai Pan
db259bd6b8 [BE][12/16] fix typos in torch/ (#156602)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156602
Approved by: https://github.com/justinchuby, https://github.com/albanD
ghstack dependencies: #156318, #156320
2025-07-02 22:55:29 +00:00
xadupre
0105cd89ab [ONNX] Fix conversion of attention - 4D (#157130)
Fixes a wrong conversion to onnx while investigation #149662.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157130
Approved by: https://github.com/gramalingam, https://github.com/justinchuby, https://github.com/titaiwangms

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
2025-07-02 18:05:10 +00:00
Justin Chu
5692cbb818 [ONNX] Delete symbolic caffe2 (#157102)
Caffe2 is removed from pytorch. This is a clean up.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157102
Approved by: https://github.com/titaiwangms, https://github.com/cyyever
2025-06-28 05:22:02 +00:00
Justin Chu
36fd1ac932 [ONNX] Bump onnxscript api for torch 2.8 (#157017)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157017
Approved by: https://github.com/titaiwangms, https://github.com/malfet
2025-06-27 17:39:17 +00:00
Ti-Tai Wang
a7b29c88b1 [ONNX] Preserve all legacy exporter params in fallback (#156659)
Fixes #151693

Previous to this PR, the fallback does not take care of all user parameters. This pr preserves them to ensure a smooth transition for users.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156659
Approved by: https://github.com/justinchuby
2025-06-24 05:28:55 +00:00
Justin Chu
fbbab794ef [ONNX] Implement Attention-23 (#156431)
Implement Attention-23 using sdpa and flexattention.

- I used copilot for this.
- Also updated the conversion logic to remove trailing None inputs.

@gramalingam @kunal-vaishnavi @titaiwangms
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156431
Approved by: https://github.com/titaiwangms

Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-06-20 23:54:57 +00:00
Ti-Tai Wang
3644b41a7c [ONNX] Note on attention op symbolic function (#156441)
Follow up https://github.com/pytorch/pytorch/pull/156367
Explain why num_heads is provided when ONNX Attention op does not need it in torch case: The thread: https://github.com/pytorch/pytorch/pull/156367#discussion_r2155727038

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156441
Approved by: https://github.com/justinchuby
2025-06-19 21:00:05 +00:00
Yuan Yao
02080c2cd9 Fix num_heads inference in ONNX Attention-23 exporter (#156367)
Fixes issue in torch-onnx exporter for Attention: https://github.com/pytorch/pytorch/issues/156105

Previously the number of heads attributes inferred by the exporter is incorrect. It should be read from input dimension -3 not dimension 3:

![image](https://github.com/user-attachments/assets/26f10e15-bc98-42ac-807a-2e089a7d996a)

But in fact, [torch sdpa](https://docs.pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html) doesn't support combined num_heads and head_size dimensions like [ONNX](https://onnx.ai/onnx/operators/onnx__Attention.html) does, so this num_heads attribute is not needed.

Extending support to rank>4 can be left as future work if there is use case for that. The translation logic will look like: Reshape(Q,K,V to 4d) -> Attention -> Reshape(Y to original rank).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156367
Approved by: https://github.com/justinchuby, https://github.com/titaiwangms
2025-06-19 09:40:01 +00:00
Justin Chu
1e474cc9c8 [ONNX] Fix how shapes are computed for float4 (#156353)
Changed the way we compute shapes for unpacked float4. Previously we always added a last dimension [2] to existing shape, but this doesn't really make sense because it prevents use from being able to represent any shape other than those with a list dim [2]. I updated the logic to be `[*shape[:-1], shape[-1]*2]` which doubles the last dimension. This is more in line with what we see in practice when people are using 4bit types, and it allows us to represent any shape with an even dimension at the end, which is much more reasonable in my opinion.

Also clarified in https://github.com/pytorch/pytorch/pull/148791#discussion_r2155395647
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156353
Approved by: https://github.com/titaiwangms
2025-06-18 22:28:02 +00:00
xadupre
e6252f62ef [ONNX] Implements converter for higher order ops scan (#154513)
Fixes #151327

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154513
Approved by: https://github.com/justinchuby

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
2025-06-17 00:54:07 +00:00
Justin Chu
f810e98143 [ONNX] Update default opset to 18 (#156023)
Update default opset for the torchscript exporter to 18 to match the dynamo exporter, because support was actaully added and tested in https://github.com/pytorch/pytorch/pull/118828. In the next version we should plan to update to opset 21 or higher. This change also removes the hard limit on the torchscript exporter for more flexibility.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156023
Approved by: https://github.com/Skylion007
2025-06-16 08:40:49 +00:00
Aaron Orenstein
e95e8eed0a mypy 1.16.0 (#155821)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155821
Approved by: https://github.com/ezyang, https://github.com/zou3519
2025-06-14 18:18:43 +00:00
Ti-Tai Wang
bf897b4cea [ONNX] Support 0/1 on dynamic dimension (#155717)
Previous to this PR, the exporter does not support dynamic dim with traced inputs containing 0/1. But after https://github.com/pytorch/pytorch/pull/148696, this is supported by torch.export.export. This PR adds the patch to torch.onnx.export.

However, there is still known pitfall existing because the difference between eager and export. Compiler needs to decide the exported shape ahead, and whether the "hidden broadcasting" being applied results in different export.

For example,

```python
import torch

class Model(torch.nn.Module):
    def forward(self, x, y, z):
        return torch.cat((x, y), axis=1) + z

model = Model()
x = torch.randn(2, 3)
y = torch.randn(2, 5)
z = torch.randn(1, 8)
model(x, y, z)

DYN = torch.export.Dim.DYNAMIC
ds = {0: DYN, 1: DYN}

with torch.fx.experimental._config.patch(backed_size_oblivious=True):
    ep = torch.export.export(model, (x, y, z), dynamic_shapes=(ds, ds, ds))

print(ep)
"""
ExportedProgram:
    class GraphModule(torch.nn.Module):
        def forward(self, x: "f32[s7, s16]", y: "f32[s7, s43]", z: "f32[s7, s16 + s43]"):
             #
            sym_size_int: "Sym(s7)" = torch.ops.aten.sym_size.int(x, 0)
            sym_size_int_1: "Sym(s16)" = torch.ops.aten.sym_size.int(x, 1)
            sym_size_int_2: "Sym(s7)" = torch.ops.aten.sym_size.int(y, 0)
            sym_size_int_3: "Sym(s43)" = torch.ops.aten.sym_size.int(y, 1)
            sym_size_int_4: "Sym(s7)" = torch.ops.aten.sym_size.int(z, 0)
            sym_size_int_5: "Sym(s16 + s43)" = torch.ops.aten.sym_size.int(z, 1)

             # File: /home/titaiwang/pytorch/test_export.py:7 in forward, code: return torch.cat((x, y), axis=1) + z
            cat: "f32[s7, s16 + s43]" = torch.ops.aten.cat.default([x, y], 1);  x = y = None

             #
            eq: "Sym(True)" = sym_size_int_2 == sym_size_int;  sym_size_int_2 = None
            _assert_scalar_default = torch.ops.aten._assert_scalar.default(eq, "Runtime assertion failed for expression Eq(s58, s35) on node 'eq'");  eq = _assert_scalar_default = None
            add_1: "Sym(s16 + s43)" = sym_size_int_1 + sym_size_int_3;  sym_size_int_1 = sym_size_int_3 = None
            eq_1: "Sym(True)" = add_1 == sym_size_int_5;  add_1 = sym_size_int_5 = None
            _assert_scalar_default_1 = torch.ops.aten._assert_scalar.default(eq_1, "Runtime assertion failed for expression Eq(s16 + s43, s23) on node 'eq_1'");  eq_1 = _assert_scalar_default_1 = None
            eq_2: "Sym(True)" = sym_size_int == sym_size_int_4;  sym_size_int = sym_size_int_4 = None
            _assert_scalar_default_2 = torch.ops.aten._assert_scalar.default(eq_2, "Runtime assertion failed for expression Eq(s35, s7) on node 'eq_2'");  eq_2 = _assert_scalar_default_2 = None

             # File: /home/titaiwang/pytorch/test_export.py:7 in forward, code: return torch.cat((x, y), axis=1) + z
            add: "f32[s7, s16 + s43]" = torch.ops.aten.add.Tensor(cat, z);  cat = z = None
            return (add,)

Graph signature:
    # inputs
    x: USER_INPUT
    y: USER_INPUT
    z: USER_INPUT

    # outputs
    add: USER_OUTPUT

Range constraints: {s7: VR[0, int_oo], s16: VR[0, int_oo], s43: VR[0, int_oo], s16 + s43: VR[0, int_oo]}
"""
ep.module()(x, y, z)
"""
Traceback (most recent call last):
  File "/home/titaiwang/pytorch/test_export.py", line 20, in <module>
    ep.module()(x, y, z)
  File "/home/titaiwang/pytorch/torch/fx/graph_module.py", line 840, in call_wrapped
    return self._wrapped_call(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/titaiwang/pytorch/torch/fx/graph_module.py", line 416, in __call__
    raise e
  File "/home/titaiwang/pytorch/torch/fx/graph_module.py", line 403, in __call__
    return super(self.cls, obj).__call__(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/titaiwang/pytorch/torch/nn/modules/module.py", line 1767, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/titaiwang/pytorch/torch/nn/modules/module.py", line 1873, in _call_impl
    return inner()
           ^^^^^^^
  File "/home/titaiwang/pytorch/torch/nn/modules/module.py", line 1800, in inner
    args_kwargs_result = hook(self, args, kwargs)  # type: ignore[misc]
                         ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/titaiwang/pytorch/torch/_dynamo/eval_frame.py", line 895, in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/titaiwang/pytorch/torch/export/_unlift.py", line 83, in _check_input_constraints_pre_hook
    _check_input_constraints_for_graph(
  File "/home/titaiwang/pytorch/torch/_export/utils.py", line 426, in _check_input_constraints_for_graph
    _check_symint(
  File "/home/titaiwang/pytorch/torch/_export/utils.py", line 338, in _check_symint
    raise RuntimeError(
RuntimeError: Expected input at *args[2].shape[0] to be equal to 2, but got 1
"""
```

The explanation (from @pianpwk):

In the model we have `return torch.cat((x, y), axis=1) + z`.

Before this add is executed, the LHS has shape `[s7, s16 + s43]`, while the z has shape, say `[s8, s16 + s43]` (we don't know `s7 == s8` yet). When we execute this add, the compiler is making a decision: does broadcasting apply or not? The choices are:

1) Yes -> then we must specialize `s8` to 1
2) No -> then this element-wise op is only valid if the shapes match up, and we assume `s7 == s8`.

Unfortunately export can only follow one of these options, and in avoiding 0/1 specialization (because a dynamic dimension was requested), it assumed case 2).

For an operation like a + b, in eager semantics it's possible to have all options (either a == 1 OR b == 1 OR a == b), but with export we need to make a decision on what the output shape of this operation is, and keeping all branches alive requires expressing the output shape with a conditional (e.g. output shape = `a if b == 1 else b`), which is pretty hard for the compiler to reason about.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155717
Approved by: https://github.com/justinchuby
2025-06-14 04:04:47 +00:00
Oguz Ulgen
d1947a8707 Migrate from lru_cache to cache (#155613)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155613
Approved by: https://github.com/ezyang
ghstack dependencies: #155612
2025-06-11 19:44:18 +00:00
Ti-Tai Wang
1e373d02d5 [ONNX] Change deprecation message from 2.8 to 2.9 (#155580)
~~The PR: https://github.com/pytorch/pytorch/pull/152478 did not respect the release policy that the deprecation should happen after the deprecation message has been set for 2 releases. This PR postpone 2.8 to the rightful version 2.10.~~

~~NOTE: "as early as" 2.10 shall give ONNX users more time to adapt and provide feedback.~~

To follow the upcoming torchscript deprecation, `torch.onnx.export` expects to switch dynamo=True (also turn on fallback=True for bc) on torch 2.9.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155580
Approved by: https://github.com/justinchuby, https://github.com/tugsbayasgalan
2025-06-11 19:31:29 +00:00
Justin Chu
79aef14169 [ONNX] Set the name of the producing node using the value name (#155413)
When comparing two graphs exported using different opset versions, even though the value names are the same in both graphs, the node names did not match, causing model-explorer to not be able to sync the two graphs. This change updates the names of the nodes that directly produce the output values, for better correspondence across exported graphs.

![image](https://github.com/user-attachments/assets/3c00ca18-221f-4add-8429-4bcf12069036)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155413
Approved by: https://github.com/cyyever, https://github.com/xadupre
2025-06-09 13:03:58 +00:00
Aaron Gokaslan
6b1211df29 [BE]: Backport runtime_checkable perf improvements/behavior from 3.12 (#155130)
Backports some behavior changes and performance improvements with runtime_checkable in 3.12 to older versions of Python. Should be free performance improvement on typing checking protocols since everything works on Python 3.12.

The difference between the two versions of runtime_checkable is [these lines](40e22ebb2c/src/typing_extensions.py (L800-L823)).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155130
Approved by: https://github.com/rec, https://github.com/aorenste
2025-06-06 13:28:05 +00:00
Kshitij Khode
ca0c2985d3 [ONNX] Allow exporter to export SDPA to Attention onnx operator (#154596)
Fixes [#149662](https://github.com/pytorch/pytorch/issues/149662)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154596
Approved by: https://github.com/justinchuby, https://github.com/titaiwangms

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
2025-06-04 14:29:44 +00:00
Justin Chu
3e57de1251 [ONNX] Create support for rotary embeddings (#154745)
This PR registers the RotaryEmbedding op in the `torch.ops.onnx` name spaces and allows the exporter to recognize and export onnx operators.

## Design

ONNX operators of their respective opset version is implemented in torch/onnx/ops/_impl.py, and are registered in the torch.ops.onnx namespace following the following rule:

`OpType-version => torch.ops.onnx.OpType.opset{version}`

For example, `RotaryEmbedding-23` becomes `torch.ops.onnx.RotaryEmbedding.opset23`

This name is parsed by the exporter to create an onnx node in the graph without having to go through translation.

When users use the ops in the model, we provide more convenient, unversioned functions under `torch.onnx.ops` that will dispatch to the implementations based on user input (type and provided attributes). For example, users can directly call `torch.onnx.ops.rotary_embedding()` to use the op natively in their pytorch models. I chose snake case naming to make the functions more pythonic and aligned with other torch apis.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154745
Approved by: https://github.com/titaiwangms
2025-06-04 03:07:43 +00:00
Aaron Gokaslan
bbda22e648 [BE][Ez]: Optimize unnecessary lambda with operator (#154722)
Automated edits performed by FURB118. Operator is implemented in C and way faster when passed to another C method like sorted, max etc as a `key=`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154722
Approved by: https://github.com/jansel
2025-05-30 23:47:10 +00:00
Justin Chu
c3100067ae [ONNX] Update onnx to 1.18 (#153746)
Update onnx python package to 1.18.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153746
Approved by: https://github.com/titaiwangms, https://github.com/cyyever, https://github.com/malfet
2025-05-25 20:58:47 +00:00
Justin Chu
0e805aad7f [ONNX] Support float4 (#151069)
- Support exporting float4 models (note: currently we use IR version 10 universally in the exporter, which does not include float 4 support. Eventually when onnx runtime and the ecosystem moves to support the new IR version 11 we should bump our version to 11 in the exporter as well)
- The shape of the type is set according to https://github.com/pytorch/pytorch/pull/148791#discussion_r2038704986 (added last dim with size 2)
- Use ml_dtypes types when converting to numpy for consistency with ONNX IR

Fix https://github.com/pytorch/pytorch/issues/150202

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151069
Approved by: https://github.com/titaiwangms
2025-05-18 03:19:35 +00:00
Aaron Gokaslan
3555ebb63d [BE]: Update ruff to 0.11.8 (#153249)
Fixes a ton of false negatives throughout the codebase. RUFF also properly validates NOQA comments now and most of the changes are fixing typos there or removing filewide flake8 suppressions that were also silencing ruff issues.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153249
Approved by: https://github.com/cyyever, https://github.com/albanD, https://github.com/seemethere
2025-05-12 18:30:52 +00:00
Ti-Tai Wang
90fde0dc09 [ONNX] Support sym_float (#153200)
Fixes #153115

Note: torch.sym_int is not supported in this PR because it's not appeared in exported program, instead, it's `torch.ops.aten.sym_size.int()`.

```
ExportedProgram:
    class GraphModule(torch.nn.Module):
        def forward(self, x: "f32[s35, s16]"):
             #
            sym_size_int_1: "Sym(s35)" = torch.ops.aten.sym_size.int(x, 0);  x = None
            return (sym_size_int_1,)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153200
Approved by: https://github.com/justinchuby

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
2025-05-09 19:10:17 +00:00
Ti-Tai Wang
773a91c775 [ONNX] dynamic_shapes uses DYNAMIC (#153065)
Although Dim.AUTO covers the cases that a user sets more axes to be dynamic than the model actually needs, it silently falls back to STATIC when DYNAMIC fails. This increases the difficulty of debugging.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153065
Approved by: https://github.com/justinchuby
2025-05-07 21:48:41 +00:00
Ti-Tai Wang
5fa5017479 [ONNX] Suggest users setting dynamo=True when exporting (#152478)
Fixes #152025

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152478
Approved by: https://github.com/justinchuby
2025-05-06 23:18:11 +00:00
Ti-Tai Wang
a5dd7011a0 [ONNX] Delete JitTraceConvertStrategy (#152556)
Fixes #151703

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152556
Approved by: https://github.com/justinchuby
2025-05-02 00:26:43 +00:00
Anthony Shoumikhin
e2f9759bd0 Fix broken URLs (#152237)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152237
Approved by: https://github.com/huydhn, https://github.com/malfet
2025-04-27 09:56:42 +00:00
xadupre
91c590f048 [ONNX] add converters for sym_min, sym_max (#152196)
Conversion of Phi4-multimodel-instruct fails because of missing converters for torch.sym_max, and torch.sym_min.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152196
Approved by: https://github.com/justinchuby
2025-04-25 20:01:05 +00:00
Justin Chu
a811d3351b [ONNX] Implement sym_not (#152111)
Implement onnx support for sym_not. Replaces https://github.com/pytorch/pytorch/pull/147472

Fix https://github.com/pytorch/pytorch/issues/136572
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152111
Approved by: https://github.com/titaiwangms
2025-04-25 07:50:37 +00:00
Justin Chu
e2c7ae52d5 [ONNX] Add group_norm support from opset 21 (#152138)
I didn't run the model in test because ORT doesn't have the op yet. Nevertheless it should be leveraged for newer opset versions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152138
Approved by: https://github.com/titaiwangms, https://github.com/shubhambhokare1, https://github.com/cyyever
2025-04-25 03:30:07 +00:00
titaiwangms
6cd1741985 [ONNX] Update decomposition logic to loop over onnx registry (#151826)
Fixes #150367

This PR makes decomposition table from onnx registry, which includes registered ops not only ATen and prim. This will help to keep the custom ops that are specified in the custom_translation table from decomposition during ONNX export.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151826
Approved by: https://github.com/justinchuby
2025-04-22 19:40:52 +00:00
Justin Chu
56d318bfac [ONNX][Eazy] Update onnx program doc formatting and improve robustness (#151623)
- Update docstring list formatting
- Use a try finally block to keep the model unmodified if save() fails.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151623
Approved by: https://github.com/titaiwangms
2025-04-18 21:31:31 +00:00
Justin Chu
8780d18f64 [ONNX] Add a comment for handling bf16/fp8 tensor to numpy conversion (#151371)
Follow up of https://github.com/pytorch/pytorch/pull/151259
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151371
Approved by: https://github.com/titaiwangms
2025-04-16 00:49:38 +00:00
Justin Chu
9917feff50 [ONNX] Produce correct dtypes for bf16/f8 in IR TorchTensor (#151259)
Split the changes from https://github.com/pytorch/pytorch/pull/151069 to address https://github.com/microsoft/onnxscript/issues/2187, where the output np arrays do not have the correct ml_dtypes types as expected.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151259
Approved by: https://github.com/titaiwangms
2025-04-15 23:21:04 +00:00
Justin Chu
901e37515f [ONNX] Fix bfloat16 support in onnx_program callable (#151121)
- Added a test to guard bfloat16. The optimizer incorrectly turns bfloat16 initializers into uint16, but this is not relevant to export logic.
- Fix bfloat16 support in onnx_program callable

Tested with the following with cuda

```py
import torch

class BfloatModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.param = torch.nn.Parameter(torch.tensor(2.0, dtype=torch.bfloat16))

    def forward(self, x):
        return x * torch.tensor(1.0, dtype=torch.bfloat16) * self.param

input = torch.randn(1, 10, dtype=torch.bfloat16)
model = BfloatModel()
onnx_program = torch.onnx.export(model, (input,), dynamo=True, optimize=False, verify=True)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151121
Approved by: https://github.com/titaiwangms

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-14 19:27:29 +00:00
Thomas Adams
8494d5582a Propagate callable parameter types using ParamSpec (#142306) (#151014)
Partially addresses #142306

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151014
Approved by: https://github.com/Skylion007
2025-04-13 20:38:11 +00:00
Justin Chu
75162aa7de [ONNX] Support running bfloat16 models with ONNX Runtime (#149646)
Use ORTValue objects to support bfloat16 and other dtypes as inputs. This only supports cuda as ort only implements bfloat16 on cuda.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149646
Approved by: https://github.com/titaiwangms
2025-04-11 03:38:26 +00:00
Justin Chu
f304483e95 [ONNX] Add asdict method to VerificationInfo class (#151024)
This pull request introduces a new method to convert `VerificationInfo` objects to dictionaries and includes a corresponding test to ensure the method works correctly.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151024
Approved by: https://github.com/titaiwangms
2025-04-10 22:23:33 +00:00
shubhambhokare1
1a56609e75 [ONNX] Supporting different opset versions for torchlib registry (#149901)
- Allows opset_version to determine which onnx decomposition to choose
- Adds a cleanup function to modify the registry after it is built

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149901
Approved by: https://github.com/justinchuby, https://github.com/titaiwangms
2025-04-09 16:03:46 +00:00
Pian Pawakapan
103bf64a3c [export] refactor _Dim into Dim (#149891)
Summary: forward fix T218515233

Test Plan: test_export

Differential Revision: D71769231

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149891
Approved by: https://github.com/jingsh, https://github.com/angelayi
2025-03-28 06:19:03 +00:00
Justin Chu
3efa211e48 [ONNX] Annotate None inputs in symbolic ops (#150038)
Add `None` to type annotations of `torch.onnx.ops.symbolic*` ops and improve tests to test support for optional inputs. Previously it was omitted mistakenly even though the implementation supports it.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150038
Approved by: https://github.com/titaiwangms
2025-03-27 00:01:09 +00:00
Justin Chu
6ae8eb881c [ONNX] Clean up the diagnostics module (#149864)
Remove the diagnostics/SARIF module from ONNX exporter because it is obsolete unused.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149864
Approved by: https://github.com/titaiwangms
2025-03-26 05:58:32 +00:00
PyTorch MergeBot
30e8be599f Revert "[ONNX] Clean up the diagnostics module (#149864)"
This reverts commit cc6e300fe2.

Reverted https://github.com/pytorch/pytorch/pull/149864 on behalf of https://github.com/malfet due to This indeed broke Mac testing see 1c98dc3664/1 ([comment](https://github.com/pytorch/pytorch/pull/149864#issuecomment-2752317873))
2025-03-25 19:31:50 +00:00
Justin Chu
cc6e300fe2 [ONNX] Clean up the diagnostics module (#149864)
Remove the diagnostics/SARIF module from ONNX exporter because it is obsolete unused.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149864
Approved by: https://github.com/titaiwangms
2025-03-25 16:58:46 +00:00
titaiwangms
280e48739a [ONNX] Set is_in_onnx_export for dynamo=True (#149678)
Fixes #149141

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149678
Approved by: https://github.com/justinchuby
2025-03-25 03:16:23 +00:00
Justin Chu
2dccd70ef0 [ONNX] Clean up legacy dynamo export code (#149745)
Clean up code that is unused and obsolete. The public `torch.onnx.dynamo_export` is kept for now but the legacy implementation is removed.

Remove public option classes and OnnxRegistry that have been deprecated.

Users: use torch.onnx.export(…, dynamo=True).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149745
Approved by: https://github.com/titaiwangms, https://github.com/cyyever
2025-03-23 19:35:16 +00:00
Justin Chu
a39bf846f5 [ONNX] Add draft_export as a strategy (#147529)
Create draft_export strategy.

The strategy is added before jit and after strict=True, as the third fallback. Since it is specializing tensors it should not be less robust than the jit trace strategy.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147529
Approved by: https://github.com/titaiwangms
2025-03-21 03:05:17 +00:00
Justin Chu
362b40939d [ONNX] Improve docstring of onnx symbolic ops (#149668)
Better examples
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149668
Approved by: https://github.com/titaiwangms
2025-03-21 01:57:39 +00:00
Pian Pawakapan
96828a2155 [export] refactor DimHints for type errors (#149424)
Differential Revision: D71414367

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149424
Approved by: https://github.com/justinchuby, https://github.com/avikchaudhuri
2025-03-19 18:51:07 +00:00
Justin Chu
010963032c [ONNX] Create onnx_symbolic (#148905)
In the old exporter we allow users to define a symbolic() method to bypass JIT tracing for a block of logic. We can allow users to do similar things by creating symbolic ops at export.

This PR implements `torch.onnx.ops.symbolic` and `torch.onnx.ops.symbolic_multi_out` to allow users to create onnx nodes symbolically with pt2 & fx. The custom pytorch ops were designed such that the attributes are encoded to be part of a valid fx op. Users provide shape and dtype for the meta function to produce the currect fake tensor during export.

An example is

![image](https://github.com/user-attachments/assets/c62f5f21-e038-456e-a71d-b9a5d0a7cd9d)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148905
Approved by: https://github.com/titaiwangms
2025-03-18 21:32:06 +00:00
Aleksei Nikiforov
d5b1d99f78 Enable more nightly tests on s390x (#148452)
Also enable some tests which probably were accidentally disabled.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148452
Approved by: https://github.com/seemethere, https://github.com/malfet
2025-03-18 16:09:39 +00:00
Justin Chu
fdacf3c920 [ONNX] Update types in VerificationInfo (#149377)
torch.types.Number was rendered as is in the documentation and can be confusing. We write the original types instead to reduce confusion for users.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149377
Approved by: https://github.com/titaiwangms
2025-03-18 15:37:39 +00:00
Justin Chu
ebabd0efdd [ONNX] Expose verification utilities (#148603)
Expose verification utilities to public documentation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148603
Approved by: https://github.com/titaiwangms
2025-03-18 02:10:34 +00:00
Aaron Gokaslan
a0ac63cbd9 [BE]: Apply ruff PERF403 to use dict comprehensions more often (#149257)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149257
Approved by: https://github.com/jansel
2025-03-18 00:46:07 +00:00
PyTorch MergeBot
24cfeec2c7 Revert "[BE]: Apply ruff PERF403 to use dict comprehensions more often (#149257)"
This reverts commit bfee141666.

Reverted https://github.com/pytorch/pytorch/pull/149257 on behalf of https://github.com/malfet due to Let's see if it helps restore compiler benchmark sanity, see 8bc7bd94a5/1 ([comment](https://github.com/pytorch/pytorch/pull/149257#issuecomment-2731133812))
2025-03-17 22:57:00 +00:00
Aaron Gokaslan
bfee141666 [BE]: Apply ruff PERF403 to use dict comprehensions more often (#149257)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149257
Approved by: https://github.com/jansel
2025-03-16 23:52:58 +00:00
Justin Chu
d96c85558a [ONNX] Use torch export to get dynamic shapes for JIT convert strategy (#148627)
Use torch export to get dynamic shapes for JIT converted graph. I just realized we can retrace a converted jit graph with `torch.export` and produce dynamic shapes using `torch.export`.

-	**Prior:** The exporter will produce a **static graph silently** even when dynamic_shapes are provided.
-	**Proposed:** When `dynamic_shapes` is provided and when the strategy is able to handle it, it will succeed

## Why are we still keeping the JIT strategy?

It is useful when users want to convert JIT modules or `.pt` files into ONNX via the new path. Sometimes also useful when there are JIT scripted modules in the nn module.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148627
Approved by: https://github.com/titaiwangms
2025-03-07 23:41:50 +00:00
Justin Chu
d36391307f [ONNX] Handle error in verification interpreter (#148730)
Use a simple try catch to handle onnx runtime errors in the verification interpreter when that happens. One example is ort will sometimes produce a list of None for some nodes. I am not sure how that happens yet.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148730
Approved by: https://github.com/titaiwangms
ghstack dependencies: #148706
2025-03-07 20:24:49 +00:00
Justin Chu
e3087f6d76 [ONNX] Improve verify_onnx_program to use VerificationInterpreter (#148706)
I realized we can just extend `verify_onnx_program` to return intermediate values. There is no need for us to expose the VerificationInterpreter to users.

I added a `compare_intermediates` option to `verify_onnx_program`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148706
Approved by: https://github.com/titaiwangms
2025-03-07 00:40:54 +00:00
titaiwangms
e7bc1d1791 [ONNX] Update saved exported program in debugging report if the exporting passes run_decomposition() (#148617)
Previous to this PR, if the exporting passes run_decomposition(), the report still shows the exported_program before decomposition, which adds the difficulties to our users when they want to check the exported program that are used to translate to ONNX graph.

The following example is what we see before this PR:

```
# PyTorch ONNX Conversion Report

```
 Obtain model graph with `torch.export.export(..., strict=False)`
 Obtain model graph with `torch.export.export(..., strict=True)`
 Obtain model graph with `torch.jit.trace`
 Decompose operators for ONNX compatibility
 Translate the graph into ONNX
 Run `onnx.checker` on the ONNX model
 Execute the model with ONNX Runtime
 Validate model output accuracy
```

## Error messages

```pytb

Traceback (most recent call last):

  File "/home/titaiwang/pytorch/torch/onnx/_internal/exporter/_core.py", line 707, in _translate_fx_graph
    _handle_call_function_node_with_lowering(

  File "/home/titaiwang/pytorch/torch/onnx/_internal/exporter/_core.py", line 486, in _handle_call_function_node_with_lowering
    raise _errors.DispatchError(

torch.onnx._internal.exporter._errors.DispatchError: No ONNX function found for <OpOverload(op='aten.slice', overload='Tensor')>. Failure message: No decompositions registered for the complex-valued input

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "/home/titaiwang/pytorch/torch/onnx/_internal/exporter/_core.py", line 1371, in export
    onnx_program = _exported_program_to_onnx_program(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "/home/titaiwang/pytorch/torch/onnx/_internal/exporter/_core.py", line 1007, in _exported_program_to_onnx_program
    values = _translate_fx_graph(
             ^^^^^^^^^^^^^^^^^^^^

  File "/home/titaiwang/pytorch/torch/onnx/_internal/exporter/_core.py", line 733, in _translate_fx_graph
    raise _errors.ConversionError(

torch.onnx._internal.exporter._errors.ConversionError: Error when translating node %slice_1 : [num_users=1] = call_function[target=torch.ops.aten.slice.Tensor](args = (%_to_copy, 0, 0, 9223372036854775807), kwargs = {}). See the stack trace for more information.

```

## Exported program

```python
ExportedProgram:
    class GraphModule(torch.nn.Module):
        def forward(self, x: "f32[3, 4]"):
             # File: /home/titaiwang/pytorch/test_slice_complex.py:6 in forward, code: x_complex = x.to(torch.complex64)
            to: "c64[3, 4]" = torch.ops.aten.to.dtype(x, torch.complex64);  x = None

             # File: /home/titaiwang/pytorch/test_slice_complex.py:8 in forward, code: return x_complex[:, :2]
            slice_1: "c64[3, 4]" = torch.ops.aten.slice.Tensor(to, 0, 0, 9223372036854775807);  to = None
            slice_2: "c64[3, 2]" = torch.ops.aten.slice.Tensor(slice_1, 1, 0, 2);  slice_1 = None
            return (slice_2,)

Graph signature: ExportGraphSignature(input_specs=[InputSpec(kind=<InputKind.USER_INPUT: 1>, arg=TensorArgument(name='x'), target=None, persistent=None)], output_specs=[OutputSpec(kind=<OutputKind.USER_OUTPUT: 1>, arg=TensorArgument(name='slice_2'), target=None)])
Range constraints: {}

```

## Analysis

PyTorch ONNX Conversion Analysis

## Model Information

The model has 0 parameters and 0 buffers (non-trainable parameters).
Number of parameters per dtype:
```python
defaultdict(<class 'int'>, {})
```
Number of buffers per dtype:
```python
defaultdict(<class 'int'>, {})
```

Inputs:
- `x`: `TensorMetadata(shape=torch.Size([3, 4]), dtype=torch.float32, requires_grad=False, stride=(4, 1), memory_format=torch.contiguous_format, is_quantized=False, qparams={})`

Outputs:
- `slice_2`: `TensorMetadata(shape=torch.Size([3, 2]), dtype=torch.complex64, requires_grad=False, stride=(4, 1), memory_format=None, is_quantized=False, qparams={})`

The FX graph has 5 nodes in total. Number of FX nodes per op:
- `placeholder`: 1
- `call_function`: 3
- `output`: 1

Of the call_function nodes, the counts of operators used are:

- `aten.slice.Tensor`: 2
- `aten.to.dtype`: 1

## ONNX Conversion Information

The model contains operators the dispatcher could not find registered ONNX decompositions for. This may be due to missing implementations, decompositions not registered correctly, or a bug in the dispatcher.

Errors grouped by operator:

- `aten.to.dtype`:     No decompositions registered for the real-valued input. Example node: `%to : [num_users=1] = call_function[target=torch.ops.aten.to.dtype](args = (%x, torch.complex64), kwargs = {})`. All nodes: `[to]`
- `aten.slice.Tensor`:     No decompositions registered for the complex-valued input. Example node: `%slice_1 : [num_users=1] = call_function[target=torch.ops.aten.slice.Tensor](args = (%to, 0, 0, 9223372036854775807), kwargs = {})`. All nodes: `[slice_1, slice_2]`

## Decomposition comparison

Ops exist only in the ExportedProgram before decomposition: `['aten.to.dtype']`

Ops exist only in the ExportedProgram after decomposition: `['aten._to_copy.default']`

```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148617
Approved by: https://github.com/justinchuby
2025-03-06 07:03:45 +00:00
titaiwangms
f057206fca [ONNX] Support complex comparison when verify=True (#148619)
Previously, the comparison of complex numbers was not supported when `verify=True`.

NOTE: This PR can be extended to support more complex comparison cases if there are other places in onnx codebase needed to be changed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148619
Approved by: https://github.com/justinchuby
2025-03-06 04:38:43 +00:00
Justin Chu
e1dee4ccb3 [ONNX] Assert capture strategy in tests (#148348)
Previously the strategy used for obtaining the exported program is not asserted. This leads to silent errors if torch.export breaks something and a fallback strategy is used. This change adds a _capture_strategy field to ONNXProgram and enables unit tests to assert the strategy used to prevent fallbacks from happening.

Fixes #147674

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148348
Approved by: https://github.com/titaiwangms, https://github.com/shubhambhokare1
2025-03-05 22:31:54 +00:00
Justin Chu
c6a05df174 [ONNX] Use onnxscript apis for 2.7 (#148453)
Use onnxscript apis for 2.7.

Remove reference to `torchlib_opset()` and `torchlib_opset_version()` which were removed in the onnxscript 2.7 apis. These apis were removed because torchlib in onnxscript will always stay on opset 18. Future opset version bumps will happen in pytorch core after the migration of torchlib.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148453
Approved by: https://github.com/titaiwangms, https://github.com/shubhambhokare1
2025-03-05 20:10:00 +00:00
Justin Chu
50e827b3df [ONNX] Create VerificationInterpreter (#148396)
An fx interpreter for comparing ONNX values with pytorch ones.

```py
import torch
from torch.onnx._internal.exporter._verification import VerificationInterpreter

class Model(torch.nn.Module):
    def forward(self, query, key, value):
        res = torch.nn.functional.scaled_dot_product_attention(
            query, key, value
        )
        rest = res.transpose(0, 1)
        return rest.view(8, 32, 128 * 64)

model = Model()

query = torch.rand(32, 8, 128, 64, dtype=torch.float16)
key = torch.rand(32, 8, 128, 64, dtype=torch.float16)
value = torch.rand(32, 8, 128, 64, dtype=torch.float16)

onnx_program = torch.onnx.export(model, (query, key, value), dynamo=True)
interpreter = VerificationInterpreter(onnx_program)
interpreter.run(query, key, value)
for info in interpreter.verification_infos:
    print(info)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148396
Approved by: https://github.com/titaiwangms
2025-03-05 19:18:52 +00:00
Xuehai Pan
c73a92fbf5 [BE][CI] bump ruff to 0.9.2: multiline assert statements (#144546)
Reference: https://docs.astral.sh/ruff/formatter/black/#assert-statements

> Unlike Black, Ruff prefers breaking the message over breaking the assertion, similar to how both Ruff and Black prefer breaking the assignment value over breaking the assignment target:
>
> ```python
> # Input
> assert (
>     len(policy_types) >= priority + num_duplicates
> ), f"This tests needs at least {priority+num_duplicates} many types."
>
>
> # Black
> assert (
>     len(policy_types) >= priority + num_duplicates
> ), f"This tests needs at least {priority+num_duplicates} many types."
>
> # Ruff
> assert len(policy_types) >= priority + num_duplicates, (
>     f"This tests needs at least {priority + num_duplicates} many types."
> )
> ```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144546
Approved by: https://github.com/malfet
2025-02-27 20:46:16 +00:00
Ti-Tai Wang
8ee84aa703 [ONNX] Fix missed None type support in dyamic shapes string cases (#148025)
In `_any_str_or_dim_in_dynamic_shapes`, we strictly guard the `dynamic_shapes` to make sure the flattened shapes are valid. But the code missed to consider None could be in the shapes.

NOTE: Found in benchmarking with Olive.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148025
Approved by: https://github.com/justinchuby

Co-authored-by: Justin Chu <justinchuby@users.noreply.github.com>
2025-02-27 07:57:47 +00:00
Xuehai Pan
754fb834db [BE][CI] bump ruff to 0.9.0: string quote styles (#144569)
Reference: https://docs.astral.sh/ruff/formatter/#f-string-formatting

- Change the outer quotes to double quotes for nested f-strings

```diff
- f'{", ".join(args)}'
+ f"{', '.join(args)}"
```

- Change the inner quotes to double quotes for triple f-strings

```diff
  string = """
-     {', '.join(args)}
+     {", ".join(args)}
  """
```

- Join implicitly concatenated strings

```diff
- string = "short string " "short string " f"{var}"
+ string = f"short string short string {var}"
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144569
Approved by: https://github.com/Skylion007
ghstack dependencies: #146509
2025-02-24 19:56:09 +00:00
Xuehai Pan
52f6d4aa30 [BE][CI][Easy] bump ruff to 0.9.0: long statements in docstrings (#146509)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146509
Approved by: https://github.com/justinchuby, https://github.com/Skylion007
2025-02-24 19:56:08 +00:00