Commit Graph

732 Commits

Author SHA1 Message Date
angelayi
fc340d0ca3 [export] Allow comparing device w/o index with device w/ index (#159665)
In the case where we have expected device "cuda" and given device "cuda:0" I think we should succeed?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159665
Approved by: https://github.com/yushangdi
2025-08-04 17:00:07 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
2ac45c2752 Fix autocast context manager when there is exception (#159565)
Summary: When exception occurs inside context manager, we need to either return False OR properly propagage exceptions via __exit__(exc_type, exc_val). But previously while tracing, we don't actually run the exit node so we end up swallowing the exception in a very weird way as outlined in https://github.com/pytorch/pytorch/issues/153202. This PR fixes it

Test Plan:
new test case

Rollback Plan:

Differential Revision: D79348382

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159565
Approved by: https://github.com/zou3519, https://github.com/yushangdi
2025-08-01 02:12:24 +00:00
PyTorch MergeBot
b1fb552974 Revert "Fix ep deepcopy when there is python builitin name (#159478)"
This reverts commit de7376537f.

Reverted https://github.com/pytorch/pytorch/pull/159478 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/159478#issuecomment-3141228423))
2025-07-31 20:20:53 +00:00
Pian Pawakapan
8fedcfa59a [export] _ccode for PythonMod (#158851)
Summary: Adds ccode impl to PythonMod

Test Plan:
test_export

Rollback Plan:

Differential Revision: D76463347

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158851
Approved by: https://github.com/kalpit-meta-1
2025-07-31 16:46:51 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
de7376537f Fix ep deepcopy when there is python builitin name (#159478)
Summary: title

Test Plan:
CI

Rollback Plan:

Differential Revision: D79261007

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159478
Approved by: https://github.com/pianpwk
2025-07-30 23:14:31 +00:00
Avik Chaudhuri
ea5369113a unflatten closure (#159418)
Summary: Sometimes the call history recorded in a `nn_module_stack` does not have the stack property, where each FQN is a prefix of the next FQN. This can cause errors during `unflatten`. Instead of erroring we now drop entries from such a `nn_module_stack` to restore the stack property. This effectively leads to less unflattening: the last FQN in the call history before the stack property was broken keeps the entire flat subgraph of its call.

Test Plan:
added test, updated another

Rollback Plan:

Differential Revision: D79204669

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159418
Approved by: https://github.com/angelayi
2025-07-30 15:42:18 +00:00
Colin Peppler
46d34d6766 (should_fold) gso to guard_or_false when checking folding whether to 3d bmm into 2d mm (#159184)
Switch from guard_size_oblivious to guard_or_false if you encounter a DDE, this would then avoid folding this 3d bmm into a mm.

806d9e3fe7/torch/_decomp/decompositions.py (L4506-L4512)

## DDE
```
  File "/data/users/colinpeppler/pytorch/torch/_decomp/decompositions.py", line 4506, in matmul
    elif should_fold(tensor1, tensor2, is_out):
  File "/data/users/colinpeppler/pytorch/torch/_decomp/decompositions.py", line 4472, in should_fold
    if guard_size_oblivious(t1.numel() == 0):
torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(12*((u0//2)), 0) (unhinted: Eq(12*((u0//2)), 0)).  (Size-like symbols: none)

Caused by: (_decomp/decompositions.py:4472 in should_fold)
```

```
  File "/data/users/colinpeppler/pytorch/torch/_decomp/decompositions.py", line 4506, in matmul
    elif should_fold(tensor1, tensor2, is_out):
  File "/data/users/colinpeppler/pytorch/torch/_decomp/decompositions.py", line 4483, in should_fold
    return all(
torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(3*((u0//2)), 3) (unhinted: Eq(3*((u0//2)), 3)).  (Size-like symbols: none)

Caused by: (_decomp/decompositions.py:4483 in should_fold)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159184
Approved by: https://github.com/ezyang
ghstack dependencies: #158894
2025-07-30 03:12:14 +00:00
Zhengxu Chen
8460131087 [nativert] Add OSS version of ModelRunner (#159268)
Summary: Implement a ModelRunner from scratch with the minimum features for OSS only

Test Plan:
test_export -r NativeRT

Rollback Plan:

Differential Revision: D78979812

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159268
Approved by: https://github.com/dolpm
2025-07-29 21:08:14 +00:00
FFFrog
6fc0ad22f0 Using the latest torch.library.register_fake API instead of torch.library.impl_abstract (#158839)
As the title stated.

`torch.library.impl_abstract` have beed deprecated in PyTorch2.4, so change to use the new API.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158839
Approved by: https://github.com/jingsh, https://github.com/zou3519
ghstack dependencies: #158838
2025-07-25 02:37:30 +00:00
Chuan Jiang
64cb349b81 Extract a method that filters frames in the captured stack trace (#158266)
Summary: The subclass can override the filtering logic to customize which frames to keep or drop.

Test Plan:
```
buck run caffe2/test:test_export -- -r  test_stack_trace
buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:others -- -r test_constant_random
buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:test_export  -- -r test_custom_obj_list_out
buck2 run 'fbcode//mode/dev-nosan' fbcode//caffe2/test:fx  -- -r class_member_back_compat
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158266
Approved by: https://github.com/ezyang, https://github.com/yushangdi
2025-07-25 02:22:03 +00:00
Laith Sakka
0b2ef76e85 DDE-Free select with unbacked index. (#157605)
When select has data dependent input, we cant tell if the actual index shall be index+size or index.
to avoid throwing dde, we allocate a new unbacked symbol to represent the storage offset of the
output view and we compute its value dynamically at runtime when inductor is lowered.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157605
Approved by: https://github.com/ColinPeppler
2025-07-24 20:08:05 +00:00
Pian Pawakapan
48fe4ff247 [export] set enable_gqa in export flash->math decomp (#158604)
Differential Revision: D78524147

For `scaled_dot_product_attention(..., enable_gqa=True)`:
- the Math backend passes the flag through, performing the extra [KV broadcast](6e07d6a0ff/aten/src/ATen/native/transformers/attention.cpp (L902)) if set to True
- the Flash backend has no flag, and relies on correct indexing in the C++ kernel
- Export used to default to Math for `enable_gqa=True`, but https://github.com/pytorch/pytorch/pull/157893 landed and enabled Flash. At the same time, there's an export-only [decomp](6e07d6a0ff/torch/_decomp/decompositions.py (L4968)) redirecting flash -> math, calling with `enable_gqa` unset, because that info isn't available. This led to https://fb.workplace.com/groups/1028545332188949/posts/1264609398582540 crashing, calling the Math non-GQA variant, with GQA inputs.

This assumes GQA for seqlen mismatches in the export decomp, setting `enable_gqa = <q seqlen> != <kv seqlen>`, relying on prior backend checks to raise on invalid input shapes.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158604
Approved by: https://github.com/angelayi, https://github.com/drisspg
2025-07-24 14:46:13 +00:00
Pian Pawakapan
dec0d3101c [export] fix unbacked range deserialization (#158681)
Fixes https://github.com/pytorch/pytorch/issues/151809, by reading shape assertion nodes into ShapeEnv, and deferring instantiation of node example values, to be done node-by-node.

Differential Revision: D78588406

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158681
Approved by: https://github.com/ydwu4, https://github.com/avikchaudhuri
2025-07-23 02:13:11 +00:00
Pian Pawakapan
39b54b78d7 [export] runtime asserts for while HOP subgraphs (#158467)
Differential Revision: D78431075

For #158366
- Calls runtime asserts pass for HOP subgraphs (in reenter_make_fx)
- For while_loop only (can be expanded), clones input tensors for subgraph tracing, so unbacked memos (item, nonzero, etc.) aren't reused

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158467
Approved by: https://github.com/ydwu4
2025-07-23 00:34:18 +00:00
Yidi Wu
fda3f3b2ec [while_loop] fix constant tensor used as carried inputs (#158381)
Address second part of #158366, where torch.tensor(0), is treated as a constant tensor and its .item() gets specailized to 0 which causes a silent specialization. The fix is to unspecialize the constant carries and make them non-constant.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158381
Approved by: https://github.com/zou3519
2025-07-18 07:08:11 +00:00
PyTorch MergeBot
23550ab735 Revert "DDE-Free select with unbacked index. (#157605)"
This reverts commit 79d7c754ab.

Reverted https://github.com/pytorch/pytorch/pull/157605 on behalf of https://github.com/laithsakka due to fail pr time benchmarks  ([comment](https://github.com/pytorch/pytorch/pull/157605#issuecomment-3084663020))
2025-07-17 16:20:02 +00:00
Xuehai Pan
c8d43cbc6e [BE][3/6] fix typos in test/ (#157637)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157637
Approved by: https://github.com/yewentao256, https://github.com/albanD
ghstack dependencies: #156605
2025-07-17 12:08:33 +00:00
Laith Sakka
79d7c754ab DDE-Free select with unbacked index. (#157605)
When select has data dependent input, we cant tell if the actual index shall be index+size or index.
to avoid throwing dde, we allocate a new unbacked symbol to represent the storage offset of the
output view and we compute its value dynamically at runtime when inductor is lowered.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157605
Approved by: https://github.com/ColinPeppler
2025-07-17 05:08:11 +00:00
Yidi Wu
651b4a68f2 [hop][dynamo] track run-ahead sym variables in side effects (#158273)
Before the PR, for code like this:
```
        class Example2(torch.nn.Module):
            def forward(self, x, trigger, target):
                return torch.cond(
                    trigger == 1,
                    lambda: x + target,
                    lambda: x * target,
                    (),
                )

        m = Example2()
        x = torch.randn(2)
        trigger = 0
        target = 2
        args = (x, trigger, target)
        ep = torch.export.export(
            m, args, dynamic_shapes=(None, Dim.DYNAMIC, Dim.DYNAMIC)
        )
```
dynamo will wrap "target" (i.e. a symInt) twice, once when we speculate the first lambda and find target is a symint and decides to wrap it up, creating a new SymNodeVariable and a placeholder input to the top-level graph.

The second time happens when we speculate the second lambda. Tensors are de-duplicated by checking tracked side effects to make sure object with the same id (though different sources) is mapped to the same TensorVaraible. For symints, two things are missing:
1. it's not in the _can_lift_attrs_to_input list (the change in builder.py)
2. it's not in the tracked by runahead_side_effects, so when speculate_subgraph finishes, they're discarded (the change in side_effects.py)

Note: the auto lifting mechanism for HOPs happens at proxy level when we trace the subgraph, which is after SymNodeVariable are created (they're created when realizing the args and bind them to subgraph). At that time, builder has created two unique SymNodeVariable for the same symint so the auto lifting in hops cannot de-dup them.

Differential Revision: [D78298163](https://our.internmc.facebook.com/intern/diff/D78298163)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/158273
Approved by: https://github.com/avikchaudhuri, https://github.com/zou3519
2025-07-15 23:48:20 +00:00
Xuehai Pan
7f14b42adf [BE][2/16] fix typos in torch/ (torch/_*/) (#156312)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156312
Approved by: https://github.com/albanD
2025-07-12 05:47:06 +00:00
PyTorch MergeBot
e15f4248ad Revert "[BE][2/16] fix typos in torch/ (torch/_*/) (#156312)"
This reverts commit 7a92b51196.

Reverted https://github.com/pytorch/pytorch/pull/156312 on behalf of https://github.com/XuehaiPan due to landrace ([comment](https://github.com/pytorch/pytorch/pull/156312#issuecomment-3064672250))
2025-07-12 04:40:52 +00:00
Xuehai Pan
7a92b51196 [BE][2/16] fix typos in torch/ (torch/_*/) (#156312)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/156312
Approved by: https://github.com/albanD
2025-07-12 01:47:22 +00:00
Pian Pawakapan
4cc13c4af6 [dynamic shapes] avoid unnecessary slices (#157528)
Fixes #157289, by extending optimization to slices where the end index exceeds the size.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157528
Approved by: https://github.com/angelayi
2025-07-10 06:34:46 +00:00
Avik Chaudhuri
9222552572 [non-strict export] uncovered cases of select and slice (#157821)
Summary:
`None` and `Ellipsis` in multi-dimensional indexing was previously not covered.

Moreover, we introduce a small optimization for `slice(None)` and a passthrough when symints do not appear in the indexing.

The remaining case is where indexing is by tensor, which is fairly complicated; we passthrough in that case.

Test Plan:
added tests

Rollback Plan:

Differential Revision: D77943929

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157821
Approved by: https://github.com/pianpwk
2025-07-10 05:48:12 +00:00
Shangdi Yu
0a624c2dc5 Fix from_node's graph_id in unlift() (#157943)
Summary: We should use the node before deepcopy in NodeSource

Test Plan:
```
buck run fbcode//caffe2/test:test_export -- -r test_from_node_metadata_export
```

Rollback Plan:

Differential Revision: D78022070

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157943
Approved by: https://github.com/angelayi, https://github.com/Gasoonjia
2025-07-10 03:23:55 +00:00
angelayi
391473cca0 [export] Fix lift constants bug (#157719)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157719
Approved by: https://github.com/yushangdi
2025-07-08 20:33:53 +00:00
Laith Sakka
7cfd054075 [attempt 2] Compute contiguity symbolically to avoid dde, and introduce c++ sym_is_contiguous (#157472)
Summary:
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.

sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.

ex:
 bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.

In this PR I only handle default contiguity, will follow up with changes for other formats like  channel_last .
We use this patter in this PR for several locations to avoid DDEs.

Test Plan:
contbuild & OSS CI,

Rollback Plan:

Reviewed By: malfet

Differential Revision: D77639021

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157472
Approved by: https://github.com/aorenste
2025-07-02 23:12:29 +00:00
PyTorch MergeBot
c6a27bae36 Revert "[do not revert] Compute contiguity symbolically to avoid dde, and introduce c++ sym_is_contiguous (#155590)"
This reverts commit d0a9629435.

Reverted https://github.com/pytorch/pytorch/pull/155590 on behalf of https://github.com/laithsakka due to was asked by to land this internally  ([comment](https://github.com/pytorch/pytorch/pull/155590#issuecomment-3025796794))
2025-07-01 22:58:14 +00:00
Laith Sakka
d0a9629435 [do not revert] Compute contiguity symbolically to avoid dde, and introduce c++ sym_is_contiguous (#155590)
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.

sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.

ex:
 bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.

In this PR I only handle default contiguity, will follow up with changes for other formats like  channel_last .
We use this patter in this PR for several locations to avoid DDEs.
Differential Revision: [D77183032](https://our.internmc.facebook.com/intern/diff/D77183032)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155590
Approved by: https://github.com/ezyang
2025-07-01 21:39:38 +00:00
angelayi
ffac0de07e [export] Remove stack trace from input/output (#157302)
Fixes https://github.com/pytorch/pytorch/issues/157183

https://github.com/pytorch/pytorch/pull/156257 consolidated the path for saving stack traces, but missed the part where stacktraces are not added to placeholder/output nodes in proxy_tensor tracing [(code)](https://github.com/pytorch/pytorch/pull/156257/files#diff-6960ce90e7162c0953b1ca07e92e7f0f2f6ba63b427b42df593e20cc6a096bb7L1107).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157302
Approved by: https://github.com/yushangdi
2025-07-01 19:16:28 +00:00
PyTorch MergeBot
1586521461 Revert "Compute contiguity symbolically to avoid dde, and introduce c++ sym_is_contiguous (#155590)"
This reverts commit 2c76f31221.

Reverted https://github.com/pytorch/pytorch/pull/155590 on behalf of https://github.com/jeanschmidt due to Breaking 1000s of internal builds, it cant be properly landed internally, there are no options except revert and codev. ([comment](https://github.com/pytorch/pytorch/pull/155590#issuecomment-3023503929))
2025-07-01 11:23:00 +00:00
Laith Sakka
2c76f31221 Compute contiguity symbolically to avoid dde, and introduce c++ sym_is_contiguous (#155590)
When we compute contiguity for a tensor with dynamic shapes we first:
1) Try to compute it without guarding.
2) If all shapes hinted, compute it with potentially adding guards.
3) if any input is not hinted, compute it symbolically.

sym_is_contiguous return a SymBool that is then either evaluated or guard_or_false can be called
on it to avoid data dependent errors.

ex:
 bool is_contiguous = input.sym_is_contiguous().guard_or_false(__FILE__, __LINE__);
is_contiguous_or_false is a helper function that does that.

In this PR I only handle default contiguity, will follow up with changes for other formats like  channel_last .
We use this patter in this PR for several locations to avoid DDEs.
Differential Revision: [D77183032](https://our.internmc.facebook.com/intern/diff/D77183032)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155590
Approved by: https://github.com/ezyang
2025-06-27 04:59:52 +00:00
Shangdi Yu
204db27a0c Consolidate stack trace in Tracer (#156257)
Summary:
- Consolidate the stack trace recording code in TracerBase and PythonKeyTracer
- Change `make_fx`'s arg name to be consistent with TracerBase member name `record_stack_traces`

We move the stack trace logic from `create_proxy` to `create_node` so all inherited classes of TracerBase and re-use the same stack trace logic.

Test Plan:
```
buck run caffe2/test:test_export -- -r  test_stack_trace
```

Rollback Plan:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156257
Approved by: https://github.com/angelayi, https://github.com/zou3519
2025-06-25 23:07:10 +00:00
Xuehai Pan
6d5c789ad5 [BE][PYFMT] migrate PYFMT for test/[a-h]*/ to ruff format (#144555)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144555
Approved by: https://github.com/ezyang
ghstack dependencies: #144551, #144554
2025-06-24 04:53:54 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
d290fe7690 Remove legacy export testing path (#156093)
Summary: After this diff stack lands, we are pretty much done with the training IR migration. So there is no need to run extensive legacy export test.

Test Plan:
CI

Rollback Plan:

Differential Revision: D76734378

Pull Request resolved: https://github.com/pytorch/pytorch/pull/156093
Approved by: https://github.com/desertfire
2025-06-18 15:36:44 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
370fc49dde Handle aten.to at submodule boundaries (#153972)
Summary: #buildall

Test Plan: CI

Differential Revision: D74582970

When we decompose to inference IR, aten.to can sometimes disappear. As a result, export module call graph tree will start containing dead nodes because previous provenance tracking is insufficient. This PR fixes that. The caveat is that this won't work in general for tensor subclass inputs to submodule that user wants to preserve signature because we always desugar the tensor subclass into constituent tensors in inference IR making it impossible to preserve the original calling convention.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153972
Approved by: https://github.com/avikchaudhuri
2025-06-14 16:13:29 +00:00
PyTorch MergeBot
06408dae49 Revert "Add view_simple as meta function for view, and avoid calling reshape_view_helper. (#154757)"
This reverts commit 0029259bdf.

Reverted https://github.com/pytorch/pytorch/pull/154757 on behalf of https://github.com/laithsakka due to post land issue ([comment](https://github.com/pytorch/pytorch/pull/154757#issuecomment-2971385787))
2025-06-13 19:11:43 +00:00
Avik Chaudhuri
463fe36532 fix error message on specialization with Dim.DYNAMIC (#155738)
Previously specialization error messages would render sources that were pretty far from source-code names. E.g., given args named `x, y, zs`, the source for `y.size()[0]` would be rendered as `args[0][1].size()[0]`.

This is because we created artificial local names following `(args, kwargs)` structure instead of reusing signatures. This PR fixes that situation.

Basically we map prefixes of key paths that correspond to original arg names to root sources corresponding to those names; the rest of the key paths hang from these root sources.

Differential Revision: [D76461391](https://our.internmc.facebook.com/intern/diff/D76461391/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155738
Approved by: https://github.com/bobrenjc93
2025-06-13 10:33:46 +00:00
Pian Pawakapan
75824035d3 [dynamic shapes] skip fused linear path if not definitely contiguous (#155051)
Falls back to non-fused linear -> add bias path for non-contiguous tensors with unbacked sizes
Pull Request resolved: https://github.com/pytorch/pytorch/pull/155051
Approved by: https://github.com/laithsakka
2025-06-12 15:55:21 +00:00
Laith Sakka
0029259bdf Add view_simple as meta function for view, and avoid calling reshape_view_helper. (#154757)
address https://github.com/pytorch/pytorch/issues/153303

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154757
Approved by: https://github.com/bobrenjc93, https://github.com/leslie-fang-intel
2025-06-12 09:58:15 +00:00
Shangdi Yu
bc3972b80a [reland] Add stack_trace on make_fx (#155486)
Summary:
Previosuly, we only add stack trace in class _ModuleStackTracer(PythonKeyTracer) for non-strict export. I moved this stack trace logic to the parent class PythonKeyTracer, this way the graph traced from Module using make_fx will have stack_trace as well.

Motivation: we've observed some uses cases where users first use make_fx on the Module, and then run export on the resulting graph. If the result of make_fx doesn't have stack trace, the stack trace information is lost.

**User needs to turn this on by passing in `stack_trace=True` to make_fx. We don't make this the default option since this might increase inductor compilation time (`make_fx` is used in inductor to trace graph patterns for pattern matching). It's also turned on if `_inductor.config.trace.enabled` is True.**

**preserving stack trace is on by default for ModuleStackTracer, which is used for non-strict export.**

Test Plan:
```
buck run test:test_export -- -r  test_stack_trace
buck run fbcode//caffe2/test/dynamo:test_dynamo -- -k test_autocast_ordering
```

Rollback Plan:

Differential Revision: D76298692

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155486
Approved by: https://github.com/angelayi, https://github.com/zou3519
2025-06-11 21:27:43 +00:00
Pian Pawakapan
247f83e0a4 [dynamic shapes] guard individual terms in sym_and; user-code-friendly sym_and/sym_or (#154737)
Previously when processing `sym_and(a, b, c)`, symbolic shapes wouldn't individually process a, b, and c and store their implications. This would lead us to data-dependent error on individual checks, e.g. we stored `u0 >= 0 & u0 <= 10`, but then couldn't figure out `u0 <= 10`.

This handles that, and also makes `sym_and/or` user-code friendly, for testing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154737
Approved by: https://github.com/laithsakka
2025-06-11 18:08:06 +00:00
Colin Peppler
7b7cd56f5e [export] support linear & layer_norm unbacked (#155260)
## What
- use `definitely_contiguous_for_memory_format` instead of `is_contiguous` when the non-contiguous case is fine if we encounter a DDE.
- use ref's contiguous over Aten's contiguous because Aten's version will DDE and stop tracing. ref's version will use `definitely_contiguous_for_memory_format` and clone if there's a DDE.

## Example DDEs

- Fixed with `definitely_contiguous_for_memory_format` in `fast_binary_impl`
```
torch._dynamo.exc.UserError: Could not guard on data-dependent expression Eq((u0//387), 0) (unhinted: Eq((u0//387), 0)).  (Size-like symbols: u0)

Caused by: layer_norm = self.layer_norm(linear)  # caffe2/test/export/test_export.py:4566 in forward (_subclasses/fake_impls.py:1022 in fast_binary_impl)
```

- Fixed with `refs.contiguous` instead of calling aten's contiguous (that'd require a bigger re-write in Aten)
```
  File "c10/core/TensorImpl.h", line 825, in torch::autograd::THPVariable_contiguous(_object*, _object*, _object*)
  File "c10/core/SymbolicShapeMeta.h", line 87, in c10::TensorImpl::is_contiguous_default(c10::MemoryFormat) const
  File "c10/core/SymbolicShapeMeta.cpp", line 250, in c10::SymbolicShapeMeta::init_is_contiguous() const

torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression Eq(128*((u0//387)), 0) (unhinted: Eq(128*((u0//387)), 0)).  (Size-like symbols: u0)

Caused by: (_refs/__init__.py:3302 in native_layer_norm)
```

- Fixed with `definitely_contiguous_for_memory_format` in ref's contiguous
```
torch.fx.experimental.symbolic_shapes.GuardOnDataDependentSymNode: Could not guard on data-dependent expression 387*((u0//387)) < 2 (unhinted: 387*((u0//387)) < 2).  (Size-like symbols: u0)

Caused by: (_prims_common/__init__.py:279 in is_contiguous)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155260
Approved by: https://github.com/laithsakka
ghstack dependencies: #155499
2025-06-11 16:47:34 +00:00
Yidi Wu
545fbd58dc [export] inline jit.scripted function in export (#155180)
When we export a scripted function, we inline the original callable stored in "_torchdynamo_inline", this is the same strategy as torch.compile path.

We do the same thing for script method, where a "\_\_wrapped\_\_" attribute points to the original callable in most cases. There are some corner cases we identified: top-level jit.scripted modules' method doesn't have a \_\_wrapped\_\_. In this case, we fall back to the original scripted approach. Maybe there're more such cases but need verification.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155180
Approved by: https://github.com/zou3519
2025-06-10 20:34:12 +00:00
PyTorch MergeBot
620415e018 Revert "Add stack_trace on make_fx (#155155)"
This reverts commit d4d0ede6ba.

Reverted https://github.com/pytorch/pytorch/pull/155155 on behalf of https://github.com/malfet due to Not sure why it was merged, it indeed breaks those tests in CI ([comment](https://github.com/pytorch/pytorch/pull/155155#issuecomment-2956973633))
2025-06-09 20:40:13 +00:00
Shangdi Yu
d4d0ede6ba Add stack_trace on make_fx (#155155)
Summary:
Previosuly, we only add stack trace in `class _ModuleStackTracer(PythonKeyTracer)` for non-strict export. I moved this stack trace logic to the parent class `PythonKeyTracer`, this way the graph traced from Module using make_fx will have stack_trace as well.

Motivation: we've observed some uses cases where users first use `make_fx` on the Module, and then run `export` on the resulting graph. If the result of `make_fx` doesn't have stack trace, the stack trace information is lost.

Test Plan:
```
buck run test:test_export -- -r  test_stack_trace
```

Rollback Plan:

Differential Revision: D75985427

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155155
Approved by: https://github.com/angelayi, https://github.com/zou3519
2025-06-09 18:31:57 +00:00
Shangdi Yu
606d73bde4 Adding from_node for nodes in gm.module() (#155053)
Summary:
Adding "from_node" information that indicates which nodes are unlifted in `.module()` call.
The lifted nodes will have "ExportedProgram.module().unlift()" passname in the last entry of from_node.

Test Plan:
```
buck run fbcode//caffe2/test:test_export -- -r test_from_node_metadata_export
```

Rollback Plan:

Reviewed By: angelayi

Differential Revision: D75837494

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155053
Approved by: https://github.com/angelayi
2025-06-05 20:11:56 +00:00
angelayi
c8566a0b98 [export] Use patching in test (#155132)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155132
Approved by: https://github.com/pianpwk
2025-06-04 21:41:26 +00:00
angelayi
77d85a4629 Symintify baddbmm (#154656)
Previously we would specialize on the shape in this if-statement
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154656
Approved by: https://github.com/pianpwk
2025-06-02 15:23:14 +00:00
angelayi
e22be781b7 Symintify repeat_interleave (#154660)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154660
Approved by: https://github.com/pianpwk
2025-06-02 15:19:39 +00:00