Commit Graph

2431 Commits

Author SHA1 Message Date
Anthony Shoumikhin
9e50c21e27 Fix xrefs (#151888)
Fix existing cross references and removed old ones

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151888
Approved by: https://github.com/eqy, https://github.com/huydhn, https://github.com/svekars
2025-04-25 21:27:27 +00:00
Yuanhao Ji
d7eb3a492c [Typing] Enable torch.types.IntLikeType / FloatLikeType / BoolLikeType (#152157)
### Changes

Replace `Union[SymInt, int]` and `Union[int, SymInt]` with `IntLikeType`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152157
Approved by: https://github.com/Skylion007
2025-04-25 19:00:10 +00:00
Jim Wan
7f28c03fac Adding fbgemm to whitelist (#152079)
Adding `torch.ops.fbgemm` to GraphPickler's allowlist. Otherwise, the fx graph module containing `fbgemm` node will return "Unable to pickle non-standard op" error.

The validation is done on the model and the difference appears only on the graph name not the node.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152079
Approved by: https://github.com/aorenste
2025-04-25 01:13:51 +00:00
Pian Pawakapan
2ee8de54b1 [dynamic shapes] user-code friendly statically_known_true, has_static_value (#151601)
Fixes #151480

Allows `statically_known_true` in user code, as well as introducing `has_static_value`, returning True if the input has a static bool/float/int value

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151601
Approved by: https://github.com/laithsakka, https://github.com/zou3519, https://github.com/jingsh
2025-04-24 02:53:59 +00:00
Pian Pawakapan
fd3d339e17 [dynamic shapes] be less aggressive with runtime assert CSE for bounds (#151590)
Fixes #150540
Fixes #147772

Stops trying to CSE bound expressions, only does exact deduplication for runtime asserts. Adds the test cases to check that AOTAutograd doesn't data-dependent error out when retracing due to not seeing the asserts.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151590
Approved by: https://github.com/laithsakka
2025-04-23 23:07:00 +00:00
Pian Pawakapan
13339ce086 [dynamic shapes] bound_sympy for size-oblivious min/max reasoning (#151242)
Differential Revision: D72978020

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151242
Approved by: https://github.com/bobrenjc93
2025-04-23 02:14:05 +00:00
Riley Dulin
cd576fdce5 [torch][fx] Add support for EXIR dialect overload ops in normalize_function (#143689)
Summary:
I had a minor annoyance when debugging graphs using EXIR dialect ops,
that all the function normalization went away. For functions with > 5 arguments,
some of which are just simple bools and ints, it's very helpful to have
the kwarg names attached.

Enhance `normalize_target` to handle EdgeOpOverload targets. To avoid
a circular dependency on Executorch from pytorch core, I just use a `hasattr`
check for "_op". This only happens if the target is not already a recognized
torch function.

Also, I noticed that the new `fx.Node.normalized_arguments` function
didn't forward an important kwarg to `normalize_target`, so I fixed that too.

Test Plan: Tested with FxGraphDrawer and an fx Graph containing EXIR nodes.

Differential Revision: D67545909

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143689
Approved by: https://github.com/angelayi
2025-04-22 23:36:02 +00:00
angelayi
f4ac9a160d [fx] Filter stacktrace (#151029)
Filtering out the stacktrace so that the stacktrace on nodes when using fx.Tracer looks nicer. I just copied the filtering we have in [proxy_tensor.py](6720d23969/torch/fx/experimental/proxy_tensor.py (L1903-L1931)).

Previously the stacktrace looked like:
```
File "/data/users/angelayi/pytorch/moo.py", line 3964, in <module>
    run_tests()
  File "/data/users/angelayi/pytorch/torch/testing/_internal/common_utils.py", line 1342, in run_tests
    unittest.main(argv=argv)
  File "/home/angelayi/.conda/envs/pytorch-3.10/lib/python3.10/unittest/main.py", line 101, in __init__
    self.runTests()
  File "/home/angelayi/.conda/envs/pytorch-3.10/lib/python3.10/unittest/main.py", line 271, in runTests
    self.result = testRunner.run(self.test)
  File "/home/angelayi/.conda/envs/pytorch-3.10/lib/python3.10/unittest/runner.py", line 184, in run
    test(result)
  File "/home/angelayi/.conda/envs/pytorch-3.10/lib/python3.10/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/home/angelayi/.conda/envs/pytorch-3.10/lib/python3.10/unittest/suite.py", line 122, in run
    test(result)
  File "/home/angelayi/.conda/envs/pytorch-3.10/lib/python3.10/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/home/angelayi/.conda/envs/pytorch-3.10/lib/python3.10/unittest/suite.py", line 122, in run
    test(result)
  File "/home/angelayi/.conda/envs/pytorch-3.10/lib/python3.10/unittest/case.py", line 650, in __call__
    return self.run(*args, **kwds)
  File "/data/users/angelayi/pytorch/torch/testing/_internal/common_utils.py", line 3324, in run
    self._run_custom(
  File "/data/users/angelayi/pytorch/torch/testing/_internal/common_utils.py", line 3296, in _run_custom
    super_run(result=result)
  File "/home/angelayi/.conda/envs/pytorch-3.10/lib/python3.10/unittest/case.py", line 591, in run
    self._callTestMethod(testMethod)
  File "/home/angelayi/.conda/envs/pytorch-3.10/lib/python3.10/unittest/case.py", line 549, in _callTestMethod
    method()
  File "/data/users/angelayi/pytorch/torch/testing/_internal/common_utils.py", line 3156, in wrapper
    method(*args, **kwargs)
  File "/data/users/angelayi/pytorch/moo.py", line 1495, in test_stack_trace
    gm = torch.fx.GraphModule(m, tracer.trace(m))
  File "/data/users/angelayi/pytorch/torch/fx/_symbolic_trace.py", line 837, in trace
    (self.create_arg(fn(*args)),),
  File "/data/users/angelayi/pytorch/moo.py", line 1485, in forward
    x = x * 2
  File "/data/users/angelayi/pytorch/torch/fx/proxy.py", line 716, in impl
    return tracer.create_proxy("call_function", target, args, kwargs)
  File "/data/users/angelayi/pytorch/torch/fx/proxy.py", line 248, in create_proxy
    proxy.node.stack_trace = "".join(CapturedTraceback.extract().format())
```
Now it looks like:
```
File "/data/users/angelayi/pytorch/moo.py", line 1485, in forward
    x = x * 2
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151029
Approved by: https://github.com/jfix71, https://github.com/zou3519, https://github.com/jingsh
2025-04-22 22:50:36 +00:00
PyTorch MergeBot
bc6c0bc344 Revert "Do not generate long log messaged for suppressed data dependent errors. (#151023)"
This reverts commit dfdf731579.

Reverted https://github.com/pytorch/pytorch/pull/151023 on behalf of https://github.com/laithsakka due to breaking other PRs ([comment](https://github.com/pytorch/pytorch/pull/151023#issuecomment-2822483635))
2025-04-22 21:08:30 +00:00
PyTorch MergeBot
459c62ee1d Revert "Do not log exception when recording is disabled or already recording (#151038)"
This reverts commit 73d95893a2.

Reverted https://github.com/pytorch/pytorch/pull/151038 on behalf of https://github.com/laithsakka due to breaking other PRs ([comment](https://github.com/pytorch/pytorch/pull/151023#issuecomment-2822483635))
2025-04-22 21:08:30 +00:00
PyTorch MergeBot
aaf71a481b Revert "Log information about suppressed data dependent errors (#151041)"
This reverts commit ccd00359da.

Reverted https://github.com/pytorch/pytorch/pull/151041 on behalf of https://github.com/laithsakka due to breaking other PRs ([comment](https://github.com/pytorch/pytorch/pull/151023#issuecomment-2822483635))
2025-04-22 21:08:30 +00:00
PyTorch MergeBot
4504910843 Revert "[ez] Make relaxed constraint error message more user friendly (#151407)"
This reverts commit e0f05229e9.

Reverted https://github.com/pytorch/pytorch/pull/151407 on behalf of https://github.com/ZainRizvi due to Sorry but this is breaking internally (see D73198095). To validate your fixes internally, you can follow the instructions here: https://fburl.com/fixing-ghfirst-reverts. ([comment](https://github.com/pytorch/pytorch/pull/151407#issuecomment-2821819654))
2025-04-22 16:12:42 +00:00
Laith Sakka
ccd00359da Log information about suppressed data dependent errors (#151041)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151041
Approved by: https://github.com/bobrenjc93
ghstack dependencies: #151023, #151038
2025-04-22 06:07:57 +00:00
Laith Sakka
73d95893a2 Do not log exception when recording is disabled or already recording (#151038)
I am not sure why do we log all exceptions here and re-raise them , but at least when recording is disabled this should be
transparent. namely logging dde could be spamming.

before:
<img width="995" alt="Screenshot 2025-04-10 at 12 47 31 PM" src="https://github.com/user-attachments/assets/f90d4557-d958-4558-a917-0d687366cad1" />

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151038
Approved by: https://github.com/bobrenjc93
ghstack dependencies: #151023
2025-04-22 06:07:57 +00:00
Laith Sakka
dfdf731579 Do not generate long log messaged for suppressed data dependent errors. (#151023)
TORCH_LOGS="all" python test/test_dynamic_shapes.py -k test_guard_or_true

 before:
<img width="1065" alt="Screenshot 2025-04-10 at 9 55 27 AM" src="https://github.com/user-attachments/assets/3ee20de0-2902-4eb1-8ab0-80f1b974fb78" />

after:
<img width="1124" alt="Screenshot 2025-04-10 at 9 54 35 AM" src="https://github.com/user-attachments/assets/4e7e1f0c-856c-417f-8763-bfe183e2450d" />

Note: we actually do not expect to see a log at all, this is an orthogonal issue in recording where it logs each error seen
even when recording is not enabled? I will follow up with PR for that.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151023
Approved by: https://github.com/bobrenjc93
2025-04-22 06:07:57 +00:00
Laith Sakka
6ea2e6a2d2 Do not do proper const fold during tensorify_python_scalars (#151494)
Chatting with Bob the goal of this is to const fold the floats that where tensorified by calling
guard_scalar(val) on them and then replacing their usages by their values.
Hence we do not need to do this for nodes with no float symbols.

We do not want todo proper const folding because we need to preserve statements that deferred
runtime asserts depend on. (see the added test)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151494
Approved by: https://github.com/bobrenjc93
2025-04-21 22:39:50 +00:00
Tugsbayasgalan Manlaibaatar
adf5f38eae Don't specialize min/max (#151347)
address https://github.com/pytorch/pytorch/issues/149635
Differential Revision: [D73041489](https://our.internmc.facebook.com/intern/diff/D73041489/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151347
Approved by: https://github.com/bobrenjc93
2025-04-19 00:11:15 +00:00
Tugsbayasgalan Manlaibaatar
eb1f85a2a0 Support C++ statically_known_true (#151346)
Differential Revision: [D73040543](https://our.internmc.facebook.com/intern/diff/D73040543/)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151346
Approved by: https://github.com/laithsakka
2025-04-18 06:42:12 +00:00
Laith Sakka
b434322075 Fix has_free_symbols (#151492)
used to fail for
        self.assertFalse(has_free_symbols(sympy.S.true))

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151492
Approved by: https://github.com/bobrenjc93
ghstack dependencies: #151170, #151171
2025-04-18 01:19:01 +00:00
bobrenjc93
e0f05229e9 [ez] Make relaxed constraint error message more user friendly (#151407)
Fixes #151356

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151407
Approved by: https://github.com/Skylion007
2025-04-17 06:43:10 +00:00
PyTorch MergeBot
a582f04608 Revert "[ez] Make relaxed constraint error message more user friendly (#151407)"
This reverts commit bc934f57d7.

Reverted https://github.com/pytorch/pytorch/pull/151407 on behalf of https://github.com/izaitsevfb due to breaks export tests ([comment](https://github.com/pytorch/pytorch/pull/151407#issuecomment-2810716135))
2025-04-16 20:40:22 +00:00
bobrenjc93
bc934f57d7 [ez] Make relaxed constraint error message more user friendly (#151407)
Fixes #151356

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151407
Approved by: https://github.com/Skylion007
2025-04-16 17:00:06 +00:00
Oguz Ulgen
3cf0e2d8ec Add inductor standalone_compile API (#150670)
This PR adds standalone_compile API that does precompilation via caching to support vLLM use case in the short term while we work on the longer term precompilation solution.

```
standalone_compile(gm, example_inputs, options) -> CompiledArtifact
CompiledArtifact.save(path, format: binary|unpacked = binary)
CompiledArtifact.load(path, format: binary|unpacked = binary)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150670
Approved by: https://github.com/jamesjwu, https://github.com/zou3519
2025-04-15 23:38:15 +00:00
sandishkumarhn
61f127aac5 [Export] fix automatically convert instances of _check(u>=0) to check_is_size() (#148844)
Fixes #148826

Understanding:

1. PyTorch should automatically convert instances of _check(u>=0) to check_is_size()
2. The export mechanism should suggest using check_is_size() instead of _check(u>=0) when applicable

Changes made:
1. Added a helper function to detect non-negative checks: is_non_negative_check
2. Modified the suggestion logic in _suggest_torch_checks to detect and handle non-negative checks
3. unit tests test_is_non_negative_check_function, test_suggest_torch_checks_with_non_negative_check, and test_suggest_torch_checks_with_regular_check

unit tests:

base) sany@sandishs-Laptop pytorch % pytest test/export/test_export.py::TestExport::test_suggest_torch_checks_with_non_negative_check
=================================== test session starts ==================
platform darwin -- Python 3.9.19, pytest-7.3.2, pluggy-1.5.0
rootdir: /Users/sany/git/pytorch
configfile: pytest.ini
plugins: xdoctest-1.1.0, cpp-2.3.0, flakefinder-1.1.0, anyio-4.6.0, rerunfailures-14.0, hypothesis-5.35.1, xdist-3.3.1, subtests-0.13.1, typeguard-4.3.0
collected 1 item
Running 1 items in this shard

test/export/test_export.py .                                                                                           [100%]

======================== 1 passed in 1.67s =======================
(base) sany@sandishs-Laptop pytorch % pytest test/export/test_export.py::TestExport::test_suggest_torch_checks_with_regular_check
======================= test session starts =================
platform darwin -- Python 3.9.19, pytest-7.3.2, pluggy-1.5.0
rootdir: /Users/sany/git/pytorch
configfile: pytest.ini
plugins: xdoctest-1.1.0, cpp-2.3.0, flakefinder-1.1.0, anyio-4.6.0, rerunfailures-14.0, hypothesis-5.35.1, xdist-3.3.1, subtests-0.13.1, typeguard-4.3.0
collected 1 item
Running 1 items in this shard

test/export/test_export.py .                                                                                           [100%]

================================= 1 passed in 1.61s ================
(base) sany@sandishs-Laptop pytorch % pytest test/export/test_export.py::TestExport::test_is_non_negative_check_function
================================ test session starts =============
platform darwin -- Python 3.9.19, pytest-7.3.2, pluggy-1.5.0
rootdir: /Users/sany/git/pytorch
configfile: pytest.ini
plugins: xdoctest-1.1.0, cpp-2.3.0, flakefinder-1.1.0, anyio-4.6.0, rerunfailures-14.0, hypothesis-5.35.1, xdist-3.3.1, subtests-0.13.1, typeguard-4.3.0
collected 1 item
Running 1 items in this shard

test/export/test_export.py .                                                                                           [100%]

======================= 1 passed in 1.62s =========================
(base) sany@sandishs-Laptop pytorch %

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148844
Approved by: https://github.com/laithsakka
2025-04-15 17:41:11 +00:00
PyTorch MergeBot
74f6bc28a7 Revert "Add inductor standalone_compile API (#150670)"
This reverts commit c9aef50898.

Reverted https://github.com/pytorch/pytorch/pull/150670 on behalf of https://github.com/Camyll due to breaking internal builds with torch module not found error ([comment](https://github.com/pytorch/pytorch/pull/150670#issuecomment-2806975267))
2025-04-15 17:35:59 +00:00
Oguz Ulgen
c9aef50898 Add inductor standalone_compile API (#150670)
This PR adds standalone_compile API that does precompilation via caching to support vLLM use case in the short term while we work on the longer term precompilation solution.

```
standalone_compile(gm, example_inputs, options) -> CompiledArtifact
CompiledArtifact.save(path, format: binary|unpacked = binary)
CompiledArtifact.load(path, format: binary|unpacked = binary)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150670
Approved by: https://github.com/jamesjwu, https://github.com/zou3519
2025-04-14 22:00:09 +00:00
Pian Pawakapan
6dddd6520d [dynamic shapes] add sym_and, sym_or (#150456)
This has been pretty helpful for the size-oblivious rewrite. Wanted the variadic args version to avoid `sym_or(a, sym_or(b, sym_or(c, d)))` in favor of `sym_or(a, b, c, d)`. Happy to change this to ban the 1-arg version.

This is better than plain and/or because the whole symbolic expression gets preserved, and if we guard on it or defer as a runtime assert, we preserve all branches.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150456
Approved by: https://github.com/laithsakka
2025-04-14 18:18:06 +00:00
PyTorch MergeBot
24b3ab9255 Revert "Add inductor standalone_compile API (#150670)"
This reverts commit bbc5fe8504.

Reverted https://github.com/pytorch/pytorch/pull/150670 on behalf of https://github.com/albanD due to Broke profiler test ([comment](https://github.com/pytorch/pytorch/pull/150670#issuecomment-2802067144))
2025-04-14 15:22:33 +00:00
Oguz Ulgen
bbc5fe8504 Add inductor standalone_compile API (#150670)
This PR adds standalone_compile API that does precompilation via caching to support vLLM use case in the short term while we work on the longer term precompilation solution.

```
standalone_compile(gm, example_inputs, options) -> CompiledArtifact
CompiledArtifact.save(path, format: binary|unpacked = binary)
CompiledArtifact.load(path, format: binary|unpacked = binary)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150670
Approved by: https://github.com/jamesjwu, https://github.com/zou3519
2025-04-14 07:07:10 +00:00
Thomas Adams
8494d5582a Propagate callable parameter types using ParamSpec (#142306) (#151014)
Partially addresses #142306

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151014
Approved by: https://github.com/Skylion007
2025-04-13 20:38:11 +00:00
Laith Sakka
5471e80fb4 Remove guard_size_oblivious from vector_norm decomposition. (#148809)
This PR remove the usage of guard_size_oblivious in vector_norm by inlining it in the runtime check,
this prevent any data dependent error from ever appearing here at the locations where guard_size_oblivious
used to exist. Before this PR it used to break potentially. This is NOT BC breaking or changing of semantics from eager.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148809
Approved by: https://github.com/bobrenjc93
2025-04-10 16:19:00 +00:00
angelayi
e6969c1bd8 [export] Symint support (nonstrict, Dim.DYNAMIC) (#150198)
Fixes https://github.com/pytorch/pytorch/issues/113682 only in the non-strict export case. Also we only support Dim.DYNAMIC/AUTO, not named-Dims

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150198
Approved by: https://github.com/pianpwk
2025-04-10 15:06:23 +00:00
Laith Sakka
cd80778ac8 Fix issue in optimized_add issue: make_optimized should be called on non args only (#150955)
PR https://github.com/pytorch/pytorch/pull/149665 did a change to the optimized_add that is causing an issue internally.
In general make_optimized should be only be called with valid new_args,  new_args can become None
when elements already exists also, we should break out of the loop in that case.

Note that I also only maintained the optimized summation when both lhs and rhs lengths are <=2.
This is ok because the optimization is based on the inductive property of adding one symbol at a time.
the [2]+[2] here is serving as base case ( i feel we can also remove it ) .

Note that keeping it for all sizes while correct, I am not sure if tis as efficient (we will do N log(n) insertions).
there is no current justification for that.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150955
Approved by: https://github.com/Mingming-Ding, https://github.com/atalman, https://github.com/bobrenjc93
2025-04-10 03:00:21 +00:00
Laith Sakka
087e8587cd support backed_size_oblivious in guard_or_false/guard_or_true (#150231)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150231
Approved by: https://github.com/pianpwk
2025-04-09 21:47:20 +00:00
PyTorch MergeBot
01568cb17a Revert "Refactor layout constraint selection logic (#148104)"
This reverts commit 2e7c9d33e7.

Reverted https://github.com/pytorch/pytorch/pull/148104 on behalf of https://github.com/atalman due to [GH job link](https://github.com/pytorch/pytorch/actions/runs/14357056427/job/40251630946) [HUD commit link](2e7c9d33e7) ([comment](https://github.com/pytorch/pytorch/pull/148104#issuecomment-2790369493))
2025-04-09 16:49:48 +00:00
rzou
2e7c9d33e7 Refactor layout constraint selection logic (#148104)
This PR:

- cleans up some existing comments that don't make sense anymore
- hooks up the "custom_op_default_layout_constraint" back (that seems to
have broken)
- cleans up the "lazy registration path" which seems to never get hit
anymore
- adds dislike_padding to nodes that require exact strides

Test Plan:
- tests + CI

disable padding

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148104
Approved by: https://github.com/shunting314, https://github.com/eellison
ghstack dependencies: #150495
2025-04-09 02:09:18 +00:00
Klint Qinami
2d98a1caf5 [MTIA] Map names to operand indices when folding submodules (#150692)
When replacing placeholders with getattrs during constant folding, we can have an argument and parameter name mismatch. In fact, there is no guarantee that the parameter name is equivalent to the argument name used in the module call.

Differential Revision: D72415970

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150692
Approved by: https://github.com/jfix71
2025-04-06 03:11:14 +00:00
Jakub Grzybek
73358d37da Fix codegen, change str comparison opeator to == for proper equality … (#150611)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150611
Approved by: https://github.com/Skylion007, https://github.com/cyyever
2025-04-04 09:59:59 +00:00
Pian Pawakapan
c6d79c163c [dynamic shapes] allow duck typing for 0/1 (#150222)
Fixes #150184

e.g. for config.backed_size_oblivious=True and compile

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150222
Approved by: https://github.com/laithsakka
2025-04-04 03:24:46 +00:00
Henry Hu
5cf3029503 Remove unused rand call if not fallback to eager for rand (#147790)
Fixes #147171

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147790
Approved by: https://github.com/eellison
2025-04-03 23:27:03 +00:00
Jason Ansel
d41c22b578 Revert "[fx] Move Node._prepend/Node._remove_from_list to C++ (#148261)" (#150542)
Reverts #148261 due to possible memory leak

This reverts commit 5d4e7d58b4.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150542
Approved by: https://github.com/clee2000
2025-04-03 21:15:38 +00:00
rzou
aae36929ed Rename node.meta["arg_kwarg_vals"] to node.meta["eager_input_vals"] (#148092)
And added a comment about it. Otherwise it might be confusing

Test Plan:
- wait for CI

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148092
Approved by: https://github.com/eellison
ghstack dependencies: #148046, #148063, #148091
2025-04-02 13:18:04 +00:00
rzou
4d121d2b02 Implement needs_exact_strides for mutable custom operators (#148091)
Mutable custom operators get wrapped into an auto_functionalized HOP, so
we need to store the arg_kwarg_vals on the auto_functionalized HOP
itself.

When Inductor does the re-inplacing, it'll use the pattern matcher to
decompose the auto_functionalized HOP back into the original op (and
0+ other view or clone operations). The pattern matcher uses the
arg_kwarg_vals to trace the subgraph to do the decomposition, so it
ultimately sets arg_kwarg_vals on the original op's node correctly.

Test Plan:
- new test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148091
Approved by: https://github.com/eellison
ghstack dependencies: #148046, #148063
2025-04-02 13:18:04 +00:00
rzou
c69c3c885e Add needs_exact_strides operator tag for Inductor to force exact strides (#148063)
Inductor will force exact strides on a custom operator tagged with
needs_exact_strides. I'll make this the default in a follow-up PR.

Test Plan:
- tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148063
Approved by: https://github.com/eellison
ghstack dependencies: #148046
2025-04-02 13:17:58 +00:00
rzou
c41fbb4f78 Change arg_kwarg_vals propagation strategy (#148046)
Instead of always propagating arg_kwarg_vals in _COPY_META_FIELDS, we
special-case the pattern matcher to propagate arg_kwarg_vals when
it sees triton_kernel_wrapper_functional.

The strategy is:
1) trace out the replacement graph with arg_kwarg_vals (which have accurate eager-mode metadata)
2) trace out the replacement graph with vals (which have the accurate Inductor metadata)
3) Propagate the arg_kwarg_vals from the first graph to the second.
4) Use the second graph as the replacement graph.

The strategy is this because we want to extend this to handle
auto_functionalized later up in the stack.

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148046
Approved by: https://github.com/eellison
2025-04-02 13:17:52 +00:00
William Wen
3ac5a499dd [dynamo] add dynamo disable reasons to codebase (#150440)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150440
Approved by: https://github.com/jansel, https://github.com/zou3519
ghstack dependencies: #150341
2025-04-02 04:26:48 +00:00
vasiliy
c974b5322a enable torch.compile for torch._scaled_mm nvfp4 recipe (#150462)
Summary:

Updates the meta registration for `torch._scaled_mm` to work for the
nvfp4 recipe.

Test Plan:

```bash
pytest test/test_matmul_cuda.py -s -k test_blockwise_nvfp4
```

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150462
Approved by: https://github.com/eellison
2025-04-02 01:08:40 +00:00
Tugsbayasgalan Manlaibaatar
7e7e5698cc Suppress more warnings (#149833)
Differential Revision: [D71702307](https://our.internmc.facebook.com/intern/diff/D71702307)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149833
Approved by: https://github.com/malfet, https://github.com/Skylion007
2025-04-01 05:33:04 +00:00
Pian Pawakapan
284b766898 [dynamic shapes] C++ bindings for guard_or_false/true (#150148)
C++ version. Would like to add it in one place to prove it works, but couldn't find one that doesn't expose a chain of data-dependent changes... so just gonna put up the base implementation

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150148
Approved by: https://github.com/laithsakka, https://github.com/jingsh
2025-03-31 17:04:25 +00:00
Pian Pawakapan
103bf64a3c [export] refactor _Dim into Dim (#149891)
Summary: forward fix T218515233

Test Plan: test_export

Differential Revision: D71769231

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149891
Approved by: https://github.com/jingsh, https://github.com/angelayi
2025-03-28 06:19:03 +00:00
bobrenjc93
f649ee73ce Use source hashing to generate consistent symbolic ids (#149665)
This PR was inspired by internal models that were cache missing due to PGO. At a high level the problem looks as follows

Run 1, Invocation 1: We do static compile, save some example values in PGO/automatic dynamic

Run 1, Invocation 2: We detect varying inputs, do dynamic compile, get a dynamic graph and save to PGO. Crucially what we save to PGO is actually a superset of what is actually dynamic. If we notice an input was varying, we mark it as dynamic in PGO even if later on that value gets specialized. When a value gets specialized, we actually remove the symbol from the graph. This results in an interesting conundrum where although we are producing the same isomorphic graph, PGO makes the second run cache miss. Let's see how....

Run 2, Invocation 1: We fetch the PGO, over-mark things as dynamic, get a fx graph, look it up in the cache and... whoops! cache miss! This is because of the aforementioned behavior where the PGO profile will cause us to over-allocate symbols. In practice this means we end up saving a graph in cache with symbols x:s1, y:s3 and on second attempt we cache miss with x:s1, y:s6 where symbols s3,s4,s5 were all optimistically marked dynamic by PGO and subsequently specialized.

We solve this problem by hashing the source names. This ensures somewhat stable assignment. To prevent catastrophic symbol collisions, we use linear probing to ensure no collisions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149665
Approved by: https://github.com/Mingming-Ding, https://github.com/laithsakka
2025-03-28 05:36:32 +00:00
PyTorch MergeBot
af7719a2fa Revert "Use source hashing to generate consistent symbolic ids (#149665)"
This reverts commit 1f92348dc6.

Reverted https://github.com/pytorch/pytorch/pull/149665 on behalf of https://github.com/malfet due to Broke trunk, see 6eb3c2e282/1 ([comment](https://github.com/pytorch/pytorch/pull/149665#issuecomment-2758578187))
2025-03-27 16:02:27 +00:00
Laith Sakka
6cbcdee944 Introduce guard_or_true, guard_or_false (#148430)
some context in this document:
https://docs.google.com/document/d/18nJsj-F2C_QXO7ClwzPcAUENQ-B440B43W7DdDnlDt4/edit?tab=t.0#heading=h.pgebnyi7pocj

But TLDR;
`guard_or_true`, `guard_or_false` are better than `guard_size_oblivious` due to :
- Easier to reason about what assumptions we are making while reading the code.
- Avoid size_oblivious complexity that is not needed.
- Avoid unsoundness that could make `guard_size_oblivious(a==1)` be true when its not true for some vaue `a` during runtime.
- Less data dependent errors for some cases: ex, when doing `guard_size_oblivious(a==1)` and we know `a` is a tensor size, if it's traced with `a=u1-u2` `guard_size_oblivious(a==1)` will throw a data dependent error but `guard_else_false` will just return `False`.

### How is it different from statically_known_true??
**`if(cond)`:** (normal guarding) will try to evaluate statically and guard on the condition, willing to restrict input space to evaluate cond. if it fails to evaluate due to data dependent error will throw an exception (that could be converted to graph break in some situations).

**`statically_known_true(cond)`:** would be used when you never want to add a guard (restrict your input space), but just want to do a best effort check to see if you can infer that something is true/false ONLY based on existing constraints.

**`guard_or_true(cond)`/`guard_or_false(cond)`:** Those would be used in situations you prefer to guard and know the result of the expression over not guarding, but in case you hit a data dependent error you are ok with just returning true or false.
Some reasons you might be ok with returning true/false instead could be:
1. It's an optimization I do not want to fail for not performing optimization.
2. I am willing to deviate from the normal semantics when I have unbacked for the benefit of not failing (See the doc above for more details).

**`definitely_true(cond)`**: same as `guard_or_false(cond)` except does not try to do static eval for unbacked (planning to deprecate it and replace uses with `guard_or_false` or make it alias to `guard_or_false`)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148430
Approved by: https://github.com/bobrenjc93
2025-03-27 09:34:05 +00:00
PyTorch MergeBot
e080bac533 Revert "Introduce guard_or_true, guard_or_false (#148430)"
This reverts commit d5593ea31c.

Reverted https://github.com/pytorch/pytorch/pull/148430 on behalf of https://github.com/laithsakka due to need to fix stuff ([comment](https://github.com/pytorch/pytorch/pull/148430#issuecomment-2756701436))
2025-03-27 05:10:20 +00:00
bobrenjc93
1f92348dc6 Use source hashing to generate consistent symbolic ids (#149665)
This PR was inspired by internal models that were cache missing due to PGO. At a high level the problem looks as follows

Run 1, Invocation 1: We do static compile, save some example values in PGO/automatic dynamic

Run 1, Invocation 2: We detect varying inputs, do dynamic compile, get a dynamic graph and save to PGO. Crucially what we save to PGO is actually a superset of what is actually dynamic. If we notice an input was varying, we mark it as dynamic in PGO even if later on that value gets specialized. When a value gets specialized, we actually remove the symbol from the graph. This results in an interesting conundrum where although we are producing the same isomorphic graph, PGO makes the second run cache miss. Let's see how....

Run 2, Invocation 1: We fetch the PGO, over-mark things as dynamic, get a fx graph, look it up in the cache and... whoops! cache miss! This is because of the aforementioned behavior where the PGO profile will cause us to over-allocate symbols. In practice this means we end up saving a graph in cache with symbols x:s1, y:s3 and on second attempt we cache miss with x:s1, y:s6 where symbols s3,s4,s5 were all optimistically marked dynamic by PGO and subsequently specialized.

We solve this problem by hashing the source names. This ensures somewhat stable assignment. To prevent catastrophic symbol collisions, we use linear probing to ensure no collisions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149665
Approved by: https://github.com/Mingming-Ding, https://github.com/laithsakka
2025-03-27 03:39:27 +00:00
Laith Sakka
d5593ea31c Introduce guard_or_true, guard_or_false (#148430)
some context in this document:
https://docs.google.com/document/d/18nJsj-F2C_QXO7ClwzPcAUENQ-B440B43W7DdDnlDt4/edit?tab=t.0#heading=h.pgebnyi7pocj

But TLDR;
`guard_or_true`, `guard_or_false` are better than `guard_size_oblivious` due to :
- Easier to reason about what assumptions we are making while reading the code.
- Avoid size_oblivious complexity that is not needed.
- Avoid unsoundness that could make `guard_size_oblivious(a==1)` be true when its not true for some vaue `a` during runtime.
- Less data dependent errors for some cases: ex, when doing `guard_size_oblivious(a==1)` and we know `a` is a tensor size, if it's traced with `a=u1-u2` `guard_size_oblivious(a==1)` will throw a data dependent error but `guard_else_false` will just return `False`.

### How is it different from statically_known_true??
**`if(cond)`:** (normal guarding) will try to evaluate statically and guard on the condition, willing to restrict input space to evaluate cond. if it fails to evaluate due to data dependent error will throw an exception (that could be converted to graph break in some situations).

**`statically_known_true(cond)`:** would be used when you never want to add a guard (restrict your input space), but just want to do a best effort check to see if you can infer that something is true/false ONLY based on existing constraints.

**`guard_or_true(cond)`/`guard_or_false(cond)`:** Those would be used in situations you prefer to guard and know the result of the expression over not guarding, but in case you hit a data dependent error you are ok with just returning true or false.
Some reasons you might be ok with returning true/false instead could be:
1. It's an optimization I do not want to fail for not performing optimization.
2. I am willing to deviate from the normal semantics when I have unbacked for the benefit of not failing (See the doc above for more details).

**`definitely_true(cond)`**: same as `guard_or_false(cond)` except does not try to do static eval for unbacked (planning to deprecate it and replace uses with `guard_or_false` or make it alias to `guard_or_false`)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148430
Approved by: https://github.com/bobrenjc93
2025-03-27 02:22:20 +00:00
Tugsbayasgalan Manlaibaatar
27657a00d9 Demote logger of runtime_asserts_frozen to be fired only on debug mode (#149832)
Differential Revision: [D71702305](https://our.internmc.facebook.com/intern/diff/D71702305)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149832
Approved by: https://github.com/malfet
2025-03-25 02:29:13 +00:00
bobrenjc93
60f31f551e Only print dde partial fx graph for export (#149831)
Lazos correctly pointed out this doesn't make sense for compile since
we graph break in compile. This results in tons of unwanted user log
spew. We do want this in export though since it's drastiaclly reduced
the support load for DDEs. This PR does the refactor to keep it in
export but remove it from compile

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149831
Approved by: https://github.com/mlazos
2025-03-24 17:46:18 +00:00
Tugsbayasgalan Manlaibaatar
021b3e23ec Fix is_nonzero for more than one elem tensors (#149637)
Differential Revision: [D71560442](https://our.internmc.facebook.com/intern/diff/D71560442)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149637
Approved by: https://github.com/pianpwk
2025-03-22 02:08:28 +00:00
Zhengxu Chen
f47aa08130 [export] Support python assertion with symints. (#149444)
Summary: This diff ports some technique from torch.fx symbolic trace to trace through Python asserts when we run into data dependent symbolic shape assertions, so that we can achieve the same effect as torch dynamo to automatically turn assert into torch.check()s.

Test Plan: buck test mode/opt caffe2/test:test_export -- -r test_python_asserts_with_sym_int
Differential Revision: D71425360

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149444
Approved by: https://github.com/tugsbayasgalan
2025-03-20 23:07:45 +00:00
angelayi
bf34e228c5 [export] Beef up guard_added logs (#149465)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149465
Approved by: https://github.com/pianpwk
2025-03-20 23:02:07 +00:00
Avik Chaudhuri
6237495fcf torch.Size input (#149414)
Summary: Support for `torch.Size` inputs was patchy before because `unflatten_fn` for this type returned a tuple. This PR cleans this up.

Fixes #149158

Test Plan: added test

Differential Revision: D71403635

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149414
Approved by: https://github.com/yushangdi
2025-03-20 16:23:13 +00:00
Ze Sheng
e98afa0f89 [Sigmoid] Remove magic method in CapabilityBasedPartitioner (#149400)
Summary: As title.

Test Plan: CI

Differential Revision: D70575197

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149400
Approved by: https://github.com/jfix71
2025-03-19 16:02:43 +00:00
Shangdi Yu
1bf443e2f2 [aoti x with_effect token] Unbacked symint and register lowering (#147656)
Differential Revision: D70022208

- When resolving unbacked symints in ExternKernel for with_effect, we need to ignore the first item in the binding path, because the `example_output` doesn't contain the effect token, but the binding paths do.
- Similarly, `node.meta["val"]` contains the effect token, so when we compute_unbacked_bindings, we need to remove that effect token

- For `torch.ops.higher_order.with_effects`'s lowering, we should not extract the items out of an list (i.e. `*result` vs `result`). The `get_attr` nodes consider the result to be in the list format.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147656
Approved by: https://github.com/angelayi, https://github.com/zou3519
2025-03-19 14:38:30 +00:00
bobrenjc93
eb7bf4202d Make dynamism code robust to NotImplementedException (#148823)
In prod many models have `@property` methods that raise
NotImplementedError. This PR updates our dynamism code to be more robust
to these types of models.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148823
Approved by: https://github.com/laithsakka
2025-03-14 23:38:19 +00:00
Zhou, Lingzhi
4a12777ffe [Partitioner] Remove unnecessary upstream nodes in dependency viewer (#146580)
We iterate upstream nodes to update partition map. But actually did nothing due to we iterate nodes with reversed topological order https://github.com/pytorch/pytorch/pull/136608/files#diff-f2f9dd3903fd99955732eb694941fea0cb7301a58d59554787f3311d417e5615L193 so that there exists no upstream nodes in assignment. Remove it to reduce for-loop overhead which up to O(N * N) complexity.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146580
Approved by: https://github.com/Skylion007, https://github.com/jerome-habana
2025-03-13 01:42:10 +00:00
PyTorch MergeBot
b1980b2405 Revert "Make dynamism code robust to NotImplementedException (#148823)"
This reverts commit 60576419a2.

Reverted https://github.com/pytorch/pytorch/pull/148823 on behalf of https://github.com/ZainRizvi due to Sorry but this is breaking internally, see D71042206 for details. To validate your fixes internally before relanding, you can follow the instructions here: https://fburl.com/fixing-ghfirst-reverts ([comment](https://github.com/pytorch/pytorch/pull/148823#issuecomment-2719287467))
2025-03-12 22:45:39 +00:00
Tugsbayasgalan Manlaibaatar
fb0e9cb0a0 Remove warnings on non-buffer tensor constants (#148483)
Export already registers tensor constants directly in the graph and this is also true for Torchbind objects. This removes warning that pollutes the output.

Differential Revision: [D70577856](https://our.internmc.facebook.com/intern/diff/D70577856)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148483
Approved by: https://github.com/zhxchen17, https://github.com/zou3519
ghstack dependencies: #148364
2025-03-12 18:20:04 +00:00
lingzhi98
81aee3c9c4 [Partitioner] Reduce time consuming of partitions merger (#146582)
This patch optimize maybe_merge_partition func through 3-ways:

Remove unnecessary copy https://github.com/pytorch/pytorch/blob/main/torch/fx/passes/infra/partitioner.py#L99. The number of copied nodes is large if we can merge all of the nodes of graph into one partition.
Record users of each partition to avoid duplicate iteration over nodes https://github.com/pytorch/pytorch/blob/main/torch/fx/passes/infra/partitioner.py#L133. The trip count of this loop maybe very large.
The nodes number of each partitions maybe not balance https://github.com/pytorch/pytorch/blob/main/torch/fx/passes/infra/partitioner.py#L145. We always encounter one issue: one partition has n nodes, but the other has one node. Merge the smaller partition into the larger can help to reduce time consuming.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146582
Approved by: https://github.com/jerome-habana, https://github.com/Skylion007
2025-03-12 09:24:38 +00:00
bobrenjc93
60576419a2 Make dynamism code robust to NotImplementedException (#148823)
In prod many models have `@property` methods that raise
NotImplementedError. This PR updates our dynamism code to be more robust
to these types of models.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148823
Approved by: https://github.com/laithsakka
2025-03-12 01:01:57 +00:00
Pian Pawakapan
a6459afb0e [dynamic shapes] add backed_size_oblivious option (#148696)
Adds option `torch.fx.experimental._config.backed_size_oblivious = True` to allocate `[0, inf]` instead of `[2, inf]` ranges for size backed symbols, and opting into size-oblivious semantics for them.

Helps in a number of cases like
- Keeps `[0, inf]` bounds for unbacked symbols, when we make a unbacked -> backed replacement
- More sound handling for 0/1 inputs at runtime when we lower from export
- Avoids ends-of-bounds, sys.maxsize constraint violations for exporting with named Dims (https://github.com/pytorch/pytorch/issues/146315, https://github.com/pytorch/pytorch/issues/146046)

May look towards turning this on globally for export.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148696
Approved by: https://github.com/bobrenjc93
2025-03-11 21:52:34 +00:00
Guilherme Leobas
4e7d264cf8 Introduce UserDefinedExceptionClassVariable (#146504)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146504
Approved by: https://github.com/anijain2305
2025-03-11 18:55:45 +00:00
Brian Hirsh
621dadd4ca partitioner: when materializing unbacked tensor intermediates, apply hint to symbol, not expr (#144097)
Fixes https://github.com/pytorch/pytorch/issues/144095

open to suggestions: the `hint_int(..., fallback=...)` API feels like a bit of a footgun, because:

(1) we use the same guess for every unbacked symint (both symbols, and compound expressions)
(2) the user may have established some relationship between some unbacked symints that we are not taking into account.

I'm not sure how real of an issue (2) is - is it common to e.g. generate two unbacked symints, and then add a runtime assert that they are unequal?

Instead I did something simpler that's just enough to fix the linked issue: if we have a sympy expression containing an unbacked symbol (e.g. `u0 + 1`), then the partitioner will now fill in the symbol with our guess instead of the expression (plugging in `u0=4096` gets us 4097). This was important for an internal custom op, that had some logic like this:
```
def custom_op(x: [u0], y: [u0 + 1]):
    assert x.shape[0] = y.shape[0] - 1
    ...
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144097
Approved by: https://github.com/laithsakka
2025-03-11 02:11:57 +00:00
Qiaochu Yuan
12a95390ae [Minimizer] allow overriding of ShapeProp logic by subclasses of _MinimizerBase (#148784)
Summary:
The changes contained in this diff
- allow subclass Minimizer implementations to override the default shape propagation logic with custom logic
- copies over the meta attribute on get_attr graph nodes during the graph splitting step
- for both changes, behavior for existing classes do not change

Test Plan: CI

Differential Revision: D70799942

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148784
Approved by: https://github.com/blaine-rister
2025-03-10 22:22:16 +00:00
eellison
4c13a859e5 Workaround no triton float8_e8m0fnu support in inductor (#148722)
Triton doesn't support actual float8_e8m0fnu yet, so we can't currently codegen any arithmetic on them. But we can support bitcasting, and view/memory operators and treat them as uint8 for now. Fix for https://github.com/pytorch/pytorch/issues/147873.

The one question i'm not sure of is whether or not we need to explicitly disable triton template fusion since it would fuse in these dtypes as uint8..

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148722
Approved by: https://github.com/vkuzo
ghstack dependencies: #148450
2025-03-10 17:37:39 +00:00
Jason Ansel
a60b4ed623 [fx] Optimize TracerBase.create_arg and Graph._gen_python_code (#148292)
Before: 19502951 function calls (18702776 primitive calls) in 8.533 seconds
After: 16402551 function calls (15602452 primitive calls) in 7.701 seconds

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148292
Approved by: https://github.com/oulgen
ghstack dependencies: #148243, #148260, #148261, #148288
2025-03-10 16:06:19 +00:00
Jason Ansel
8f858e226b [fx] Optimizations for node name generation (#148288)
Before:
![image](https://github.com/user-attachments/assets/3a9ed22b-ae33-41ec-a0db-01f4f3ca2ffe)

After:
![image](https://github.com/user-attachments/assets/44c6e578-c63e-4a43-b3e0-d11d4bdbb6db)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148288
Approved by: https://github.com/oulgen
ghstack dependencies: #148243, #148260, #148261
2025-03-10 16:06:19 +00:00
Jason Ansel
5d4e7d58b4 [fx] Move Node._prepend/Node._remove_from_list to C++ (#148261)
Microbenchmarking `fx.symbolic_trace(lambda x: functools.reduce(operator.add, [x, *range(100000)]))`, before:
```
24303536 function calls (23503339 primitive calls) in 10.726 seconds
```
after:
```
20003454 function calls (19203257 primitive calls) in 8.936 seconds
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148261
Approved by: https://github.com/oulgen
ghstack dependencies: #148243, #148260
2025-03-10 16:06:11 +00:00
Jason Ansel
bf752c36da [fx] Move Node._update_args_kwargs to C++ (#148260)
Microbenchmarking `fx.symbolic_trace(lambda x: functools.reduce(operator.add, [x, *range(100000)]))`, before:
```
25203549 function calls (24403352 primitive calls) in 12.090 seconds
```
after:
```
24303536 function calls (23503339 primitive calls) in 10.726 seconds
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148260
Approved by: https://github.com/oulgen
ghstack dependencies: #148243
2025-03-10 16:06:02 +00:00
Jason Ansel
bec7bdad47 [fx] Move map_aggregate to C++ (#148243)
Microbenchmarking `fx.symbolic_trace(lambda x: functools.reduce(operator.add, [x, *range(100000)]))`, before:
```
30603618 function calls (29403419 primitive calls) in 13.744 seconds
```
after:
```
25203549 function calls (24403352 primitive calls) in 12.090 seconds
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148243
Approved by: https://github.com/oulgen
2025-03-10 16:05:53 +00:00
Aaron Gokaslan
edd640a95a [BE][Ez]: Use itertools.chain.from_iterable when possible (#148190)
Often makes the code more readable, more efficient, and adds support for infinite iterables.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148190
Approved by: https://github.com/jansel, https://github.com/malfet
2025-03-06 20:37:06 +00:00
angelayi
ed9624ee60 [export] Fix AttrProxy slicing (#148507)
Fixes https://fb.workplace.com/groups/1028545332188949/permalink/1159599265750221/

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148507
Approved by: https://github.com/zhxchen17
2025-03-05 21:03:15 +00:00
Laith Sakka
913356fb41 Fix recent regression in evaluate_expr that effect cache lookups (#147836)
PR https://github.com/pytorch/pytorch/pull/146939/ added an argument for evaluate_expr for the purpose of logging.
This caused a regression that we thought is due to calling id on symnode.

I digged deeper and found that adding that argument although does not effect results of evaluate_expr it mess the cache
lookups.
I refactored the code to avoid using expr_sym_node_id in the cache lookup, I also introduced evaluate_sym_node to and simplified the calls to evaluate_expr
#suppress-bc-linter

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147836
Approved by: https://github.com/oulgen
2025-03-05 04:11:41 +00:00
Pian Pawakapan
c677f3251f [export] don't use unbacked_renamings in export (#147574)
Plan: avoid the use of unbacked renamings, and introduce a pass run in `_produce_aten_artifact` that recomputes unbacked bindings. Decided to do this because in we don't serialize unbacked renamings (or any ShapeEnv state), so this used to compose poorly with de/serialization. This hopefully establishes the invariant that the unbacked binding keys are always in sync with the example values (i.e. same indices, and removed if the symbol is replaced / specialized).

For de/serialization, we don't stored unbacked bindings, and just rerun the pass.

Involved a refactor of compute_unbacked_bindings.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147574
Approved by: https://github.com/avikchaudhuri
2025-03-04 21:43:49 +00:00
Angela Yi
60205b0eb2 [export] Fix logging so that it doesn't result in max recursion error (#148231)
Test Plan:
buck2 run mode/dev-nosan sigmoid/inference/ts_migration:pt2i_readiness_main -- --model_id=487493491 --test_suite ads_all --mode test_full_model

Produces https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmp2wsjQH/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=100

Differential Revision: D70416613

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148231
Approved by: https://github.com/yiming0416
2025-03-04 20:47:25 +00:00
PyTorch MergeBot
92beda54c8 Revert "[fx] Move map_aggregate to C++ (#148243)"
This reverts commit edaff88f69.

Reverted https://github.com/pytorch/pytorch/pull/148243 on behalf of https://github.com/jovianjaison due to breaking internal builds [T216910920] ([comment](https://github.com/pytorch/pytorch/pull/148243#issuecomment-2698724058))
2025-03-04 19:40:21 +00:00
PyTorch MergeBot
17d003fe75 Revert "[fx] Move Node._update_args_kwargs to C++ (#148260)"
This reverts commit 0135f57f4a.

Reverted https://github.com/pytorch/pytorch/pull/148260 on behalf of https://github.com/jovianjaison due to breaking internal builds [T216910920] ([comment](https://github.com/pytorch/pytorch/pull/148243#issuecomment-2698724058))
2025-03-04 19:40:21 +00:00
PyTorch MergeBot
97b9e68bc6 Revert "[fx] Move Node._prepend/Node._remove_from_list to C++ (#148261)"
This reverts commit 29c2de9ae1.

Reverted https://github.com/pytorch/pytorch/pull/148261 on behalf of https://github.com/jovianjaison due to breaking internal builds [T216910920] ([comment](https://github.com/pytorch/pytorch/pull/148243#issuecomment-2698724058))
2025-03-04 19:40:21 +00:00
PyTorch MergeBot
611b0e9bc4 Revert "[fx] Optimizations for node name generation (#148288)"
This reverts commit 5eb0337cfd.

Reverted https://github.com/pytorch/pytorch/pull/148288 on behalf of https://github.com/clee2000 due to something in this stack broke some dynamo and higher order ops tests like higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_dedupe [GH job link](https://github.com/pytorch/pytorch/actions/runs/13645082540/job/38149882002) [HUD commit link](8531d247ba).   dynamo/test_graph_deduplication did run on the PR but the higher_order_ops one didn't, probably combo of landrace and bad TD ([comment](https://github.com/pytorch/pytorch/pull/148288#issuecomment-2698365172))
2025-03-04 17:10:12 +00:00
PyTorch MergeBot
ed9055c303 Revert "[fx] Optimize TracerBase.create_arg and Graph._gen_python_code (#148292)"
This reverts commit 8531d247ba.

Reverted https://github.com/pytorch/pytorch/pull/148292 on behalf of https://github.com/clee2000 due to something in this stack broke some dynamo and higher order ops tests like higher_order_ops/test_invoke_subgraph.py::TestInvokeSubgraphCompile::test_dedupe [GH job link](https://github.com/pytorch/pytorch/actions/runs/13645082540/job/38149882002) [HUD commit link](8531d247ba).   dynamo/test_graph_deduplication did run on the PR but the higher_order_ops one didn't, probably combo of landrace and bad TD ([comment](https://github.com/pytorch/pytorch/pull/148288#issuecomment-2698365172))
2025-03-04 17:10:12 +00:00
Jason Ansel
8531d247ba [fx] Optimize TracerBase.create_arg and Graph._gen_python_code (#148292)
Before: 19502951 function calls (18702776 primitive calls) in 8.533 seconds
After: 16402551 function calls (15602452 primitive calls) in 7.701 seconds

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148292
Approved by: https://github.com/oulgen
ghstack dependencies: #148243, #148260, #148261, #148303, #148288
2025-03-04 02:42:23 +00:00
Jason Ansel
5eb0337cfd [fx] Optimizations for node name generation (#148288)
Before:
![image](https://github.com/user-attachments/assets/3a9ed22b-ae33-41ec-a0db-01f4f3ca2ffe)

After:
![image](https://github.com/user-attachments/assets/44c6e578-c63e-4a43-b3e0-d11d4bdbb6db)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148288
Approved by: https://github.com/oulgen
ghstack dependencies: #148243, #148260, #148261, #148303
2025-03-04 02:42:23 +00:00
Jason Ansel
29c2de9ae1 [fx] Move Node._prepend/Node._remove_from_list to C++ (#148261)
Microbenchmarking `fx.symbolic_trace(lambda x: functools.reduce(operator.add, [x, *range(100000)]))`, before:
```
24303536 function calls (23503339 primitive calls) in 10.726 seconds
```
after:
```
20003454 function calls (19203257 primitive calls) in 8.936 seconds
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148261
Approved by: https://github.com/oulgen
ghstack dependencies: #148243, #148260
2025-03-02 22:42:31 +00:00
Jason Ansel
0135f57f4a [fx] Move Node._update_args_kwargs to C++ (#148260)
Microbenchmarking `fx.symbolic_trace(lambda x: functools.reduce(operator.add, [x, *range(100000)]))`, before:
```
25203549 function calls (24403352 primitive calls) in 12.090 seconds
```
after:
```
24303536 function calls (23503339 primitive calls) in 10.726 seconds
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148260
Approved by: https://github.com/oulgen
ghstack dependencies: #148243
2025-03-02 22:42:31 +00:00
Jason Ansel
edaff88f69 [fx] Move map_aggregate to C++ (#148243)
Microbenchmarking `fx.symbolic_trace(lambda x: functools.reduce(operator.add, [x, *range(100000)]))`, before:
```
30603618 function calls (29403419 primitive calls) in 13.744 seconds
```
after:
```
25203549 function calls (24403352 primitive calls) in 12.090 seconds
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148243
Approved by: https://github.com/oulgen
2025-03-02 22:42:31 +00:00
bobrenjc93
82603fd7d2 introduce dynamism library (#147981)
This is the first step in supporting delayed compile. This library takes in example inputs and outputs a dict of dynamism across the inputs. We will use this to detect dynamism across multiple inputs in delayed compile. We will also use this to make shape collections more ergonomic by providing an affordance to generate a shape collection using example inputs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147981
Approved by: https://github.com/pianpwk, https://github.com/wdvr
2025-03-01 19:57:54 +00:00
PyTorch MergeBot
8e004865dd Revert "introduce dynamism library (#147981)"
This reverts commit 1c1bf410ec.

Reverted https://github.com/pytorch/pytorch/pull/147981 on behalf of https://github.com/malfet due to lint failure ([comment](https://github.com/pytorch/pytorch/pull/147981#issuecomment-2692351017))
2025-03-01 18:16:52 +00:00
bobrenjc93
1c1bf410ec introduce dynamism library (#147981)
This is the first step in supporting delayed compile. This library takes in example inputs and outputs a dict of dynamism across the inputs. We will use this to detect dynamism across multiple inputs in delayed compile. We will also use this to make shape collections more ergonomic by providing an affordance to generate a shape collection using example inputs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147981
Approved by: https://github.com/pianpwk
2025-03-01 14:57:06 +00:00
PyTorch MergeBot
baf1c8fcdc Revert "introduce dynamism library (#147981)"
This reverts commit 6eff6b28e4.

Reverted https://github.com/pytorch/pytorch/pull/147981 on behalf of https://github.com/wdvr due to lint failure ([comment](https://github.com/pytorch/pytorch/pull/147981#issuecomment-2691906065))
2025-03-01 03:43:01 +00:00
bobrenjc93
6eff6b28e4 introduce dynamism library (#147981)
This is the first step in supporting delayed compile. This library takes in example inputs and outputs a dict of dynamism across the inputs. We will use this to detect dynamism across multiple inputs in delayed compile. We will also use this to make shape collections more ergonomic by providing an affordance to generate a shape collection using example inputs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147981
Approved by: https://github.com/pianpwk
2025-03-01 02:49:16 +00:00