pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 00:20:18 +01:00

Author	SHA1	Message	Date
Brian Hirsh	7d710403b0	Reapply "Make functionalization `ViewMeta` serializable with pickle. (#143712 )" (#163769 ) ### Summary: NOTE: This is a re-export of https://github.com/pytorch/pytorch/pull/161994 ; the changes between these two PRs is exclusively to the buck/build files (Summary from #161994 ) Attempted rebase of https://github.com/pytorch/pytorch/pull/143712. This reverts commit `6c713ccb5e`. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames Lucaskabela imported-using-ghimport Test Plan: Imported from OSS Differential Revision: D81524507 Pulled By: Lucaskabela Pull Request resolved: https://github.com/pytorch/pytorch/pull/163769 Approved by: https://github.com/dolpm Co-authored-by: Brian Hirsh <hirsheybar@fb.com>	2025-09-25 10:27:37 +00:00
Laith Sakka	04ddea44fd	Fix: ShapeEnv not propagated properly to inductor SizeVars (#162927 ) Summary: I am really skeptical about inductor sizevars creating an empty shape env when not provided with one i think we should fail there if the graph has dynamic shapes and no shape env is provided. however i wonder if there are actually use cases that depends on the shape env not being there? Reasoning APIs depends on facts in the shape env. and assumes some stuff exists for specific symbols. Test Plan: Fix the bug reported in creating simple e2e unit test is not trivial https://www.internalfb.com/diff/D82337184 Rollback Plan: Differential Revision: D82412384 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162927 Approved by: https://github.com/ezyang, https://github.com/eellison, https://github.com/jansel	2025-09-18 00:56:22 +00:00
bobrenjc93	0661ecdb38	add support for hint_override in mark_unbacked (#162652 ) Very similar to https://github.com/pytorch/pytorch/pull/161007 except now for mark_unbacked. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162652 Approved by: https://github.com/laithsakka	2025-09-17 22:29:54 +00:00
Shen Zhang	3f8a2e62ea	Fix rebind_unbacked in torch.fx.experimental.symbolic_shapes (#162788 ) ## Description Fix a float type handling in `torch.fx.experimental.symbolic_shapes` function. [#162480](https://github.com/pytorch/pytorch/issues/162480) ## Issue When I use AOTInductor to compile the YOLOv10, I encounter the bug `'float' object has no attribute 'node'`. [Torch AOTInductor Ahead-Of-Time Compilation Fail](https://github.com/opendatalab/DocLayout-YOLO/issues/177) The problem is due to missing float type handling. https://github.com/pytorch/pytorch/blob/main/torch/fx/experimental/symbolic_shapes.py#L597 ``` if isinstance(u1, int): log.info( "rebind_unbacked: discard %s %s %s -> %s", n.target, raw_u0, path, u1, ) continue ``` ## Solution Change the code `if isinstance(u1, float)` to `if isinstance(u1, (int,float))` Pull Request resolved: https://github.com/pytorch/pytorch/pull/162788 Approved by: https://github.com/ezyang	2025-09-14 17:07:14 +00:00
Laith Sakka	f01bf0f64b	Do not use // but use CleanDiv or FloorDiv instead (#162869 ) Summary: When rewriting sympy expressions in the compiler codebase we want to generate FloorDiv(a, b) CleanDiv(a, b) directly and not a//b. since the later become floor(a*pow(b, -1)) For symnodes we automatically handle that conversions in the symnode op dispatch. I will follow up with an issue to track all other usages of //. Block internal Model. Test Plan: add test run existing tests. dakechen1993 testing on the model. Rollback Plan: Differential Revision: D82362241 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162869 Approved by: https://github.com/ezyang	2025-09-14 01:30:33 +00:00
Avik Chaudhuri	fccddf02b6	repro 161902 (#162416 ) Summary: Sometimes `ShapeEnv.create_symbol` can return a `sympy.Integer`. This messes up our phantom symbol infra for derived dims. Fixes #161902 Test Plan: added test based on repro Rollback Plan: Differential Revision: D81960709 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162416 Approved by: https://github.com/tugsbayasgalan	2025-09-11 16:35:23 +00:00
Avik Chaudhuri	3c45af079a	kill allow_complex_guards_as_runtime_asserts (#161794 ) Summary: [reland] Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept). Test Plan: updated tests Rollback Plan: Differential Revision: D81334984 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161794 Approved by: https://github.com/zhxchen17	2025-09-04 00:17:01 +00:00
Laith Sakka	3559c354ce	stop suggesting using guard_size_oblivious on data dependent errors (#160510 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160510 Approved by: https://github.com/ezyang	2025-09-03 18:07:59 +00:00
PyTorch MergeBot	47742081c9	Revert "kill allow_complex_guards_as_runtime_asserts (#160198 )" This reverts commit `69d91b94ba`. Reverted https://github.com/pytorch/pytorch/pull/160198 on behalf of https://github.com/jeffdaily due to let's revert again instead of waiting for forward fix, see earlier comments ([comment](https://github.com/pytorch/pytorch/pull/160198#issuecomment-3235165462))	2025-08-28 22:50:37 +00:00
Avik Chaudhuri	69d91b94ba	kill allow_complex_guards_as_runtime_asserts (#160198 ) Summary: Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept). Test Plan: updated tests Rollback Plan: Differential Revision: D79903317 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160198 Approved by: https://github.com/ezyang	2025-08-28 19:36:19 +00:00
PyTorch MergeBot	a8270dd124	Revert "kill allow_complex_guards_as_runtime_asserts (#160198 )" This reverts commit `196232bb93`. Reverted https://github.com/pytorch/pytorch/pull/160198 on behalf of https://github.com/atalman due to dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_triton_kernel_cuda [GH job link](https://github.com/pytorch/pytorch/actions/runs/17289619543/job/49074475338) [HUD commit link](`196232bb93`) ([comment](https://github.com/pytorch/pytorch/pull/160198#issuecomment-3234013520))	2025-08-28 15:40:37 +00:00
Avik Chaudhuri	196232bb93	kill allow_complex_guards_as_runtime_asserts (#160198 ) Summary: Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept). Test Plan: updated tests Rollback Plan: Differential Revision: D79903317 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160198 Approved by: https://github.com/ezyang	2025-08-28 07:59:29 +00:00
bobrenjc93	9a41570199	[rfc] add hint_override kwarg to mark_dynamic (#161007 ) The motivation for this change can be seen through the following example: ``` import torch GPU_TYPE = "cuda" @torch.compile def no_override(x): return x.sum(dim=0) @torch.compile def override(x): return x.sum(dim=0) x_small = torch.randn(4096, 512, device=GPU_TYPE) no_override(x_small) torch._dynamo.decorators.mark_dynamic(x_small, 0, hint_override=4096 * 1000) override(x_small) ``` Previously, when reductions were split, codegen relied only on the first observed shape. With a small input, this resulted in a small split size: ``` def triton_red_fused_sum_0(in_ptr0, out_ptr0, ks0, xnumel, r0_numel, XBLOCK : tl.constexpr, R0_BLOCK : tl.constexpr): xnumel = 16384 rnumel = r0_numel ``` With the new scheme, inductor honors hint_override during codegen, producing larger and more appropriate split sizes: ``` def triton_red_fused_sum_0(in_ptr0, out_ptr0, ks0, xnumel, r0_numel, XBLOCK : tl.constexpr, R0_BLOCK : tl.constexpr): xnumel = 1024000 rnumel = r0_numel ``` This addresses a broader problem with dynamism: performance and numerics previously depended on whichever shape was seen first. For example: ``` f(s0) -> f(s2) f(s1) -> f(s2) ``` could generate different kernels. With the new approach, an explicit override pins the chosen configuration: ``` f(s0, hint_override=s0) -> f(s2) f(s1, hint_override=s0) -> f(s2) ``` ensuring consistent kernel generation regardless of input order. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161007 Approved by: https://github.com/jansel	2025-08-21 02:22:52 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	dbef606631	Add support for tracing vmap in pre-dispatch export (#154650 ) Summary: ONNX team and recent transformer upgrade ran into this error and we also ran into during our export benchmarking. This diff makes it possible to trace through vmap implementation in pre-dispatch IR. Note that we don't support serializing functorch ops in pre-dispatch IR and in the future, we should desugar them to post-grad ops. The implementation strategy is: 1. We add python wrappers around vmap APIs so that we attach custom torch function handler that is only on during non-strict export. The reason is we don't want to add this to default torch_function handler because it will break BC. 2. Some dynamo changes to make sure it picks up new python wrapper APIs. The reason is when we do strict export, we need to re-materialize these APIs in pre-dispatch IR from torch IR. We can avoid this by special casing in dynamo for export to proxy different API calls but i feel that is too much chaos because you need to be able to proxy 2 different variants of same vmap API. Test Plan: CI Differential Revision: D75623875 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154650 Approved by: https://github.com/ezyang, https://github.com/zou3519	2025-08-20 19:31:07 +00:00
PyTorch MergeBot	90ea9ccefe	Revert "[rfc] add hint_override kwarg to mark_dynamic (#161007 )" This reverts commit `0533ff2ccb`. Reverted https://github.com/pytorch/pytorch/pull/161007 on behalf of https://github.com/jeffdaily due to failing on both cuda and rocm ([comment](https://github.com/pytorch/pytorch/pull/161007#issuecomment-3206893756))	2025-08-20 15:31:33 +00:00
bobrenjc93	0533ff2ccb	[rfc] add hint_override kwarg to mark_dynamic (#161007 ) The motivation for this change can be seen through the following example: ``` import torch GPU_TYPE = "cuda" @torch.compile def no_override(x): return x.sum(dim=0) @torch.compile def override(x): return x.sum(dim=0) x_small = torch.randn(4096, 512, device=GPU_TYPE) no_override(x_small) torch._dynamo.decorators.mark_dynamic(x_small, 0, hint_override=4096 * 1000) override(x_small) ``` Previously, when reductions were split, codegen relied only on the first observed shape. With a small input, this resulted in a small split size: ``` def triton_per_fused_sum_1(in_ptr0, out_ptr0, xnumel, r0_numel, XBLOCK : tl.constexpr): xnumel = 512 r0_numel = 32 ``` With the new scheme, inductor honors hint_override during codegen, producing larger and more appropriate split sizes: ``` def triton_red_fused_sum_0(in_ptr0, out_ptr0, xnumel, r0_numel, XBLOCK : tl.constexpr, R0_BLOCK : tl.constexpr): xnumel = 16384 r0_numel = 128 ``` This addresses a broader problem with dynamism: performance and numerics previously depended on whichever shape was seen first. For example: ``` f(s0) -> f(s2) f(s1) -> f(s2) ``` could generate different kernels. With the new approach, an explicit override pins the chosen configuration: ``` f(s0, hint_override=s0) -> f(s2) f(s1, hint_override=s0) -> f(s2) ``` ensuring consistent kernel generation regardless of input order. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161007 Approved by: https://github.com/jansel	2025-08-20 07:51:09 +00:00
bobrenjc93	311f74089a	remove print (#159917 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159917 Approved by: https://github.com/laithsakka	2025-08-06 03:48:23 +00:00
Lucas Kabela	c137f9da0b	[Dynamo][Better Engineering] Add type coverage to dynamo/compiled_autograd.py (#159518 ) As part of better engineering effort, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to `torch/_dynamo/compiled_autograd.py` Running ``` mypy torch/_dynamo/compiled_autograd.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Annotated \| Lines Total \| % lines covered \| Funcs Annotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 425 \| 1553 \| 27.37% \| 17 \| 62 \| 27.42% \| \| This PR \| 1623 \| 1623 \| 100.00% \| 62 \| 62 \| 100.00% \| \| Delta \| +1198\| +0 \| +72.63% \| +45 \| 0 \| +72.58% \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/159518 Approved by: https://github.com/xmfan	2025-08-01 20:24:58 +00:00
Lucas Kabela	2b1ae29960	[Dynamo][Better Engineering] Add typing annotations to guard and source (#158397 ) (#159491 ) Summary: X-link: https://github.com/pytorch/executorch/pull/12986 As part of better engineering week, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to a critical set of files for dynamo, `source.py` and the base `_guards.py` Running ``` mypy torch/_dynamo/source.py torch/_guards.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Unannotated \| Lines Total \| % lines covered \| Funcs Unannotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 1227 \| 2208 \| 55.57% \| 207 \| 362 \| 57.18% \| \| This PR \| 2217 \| 2217 \| 100.00% \| 362 \| 362 \| 100.00% \| \| Delta \| +990 \| +9 \| +44.43% \| +155 \| 0 \| +42.82% \| cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 jerryzh168 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Test Plan: Imported from GitHub, without a `Test Plan:` line. Rollback Plan: Reviewed By: JacobSzwejbka, yangw-dev Differential Revision: D79199389 Pulled By: Lucaskabela Pull Request resolved: https://github.com/pytorch/pytorch/pull/159491 Approved by: https://github.com/anijain2305, https://github.com/yangw-dev	2025-07-30 22:57:50 +00:00
PyTorch MergeBot	d987a6f7f0	Revert "[Dynamo][Better Engineering] Add typing annotations to guard and source (#158397 )" This reverts commit `abcb24f4de`. Reverted https://github.com/pytorch/pytorch/pull/158397 on behalf of https://github.com/yangw-dev due to Suggested to fix failing internal signals on D78911890 ([comment](https://github.com/pytorch/pytorch/pull/158397#issuecomment-3133823766))	2025-07-29 19:49:40 +00:00
Laith Sakka	0b2ef76e85	DDE-Free select with unbacked index. (#157605 ) When select has data dependent input, we cant tell if the actual index shall be index+size or index. to avoid throwing dde, we allocate a new unbacked symbol to represent the storage offset of the output view and we compute its value dynamically at runtime when inductor is lowered. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157605 Approved by: https://github.com/ColinPeppler	2025-07-24 20:08:05 +00:00
Lucas Kabela	abcb24f4de	[Dynamo][Better Engineering] Add typing annotations to guard and source (#158397 ) As part of better engineering week, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to a critical set of files for dynamo, `source.py` and the base `_guards.py` Running ``` mypy torch/_dynamo/source.py torch/_guards.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Unannotated \| Lines Total \| % lines covered \| Funcs Unannotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 1227 \| 2208 \| 55.57% \| 207 \| 362 \| 57.18% \| \| This PR \| 2217 \| 2217 \| 100.00% \| 362 \| 362 \| 100.00% \| \| Delta \| +990 \| +9 \| +44.43% \| +155 \| 0 \| +42.82% \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/158397 Approved by: https://github.com/anijain2305	2025-07-24 15:55:18 +00:00
Pian Pawakapan	1227ed6674	[dynamic shapes] fix _maybe_evaluate_static axioms bug (#158672 ) Summary: couldn't get a minimal repro, but xref for size change during dict iteration error: https://fb.workplace.com/groups/1075192433118967/posts/1709439696360901 Test Plan: - Rollback Plan: Differential Revision: D78047846 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158672 Approved by: https://github.com/bobrenjc93	2025-07-21 23:14:19 +00:00
PyTorch MergeBot	23550ab735	Revert "DDE-Free select with unbacked index. (#157605 )" This reverts commit `79d7c754ab`. Reverted https://github.com/pytorch/pytorch/pull/157605 on behalf of https://github.com/laithsakka due to fail pr time benchmarks ([comment](https://github.com/pytorch/pytorch/pull/157605#issuecomment-3084663020))	2025-07-17 16:20:02 +00:00
Laith Sakka	79d7c754ab	DDE-Free select with unbacked index. (#157605 ) When select has data dependent input, we cant tell if the actual index shall be index+size or index. to avoid throwing dde, we allocate a new unbacked symbol to represent the storage offset of the output view and we compute its value dynamically at runtime when inductor is lowered. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157605 Approved by: https://github.com/ColinPeppler	2025-07-17 05:08:11 +00:00
Xuehai Pan	11c07c848c	[BE][14/16] fix typos in torch/ (torch/fx/) (#156604 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156604 Approved by: https://github.com/jingsh ghstack dependencies: #156318, #156320, #156602	2025-07-02 22:55:29 +00:00
Prajesh Praveen Anchalia	156bc243f0	Back out "Include c++ stack traces when we hit constraint violation (#155603 )" (#157406 ) Summary: Original commit changeset: 4b3fdaa8f2c6 Original Phabricator Diff: D76434787 Meta: https://fb.workplace.com/groups/1286739428954016/permalink/1535462614081695/ Test Plan: Meta: Revert D76434787 for S536719 Rollback Plan: Differential Revision: D77626334 Pull Request resolved: https://github.com/pytorch/pytorch/pull/157406 Approved by: https://github.com/bobrenjc93	2025-07-02 16:51:07 +00:00
Laith Sakka	6cc490d40b	simplify max(1,x) to x when x known >=1 (#157189 ) Creating contiguous strides creates an expression max(1, x). Often we know that x >= 1, in which case we should simplify max(1, x) to x. This appeared in two situations: 1) An internal user complained about statically_known_true(x == max(1, x)) failing (internal link: https://fb.workplace.com/groups/1028545332188949/permalink/1232958568414290). This https://github.com/pytorch/pytorch/pull/155938 won't be needed with this. 3) Not simplifying the above could result in wrong ConstraintViolationErrors. Because we assume non-trival single arg guards shall evaporate see the logic in the function issue_guard in symbolic_shapes.py with this change we longer throw ConstraintViolationErrors with the program bellow this is blocking landing this [PR](https://github.com/pytorch/pytorch/pull/155590) from landing internally. Due to internal export tests throwing ConstraintViolationErrors. like ``` Constraints violated (width)! - Not all values of width = L['x'].size()[3] in the specified range 224 <= width <= 455 satisfy the generated guard max(1, 1 + (((-1) + L['x'].size()[3]) // 2)) == (1 + (((-1) + L['x'].size()[3]) // 2)). ```` ``` x = torch.rand(10) torch._dynamo.mark_dynamic(x, 0, max=20, min=5) @torch.compile(fullgraph=True, dynamic=True) def func(x): if max(1, (-1 + x.size()[0]//2)) == (-1+x.size()[0]//2): return x400 else: return (x10)*100 func(x) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/157189 Approved by: https://github.com/pianpwk	2025-06-29 01:16:30 +00:00
Tuan Trieu	eeaefa1336	Fix UnbackedSymint rebinding - check unbacked before renaming (#156911 ) Differential Revision: D77249427 Due to memoization and graph order update, it can happen that a backed symbol is passed into compute_unbacked_bindings and lead to failure. An example as follow: - There are 2 boolean indexing operators (e.g. op1 and op2) with the same mask. - A unbacked symint is generated from op1, and then op2 reuses the unbacked symint due to a nonzero_memo in nonzero's fake implementation and no rebinding is needed for op2. - Since op1 generated the unbacked symint, its meta has "unbacked_bindings" field filled and op2's meta doesn't have it. - Output from op1 and op2 are later concated with others with backed symint, so that the unbacked symint can be replaced by a backed symint. - In Inductor, during fake tensor prop, there is no memoi because new fake tensor is always generated (for the same node). op1 generates an unbacked symint and the unbacked can be rebound successfully to the backed symint. Since there is no memoi, op2 also generates a new unbacked symint, but no rebinding can happen because op2's meta doesn't have "unbacked_bindings". And "compute_unbacked_bindings/_rename_unbacked_to" fails to assert op2's old symbol to be unbacked. From discussion with [@ezyang](https://www.internalfb.com/intern/profile/?id=503862770), there is no easy way to fix this issue. - We can try to enable memoization for fake tensor prop in Inductor, however, we need to ensure that op1 is visited before op2 during Inductor fake tensor prop for this to work (op2's meta doesn't have "unbacked_bindings" so no rebinding can happen and we need to do rebinding from op1. But there are passes such as reorder_for_locality that can change the graph order so this doesn't work. - A simple hack is to just replace the unbacked symbol in op2 by the backed symbol. Pull Request resolved: https://github.com/pytorch/pytorch/pull/156911 Approved by: https://github.com/ezyang	2025-06-27 16:57:04 +00:00
Shangdi Yu	204db27a0c	Consolidate stack trace in Tracer (#156257 ) Summary: - Consolidate the stack trace recording code in TracerBase and PythonKeyTracer - Change `make_fx`'s arg name to be consistent with TracerBase member name `record_stack_traces` We move the stack trace logic from `create_proxy` to `create_node` so all inherited classes of TracerBase and re-use the same stack trace logic. Test Plan: ``` buck run caffe2/test:test_export -- -r test_stack_trace ``` Rollback Plan: Pull Request resolved: https://github.com/pytorch/pytorch/pull/156257 Approved by: https://github.com/angelayi, https://github.com/zou3519	2025-06-25 23:07:10 +00:00
bobrenjc93	348e2a76df	s/defer_runtime_assert/guard_or_defer_runtime_assert (#156397 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156397 Approved by: https://github.com/laithsakka	2025-06-19 10:18:28 +00:00
Xuehai Pan	2e0e08588e	[BE][PYFMT] migrate PYFMT for `torch/[e-n]*/` to `ruff format` (#144553 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144553 Approved by: https://github.com/ezyang ghstack dependencies: #144551	2025-06-17 08:18:47 +00:00
PyTorch MergeBot	06408dae49	Revert "Add view_simple as meta function for view, and avoid calling reshape_view_helper. (#154757 )" This reverts commit `0029259bdf`. Reverted https://github.com/pytorch/pytorch/pull/154757 on behalf of https://github.com/laithsakka due to post land issue ([comment](https://github.com/pytorch/pytorch/pull/154757#issuecomment-2971385787))	2025-06-13 19:11:43 +00:00
Avik Chaudhuri	463fe36532	fix error message on specialization with Dim.DYNAMIC (#155738 ) Previously specialization error messages would render sources that were pretty far from source-code names. E.g., given args named `x, y, zs`, the source for `y.size()[0]` would be rendered as `args[0][1].size()[0]`. This is because we created artificial local names following `(args, kwargs)` structure instead of reusing signatures. This PR fixes that situation. Basically we map prefixes of key paths that correspond to original arg names to root sources corresponding to those names; the rest of the key paths hang from these root sources. Differential Revision: [D76461391](https://our.internmc.facebook.com/intern/diff/D76461391/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155738 Approved by: https://github.com/bobrenjc93	2025-06-13 10:33:46 +00:00
Laith Sakka	f4376cac54	unify symbolic_shapes and sizevars dynamic shapes APIs naming 1 (#154774 ) Inductor have a set of APIs that allows performing symbolic evaluations similar to that of symbolic shapes but it operates on sympy expressions instead of symnodes. Namings are not consistent making them consistent in this stack. Step 1 : unify statically_know_true naming! for consistent experience. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154774 Approved by: https://github.com/drisspg, https://github.com/bobrenjc93, https://github.com/eellison	2025-06-12 16:11:55 +00:00
Zhengxu Chen	0fd711df19	[export] Allow user frame to be None when symbolic shape tries to get stacktrace. (#155744 ) Summary: Fixing https://github.com/pytorch/pytorch/issues/155605 Test Plan: CI Rollback Plan: Differential Revision: D76463358 Pull Request resolved: https://github.com/pytorch/pytorch/pull/155744 Approved by: https://github.com/angelayi	2025-06-12 14:36:29 +00:00
Laith Sakka	0029259bdf	Add view_simple as meta function for view, and avoid calling reshape_view_helper. (#154757 ) address https://github.com/pytorch/pytorch/issues/153303 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154757 Approved by: https://github.com/bobrenjc93, https://github.com/leslie-fang-intel	2025-06-12 09:58:15 +00:00
Oguz Ulgen	d1947a8707	Migrate from lru_cache to cache (#155613 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155613 Approved by: https://github.com/ezyang ghstack dependencies: #155612	2025-06-11 19:44:18 +00:00
Pian Pawakapan	247f83e0a4	[dynamic shapes] guard individual terms in sym_and; user-code-friendly sym_and/sym_or (#154737 ) Previously when processing `sym_and(a, b, c)`, symbolic shapes wouldn't individually process a, b, and c and store their implications. This would lead us to data-dependent error on individual checks, e.g. we stored `u0 >= 0 & u0 <= 10`, but then couldn't figure out `u0 <= 10`. This handles that, and also makes `sym_and/or` user-code friendly, for testing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154737 Approved by: https://github.com/laithsakka	2025-06-11 18:08:06 +00:00
bobrenjc93	5b9db4335e	Include c++ stack traces when we hit constraint violation (#155603 ) Example new error message ``` torch.fx.experimental.symbolic_shapes.ConstraintViolationError: Constraints violated (L['x'].size()[0])! For more information, run with TORCH_LOGS="+dynamic". - You marked L['x'].size()[0] as dynamic but your code specialized it to be a constant (5). Either remove the mark_dynamic or use a less strict API such as maybe_mark_dynamic or Dim.AUTO. Framework stack: File "??", line 0, in _start File "", line 0, in __libc_start_main_alias_2 File "??", line 0, in __libc_start_call_main File "/usr/local/src/conda/python-3.10.16/Modules/main.c", line 1094, in Py_BytesMain File "/usr/local/src/conda/python-3.10.16/Modules/main.c", line 357, in pymain_run_file_obj File "/usr/local/src/conda/python-3.10.16/Python/pythonrun.c", line 90, in _PyRun_AnyFileObject File "/usr/local/src/conda/python-3.10.16/Python/pythonrun.c", line 456, in _PyRun_SimpleFileObject File "/usr/local/src/conda/python-3.10.16/Python/pythonrun.c", line 1208, in pyrun_file File "/usr/local/src/conda/python-3.10.16/Python/pythonrun.c", line 1312, in run_mod File "/usr/local/src/conda/python-3.10.16/Python/pythonrun.c", line 1291, in run_eval_code_obj File "/usr/local/src/conda/python-3.10.16/Python/ceval.c", line 1134, in PyEval_EvalCode File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/scratch/repro.py", line 9, in <module> foo(x) File "/usr/local/src/conda/python-3.10.16/Python/ceval.c", line 5945, in do_call_core File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/eval_frame.py", line 699, in compile_wrapper return fn(args, kwargs) File "offloadstuff.c", line 0, in dynamo__custom_eval_frame File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 305, in _PyObject_Call File "/usr/local/src/conda/python-3.10.16/Objects/typeobject.c", line 7494, in slot_tp_call File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 431, in _PyObject_Call_Prepend File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/convert_frame.py", line 1469, in __call__ return self._torchdynamo_orig_callable( File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 112, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 215, in _PyObject_MakeTpCall File "/usr/local/src/conda/python-3.10.16/Objects/typeobject.c", line 7494, in slot_tp_call File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 431, in _PyObject_Call_Prepend File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 153, in _PyObject_FastCallDictTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/convert_frame.py", line 1248, in __call__ result = self._inner_convert( File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 112, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 215, in _PyObject_MakeTpCall File "/usr/local/src/conda/python-3.10.16/Objects/typeobject.c", line 7494, in slot_tp_call File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 431, in _PyObject_Call_Prepend File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 153, in _PyObject_FastCallDictTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/convert_frame.py", line 625, in __call__ return _compile( File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/convert_frame.py", line 1092, in _compile guarded_code = compile_inner(code, one_graph, hooks, transform) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_utils_internal.py", line 97, in wrapper_function return function(args, *kwargs) File "/usr/local/src/conda/python-3.10.16/Python/ceval.c", line 5945, in do_call_core File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/convert_frame.py", line 779, in compile_inner return _compile_inner(code, one_graph, hooks, transform) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/convert_frame.py", line 818, in _compile_inner out_code = transform_code_object(code, transform) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/bytecode_transformation.py", line 1424, in transform_code_object transformations(instructions, code_options) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/convert_frame.py", line 265, in _fn return fn(args, kwargs) File "/usr/local/src/conda/python-3.10.16/Python/ceval.c", line 5945, in do_call_core File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/convert_frame.py", line 743, in transform tracer.run() File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/symbolic_convert.py", line 3531, in run super().run() File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1359, in run while self.step(): File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/symbolic_convert.py", line 1263, in step self.dispatch_table[inst.opcode](self, inst) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/symbolic_convert.py", line 422, in impl self.push(fn_var.call_function(self, self.popn(nargs), {})) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/variables/builtin.py", line 1160, in call_function return handler(tx, args, kwargs) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/variables/builtin.py", line 792, in <lambda> return lambda tx, args, kwargs: obj.call_function( File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/variables/builtin.py", line 1160, in call_function return handler(tx, args, kwargs) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/variables/builtin.py", line 1120, in _handle_insert_op_in_graph return wrap_fx_proxy(tx, proxy) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/variables/builder.py", line 2500, in wrap_fx_proxy return wrap_fx_proxy_cls(target_cls=TensorVariable, kwargs) File "/usr/local/src/conda/python-3.10.16/Python/ceval.c", line 5945, in do_call_core File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 267, in PyVectorcall_Call File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/variables/builder.py", line 2566, in wrap_fx_proxy_cls return _wrap_fx_proxy( File "/usr/local/src/conda/python-3.10.16/Python/ceval.c", line 5945, in do_call_core File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/variables/builder.py", line 2664, in _wrap_fx_proxy example_value = get_fake_value(proxy.node, tx, allow_non_graph_fake=True) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/utils.py", line 3205, in get_fake_value ret_val = wrap_fake_exception( File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/utils.py", line 2705, in wrap_fake_exception return fn() File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/utils.py", line 3206, in <lambda> lambda: run_node(tx.output, node, args, kwargs, nnmodule) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_dynamo/utils.py", line 3373, in run_node return node.target(args, kwargs) File "/usr/local/src/conda/python-3.10.16/Python/ceval.c", line 5917, in do_call_core File "/usr/local/src/conda/python-3.10.16/Objects/methodobject.c", line 430, in cfunction_vectorcall_FASTCALL File "/usr/local/src/conda/python-3.10.16/Objects/abstract.c", line 891, in binary_op1 File "/usr/local/src/conda/python-3.10.16/Objects/typeobject.c", line 7284, in slot_nb_multiply File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Objects/descrobject.c", line 344, in method_vectorcall_VARARGS_KEYWORDS File "python_variable_methods.cpp", line 0, in _object torch::autograd::TypeError_to_NotImplemented_<&torch::autograd::THPVariable_mul>(_object, _object, _object) File "python_variable_methods.cpp", line 0, in torch::autograd::THPVariable_mul(_object, _object, _object) File "??", line 0, in at::_ops::mul_Tensor::call(at::Tensor const&, at::Tensor const&) File "offloadstuff.c", line 0, in c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&, at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&, at::Tensor const&) File "PyInterpreter.cpp", line 0, in torch::detail::(anonymous namespace)::ConcretePyInterpreterVTable::python_dispatcher(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) const File "offloadstuff.c", line 0, in c10::OperatorHandle::callBoxedForDispatchKey(c10::DispatchKey, std::vector<c10::IValue, std::allocator<c10::IValue> >&) const File "PythonFallbackKernel.cpp", line 0, in void c10::BoxedKernel::make_boxed_function<&(anonymous namespace)::pythonTLSSnapshotFallback>(c10::OperatorKernel, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) File "PyInterpreter.cpp", line 0, in torch::detail::(anonymous namespace)::ConcretePyInterpreterVTable::python_dispatcher(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) const File "offloadstuff.c", line 0, in c10::OperatorHandle::callBoxedForDispatchKey(c10::DispatchKey, std::vector<c10::IValue, std::allocator<c10::IValue> >&) const File "VariableType_0.cpp", line 0, in c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&, at::Tensor const&), &torch::autograd::VariableType::(anonymous namespace)::mul_Tensor>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, at::Tensor const&> >, false>::call(c10::OperatorKernel, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) File "VariableType_0.cpp", line 0, in torch::autograd::VariableType::(anonymous namespace)::mul_Tensor(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&) File "??", line 0, in at::_ops::mul_Tensor::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&) File "offloadstuff.c", line 0, in c10::impl::BoxedKernelWrapper<at::Tensor (at::Tensor const&, at::Tensor const&), void>::call(c10::BoxedKernel const&, c10::OperatorHandle const&, c10::DispatchKeySet, at::Tensor const&, at::Tensor const&) File "PyInterpreter.cpp", line 0, in torch::detail::(anonymous namespace)::ConcretePyInterpreterVTable::python_dispatcher(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) const File "offloadstuff.c", line 0, in c10::OperatorHandle::callBoxedForDispatchKey(c10::DispatchKey, std::vector<c10::IValue, std::allocator<c10::IValue> >&) const File "PythonFallbackKernel.cpp", line 0, in (anonymous namespace)::pythonFallback(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >) File "PyInterpreter.cpp", line 0, in torch::detail::(anonymous namespace)::ConcretePyInterpreterVTable::dispatch(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >) const File "??", line 0, in torch::handle_torch_function_no_python_arg_parser(c10::ArrayRef<_object>, _object, _object, char const, _object, char const, torch::TorchFunctionName) File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 577, in PyObject_CallMethod File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/utils/_stats.py", line 27, in wrapper return fn(args, *kwargs) File "/usr/local/src/conda/python-3.10.16/Python/ceval.c", line 5945, in do_call_core File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_subclasses/fake_tensor.py", line 1346, in __torch_dispatch__ return self.dispatch(func, types, args, kwargs) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_subclasses/fake_tensor.py", line 2029, in dispatch return self._cached_dispatch_impl(func, types, args, kwargs) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_subclasses/fake_tensor.py", line 1442, in _cached_dispatch_impl return self._dispatch_impl(func, types, args, kwargs) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_subclasses/fake_tensor.py", line 2552, in _dispatch_impl return maybe_propagate_real_tensors(fast_impl(self, args, *kwargs)) File "/usr/local/src/conda/python-3.10.16/Python/ceval.c", line 5945, in do_call_core File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_subclasses/fake_impls.py", line 956, in fast_binary_impl final_shape = infer_size(final_shape, shape) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/_subclasses/fake_impls.py", line 916, in infer_size torch._check( File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/__init__.py", line 1669, in _check _check_with(RuntimeError, cond, message) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/__init__.py", line 1632, in _check_with if expect_true(cond): File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/symbolic_shapes.py", line 1686, in expect_true return a.node.expect_true( File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/sym_node.py", line 552, in expect_true return self.guard_bool(file, line) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/sym_node.py", line 536, in guard_bool r = self.evaluate() File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/sym_node.py", line 510, in evaluate return self.shape_env.evaluate_sym_node(self, size_oblivious) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/symbolic_shapes.py", line 7113, in evaluate_sym_node return self.evaluate_expr( File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 112, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 215, in _PyObject_MakeTpCall File "/usr/local/src/conda/python-3.10.16/Modules/_functoolsmodule.c", line 1020, in bounded_lru_cache_wrapper File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 267, in PyVectorcall_Call File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/recording.py", line 272, in wrapper return retlog(fn(args, *kwargs)) File "/usr/local/src/conda/python-3.10.16/Python/ceval.c", line 5945, in do_call_core File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 267, in PyVectorcall_Call File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/symbolic_shapes.py", line 7215, in evaluate_expr return self._inner_evaluate_expr( File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 112, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 215, in _PyObject_MakeTpCall File "/usr/local/src/conda/python-3.10.16/Modules/_functoolsmodule.c", line 1020, in bounded_lru_cache_wrapper File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/recording.py", line 272, in wrapper return retlog(fn(args, *kwargs)) File "/usr/local/src/conda/python-3.10.16/Python/ceval.c", line 5945, in do_call_core File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/symbolic_shapes.py", line 7238, in _inner_evaluate_expr return self._evaluate_expr( File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/symbolic_shapes.py", line 7505, in _evaluate_expr self._maybe_guard_rel(g) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 112, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 215, in _PyObject_MakeTpCall File "/usr/local/src/conda/python-3.10.16/Modules/_functoolsmodule.c", line 1020, in bounded_lru_cache_wrapper File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/symbolic_shapes.py", line 6758, in _maybe_guard_rel self._refine_ranges(expr) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/symbolic_shapes.py", line 7709, in _refine_ranges self._set_replacement( File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/fx/experimental/symbolic_shapes.py", line 6667, in _set_replacement self.framework_specialization_stacks[source] = CapturedTraceback.extract(cpp=True) File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 114, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Include/internal/pycore_ceval.h", line 46, in _PyEval_EvalFrame File "/home/bobren/local/a/pytorch/torch/utils/_traceback.py", line 207, in extract torch._C._profiler.gather_traceback(python=True, script=script, cpp=cpp), File "/usr/local/src/conda/python-3.10.16/Include/cpython/abstract.h", line 112, in _PyObject_VectorcallTstate File "/usr/local/src/conda/python-3.10.16/Objects/call.c", line 215, in _PyObject_MakeTpCall File "/usr/local/src/conda/python-3.10.16/Objects/methodobject.c", line 543, in cfunction_call File "offloadstuff.c", line 0, in pybind11::cpp_function::dispatcher(_object, _object, _object) File "offloadstuff.c", line 0, in pybind11::cpp_function::initialize<std::shared_ptr<torch::CapturedTraceback> (&)(bool, bool, bool), std::shared_ptr<torch::CapturedTraceback>, bool, bool, bool, pybind11::name, pybind11::scope, pybind11::sibling, pybind11::arg_v, pybind11::arg_v, pybind11::arg_v>(std::shared_ptr<torch::CapturedTraceback> (&)(bool, bool, bool), std::shared_ptr<torch::CapturedTraceback> ()(bool, bool, bool), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, pybind11::arg_v const&, pybind11::arg_v const&, pybind11::arg_v const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call&) File "??", line 0, in torch::CapturedTraceback::gather(bool, bool, bool) File "??", line 0, in torch::unwind::unwind() User stack: File "/home/bobren/local/a/pytorch/scratch/repro.py", line 5, in foo return torch.randn(5) x ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/155603 Approved by: https://github.com/zou3519, https://github.com/cyyever ghstack dependencies: #155133	2025-06-11 05:00:36 +00:00
Aaron Orenstein	fc0135ca11	Re-enable FakeTensor caching for SymInts (#152662 ) Summary: This backs out D60320595 which itself turned off FakeTensor caching when a SymInt was present. There has been a lot of dynamic shape fixes done this year and tests pass so I'm assuming some of that work fixed what was breaking previously. Test Plan: Reran the tests listed in T196779132 and they pass. ## Perf ### Instruction Counter Benchmark: - 26% win on add_loop_eager_dynamic - 13% win on add_loop_inductor_dynamic_gpu ### Perf Dashboard Compilation Latency wins across the board but especially strong on the dynamic tests (like cudagraphs_dynamic) - for example MobileBertForMaskedLM went from 66s -> 50s. Differential Revision: [D75467694](https://our.internmc.facebook.com/intern/diff/D75467694) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152662 Approved by: https://github.com/anijain2305	2025-05-30 17:23:36 +00:00
bobrenjc93	9c06dff1ce	[multigraph] use specializations in compile_and_call_fx_graph (#153449 ) The goal of this multigraph work is to enable a compiled region that has a single dynamo trace but multiple backend specializations. This work was inspired by vLLM which does this in a somewhat hacky way where they use a custom backend to capture a dynamo graph and then manually invoke compile_fx multiple times to get specialized graphs. There's really two parts of this work: The frontend changes: 1) we introduce an optional kwarg `specialize_on` to mark_{dynamic,unbacked} that takes in a list of specializations. I debated other methods including specifying specializations via decorators, but ultimately decided this approach was more harmonious. The big issue with decorators is the difficulty of composing well with the rest of the torch.compile ecosystem including graph breaks, lazy initialization of variable trackers and symbolic variables, etc. The backend changes (this PR): 1) We capture the backend_specialization specified in the mark_{dynamic,unbacked} API into a SymbolicContext. See changes in `/_dynamo/variables/builder.py` 2) After we are done dynamo tracing, we will lazily (more on this later) invoke `call_user_compiler` up to N + 1 times for N specializations and 1 generic graph. Under the hood this will call compile_fx, which composes nicely with both Async Compile and AOTAutogradCache. We do this by using a context manager to patch in specialization specific axioms into the ShapeEnv before invoking the user compiler. 3) When we have specializations, we install a lazy specialized dispatch function that checks each specialization and dispatches to the first one that matches. Instead of doing all of the specialization compiles up front, we do the compiles lazily. The first time a specialization is invoked, we will do the compilation and save it in a cache so subsequent invocations are fast. If none of the specializations match, we dispatch to the generic graph. I decided to do this over returning N different GuardedCodes since 1) it doesn't pollute the dynamo cache (eg. if you have 8 specializations, you would hit the cache limit) 2) it naturally incorporates the hierarchical lattice structure of the guards since the specializations are always necessarily stricter than the generic region's guards. I benchmarked this PR stack with #152596 and found around a 50% reduction when dispatching to the specialized regions: ![495269647_576053105510082_9189856138964956774_n](https://github.com/user-attachments/assets/66030fed-d62e-4d87-940f-aa13c99b1a73) Pull Request resolved: https://github.com/pytorch/pytorch/pull/153449 Approved by: https://github.com/zou3519 ghstack dependencies: #153433	2025-05-30 03:19:49 +00:00
Laith Sakka	cd9ff41282	check fallback_value first. (#154493 ) This is just a refactor, not a fix for any issue. we do check fallback_value first and early exit instead of checking it not set over and over. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154493 Approved by: https://github.com/bobrenjc93	2025-05-29 09:06:43 +00:00
bobrenjc93	d62a33c002	[ez] add docblock for _expandsums (#154397 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154397 Approved by: https://github.com/laithsakka ghstack dependencies: #154400, #154398, #154396, #154399	2025-05-28 22:43:26 +00:00
bobrenjc93	0c00e32632	[ez] add docblock for _eval_is_non_overlapping_and_dense (#154399 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154399 Approved by: https://github.com/laithsakka ghstack dependencies: #154400, #154398, #154396	2025-05-28 22:40:03 +00:00
bobrenjc93	ed348e7026	Add docblock for TrackedFake (#154396 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154396 Approved by: https://github.com/laithsakka ghstack dependencies: #154400, #154398	2025-05-28 21:19:49 +00:00
bobrenjc93	d311b79c12	add docblock for _fast_expand (#154398 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154398 Approved by: https://github.com/laithsakka ghstack dependencies: #154400	2025-05-28 21:16:47 +00:00
bobrenjc93	e7318b863d	[ez] add docblock to cast_symbool_to_symint_guardless (#154400 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154400 Approved by: https://github.com/laithsakka	2025-05-28 21:11:53 +00:00
bobrenjc93	476e0a643a	[ez] add docblock for ShapeGuardPythonPrinter (#154403 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154403 Approved by: https://github.com/jingsh ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405, #154377, #154378, #154379, #154380, #154381, #154383, #154384, #154385, #154402	2025-05-28 14:17:17 +00:00
bobrenjc93	473a93eb58	[ez] add docblock for _ShapeGuardPrinter (#154402 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154402 Approved by: https://github.com/jingsh ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405, #154377, #154378, #154379, #154380, #154381, #154383, #154384, #154385	2025-05-28 14:13:22 +00:00

1 2 3 4 5 ...

756 Commits