pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
linhaifeng	369f2d6951	[3/N] fix typo in other folders (#166606 ) fix typo in other folders #166374 #166126 _typos.toml ```bash [files] extend-exclude = ["tools/linter/dictionary.txt"] [default.extend-words] nd = "nd" arange = "arange" Nd = "Nd" GLOBALs = "GLOBALs" hte = "hte" iy = "iy" PN = "PN" Dout = "Dout" optin = "optin" gam = "gam" PTD = "PTD" Sur = "Sur" nin = "nin" tme = "tme" inpt = "inpt" mis = "mis" Raison = "Raison" ouput = "ouput" nto = "nto" Onwer = "Onwer" callibrate = "callibrate" ser = "ser" Metdata = "Metdata" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/166606 Approved by: https://github.com/ezyang	2025-10-30 10:30:40 +00:00
Michael Lazos	f5cb9a4c68	[user-streams] Fix stream graph output semantics (#164819 ) Preivously, we would stash a single stream value we constructed at trace time in a global and return the same value from repeated calls to the graph. With this PR, we construct the stream value in advance, reference the constructed value in the graph via the lookup table, and if that value is returned as an output, read the value from the lookup table and return it (in bytecode, not as a graph output, since we don't support arbitrary stream outputs). Pull Request resolved: https://github.com/pytorch/pytorch/pull/164819 Approved by: https://github.com/anijain2305 ghstack dependencies: #164304, #164522	2025-10-30 04:58:46 +00:00
Zhengxu Chen	f93ea7dab1	[export] Update dynamo_graph_capture_for_export to return GraphModule. (#166091 ) Make dynamo_graph_capture_for_export return a more compatible GraphModule object which is closer the the original behavior of dynamo Pull Request resolved: https://github.com/pytorch/pytorch/pull/166091 Approved by: https://github.com/tugsbayasgalan	2025-10-28 04:23:28 +00:00
Maggie Moss	27302a4932	Fix error suppression syntax in onnx, jit, _dynamo (#166249 ) Ensures pyrefly will only silence one specific error code pyrefly check lintrunner Pull Request resolved: https://github.com/pytorch/pytorch/pull/166249 Approved by: https://github.com/oulgen	2025-10-27 02:01:54 +00:00
Maggie Moss	eb83c3ca23	Clean up unused Pyrefly suppressions (#166178 ) Cleaning up ignores that are no longer needed in the repo and adding select suppressions so the main branch is clean. test plan: `lintrunner -a` Pull Request resolved: https://github.com/pytorch/pytorch/pull/166178 Approved by: https://github.com/oulgen	2025-10-25 05:32:21 +00:00
bobrenjc93	1d58d5fe25	[hops] fix unbacked runtime asserts for cond higher order op (#165893 ) At a high level after this fix we get the following nice tlparse https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/bobren/54a57665-7dcc-41e0-8ca7-df01393cd4aa/custom/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 As seen in this doc, previously we were simply dropping assert post dynamo: https://docs.google.com/document/d/1nRQwvw_gWL0_9T3VKb5Ly3_tNI1fgqG9WtryeD6qaZI/edit?tab=t.0 The fixes are a couple things: 1) Actually run the runtime assertion fx graph pass on subgraphs 2) Reset fake mode unbacked memo across speculate subgraph invocations since the memos actually break the runtime assertion insertions since calls like nonzero end up not allocating new unbacked symints and hence not populating pending_unbacked which then results in incorrect unbacked_bindings on fx_nodes in subgraphs. This is a first step in hardening runtime asserts across all phases of the compiler (eager, aot_eager, inductor, etc.). I will continue kicking tires and fixing bugs until we get runtime assert generations in a good place. One obvious next step is the added test case in this PR fails when compiled with inductor with the following error (NB: it fails before this PR as well): ``` File "/data/users/bobren/a/pytorch/torch/_inductor/ir.py", line 659, in get_dtype return self.dtype torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised: LoweringException: AttributeError: 'ShapeAsConstantBuffer' object has no attribute 'dtype' target: cond args[0]: Eq(Mod(s77, 4), 0) args[1]: Subgraph(name='true_graph_0', graph_module=<lambda>(), graph=<torch._inductor.graph.SubgraphLowering object at 0x7fbcbb11e110>) args[2]: Subgraph(name='false_graph_0', graph_module=<lambda>(), graph=<torch._inductor.graph.SubgraphLowering object at 0x7fbcbb21cf70>) args[3]: (s77, TensorBox(StorageBox( ComputedBuffer(name='buf0', layout=FlexibleLayout('cuda:0', torch.float32, size=[s77, s77], stride=[s77, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.float32, inner_fn=<function make_pointwise.<locals>.inner.<locals>.inner_fn at 0x7fbcbb2f37f0>, ranges=[s77, s77])) ))) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165893 Approved by: https://github.com/zou3519	2025-10-25 03:25:36 +00:00
Yuanyuan Chen	9d0b77f4cd	[10/N] Apply ruff UP035 rule (#165709 ) This is a follow-up of #165515. ruff `UP035` rules are applied to dynamo code to use Py 3.10+ typing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165709 Approved by: https://github.com/ezyang	2025-10-25 00:20:13 +00:00
zhxchen17	757975ad50	[export] Unified graph capture with fullgraph_capture. (#165562 ) Summary: _dynamo_graph_capture_for_export in the current form has the compability issue with the main torch.compile() path despite we reuse fullgraph_capture as the bytecode tracer. The reason is that we flip on many export specific flags and even trace with a wrapped function which will cause divergence with torch.compile() again. This PR instead creates a new implementation of dynamo_graph_capture_for_export which 100% relies on fullgraph capture and post-processing on CaptureOutput so that we can avoid the inversion of phases in PT2 compiler stack. This also benefits precompile workflow since we want to have a feature that only accepts pytree inputs and ship portable python wrappers in package. In other words, I think the code here is sharable between export and precompile for exporting portable graph. Test Plan: ===================================================================== test session starts ===================================================================== platform linux -- Python 3.12.11, pytest-7.3.2, pluggy-1.6.0 rootdir: /data/users/zhxchen17/pytorch configfile: pytest.ini plugins: xdoctest-1.1.0, hypothesis-5.35.1, xdist-3.3.1, subtests-0.13.1, rerunfailures-14.0, flakefinder-1.1.0, cpp-2.3.0, anyio-4.10.0 collected 9 items Running 9 items in this shard test/distributed/tensor/test_dtensor_export.py ........x [100%] ================================================================ 8 passed, 1 xfailed in 11.42s ================================================================ Reviewers: Subscribers: Tasks: Tags: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/165562 Approved by: https://github.com/tugsbayasgalan	2025-10-22 20:44:55 +00:00
Yuanyuan Chen	e595136187	Enable PLC1802 on ruff (#165813 ) This PR enables ruff check `PLC1802`, which detects len calls on sequences in a boolean test context. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165813 Approved by: https://github.com/ezyang	2025-10-18 05:44:14 +00:00
Yiming Zhou	e4d6c56ffb	Improve dynamo graph capture stack trace for custom ops (#165693 ) For a custom op ``` @torch.library.custom_op("my_lib::foo", mutates_args={}) def foo(x: torch.Tensor, y: torch.Tensor) -> torch.Tensor: return x + y ``` ppl could call `torch.ops.my_lib.foo()` or directly call `foo()` in the `forward` of an `nn.Module` These two calling conventions will lead to the same node in the output graph, but different stack traces. When directly calling `foo()`, the displayed stack_trace in the graph will be ``` # File: .../pytorch/torch/_library/custom_ops.py:687 in __call__, code: return self._opoverload(args, *kwargs) ``` This is not useful so we filter it out. ``` python test/functorch/test_aot_joint_with_descriptors.py -k test_custom_op_stack_trace ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165693 Approved by: https://github.com/SherlockNoMad, https://github.com/williamwen42	2025-10-18 03:48:18 +00:00
Michael Lazos	04e36611bb	[user-cuda-streams] Pass streams/events to the graph via lookup table (#162899 ) Stores streams in a global object look table that maps a dynamo selected index to objects. This index is generated during tracing, and at runtime, a helper function is called from the bytecode to populate this map. This differs from the previous implementation that simply mapped IDs to the associated objects. This required specialization on the IDs of the specific objects, while this new approach does not. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162899 Approved by: https://github.com/anijain2305 ghstack dependencies: #163027	2025-10-14 05:43:19 +00:00
Animesh Jain	a701c937bf	[dynamo][executorch] Return already added nn.Module during registration (#165338 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/165338 Approved by: https://github.com/tugsbayasgalan	2025-10-13 21:24:07 +00:00
William Wen	486b4d2414	[dynamo, nested graph breaks] move cell codegen before side effects codegen (#160601 ) This is needed because if we codegen cells for nested frames AFTER side effects, then reconstruction could get messed up. From below: >The added test case demonstrates the reconstruction failure if we kept cell codegen at the original place (only happens with nested graph breaks since we reconstruct nested frame cells from VariableTracker rather than directly using LOAD_CLOSURE). >At a high level, what happened before this change was that side_effects was pruning the cells (I don't recall exactly why this happens), and because cells were codegen'd after the side effects were applied, we were unable to properly reconstruct the cell. The error I was seeing was a list/tuple IndexError. Pull Request resolved: https://github.com/pytorch/pytorch/pull/160601 Approved by: https://github.com/mlazos	2025-10-08 22:02:52 +00:00
Maggie Moss	c855f8632e	Pyrefly suppressions 7/n (#164913 ) Adds suppressions to pyrefly will typecheck clean: https://github.com/pytorch/pytorch/issues/163283 Almost there! Test plan: dmypy restart && python3 scripts/lintrunner.py -a pyrefly check step 1: delete lines in the pyrefly.toml file from the project-excludes field step 2: run pyrefly check step 3: add suppressions, clean up unused suppressions before: https://gist.github.com/maggiemoss/4b3bf2037014e116bc00706a16aef199 after: INFO 0 errors (6,884 ignored) Pull Request resolved: https://github.com/pytorch/pytorch/pull/164913 Approved by: https://github.com/oulgen	2025-10-08 07:27:17 +00:00
Aaron Gokaslan	d1a62c8036	[BE][Ez]: Enable RUF007 Prefer itertools.pairwise over zip slicing (#164856 ) Now that our min version is 3.10 we can support this rule. This is more concise, readable, and efficient than the previous zip slicing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164856 Approved by: https://github.com/williamwen42	2025-10-07 22:51:17 +00:00
Yuanyuan Chen	35c4130fd1	[2/N] Fix ruff warnings (#164460 ) Apply ruff `SIM` rules. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164460 Approved by: https://github.com/ezyang	2025-10-04 03:40:32 +00:00
Animesh Jain	96c3b9e275	[dynamo] Use strings instead of modules for fqn info tracking (#164272 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/164272 Approved by: https://github.com/Skylion007, https://github.com/williamwen42, https://github.com/mlazos	2025-10-01 04:22:57 +00:00
William Wen	09c774145e	[dynamo, 3.14] Python dynamo changes to get basic programs working (#161839 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161839 Approved by: https://github.com/Lucaskabela, https://github.com/anijain2305 ghstack dependencies: #161838, #161555	2025-09-30 17:41:49 +00:00
Animesh Jain	d8becd1cf4	[dynamo][export] Make the source_stack and fqn info same between dynamo and export (#164085 ) preparing for landing the install_free_tensors flag Pull Request resolved: https://github.com/pytorch/pytorch/pull/164085 Approved by: https://github.com/tugsbayasgalan	2025-09-29 04:35:13 +00:00
Avik Chaudhuri	d70c0babf5	minimize graph capture output (#162211 ) Currently OutputGraphGuardsState is separated out as a serializable interface for OutputGraph, but some of the typing around it is incorrect in dynamo's guards.py and output_graph.py: more fields are used by code than claimed by OutputGraphGuardsState, and it works because either the full OutputGraph is passed in or the parts that use those fields are dead when OutputGraphGuardsState is passed in. In this PR we try to further separate the necessary fields of OutputGraph that should be retained by a full graph capture mechanism, not just limited to dynamo (as it is currently) but also something like make_fx (in the future). Since these fields do not need to be serialized, the result is an intermediate "common" data structure that is between OutputGraphGuardsState and OutputGraph in the inheritance hierarchy. Differential Revision: D81718791 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162211 Approved by: https://github.com/zhxchen17	2025-09-20 15:52:28 +00:00
PyTorch MergeBot	1302637a23	Revert "[dynamo][guards] Do not construct entire framelocals dict for LAMBDA_GUARD (#162525 )" This reverts commit `5f630d28d7`. Reverted https://github.com/pytorch/pytorch/pull/162525 on behalf of https://github.com/anijain2305 due to internal tests fail ([comment](https://github.com/pytorch/pytorch/pull/162525#issuecomment-3310748980))	2025-09-19 06:15:28 +00:00
Pian Pawakapan	4c007073e6	[dynamic shapes] DynamicInts prototype (#162194 ) Initial prototype for dynamic int inputs, allows users to run with `torch.compile(f)(DynamicInt(4))`, compiling dynamically and using the underlying hint at runtime. Current behavior: - Also works in eager (mostly by subclassing int), as scalar input to torch functions, or numpy/math/etc. For example, `x = DynamicInt(3); torch.randn(x); torch.add(y, z, alpha=x); np.arange(x)` all act as if x = 3. - Behavior for arithmetic ops is to return new DynamicInts rather than static ints; `DynamicInt(3) * 2 = DynamicInt(6)`. This is via SymNode magic methods, but coverage might not be 100% - for example, I had to explicitly override floordiv to avoid int casting. This is not necessarily the case for non-magic method ops (e.g. `math.cos(x)`). The alternative here is to int cast on all operations, but I opted for this for dynamism propagation in non-compiled regions. - Doesn't ban fullgraph=False; DynamicInt objects might be leaked back to the user, but I guess this is fine, because they can be casted to ints when needed? - Dynamo only allocates one symbol per DynamicInt; specifying the same DynamicInt for multiple inputs leads to input deduplication, and a guard installed. - We don't raise on int specialization (in allowlist/maybe_mark_dynamic style) - but an easy change if needed. - DynamicInts as nn.Module attributes are handled. - We don't guard on the DynamicInt id, e.g. users can do the following without recompiling (maybe we should guard?) ```python x = DynamicInt(4) f(x) f(1) f(DynamicInt(3)) # same as f(3) ``` Follow-up work: - Specifying shape constraints, either at the int-level, e.g. ```python DynamicInt(64, name="s0", constraints=["s0 % 32 == 0", "s0 <= 1024"] ``` or at the compilation level, e.g. something like ```python s0 = DynamicInt(64, name="s0") s1 = DynamicInt(128, name="s1") with some_compiler_config.dynamic_int_constraints(["s1 == 2*s0", "s0 % 32 == 0"]): f(s0, s1) ``` This should subsume the need for specifying derived SymInts? - SymFloat support - currently it seems backed floats are specialized by the tensorify float pass, and there's no handling in inductor. - Propagating dynamism in tensor constructors, e.g. `x = DynamicInt(4); torch.randn(x)` could annotate `_dynamo_dynamic_indices`. Differential Revision: D81698719 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162194 Approved by: https://github.com/bobrenjc93	2025-09-18 23:26:28 +00:00
bobrenjc93	ed3438ff13	Turn on capture_dynamic_output_shape_ops when fullgraph=True (#163123 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163123 Approved by: https://github.com/laithsakka ghstack dependencies: #163121	2025-09-18 21:24:15 +00:00
bobrenjc93	7dcb568c8f	Turn on capture_scalar_outputs when fullgraph=True (#163121 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163121 Approved by: https://github.com/laithsakka	2025-09-18 21:24:15 +00:00
Zhengxu Chen	bb3f3cc65e	[precompile] Store traced file information with CompileArtifacts. (#162983 ) Summary: Add some metadata to CompileArtifacts, so that it contains the source code information about the original code while they are being traced. For now, we will not provide a verification method to end user and instead we just provide which files are inlined. It's up to user to verify the content from these files are not changed (because it's optional for many users to validate source code changes anyway in aot precompile) Test Plan: buck run @mode/opt test/dynamo:test_dynamo -- -k test_file_change buck run @mode/opt test/dynamo:test_dynamo -- -k test_aot_compile_source_info Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/162983 Approved by: https://github.com/yushangdi	2025-09-16 18:27:48 +00:00
Tugsbayasgalan Manlaibaatar	6d65737aee	testing infra and some fixes (#162183 ) This PR is quite large in that it covers most of rough edges in the new strict export flow: 1. Handle nn_module_stack correctly now that we are tracing wrapper module 2. module_call_spec needs to get queried from source directly because we are not running the bytecode anymore. 3. Correct input and output handling. @diff-train-skip-merge Pull Request resolved: https://github.com/pytorch/pytorch/pull/162183 Approved by: https://github.com/zhxchen17	2025-09-10 20:48:12 +00:00
Animesh Jain	5f630d28d7	[dynamo][guards] Do not construct entire framelocals dict for LAMBDA_GUARD (#162525 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162525 Approved by: https://github.com/williamwen42 ghstack dependencies: #162509	2025-09-10 18:52:15 +00:00
PyTorch MergeBot	60d009267e	Revert "testing infra and some fixes (#162183 )" This reverts commit `d8b6622bb6`. Reverted https://github.com/pytorch/pytorch/pull/162183 on behalf of https://github.com/huydhn due to Failing a test on macos ([comment](https://github.com/pytorch/pytorch/pull/162183#issuecomment-3268922096))	2025-09-09 05:26:32 +00:00
Tugsbayasgalan Manlaibaatar	d8b6622bb6	testing infra and some fixes (#162183 ) This PR is quite large in that it covers most of rough edges in the new strict export flow: 1. Handle nn_module_stack correctly now that we are tracing wrapper module 2. module_call_spec needs to get queried from source directly because we are not running the bytecode anymore. 3. Correct input and output handling. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162183 Approved by: https://github.com/zhxchen17 ghstack dependencies: #162167	2025-09-09 02:42:11 +00:00
Tugsbayasgalan Manlaibaatar	047603d35b	New export implementation with flat inp/out (#162167 ) This is my first attempt of building new export API. The main thing it addresses is correctly getting input and output relations. Subsequent diffs willl add functionality for dynamic shapes, nn_module_stack etc. Differential Revision: [D81793205](https://our.internmc.facebook.com/intern/diff/D81793205) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162167 Approved by: https://github.com/zhxchen17, https://github.com/avikchaudhuri	2025-09-06 20:03:52 +00:00
Avik Chaudhuri	3c45af079a	kill allow_complex_guards_as_runtime_asserts (#161794 ) Summary: [reland] Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept). Test Plan: updated tests Rollback Plan: Differential Revision: D81334984 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161794 Approved by: https://github.com/zhxchen17	2025-09-04 00:17:01 +00:00
Raman-RH	20bfb2539d	Skip compilation when FX graph has no calls and returns empty (#160536 ) Fixes #160437 Summary: This PR avoids compiling empty FX graphs generated during graph breaks. If there are no calls in the graph, we can just return the empty list of instructions. More precisely, In compile_and_call_fx_graph, if the FX graph contains no calls (count_calls(self.graph) == 0) and the return value list is empty, we now return an empty instruction list immediately Impact: module: dynamo Pull Request resolved: https://github.com/pytorch/pytorch/pull/160536 Approved by: https://github.com/Lucaskabela	2025-09-01 08:32:22 +00:00
PyTorch MergeBot	47742081c9	Revert "kill allow_complex_guards_as_runtime_asserts (#160198 )" This reverts commit `69d91b94ba`. Reverted https://github.com/pytorch/pytorch/pull/160198 on behalf of https://github.com/jeffdaily due to let's revert again instead of waiting for forward fix, see earlier comments ([comment](https://github.com/pytorch/pytorch/pull/160198#issuecomment-3235165462))	2025-08-28 22:50:37 +00:00
Avik Chaudhuri	69d91b94ba	kill allow_complex_guards_as_runtime_asserts (#160198 ) Summary: Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept). Test Plan: updated tests Rollback Plan: Differential Revision: D79903317 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160198 Approved by: https://github.com/ezyang	2025-08-28 19:36:19 +00:00
PyTorch MergeBot	a8270dd124	Revert "kill allow_complex_guards_as_runtime_asserts (#160198 )" This reverts commit `196232bb93`. Reverted https://github.com/pytorch/pytorch/pull/160198 on behalf of https://github.com/atalman due to dynamo/test_activation_checkpointing.py::ActivationCheckpointingViaTagsTestsCUDA::test_compile_selective_checkpoint_triton_kernel_cuda [GH job link](https://github.com/pytorch/pytorch/actions/runs/17289619543/job/49074475338) [HUD commit link](`196232bb93`) ([comment](https://github.com/pytorch/pytorch/pull/160198#issuecomment-3234013520))	2025-08-28 15:40:37 +00:00
Avik Chaudhuri	196232bb93	kill allow_complex_guards_as_runtime_asserts (#160198 ) Summary: Since `allow_complex_guards_as_runtime_asserts` is now sync'd with `prefer_deferred_runtime_asserts_over_guards`, we can kill the former (especially since it was a export-only concept). Test Plan: updated tests Rollback Plan: Differential Revision: D79903317 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160198 Approved by: https://github.com/ezyang	2025-08-28 07:59:29 +00:00
Pian Pawakapan	97a548b640	[PGO] skip allowlist logging for empty graphs (#161530 ) Summary: reduces spurious logging Test Plan: test_pgo Rollback Plan: Differential Revision: D81060182 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161530 Approved by: https://github.com/bobrenjc93, https://github.com/mlazos	2025-08-28 00:12:13 +00:00
William Wen	6562646dab	[dynamo, nested graph breaks] clean up comments and codegen (#160138 ) Fix comments to reflect that we no longer codegen cells to be sent to resume function as inputs - they are instead codegen'd after the unsupported instruction in order to build resume functions that are closures. Also simplify some codegen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/160138 Approved by: https://github.com/anijain2305 ghstack dependencies: #159329, #159678, #159817	2025-08-27 21:53:37 +00:00
William Wen	d0a242e547	[dynamo, nested graph breaks] support nested closures (#159817 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159817 Approved by: https://github.com/anijain2305 ghstack dependencies: #159329, #159678	2025-08-27 21:53:37 +00:00
PyTorch MergeBot	a7aa480e55	Revert "[dynamo, nested graph breaks] support nested closures (#159817 )" This reverts commit `ef0ef6f93f`. Reverted https://github.com/pytorch/pytorch/pull/159817 on behalf of https://github.com/atalman due to failing internal tests ([comment](https://github.com/pytorch/pytorch/pull/159817#issuecomment-3225586996))	2025-08-26 20:13:33 +00:00
PyTorch MergeBot	caf98fde0d	Revert "[dynamo, nested graph breaks] clean up comments and codegen (#160138 )" This reverts commit `ac6316caaa`. Reverted https://github.com/pytorch/pytorch/pull/160138 on behalf of https://github.com/atalman due to failing internal tests ([comment](https://github.com/pytorch/pytorch/pull/160138#issuecomment-3225546707))	2025-08-26 20:01:26 +00:00
Zhengxu Chen	74124d1b46	[reland] [dynamo] Refactor convert_frame.compile_frame to be self contained function. [5/n] (#161514 ) Summary: convert_frame.compile_frame used to take a callback transform function which will capture the frame object it has, but the frame information is not passed directly into compile_frame function. This PR changes the signature of compile_frame so that frame information is directly passed in the function without taking a callback. This makes it easier to build fullgraph capture API on top of compile_frame. Test Plan: CI Rollback Plan: Differential Revision: D81041296 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161514 Approved by: https://github.com/tugsbayasgalan	2025-08-26 19:16:05 +00:00
Yidi Wu	6598f00c18	[dynamo] auto lift unbacked symbol in tensor's storage_offset (#161199 ) ```python import torch torch._dynamo.config.capture_scalar_outputs = True class M(torch.nn.Module): def forward(self, idx, x): u0 = idx.item() x0 = x.select(0, u0) def fn(): return x0.sin() return torch.cond(x0.sum() > 0, fn, fn) m = M() out = torch.compile(m, fullgraph=True)(torch.tensor(0, dtype=torch.int64, device="cuda"), torch.randn(3, 3, device="cuda")) print(out) ``` Before the PR, we didn't track the storage_offset symbol of a tensor. After https://github.com/pytorch/pytorch/pull/157605, we create an unbacked_symint for stroage_offset for the result of select. So when we try to lift the free basic symbols of x0 during speculating fn, we found a free symbol that's not bound to a proxy. This PR tracks the symbols of storage_offset and associated it with a proxy using torch.ops.aten.storage_offest. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161199 Approved by: https://github.com/zou3519 ghstack dependencies: #161198	2025-08-26 17:06:54 +00:00
Yidi Wu	ba6ce66698	[dynamo] lift backed symint output of item() (#161198 ) Before the change in this PR, we have an error for the following code ```python import torch torch._dynamo.config.capture_scalar_outputs = True class M(torch.nn.Module): def forward(self, idx, x): u0 = idx.item() x0 = x.select(0, u0) def fn(): return x0.sin() return torch.cond(x0.sum() > 0, fn, fn) m = M() out = torch.compile(m, fullgraph=True)(torch.tensor(0, dtype=torch.int64), torch.randn(3, 3)) ``` The error is caused when speculate fn, and tries to lift symbol of x0.storage_offset() but found the symbols doesn't have a source associated with it. What really happens is that, when input tensor is a scalar tensor of int type and resides on CPU, we have a short cut that creates a norm symint when .item() is called see https://github.com/pytorch/pytorch/pull/126245. However, previously, we only track the unbacked symint output of an operation because we believe all the backed symint must have a source associated with it and has already bee lifted as input at the top-level. Now this invariant no longer holds, so we end up an error saying the symbol doesn't have source (because only input and symbols derided from inputs have source and result of .item() doesn't have a source). In this PR, we start to also track the normal symint with the proxy that created it (i.e. in this case the proxy .item()). Pull Request resolved: https://github.com/pytorch/pytorch/pull/161198 Approved by: https://github.com/zou3519	2025-08-26 17:06:54 +00:00
PyTorch MergeBot	e795450a35	Revert "[dynamo] Refactor convert_frame.compile_frame to be self contained function. [5/n] (#160900 )" This reverts commit `447d34b5f8`. Reverted https://github.com/pytorch/pytorch/pull/160900 on behalf of https://github.com/atalman due to reverting since can't land existing diff internally, will need to reland it ([comment](https://github.com/pytorch/pytorch/pull/160900#issuecomment-3224029031))	2025-08-26 12:45:59 +00:00
William Wen	ac6316caaa	[dynamo, nested graph breaks] clean up comments and codegen (#160138 ) Fix comments to reflect that we no longer codegen cells to be sent to resume function as inputs - they are instead codegen'd after the unsupported instruction in order to build resume functions that are closures. Also simplify some codegen. Pull Request resolved: https://github.com/pytorch/pytorch/pull/160138 Approved by: https://github.com/anijain2305 ghstack dependencies: #157971, #159281, #144516, #159329, #159678, #159817	2025-08-26 00:58:38 +00:00
William Wen	ef0ef6f93f	[dynamo, nested graph breaks] support nested closures (#159817 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159817 Approved by: https://github.com/anijain2305 ghstack dependencies: #157971, #159281, #144516, #159329, #159678	2025-08-26 00:58:28 +00:00
William Wen	2df9b437e3	[dynamo, nested graph breaks] implement new resume frame stack/locals/cell layout convention (#157971 ) The comments/conventions are not exactly correct here, as the implementation at this PR is partial. They will be fixed in #160138. No tests added, since there shouldn't be any overall semantic changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157971 Approved by: https://github.com/anijain2305	2025-08-26 00:57:39 +00:00
zhxchen17	447d34b5f8	[dynamo] Refactor convert_frame.compile_frame to be self contained function. [5/n] (#160900 ) convert_frame.compile_frame used to take a callback transform function which will capture the frame object it has, but the frame information is not passed directly into compile_frame function. This PR changes the signature of compile_frame so that frame information is directly passed in the function without taking a callback. This makes it easier to build fullgraph capture API on top of compile_frame. @exported-using-ghexport Differential Revision: [D80469801](https://our.internmc.facebook.com/intern/diff/D80469801/) Differential Revision: [D80469801](https://our.internmc.facebook.com/intern/diff/D80469801) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160900 Approved by: https://github.com/tugsbayasgalan, https://github.com/anijain2305	2025-08-25 23:16:21 +00:00
PyTorch MergeBot	3e210f90c2	Revert "[dynamo] Refactor convert_frame.compile_frame to be self contained function. [5/n] (#160900 )" This reverts commit `1113e7de30`. Reverted https://github.com/pytorch/pytorch/pull/160900 on behalf of https://github.com/atalman due to executorch failure ([comment](https://github.com/pytorch/pytorch/pull/160900#issuecomment-3221372096))	2025-08-25 18:56:18 +00:00

1 2 3 4 5 ...

539 Commits