pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Lucas Kabela	979e10f7d6	[Bugfix] Match eager stride semantics for cloned tensors with preserve_format in compile (#163017 ) Fixes #161010 by making `clone_meta` match the semantics of strides for eager mode. This is: * Case 1: Tensor is_non_overlapping_and_dense; in this case, stride should match input tensor stride * Case 2: Otherwise, stride should be contiguous computed from input tensor using `compute_elementwise_output_strides` Pull Request resolved: https://github.com/pytorch/pytorch/pull/163017 Approved by: https://github.com/williamwen42, https://github.com/xmfan Co-authored-by: morrison-turnansky <mturnans@redhat.com>	2025-09-19 19:41:33 +00:00
Prachi Gupta	c0142f5c06	[ROCm] Enabling several UTs (#161715 ) All these UTs are working as is, just removing the skip - test_p2p_ipc - test_repros.py: working, added fp8 support - test_activation_checkpointing.py - test_content_store.py - test_cuda_multigpu.py - test_compute_comm_reordering.py - test_segment_reductions.py - test_dataloader.py - test_math_ops.py - test_loop_ordering.py - test_control_flow.py - distributed_test.py - test_mem_tracker.py - test_fsdp_optim_state.py - test_fully_shard_mixed_precision.py: skippped for < ROCm7.0 - test_aot_inductor_custom_ops.py - test_c10d_ops_nccl.py - test_eager_transforms.py - test_sparse_csr.py - test_inductor_collectives.py - test_fake_tensor.py - test_cupy_as_tensor.py - test_cuda.py: enable UTs that are working - test_matmul_cuda.py: enable UTs that are working Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/161715 Approved by: https://github.com/msaroufim Co-authored-by: Mark Saroufim <marksaroufim@fb.com>	2025-09-09 15:49:21 +00:00
William Wen	26a1b9cce2	[dynamo] fix resume_execution.py KeyError in Python 3.11+ (#162318 ) Fixes https://github.com/pytorch/pytorch/issues/162313 Differential Revision: [D81938289](https://our.internmc.facebook.com/intern/diff/D81938289) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162318 Approved by: https://github.com/Lucaskabela, https://github.com/mlazos, https://github.com/anijain2305	2025-09-08 20:26:24 +00:00
PyTorch MergeBot	8235c4f65d	Revert "[ROCm] Enabling several UTs (#161715 )" This reverts commit `b9ba612f7a`. Reverted https://github.com/pytorch/pytorch/pull/161715 on behalf of https://github.com/jeanschmidt due to Need to revert in order to revert https://github.com/pytorch/pytorch/pull/159473, feel free to merge it back once conflicts are cleared ([comment](https://github.com/pytorch/pytorch/pull/161715#issuecomment-3264040604))	2025-09-07 21:03:17 +00:00
Prachi Gupta	b9ba612f7a	[ROCm] Enabling several UTs (#161715 ) All these UTs are working as is, just removing the skip - test_p2p_ipc - test_repros.py: working, added fp8 support - test_activation_checkpointing.py - test_content_store.py - test_cuda_multigpu.py - test_compute_comm_reordering.py - test_segment_reductions.py - test_dataloader.py - test_math_ops.py - test_loop_ordering.py - test_control_flow.py - distributed_test.py - test_mem_tracker.py - test_fsdp_optim_state.py - test_fully_shard_mixed_precision.py: skippped for < ROCm7.0 - test_aot_inductor_custom_ops.py - test_c10d_ops_nccl.py - test_eager_transforms.py - test_sparse_csr.py - test_inductor_collectives.py - test_fake_tensor.py - test_cupy_as_tensor.py - test_cuda.py: enable UTs that are working - test_matmul_cuda.py: enable UTs that are working Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/161715 Approved by: https://github.com/pruthvistony, https://github.com/jeffdaily	2025-09-04 20:43:03 +00:00
Animesh Jain	600c25e9a1	[dynamo] Graph break on torch.cuda.sychronize (#161925 ) Today, AOTDispatcher ignores cuda.synchornize. Even if we wrap it in some HOP, we need it to be a barrier op to prevent any inductor reordering. So graph breaking. Fixes https://github.com/pytorch/pytorch/issues/160751 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161925 Approved by: https://github.com/zou3519, https://github.com/jansel, https://github.com/mlazos	2025-09-02 19:00:21 +00:00
can-gaa-hou	c0ed87c82d	[Dynamo] Fix weakref.proxy error when `torch.compile` (#161508 ) Fixes #159258 The error occurs when we attempt to create a weak reference from a weak reference proxy. `e9d42b3880/torch/_dynamo/guards.py (L2910-L2915)` In fact, we shouldn't create a weak reference from another reference or proxy, as it would check in CPython. `f60f8225ed/Objects/weakrefobject.c (L410-L418)` However, `__weakrefoffset__` is not equal to 0 when the `guarded_object` is in `weakref.ProxyTypes`, and it will wrongly create a weak reference for the `weakref.ProxyTypes`. I think this could be a bug from CPython, but we can prevent it by adding more weakref type checks (`weakref.ProxyTypes` contains `weakref.ProxyType` and `weakref.CallableProxyType`) here. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161508 Approved by: https://github.com/Lucaskabela, https://github.com/anijain2305, https://github.com/malfet	2025-08-28 22:34:18 +00:00
Animesh Jain	3d406429b0	[dynamo][vllm] Support typing.get_type_hints (#161362 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161362 Approved by: https://github.com/Skylion007, https://github.com/StrongerXi, https://github.com/jansel	2025-08-27 09:55:31 +00:00
Michael Lazos	be55d7ac9e	Revert "[Dynamo] Allow inlining into AO quantization modules (#152934 )" (#161567 ) This reverts commit `20e2ca3e29`. Fixes https://github.com/pytorch/pytorch/issues/157434 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161567 Approved by: https://github.com/Lucaskabela	2025-08-27 03:33:04 +00:00
William Wen	b074cbaedd	[dynamo] allow resume functions to have name in both freevars and varnames (#161544 ) fixes https://github.com/pytorch/pytorch/issues/161542 Differential Revision: [D81073109](https://our.internmc.facebook.com/intern/diff/D81073109) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161544 Approved by: https://github.com/StrongerXi, https://github.com/anijain2305	2025-08-27 00:25:16 +00:00
Simon Fan	8aad3a60ce	[dynamo] propagate tensor metadata on Tensor.__setitem__(tensor) (#161036 ) Fixes silent incorrectness for autograd function tracing, where we rely on FakeTensor metadata (requires_grad) to determine whether to HOP or not: `5ee464db5c/torch/_dynamo/variables/misc.py (L671)` Stared at this with @anijain2305 yesterday, `Tensor.__setitem__` can update tensor metadata, and we can just run the fake prop and extract the output metadata from the updated FakeTensor. FIXES https://github.com/pytorch/pytorch/issues/160901 It should also be the root cause behind the issue in https://github.com/pytorch/torchtitan/pull/1604 @bdhirsh @ruisizhang123 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161036 Approved by: https://github.com/anijain2305 ghstack dependencies: #160805	2025-08-22 04:43:22 +00:00
Peter Y. Yeh	e389a08dcd	AMD/ROCm OCP Micro-scaling Format (mx-fp8/mx-fp4) Support (#151360 ) - This pull request introduces support for the [OCP Micro-scaling (MX) format](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf), with a focus on compatibility with AMD ROCm 7.0 and the gfx950 architecture. This PR also establishes the foundation for enabling MX-FPX features in [TorchAO](https://github.com/pytorch/ao/issues/2229) on the AMD platform. - Validation (ROCm 7.0 + gfx950 required): `111 relevant tests passing.` > PYTORCH_TEST_WITH_ROCM=1 python test/test_matmul_cuda.py -k test_blockwise -v Co-author: @jagadish-amd — Thank you for the efforts leading validation on gfx950 with ROCm 7.0. ----------------------------------- This pull request introduces support for new scalar types and scaling methods, particularly for ROCm 7.0 and gfx950, and refines testing for these features. Key changes include adding constraints for matrix dimensions, enabling block-wise scaling, and updating tests to accommodate new data types. ### Support for new scalar types and scaling methods: * [`aten/src/ATen/cuda/CUDABlas.cpp`](diffhunk://#diff-74fcb26047c1df4024105d36ce22a36b77cf8cc93c28631d743e639b3d6066aeR1876-R1885): Added constraints for matrix dimensions when using `Float8_e8m0fnu` with block-wise scaling, ensuring dimensions are multiples of 32. Updated compatibility checks to support ROCm 7.0 for `Float8_e8m0fnu` and `Float8_e4m3fn`. [[1]](diffhunk://#diff-74fcb26047c1df4024105d36ce22a36b77cf8cc93c28631d743e639b3d6066aeR1876-R1885) [[2]](diffhunk://#diff-74fcb26047c1df4024105d36ce22a36b77cf8cc93c28631d743e639b3d6066aeL1913-R1934) * [`aten/src/ATen/native/cuda/Blas.cpp`](diffhunk://#diff-e8a569efee1e650172f120a0fdcda024fe3e4703a4ee3336425c8f685af6b3abR1276-R1290): Introduced block-wise scaling for `Float8_e8m0fnu`, with checks for ROCm 7.0 and GPU architecture `gfx950`. Added validation for supported scalar types and matrix dimensions. [[1]](diffhunk://#diff-e8a569efee1e650172f120a0fdcda024fe3e4703a4ee3336425c8f685af6b3abR1276-R1290) [[2]](diffhunk://#diff-e8a569efee1e650172f120a0fdcda024fe3e4703a4ee3336425c8f685af6b3abR1349-R1364) ### Updates to scalar type mappings: * [`aten/src/ATen/cuda/CUDADataType.h`](diffhunk://#diff-9188bb13b1a49f459141f5f9b875593d1c5ce2beb5ad711fdbaf5bc7089ec015L93-R93): Extended scalar type mappings to support `Float4_e2m1fn_x2` for ROCm 7.0. * [`aten/src/ATen/cuda/tunable/GemmHipblaslt.h`](diffhunk://#diff-bfa1a3b5d4bef1892bf50338775f3b0fd8cd31fc1868148f3968b98aefb68e3fR88-R96): Added a constexpr mapping for `Float4_e2m1fn_x2` based on ROCm version. ### Enhancements to testing(@jagadish-amd): * [`test/test_matmul_cuda.py`](diffhunk://#diff-3f31c52b48cfddf8f4617d809f7695b2e4a1c78656f8c4b5143a4b45d01fcf23R765-R766): Updated tests to include new scalar types (`Float4_e2m1fn_x2`) and recipes (`mxfp4`). Added logic to handle different scaling recipes and validate compatibility with ROCm and CUDA versions. [[1]](diffhunk://#diff-3f31c52b48cfddf8f4617d809f7695b2e4a1c78656f8c4b5143a4b45d01fcf23R765-R766) [[2]](diffhunk://#diff-3f31c52b48cfddf8f4617d809f7695b2e4a1c78656f8c4b5143a4b45d01fcf23L1331-R1356) F592e669L1353R1472) These changes improve compatibility with newer hardware and software versions, enhance functionality for matrix operations, and ensure robust testing for the added features. Pull Request resolved: https://github.com/pytorch/pytorch/pull/151360 Approved by: https://github.com/drisspg, https://github.com/malfet	2025-08-18 16:43:09 +00:00
Simon Fan	c8205cb354	[autograd] match 0-dim gradients device type regardless of subclassness (#160165 ) Not sure if there some subclasses where the outer.dim() == 0 but you wouldn't want to move it? FIXES https://github.com/pytorch/pytorch/issues/160084 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160165 Approved by: https://github.com/ezyang, https://github.com/albanD	2025-08-11 17:57:32 +00:00
William Wen	fd606a3a91	[dynamo] update pytorch-labs -> meta-pytorch in graph break URLs (#159975 ) Related PR: https://github.com/meta-pytorch/compile-graph-break-site/pull/30 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159975 Approved by: https://github.com/Lucaskabela	2025-08-06 23:57:31 +00:00
Lucas Kabela	5d89634ca8	Graph break with error message (#158800 ) Fixes #157452 Test with ``` python test/dynamo/test_repros.py ReproTests.test_nn_parameter_ctor_graph_breaks ``` ### Release Notes Change to nn.Parameter Constructor Behavior in Dynamo Semantic change introduced in the nn.Parameter constructor; previously, if the constructor lacked a clean source, the system would attempt to infer arguments to construct a clone and lift this synthetic proxy in the computation graph. This approach had many potential edge cases and was difficult to reason about. The new behavior defaults to graph breaking when the nn.Parameter constructor does not have a clean source. Users are now suggested to manually move the constructor out of the graph in such cases. This change improves clarity and reduces complexity in graph construction and debugging. Users can escape hatch to old semantics with `torch.dynamo.config.graph_break_on_nn_param_ctor=False` if this cannot be done. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158800 Approved by: https://github.com/anijain2305	2025-07-29 17:34:49 +00:00
Xu Han	dfcb07bdfa	[Inductor] disable windows failed UTs temporary. (#159163 ) Disable windows failed UTs temporary. <img width="1238" height="107" alt="image" src="https://github.com/user-attachments/assets/c8a40408-a793-4016-99bb-19c1bb09860a" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/159163 Approved by: https://github.com/desertfire	2025-07-25 20:25:36 +00:00
PyTorch MergeBot	8d2a1d6e18	Revert "Graph break with error message (#158800 )" This reverts commit `cae4746952`. Reverted https://github.com/pytorch/pytorch/pull/158800 on behalf of https://github.com/clee2000 due to broke some tests on main inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return4 [GH job link](https://github.com/pytorch/pytorch/actions/runs/16507837934/job/46685704688) [HUD commit link](`cae4746952`), note to self: bad TD, but also dynamo/test_repros failed but didn't get skipped by TD so maybe a landrace, or I just blaming the wrong commit entirely.. ([comment](https://github.com/pytorch/pytorch/pull/158800#issuecomment-3115224608))	2025-07-24 22:45:58 +00:00
Lucas Kabela	cae4746952	Graph break with error message (#158800 ) Fixes #157452 Test with ``` python test/dynamo/test_repros.py ReproTests.test_nn_parameter_ctor_graph_breaks ``` ### Release Notes Change to nn.Parameter Constructor Behavior in Dynamo Semantic change introduced in the nn.Parameter constructor; previously, if the constructor lacked a clean source, the system would attempt to infer arguments to construct a clone and lift this synthetic proxy in the computation graph. This approach had many potential edge cases and was difficult to reason about. The new behavior defaults to graph breaking when the nn.Parameter constructor does not have a clean source. Users are now suggested to manually move the constructor out of the graph in such cases. This change improves clarity and reduces complexity in graph construction and debugging. Users can escape hatch to old semantics with `torch.dynamo.config.graph_break_on_nn_param_ctor=False` if this cannot be done. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158800 Approved by: https://github.com/anijain2305	2025-07-24 21:05:17 +00:00
Animesh Jain	1b456c580d	[dynamo][guards] Add type info of the guarded value in guard managers (#158765 ) tlparse looks like this <img width="1165" height="226" alt="image" src="https://github.com/user-attachments/assets/04c4e6b1-34a3-4d9d-8304-6eb6d9a94980" /> This will aid in reading guards. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158765 Approved by: https://github.com/Lucaskabela, https://github.com/StrongerXi	2025-07-23 16:59:15 +00:00
William Wen	7d2ceaff21	[dynamo] skip tracing functions registered in sys.monitoring (#158171 ) Fixes https://github.com/pytorch/pytorch/issues/158164 This was fixed by applying `skip_code_recursive` to any function registered to `sys.monitoring` (via `PyThreadState_GET()->interp->monitoring_callables`). This check is done whenever we attempt to set the eval frame callback from Python. Microbenchmark: `benchmarks/dynamo/microbenchmarks/overheads.py`: BEFORE: ``` requires_grad=False eager 7.1us (warmup=0.0s) compiled 24.6us (warmup=10.0s) requires_grad=True eager 8.9us (warmup=0.0s) compiled 57.8us (warmup=0.1s) inference_mode() eager 6.5us (warmup=0.0s) compiled 23.4us (warmup=0.1s) ``` AFTER: ``` requires_grad=False eager 7.0us (warmup=0.0s) compiled 23.2us (warmup=15.2s) requires_grad=True eager 9.0us (warmup=0.0s) compiled 55.1us (warmup=0.1s) inference_mode() eager 6.4us (warmup=0.0s) compiled 22.2us (warmup=0.1s) ``` Followup thought: how do we let users know that a frame is skipped because the code object is a callable registered to sys.monitoring? (or any other reason?) Differential Revision: [D78530528](https://our.internmc.facebook.com/intern/diff/D78530528) Pull Request resolved: https://github.com/pytorch/pytorch/pull/158171 Approved by: https://github.com/jansel	2025-07-22 18:02:30 +00:00
Simon Fan	07c4c2a792	[dynamo][be] hide warnings without invalidating warnings cache (#158520 ) I feel uneasy about touching `__warningregistry__` since it is undocumented and private surface. The only public API hook that doesn't increment warnings version seems to be https://docs.python.org/3/library/warnings.html#warnings.showwarning. So we could wack a mole all the warnings muters in compile to just not display warnings, and we wouldn't invalidate warnings cache. This PR adds it for torch/_dynamo, and I didn't find any warnings versioning mutation from torch/_inductor. There is a behavior change if someone calls a compiled graph with simplefilter("error"): ```python # e.g. test/dynamo_expected_failures/TestAutogradFallback.test_no_autograd_kernel_inplace_mode_nothing with warnings.catch_warnings(): warnings.simplefilter("error") # turns all warnings into errors compiled_fn() # will throw if any of the muted warnings fire ``` FIXES https://github.com/pytorch/pytorch/issues/128427 A note for the future: The warnings module doesn't offer a thread safe way of using it. Even regular filters have this problem, directly editing `__warningregistry__` would be very bad, and this PR would mute all threads. Someone will need to build a thread safe warnings interface. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158520 Approved by: https://github.com/anijain2305, https://github.com/zou3519	2025-07-18 22:02:31 +00:00
Sam Larsen	bff69f25c2	[BE][testing] fix test/dynamo/test_repros:test_longtensor_list (#158458 ) Summary: This test is failing internally because the number of underlying calls to the rng differ by virtue of various library initializations that get sucked in with an internal build. Test Plan: `buck test '@fbcode//mode/opt' fbcode//caffe2/test/dynamo:test_dynamo -- --exact 'caffe2/test/dynamo:test_dynamo - test_repros.py::ReproTests::test_longtensor_list' --run-disabled` Pull Request resolved: https://github.com/pytorch/pytorch/pull/158458 Approved by: https://github.com/jansel	2025-07-17 17:27:00 +00:00
Simon Fan	7cf31b4a42	[dynamo] fix NamedTupleVariable cloning (#158190 ) FIXES https://github.com/pytorch/pytorch/issues/157945 ## Explanation 1. Some VTs add additional attrs e.g. NamedTupleVariable has "dynamic_attributes" `a0308edb6c/torch/_dynamo/variables/lists.py (L1048-L1051)` 2. VT.clone passes everything by dict, includes "dynamic_attributes" `a0308edb6c/torch/_dynamo/variables/base.py (L255-L259)` 3. Non-handled args become kwargs in VT's `__init__`, `super().__init__()` passes kwargs to Base VT `a0308edb6c/torch/_dynamo/variables/lists.py (L1048-L1051)` 4. Base VT's `__init__` gets unexpected "dynamic_attributes" kwarg `a0308edb6c/torch/_dynamo/variables/base.py (L609-L613)` You could also let Base VT's `__init__` ignore additional kwargs, but that seemed a bit too permissive, and I don't think many VT's add these derived class only attrs. ## After fix ```python ===== __compiled_fn_1_7f9541ed_e166_43fe_8322_c5225ce4207f ===== /home/xmfan/core/miniconda3/envs/0712/lib/python3.12/site-packages/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module): def forward(self, L_x_: "f32[4, 8, 6][48, 6, 1]cpu"): l_x_ = L_x_ # File: /home/xmfan/core/a/torchtitan/wtf.py:10 in forward, code: U, S = torch.linalg.svd(x)[:2] linalg_svd = torch._C._linalg.linalg_svd(l_x_); l_x_ = None U: "f32[4, 8, 8][64, 1, 8]cpu" = linalg_svd[0] S: "f32[4, 6][6, 1]cpu" = linalg_svd[1]; linalg_svd = None # File: /home/xmfan/core/a/torchtitan/wtf.py:11 in forward, code: reduced = U[:, :, :self.k] @ torch.diag_embed(S[:, :self.k]) getitem_3: "f32[4, 8, 5][64, 1, 8]cpu" = U[(slice(None, None, None), slice(None, None, None), slice(None, 5, None))]; U = None getitem_4: "f32[4, 5][6, 1]cpu" = S[(slice(None, None, None), slice(None, 5, None))]; S = None diag_embed: "f32[4, 5, 5][25, 5, 1]cpu" = torch.diag_embed(getitem_4); getitem_4 = None reduced: "f32[4, 8, 5][40, 5, 1]cpu" = getitem_3 @ diag_embed; getitem_3 = diag_embed = None return (reduced,) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/158190 Approved by: https://github.com/StrongerXi	2025-07-14 23:39:25 +00:00
Huy Do	2af7c67e48	Mitigate some flaky tests in trunk (#157756 ) (not really fix these issues, but we should be able to close them. This also allows CI from the PR to test them) Fixes https://github.com/pytorch/pytorch/issues/156579 Fixes https://github.com/pytorch/pytorch/issues/156580 Fixes https://github.com/pytorch/pytorch/issues/126867 Pull Request resolved: https://github.com/pytorch/pytorch/pull/157756 Approved by: https://github.com/clee2000	2025-07-08 07:07:11 +00:00
Xuehai Pan	02715d0876	[BE][5/6] fix typos in test/ (test/dynamo/) (#157639 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157639 Approved by: https://github.com/yewentao256, https://github.com/jansel ghstack dependencies: #157638	2025-07-06 06:34:25 +00:00
William Wen	52e4e41cbc	[dynamo] do not issue lru_cache warning for functions in the top-level torch namespace (#157598 ) `lru_cache` usage warning was being raised for `torch.get_device_module()`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157598 Approved by: https://github.com/Sidharth123-cpu	2025-07-04 08:17:50 +00:00
Nikita Shulga	5e636d664a	[BE] `@serialTest` decorator must be called (#157388 ) Otherwise it turns test into a trivial one(that always succeeds), as following example demonstrates ```python import torch from torch.testing._internal.common_utils import serialTest, run_tests, TestCase class MegaTest(TestCase): @serialTest def test_foo(self): if hasattr(self.test_foo, "pytestmark"): print("foo has attr and it is", self.test_foo.pytestmark) print("foo") @serialTest() def test_bar(self): if hasattr(self.test_bar, "pytestmark"): print("bar has attr and it is", self.test_bar.pytestmark) print("bar") if __name__ == "__main__": run_tests() ``` That will print ``` test_bar (__main__.MegaTest.test_bar) ... bar has attr and it is [Mark(name='serial', args=(), kwargs={})] bar ok test_foo (__main__.MegaTest.test_foo) ... ok ---------------------------------------------------------------------- Ran 2 tests in 0.013s ``` Added assert that arg is boolean in the decorator to prevent such silent skips in the future Pull Request resolved: https://github.com/pytorch/pytorch/pull/157388 Approved by: https://github.com/clee2000	2025-07-02 19:15:19 +00:00
William Wen	bdb7819166	[dynamo, nested graph breaks] remove recursive cell/freevar in instruction tx (#154078 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154078 Approved by: https://github.com/StrongerXi, https://github.com/jansel	2025-07-02 13:36:14 +00:00
Ryan Guo	a4b59498c5	Fix fake kernel for the `out=...` variant of `unbind_copy` (#156643 ) `unbind_copy(..., out=...)` returns None rather than the `out` argument (see https://github.com/pytorch/pytorch/issues/130829#issuecomment-2283936222), but the old fake kernel didn't account for that and caused an assertion failure in `pushPyOutToStack`. This patch fixes that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/156643 Approved by: https://github.com/zou3519, https://github.com/jansel, https://github.com/bdhirsh ghstack dependencies: #156642	2025-06-27 01:34:07 +00:00
Ryan Guo	89aa708b39	[core] Dispatch to `at::nansum_out` rather than `at::native::nansum_out` (#156642 ) Calling `at::native::nansum_out` causes the fake kernel to dispatch to a `make_reduction` call and then segfaults later due to the `mutable_data_ptr` call in `TensorIteratorBase::build`. It also causes fake tensor propagation issue in Dynamo. The added tests demonstrate the aforementioned 2 issues. This patch fixes it by dispatching to `at::nansum_out` instead. Pull Request resolved: https://github.com/pytorch/pytorch/pull/156642 Approved by: https://github.com/zou3519	2025-06-27 01:34:07 +00:00
William Wen	6089ebcf6d	[dynamo] fix segfault due to dangling CacheEntry backend pointer (#156527 ) Fixes https://github.com/pytorch/pytorch/issues/155057 Pull Request resolved: https://github.com/pytorch/pytorch/pull/156527 Approved by: https://github.com/anijain2305, https://github.com/jansel	2025-06-26 23:51:08 +00:00
PyTorch MergeBot	9fe2d156a9	Revert "[dynamo] fix segfault due to dangling CacheEntry backend pointer (#156527 )" This reverts commit `5ad2bee2c8`. Reverted https://github.com/pytorch/pytorch/pull/156527 on behalf of https://github.com/Camyll due to failing test assertions ([comment](https://github.com/pytorch/pytorch/pull/156527#issuecomment-3009231797))	2025-06-26 17:32:34 +00:00
Ryan Guo	d06a406656	[dynamo] Graph break on `torch.Tensor.data` assignment with mismatched dtype (#156623 ) Fixes #152162. Discussed with @bdhirsh and decided this is the easiest workaround for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/156623 Approved by: https://github.com/bdhirsh	2025-06-25 02:03:04 +00:00
PyTorch MergeBot	1dc1eedd43	Revert "[dynamo] Graph break on `torch.Tensor.data` assignment with mismatched dtype (#156623 )" This reverts commit `c1ad4b8e7a`. Reverted https://github.com/pytorch/pytorch/pull/156623 on behalf of https://github.com/albanD due to Breaks Dynamo tests in trunk ([comment](https://github.com/pytorch/pytorch/pull/156623#issuecomment-3001806841))	2025-06-24 20:44:42 +00:00
Ryan Guo	c1ad4b8e7a	[dynamo] Graph break on `torch.Tensor.data` assignment with mismatched dtype (#156623 ) Fixes #152162. Discussed with @bdhirsh and decided this is the easiest workaround for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/156623 Approved by: https://github.com/bdhirsh	2025-06-24 19:33:11 +00:00
William Wen	5ad2bee2c8	[dynamo] fix segfault due to dangling CacheEntry backend pointer (#156527 ) Fixes https://github.com/pytorch/pytorch/issues/155057 Pull Request resolved: https://github.com/pytorch/pytorch/pull/156527 Approved by: https://github.com/anijain2305, https://github.com/jansel	2025-06-24 17:57:14 +00:00
Xuehai Pan	6d5c789ad5	[BE][PYFMT] migrate PYFMT for `test/[a-h]*/` to `ruff format` (#144555 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144555 Approved by: https://github.com/ezyang ghstack dependencies: #144551, #144554	2025-06-24 04:53:54 +00:00
Sidharth	a9ef7c4d04	[dynamo] update to lru_cache message and updated user stack trace in debug mode (#156639 ) I had to create a new PR for this because of @atalman request of temporary reverting the previous PR to restore diff train sync. Nothing has changed from this PR and the original one. Pull Request resolved: https://github.com/pytorch/pytorch/pull/156639 Approved by: https://github.com/atalman	2025-06-24 01:52:13 +00:00
atalman	ee4d343499	Revert "[dynamo] handle fullgraph toggle using nested torch.compile (#155166 )" (#156624 ) This reverts changes to [test/dynamo/test_repros.py](https://github.com/pytorch/pytorch/compare/main...atalman:revert_only_portion_of_file?expand=1#diff-4c82a5798a61d4cceb176b2700ba6fdd7c3e72d575b8e7e22458589139459caa) Missed by: `ee3d9969cc (diff-036cb21341ff8e390cc250e74fe9e3f0f15f259ea4bec4abcce49d95febf1553)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/156624 Approved by: https://github.com/Camyll	2025-06-23 19:30:08 +00:00
PyTorch MergeBot	55ef7b15e0	Revert "[dynamo] fixes to lru_cache message and adding user stack trace in debug mode (#156463 )" This reverts commit `afbf5420b8`. Reverted https://github.com/pytorch/pytorch/pull/156463 on behalf of https://github.com/atalman due to This is temoprary revert, to restore diff train sync. We should be good to reland this change ([comment](https://github.com/pytorch/pytorch/pull/156463#issuecomment-2997335541))	2025-06-23 17:44:36 +00:00
Sidharth	afbf5420b8	[dynamo] fixes to lru_cache message and adding user stack trace in debug mode (#156463 ) This PR refers to the issue: https://github.com/pytorch/pytorch/issues/155352 This PR uses torch._dynamo.utils.warn_once so that this warning only emits once, clarifies in the warning that silent incorrectness is potential, not observed, Doesn't warn for functions that come from torch.* As of right now with this code change the terminal outputs: if the code came from torch.* : Nothing, as we shouldn't warn for functions that come from torch.* else: /data/users/ssubbarao8/pytorch/torch/_dynamo/variables/functions.py:1565: UserWarning: Dynamo detected a call to a `functools.lru_cache`-wrapped function. Dynamo ignores the cache wrapper and directly traces the wrapped function. Silent incorrectness is only a potential risk, not something we have observed. Enable TORCH_LOGS="+dynamo" for a DEBUG stack trace. torch._dynamo.utils.warn_once(msg) If the user runs the command 'TORCH_LOGS="+dynamo" python foo4.py', in the debug logs it shows(this log below is based on chillee's repro: /data/users/ssubbarao8/pytorch/torch/_dynamo/variables/functions.py:1565: UserWarning: Dynamo detected a call to a `functools.lru_cache`-wrapped function. Dynamo ignores the cache wrapper and directly traces the wrapped function. Silent incorrectness is only a potential risk, not something we have observed. Enable TORCH_LOGS="+dynamo" for a DEBUG stack trace. torch._dynamo.utils.warn_once(msg) V0619 21:00:16.504000 956424 torch/_dynamo/variables/functions.py:1575] [0/0] call to a lru_cache` wrapped function from user code at: /data/users/ssubbarao8/pytorch/foo4.py:9 V0619 21:00:16.504000 956424 torch/_dynamo/variables/functions.py:1575] [0/0] File "/data/users/ssubbarao8/pytorch/foo4.py", line 9, in <module> V0619 21:00:16.504000 956424 torch/_dynamo/variables/functions.py:1575] [0/0] torch.compile(foo, backend="eager")(torch.randn(4)) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156463 Approved by: https://github.com/williamwen42	2025-06-22 11:40:28 +00:00
Arsh Zahed	c09b054878	Add runtime profiler info for AOTDispatcher prologue (#155785 ) Fixes #155721 Pull Request resolved: https://github.com/pytorch/pytorch/pull/155785 Approved by: https://github.com/bdhirsh	2025-06-21 03:34:07 +00:00
William Wen	24dc33b37b	[dynamo] handle fullgraph toggle using nested torch.compile (#155166 ) See added test for the case that this PR handles. In particular, the semantics for nested torch.compile with toggled fullgraph settings was strange before - `@torch.compile(fullgraph=True)` overrides the existing fullgraph setting, while `@torch.compile(fullgraph=False)` does not. Note that this change will add an extra frame to any inlined torch.compile'd function (which I don't expect to happen frequently). Pull Request resolved: https://github.com/pytorch/pytorch/pull/155166 Approved by: https://github.com/jansel ghstack dependencies: #154283, #154289, #154782	2025-06-20 07:03:29 +00:00
PyTorch MergeBot	6201981f48	Revert "[dynamo] handle fullgraph toggle using nested torch.compile (#155166 )" This reverts commit `614a415145`. Reverted https://github.com/pytorch/pytorch/pull/155166 on behalf of https://github.com/atalman due to inductor/test_flex_decoding.py::TestFlexDecodingCUDA::test_do_not_trigger_dynamic_shapes_on_empty_block_mask_cuda [GH job link](https://github.com/pytorch/pytorch/actions/runs/15726606697/job/44333233942) [HUD commit link](`a6a3a44144`) ([comment](https://github.com/pytorch/pytorch/pull/155166#issuecomment-2984751600))	2025-06-18 15:43:22 +00:00
William Wen	614a415145	[dynamo] handle fullgraph toggle using nested torch.compile (#155166 ) See added test for the case that this PR handles. In particular, the semantics for nested torch.compile with toggled fullgraph settings was strange before - `@torch.compile(fullgraph=True)` overrides the existing fullgraph setting, while `@torch.compile(fullgraph=False)` does not. Note that this change will add an extra frame to any inlined torch.compile'd function (which I don't expect to happen frequently). Pull Request resolved: https://github.com/pytorch/pytorch/pull/155166 Approved by: https://github.com/jansel ghstack dependencies: #154283, #154289, #154782	2025-06-18 07:27:20 +00:00
Oguz Ulgen	a2a75be0f8	Rename inductor cache (#156128 ) Requested by Simon on a different PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/156128 Approved by: https://github.com/xmfan	2025-06-17 03:57:18 +00:00
William Wen	1f0eb79e3e	[dynamo] fix KeyError in LOAD_FAST_CHECK (#155763 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155763 Approved by: https://github.com/StrongerXi, https://github.com/jansel ghstack dependencies: #155761	2025-06-17 00:54:16 +00:00
William Wen	4e833c2005	[dynamo] support tracing weakref callback (#155761 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155761 Approved by: https://github.com/StrongerXi, https://github.com/jansel	2025-06-17 00:54:16 +00:00
cz2h	dabb55baff	Add resolve in add decomp to enable view (#153945 ) Fixes #148950. During the construction of graph and running the node of add under [interpreter](/github.com/pytorch/pytorch/blob/d68d4d31f4824f1d1e0d1d6899e9879ad19b0754/torch/fx/interpreter.py#L301 ), the functional argument of conj complex tensor gets cloned. This result in always having .is_conj() evaluted to false in decomposition function. Propose a fix of calling resolve_conj() in the decomposition of complex tensor add. Test as below `python test/dynamo/test_repros.py ReproTests.test_add_complex_conj` Pull Request resolved: https://github.com/pytorch/pytorch/pull/153945 Approved by: https://github.com/jansel	2025-06-14 00:41:50 +00:00
PyTorch MergeBot	06408dae49	Revert "Add view_simple as meta function for view, and avoid calling reshape_view_helper. (#154757 )" This reverts commit `0029259bdf`. Reverted https://github.com/pytorch/pytorch/pull/154757 on behalf of https://github.com/laithsakka due to post land issue ([comment](https://github.com/pytorch/pytorch/pull/154757#issuecomment-2971385787))	2025-06-13 19:11:43 +00:00

1 2 3 4 5 ...

519 Commits