pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Yuanyuan Chen	fc8ac1216c	[4/N] Remove unused loop variables in tests (#166690 ) This PR removes unused loop variables in tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/166690 Approved by: https://github.com/justinchuby, https://github.com/mlazos	2025-10-31 10:20:48 +00:00
linhaifeng	695cb0d342	[2/N][Fix] Fix typo in test folder (#166374 ) Fix typo in test folder. _typos.toml ```bash [default.extend-words] nd = "nd" arange = "arange" Nd = "Nd" GLOBALs = "GLOBALs" hte = "hte" iy = "iy" PN = "PN" Dout = "Dout" optin = "optin" gam = "gam" PTD = "PTD" ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/166374 Approved by: https://github.com/cyyever, https://github.com/ezyang	2025-10-29 03:02:07 +00:00
Sherlock Huang	34d6ef7022	Update gm.print_readable to include Annotation (#165397 ) Sample output ``` [rank0]: # Annotation: {'compile_with_inductor': 'flex_attention'} File: /data/users/bahuang/pytorch/torch/nn/attention/flex_attention.py:1490 in flex_attention, code: out, lse, max_scores = flex_attention_hop( [rank0]: score_mod_2 = self.score_mod_2 [rank0]: mask_fn_2 = self.mask_fn_2 [rank0]: flex_attention_1 = torch.ops.higher_order.flex_attention(xq_5, xk_5, xv_3, score_mod_2, (2048, 2048, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___kv_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___kv_indices, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_kv_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_kv_indices, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___q_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___q_indices, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_q_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_q_indices, 128, 128, mask_fn_2), 0.25, {'PRESCALE_QK': False, 'ROWS_GUARANTEED_SAFE': False, 'BLOCKS_ARE_CONTIGUOUS': False, 'WRITE_DQ': True, 'OUTPUT_LOGSUMEXP': True, 'OUTPUT_MAX': False}, (), (g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___mask_mod___closure___0_cell_contents,)); xq_5 = xk_5 = xv_3 = score_mod_2 = mask_fn_2 = None [rank0]: out_2: "bf16[8, 4, 2048, 16]" = flex_attention_1[0]; flex_attention_1 = None ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165397 Approved by: https://github.com/yushangdi, https://github.com/anijain2305, https://github.com/mlazos	2025-10-28 13:54:38 +00:00
bobrenjc93	1d58d5fe25	[hops] fix unbacked runtime asserts for cond higher order op (#165893 ) At a high level after this fix we get the following nice tlparse https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/bobren/54a57665-7dcc-41e0-8ca7-df01393cd4aa/custom/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 As seen in this doc, previously we were simply dropping assert post dynamo: https://docs.google.com/document/d/1nRQwvw_gWL0_9T3VKb5Ly3_tNI1fgqG9WtryeD6qaZI/edit?tab=t.0 The fixes are a couple things: 1) Actually run the runtime assertion fx graph pass on subgraphs 2) Reset fake mode unbacked memo across speculate subgraph invocations since the memos actually break the runtime assertion insertions since calls like nonzero end up not allocating new unbacked symints and hence not populating pending_unbacked which then results in incorrect unbacked_bindings on fx_nodes in subgraphs. This is a first step in hardening runtime asserts across all phases of the compiler (eager, aot_eager, inductor, etc.). I will continue kicking tires and fixing bugs until we get runtime assert generations in a good place. One obvious next step is the added test case in this PR fails when compiled with inductor with the following error (NB: it fails before this PR as well): ``` File "/data/users/bobren/a/pytorch/torch/_inductor/ir.py", line 659, in get_dtype return self.dtype torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised: LoweringException: AttributeError: 'ShapeAsConstantBuffer' object has no attribute 'dtype' target: cond args[0]: Eq(Mod(s77, 4), 0) args[1]: Subgraph(name='true_graph_0', graph_module=<lambda>(), graph=<torch._inductor.graph.SubgraphLowering object at 0x7fbcbb11e110>) args[2]: Subgraph(name='false_graph_0', graph_module=<lambda>(), graph=<torch._inductor.graph.SubgraphLowering object at 0x7fbcbb21cf70>) args[3]: (s77, TensorBox(StorageBox( ComputedBuffer(name='buf0', layout=FlexibleLayout('cuda:0', torch.float32, size=[s77, s77], stride=[s77, 1]), data=Pointwise(device=device(type='cuda', index=0), dtype=torch.float32, inner_fn=<function make_pointwise.<locals>.inner.<locals>.inner_fn at 0x7fbcbb2f37f0>, ranges=[s77, s77])) ))) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165893 Approved by: https://github.com/zou3519	2025-10-25 03:25:36 +00:00
Yidi Wu	af4ba78543	[scan x vmap] support scan in vmap (#165580 ) This is required by the chunked_with_scan work where two nested vmap(vmap) with chunk sizes > 1 are invoked, which produces a scan-> vmap -> scan -> vmap chain and we need to handle the case of vmap(scan) and scan(vmap). The way we handle vmap(scan) is to turn it into scan(vmap(combine_fn)). The idea being that the combine_fn no longer do the combine_fn for a single slice, it vmaps over the combine_fn and do multiple combine_fns in one step. We need to need know how combine_fn propagates the batched tensor and what are the batched dims of the output. For this purpose, we use restore_vmap to give us the out_dims information. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165580 Approved by: https://github.com/zou3519 ghstack dependencies: #165675	2025-10-22 09:46:00 +00:00
PyTorch MergeBot	e50dc40d28	Revert "Update gm.print_readable to include Annotation (#165397 )" This reverts commit `7a65770013`. Reverted https://github.com/pytorch/pytorch/pull/165397 on behalf of https://github.com/malfet due to I don't know how/why, but it breaks windows tests, see `2e22b1a61e/1` ([comment](https://github.com/pytorch/pytorch/pull/165397#issuecomment-3417428128))	2025-10-17 22:35:50 +00:00
Sherlock Huang	7a65770013	Update gm.print_readable to include Annotation (#165397 ) Sample output ``` [rank0]: # Annotation: {'compile_with_inductor': 'flex_attention'} File: /data/users/bahuang/pytorch/torch/nn/attention/flex_attention.py:1490 in flex_attention, code: out, lse, max_scores = flex_attention_hop( [rank0]: score_mod_2 = self.score_mod_2 [rank0]: mask_fn_2 = self.mask_fn_2 [rank0]: flex_attention_1 = torch.ops.higher_order.flex_attention(xq_5, xk_5, xv_3, score_mod_2, (2048, 2048, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___kv_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___kv_indices, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_kv_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_kv_indices, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___q_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___q_indices, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_q_num_blocks, g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___full_q_indices, 128, 128, mask_fn_2), 0.25, {'PRESCALE_QK': False, 'ROWS_GUARANTEED_SAFE': False, 'BLOCKS_ARE_CONTIGUOUS': False, 'WRITE_DQ': True, 'OUTPUT_LOGSUMEXP': True, 'OUTPUT_MAX': False}, (), (g____import_torchtitan_dot_models_dot_attention___flex_attention_block_masks___block_causal___none___mask_mod___closure___0_cell_contents,)); xq_5 = xk_5 = xv_3 = score_mod_2 = mask_fn_2 = None [rank0]: out_2: "bf16[8, 4, 2048, 16]" = flex_attention_1[0]; flex_attention_1 = None ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/165397 Approved by: https://github.com/yushangdi, https://github.com/anijain2305	2025-10-17 18:35:18 +00:00
Yuanyuan Chen	e925dfcc6b	Enable all SIM rules except disabled ones (#164645 ) `SIM` rules are useful for simplifying boolean expressions and enhances code readability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164645 Approved by: https://github.com/ezyang, https://github.com/mlazos	2025-10-17 07:27:11 +00:00
Yuanyuan Chen	8de85896e0	Enable ruff rule E721 (#165162 ) `E721` checks for object type comparisons using == and other comparison operators. This is useful because it is recommended to use `is` for type comparisons. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165162 Approved by: https://github.com/Skylion007	2025-10-13 01:48:55 +00:00
PyTorch MergeBot	816fb7f48d	Revert "Enable ruff rule E721 (#165162 )" This reverts commit `9e7c19f72b`. Reverted https://github.com/pytorch/pytorch/pull/165162 on behalf of https://github.com/pytorch-auto-revert due to Reverted automatically by pytorch's autorevert, to avoid this behaviour add the tag autorevert: disable ([comment](https://github.com/pytorch/pytorch/pull/165162#issuecomment-3393328271))	2025-10-11 13:25:40 +00:00
Yuanyuan Chen	9e7c19f72b	Enable ruff rule E721 (#165162 ) `E721` checks for object type comparisons using == and other comparison operators. This is useful because it is recommended to use `is` for type comparisons. Pull Request resolved: https://github.com/pytorch/pytorch/pull/165162 Approved by: https://github.com/Skylion007	2025-10-11 06:43:53 +00:00
Laith Sakka	7f2a902ea2	more sizelike deprecation (#164889 ) remove expext_size c++ bindings and usages Pull Request resolved: https://github.com/pytorch/pytorch/pull/164889 Approved by: https://github.com/mlazos ghstack dependencies: #164884, #164885, #164886, #164887, #164888	2025-10-10 03:45:06 +00:00
PyTorch MergeBot	5d7360bb03	Revert "Enable all SIM rules except disabled ones (#164645 )" This reverts commit `321e602692`. Reverted https://github.com/pytorch/pytorch/pull/164645 on behalf of https://github.com/izaitsevfb due to causes lint failures ([comment](https://github.com/pytorch/pytorch/pull/164645#issuecomment-3369274351))	2025-10-05 19:32:21 +00:00
Yuanyuan Chen	321e602692	Enable all SIM rules except disabled ones (#164645 ) `SIM` rules are useful for simplifying boolean expressions and enhances code readability. Pull Request resolved: https://github.com/pytorch/pytorch/pull/164645 Approved by: https://github.com/ezyang	2025-10-05 07:38:25 +00:00
Tugsbayasgalan Manlaibaatar	f6537d9616	Move control flow export tests to new tracer (#163259 ) Differential Revision: [D82732614](https://our.internmc.facebook.com/intern/diff/D82732614) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163259 Approved by: https://github.com/avikchaudhuri ghstack dependencies: #163136, #163137, #163258	2025-09-28 19:56:09 +00:00
Yidi Wu	8f6dbc0ba8	[scan] create fw and bw graphs via partitioning (#162754 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162754 Approved by: https://github.com/zou3519 ghstack dependencies: #161557, #161664, #161808, #162025, #161732	2025-09-27 18:13:15 +00:00
Yidi Wu	b85bee3bbb	[hop] refactor check input alias and mutation to be a graph pass (#162025 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/162025 Approved by: https://github.com/zou3519 ghstack dependencies: #161557, #161664, #161808	2025-09-27 18:13:15 +00:00
Yidi Wu	66dbf2c9f5	[scan][autograd] clone outputs that's aliasing with inputs or outputs in bw (#161808 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161808 Approved by: https://github.com/zou3519 ghstack dependencies: #161557, #161664	2025-09-27 18:13:15 +00:00
bobrenjc93	7dcb568c8f	Turn on capture_scalar_outputs when fullgraph=True (#163121 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/163121 Approved by: https://github.com/laithsakka	2025-09-18 21:24:15 +00:00
Thomas Bohnstingl	87cc126457	[associative_scan] partial gradient support (#162388 ) This PR tests the partial gradient support of the `associative_scan` operation. It replaces https://github.com/bohnstingl/pytorch/pull/6 Pull Request resolved: https://github.com/pytorch/pytorch/pull/162388 Approved by: https://github.com/ydwu4	2025-09-09 23:52:29 +00:00
Prachi Gupta	c0142f5c06	[ROCm] Enabling several UTs (#161715 ) All these UTs are working as is, just removing the skip - test_p2p_ipc - test_repros.py: working, added fp8 support - test_activation_checkpointing.py - test_content_store.py - test_cuda_multigpu.py - test_compute_comm_reordering.py - test_segment_reductions.py - test_dataloader.py - test_math_ops.py - test_loop_ordering.py - test_control_flow.py - distributed_test.py - test_mem_tracker.py - test_fsdp_optim_state.py - test_fully_shard_mixed_precision.py: skippped for < ROCm7.0 - test_aot_inductor_custom_ops.py - test_c10d_ops_nccl.py - test_eager_transforms.py - test_sparse_csr.py - test_inductor_collectives.py - test_fake_tensor.py - test_cupy_as_tensor.py - test_cuda.py: enable UTs that are working - test_matmul_cuda.py: enable UTs that are working Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/161715 Approved by: https://github.com/msaroufim Co-authored-by: Mark Saroufim <marksaroufim@fb.com>	2025-09-09 15:49:21 +00:00
Thomas Bohnstingl	07f07309c6	[associative_scan] Autograd separated (#139939 ) This PR implements the Autograd feature of the associative_scan. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139939 Approved by: https://github.com/huydhn	2025-09-08 23:30:11 +00:00
Avik Chaudhuri	711c8c821e	shape guards (#161178 ) Summary: This PR introduces shape guards to export. Previously only value ranges, equalities, and specializations would be tracked for symbolic expressions, and we had a forward hook to check them. Instead now we create a function to check shape guards and call it in the exported program. Test Plan: updated several tests Rollback Plan: Differential Revision: D80713603 Pull Request resolved: https://github.com/pytorch/pytorch/pull/161178 Approved by: https://github.com/tugsbayasgalan	2025-09-08 22:44:09 +00:00
PyTorch MergeBot	5d819f3faf	Revert "[associative_scan] Autograd separated (#139939 )" This reverts commit `103f725afa`. Reverted https://github.com/pytorch/pytorch/pull/139939 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I am seeing a weird failure after this lands in trunk ([comment](https://github.com/pytorch/pytorch/pull/139939#issuecomment-3267945657))	2025-09-08 20:42:47 +00:00
Thomas Bohnstingl	103f725afa	[associative_scan] Autograd separated (#139939 ) This PR implements the Autograd feature of the associative_scan. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139939 Approved by: https://github.com/ydwu4	2025-09-08 03:21:17 +00:00
Yidi Wu	ec2e3687c7	[while_loop][autograd] support autograd_key of while_loop (#160483 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160483 Approved by: https://github.com/zou3519	2025-09-07 21:55:29 +00:00
PyTorch MergeBot	8235c4f65d	Revert "[ROCm] Enabling several UTs (#161715 )" This reverts commit `b9ba612f7a`. Reverted https://github.com/pytorch/pytorch/pull/161715 on behalf of https://github.com/jeanschmidt due to Need to revert in order to revert https://github.com/pytorch/pytorch/pull/159473, feel free to merge it back once conflicts are cleared ([comment](https://github.com/pytorch/pytorch/pull/161715#issuecomment-3264040604))	2025-09-07 21:03:17 +00:00
PyTorch MergeBot	7a83cf430e	Revert " [while_loop][autograd] support autograd_key of while_loop (#160483 )" This reverts commit `2b8a83901c`. Reverted https://github.com/pytorch/pytorch/pull/160483 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but some trunk tests are failing either from this PR or the previous one in the stack ([comment](https://github.com/pytorch/pytorch/pull/160483#issuecomment-3263597325))	2025-09-07 08:50:49 +00:00
Yidi Wu	2b8a83901c	[while_loop][autograd] support autograd_key of while_loop (#160483 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160483 Approved by: https://github.com/zou3519 ghstack dependencies: #160548, #160467	2025-09-06 21:26:33 +00:00
Yidi Wu	48e3be3ab6	[while_loop][autograd] add hop while_loop_stack_output (#160467 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/160467 Approved by: https://github.com/zou3519 ghstack dependencies: #160548	2025-09-06 21:26:33 +00:00
Prachi Gupta	b9ba612f7a	[ROCm] Enabling several UTs (#161715 ) All these UTs are working as is, just removing the skip - test_p2p_ipc - test_repros.py: working, added fp8 support - test_activation_checkpointing.py - test_content_store.py - test_cuda_multigpu.py - test_compute_comm_reordering.py - test_segment_reductions.py - test_dataloader.py - test_math_ops.py - test_loop_ordering.py - test_control_flow.py - distributed_test.py - test_mem_tracker.py - test_fsdp_optim_state.py - test_fully_shard_mixed_precision.py: skippped for < ROCm7.0 - test_aot_inductor_custom_ops.py - test_c10d_ops_nccl.py - test_eager_transforms.py - test_sparse_csr.py - test_inductor_collectives.py - test_fake_tensor.py - test_cupy_as_tensor.py - test_cuda.py: enable UTs that are working - test_matmul_cuda.py: enable UTs that are working Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/161715 Approved by: https://github.com/pruthvistony, https://github.com/jeffdaily	2025-09-04 20:43:03 +00:00
Yidi Wu	266784ec6a	remove old while_loop_schema_gen test (#161202 ) Fixes https://github.com/pytorch/pytorch/issues/141202. This test is flaky for mysterious reasons and we have created a new way of creating schemas for hops. So delete the test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/161202 Approved by: https://github.com/zou3519	2025-08-22 18:22:29 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar	dbef606631	Add support for tracing vmap in pre-dispatch export (#154650 ) Summary: ONNX team and recent transformer upgrade ran into this error and we also ran into during our export benchmarking. This diff makes it possible to trace through vmap implementation in pre-dispatch IR. Note that we don't support serializing functorch ops in pre-dispatch IR and in the future, we should desugar them to post-grad ops. The implementation strategy is: 1. We add python wrappers around vmap APIs so that we attach custom torch function handler that is only on during non-strict export. The reason is we don't want to add this to default torch_function handler because it will break BC. 2. Some dynamo changes to make sure it picks up new python wrapper APIs. The reason is when we do strict export, we need to re-materialize these APIs in pre-dispatch IR from torch IR. We can avoid this by special casing in dynamo for export to proxy different API calls but i feel that is too much chaos because you need to be able to proxy 2 different variants of same vmap API. Test Plan: CI Differential Revision: D75623875 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154650 Approved by: https://github.com/ezyang, https://github.com/zou3519	2025-08-20 19:31:07 +00:00
Yidi Wu	209143ddeb	[while_loop][inductor] fix aliased inputs by cloning (#160668 ) [fx_graph_cse](https://github.com/pytorch/pytorch/blob/main/torch/_functorch/compile_utils.py#L46) is executed in min_cut partitioner which accidentally creates the aliasing for empty buffers and we could see the following graph node for joint graph with cmd: "pytest test/functorch/test_control_flow.py -k test_scan_multiple_layers_gradient_layers_2_device_cpu" ```python while_loop = torch.ops.higher_order.while_loop(while_loop_cond_graph_0_0, while_loop_body_graph_0_0, (full_default_4, empty_strided_default, full_default_2, full_default_3, full_default_2, full_default_3, full_default, full_default, rev, rev_1, rev_2, rev_3), (primals_4, primals_5, primals_6, primals_7)); ``` Notice the operands sequence "full_default_2, full_default_3, full_default_2, full_default_3, full_default, full_default", which indicates the gradient of different layers now sharing the same buffer, which create silent incorrectness. Fixes https://github.com/pytorch/pytorch/pull/158168. Pull Request resolved: https://github.com/pytorch/pytorch/pull/160668 Approved by: https://github.com/zou3519 ghstack dependencies: #160548, #160374	2025-08-19 02:33:59 +00:00
Yidi Wu	ff86509a06	[map] filter none gradients and add autograd inductor tests (#160548 ) Will filter the none outputs in autograd backward for other hops as follow ups Pull Request resolved: https://github.com/pytorch/pytorch/pull/160548 Approved by: https://github.com/zou3519	2025-08-15 20:13:12 +00:00
Yidi Wu	da8f48d88f	[associative_scan] support gen_schema for associative_scan (#158883 ) In-place mutation may create inter-loop dependency that breaks the parallelism we have for associative_scan so we ban input mutations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158883 Approved by: https://github.com/zou3519 ghstack dependencies: #154193, #158965, #158863, #158864	2025-08-15 17:28:44 +00:00
Yidi Wu	cb9e2092a8	[scan] support gen_schema for scan (#158864 ) We don't want to allow scan's combine_fn to mutate its inputs. The semantic of the mutation can be confusing. For example: ```python def combine_fn(init, x): ``` If combine_fn mutates init, only first iteration mutates init, the rest of the iterations mutates the previous carry, which is an intermediate result. This is kind of a weird semantic because the only observable mutation is for init, which can be done outside of the combine_fn. If combine_fn mutates x, where x is a slice of scanned inputs (i.e. xs), this pattern is more meaningful but we've not seen any use case yet. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158864 Approved by: https://github.com/zou3519 ghstack dependencies: #154193, #158965, #158863	2025-08-15 17:28:44 +00:00
Yidi Wu	f6bf1573fc	[while_loop] support gen_schema for while_loop (#158863 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/158863 Approved by: https://github.com/zou3519 ghstack dependencies: #154193, #158965	2025-08-15 17:28:34 +00:00
Yidi Wu	3fe3c23d4e	[cond] support gen_schema for cond (#154193 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154193 Approved by: https://github.com/zou3519	2025-08-15 17:28:13 +00:00
Michael Lazos	182975e01a	[Dynamo] Enable torch function dispatch on HOPs (#159708 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159708 Approved by: https://github.com/zou3519, https://github.com/XilunWu ghstack dependencies: #159707	2025-08-05 01:43:22 +00:00
Pian Pawakapan	39b54b78d7	[export] runtime asserts for while HOP subgraphs (#158467 ) Differential Revision: D78431075 For #158366 - Calls runtime asserts pass for HOP subgraphs (in reenter_make_fx) - For while_loop only (can be expanded), clones input tensors for subgraph tracing, so unbacked memos (item, nonzero, etc.) aren't reused Pull Request resolved: https://github.com/pytorch/pytorch/pull/158467 Approved by: https://github.com/ydwu4	2025-07-23 00:34:18 +00:00
Yidi Wu	a3396a9b85	[hop] set capture_scalar_outputs=True by default for compiled hops (#158480 ) We want to do it for two reasons: 1. It's tedious for users to manually turn on capture_scalar_outputs=True when compiling map and scan with inductor, where we decomposing them into while_loop and use the idx tensor.item() to select a slice of output buffer and write into it. This pr turns on the flag by default. 2. a graph break caused by capture_scalar_outputs=False would cause the hop to fail, and we should turn it on by default so that the error message is more meaningful. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158480 Approved by: https://github.com/zou3519	2025-07-18 07:16:50 +00:00
Xuehai Pan	c8d43cbc6e	[BE][3/6] fix typos in test/ (#157637 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157637 Approved by: https://github.com/yewentao256, https://github.com/albanD ghstack dependencies: #156605	2025-07-17 12:08:33 +00:00
PyTorch MergeBot	21990fbad9	Revert "[cond] support gen_schema for cond (#154193 )" This reverts commit `6de41ce0f8`. Reverted https://github.com/pytorch/pytorch/pull/154193 on behalf of https://github.com/Camyll due to issue landing internally, discussed with Yidi offline ([comment](https://github.com/pytorch/pytorch/pull/154193#issuecomment-3009160081))	2025-06-26 17:10:00 +00:00
Yidi Wu	6de41ce0f8	[cond] support gen_schema for cond (#154193 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154193 Approved by: https://github.com/zou3519 ghstack dependencies: #155644	2025-06-25 21:19:58 +00:00
Yidi Wu	3257c8f74c	[cond] preserve merged phs meta for subgraph (#155644 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155644 Approved by: https://github.com/zou3519	2025-06-25 21:19:58 +00:00
Avik Chaudhuri	463fe36532	fix error message on specialization with Dim.DYNAMIC (#155738 ) Previously specialization error messages would render sources that were pretty far from source-code names. E.g., given args named `x, y, zs`, the source for `y.size()[0]` would be rendered as `args[0][1].size()[0]`. This is because we created artificial local names following `(args, kwargs)` structure instead of reusing signatures. This PR fixes that situation. Basically we map prefixes of key paths that correspond to original arg names to root sources corresponding to those names; the rest of the key paths hang from these root sources. Differential Revision: [D76461391](https://our.internmc.facebook.com/intern/diff/D76461391/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155738 Approved by: https://github.com/bobrenjc93	2025-06-13 10:33:46 +00:00
Yidi Wu	d6be87648f	[hop schema] add schema.tree_spec to support pytree inputs (#154191 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154191 Approved by: https://github.com/zou3519 ghstack dependencies: #155261, #154072	2025-06-11 22:52:37 +00:00
Thomas Bohnstingl	fb5a787a8f	[HOP] Added clone for outputs of create_bw_fn that are aliasing the inputs (#153932 ) This PR fixes an issue with the new way of creating the bw graph introduced for cond. In particular, there is an issue if the bw function simply aliases the inputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153932 Approved by: https://github.com/ydwu4	2025-06-04 23:52:52 +00:00
Animesh Jain	c881f2ddf3	[reland][dynamo] Mark a vt unspecialized nn module variable source earlier (#155099 ) Reland of https://github.com/pytorch/pytorch/pull/154780 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/155099 Approved by: https://github.com/williamwen42	2025-06-04 23:05:36 +00:00

1 2 3 4 5

246 Commits