pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Laith Sakka	0029259bdf	Add view_simple as meta function for view, and avoid calling reshape_view_helper. (#154757 ) address https://github.com/pytorch/pytorch/issues/153303 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154757 Approved by: https://github.com/bobrenjc93, https://github.com/leslie-fang-intel	2025-06-12 09:58:15 +00:00
Pian Pawakapan	8ad6197b46	[draft export] avoid storing intermediate real tensors in proxies (#154630 ) Handles GC for non-strict draft export; GPU memory usage shouldn't be much more than eager mode + input tensors now. While trying to do draft export CPU offloading, I found out GC is feasible, because in non-strict, there's 2 places holding references to a `.real_tensor` attribute: 1) the FakeTensors in fake tensor prop, but these are held by the actual variables in the model's forward call, and so the real tensor gets gc-ed along with the fake one when the variable goes out of scope. 2) A clone of the fake tensor in 1) stored in `proxy.node.meta["val"]`, which was added in https://github.com/pytorch/pytorch/pull/150948. But we didn't actually need to store them on intermediate values; the placeholders are enough for retracing/lowering. Avoiding storing the intermediate values in 2), the values in 1) should be naturally GC-ed, and the real-tensor memory usage for non-strict should be pretty similar to eager computation? Strict still OOMs; dynamo still holds these in variable tracking, and not sure how to GC those. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154630 Approved by: https://github.com/angelayi, https://github.com/yushangdi	2025-06-12 01:18:57 +00:00
PyTorch MergeBot	0fab32290a	Revert "[draft export] avoid storing intermediate real tensors in proxies (#154630 )" This reverts commit `5acb8d5080`. Reverted https://github.com/pytorch/pytorch/pull/154630 on behalf of https://github.com/malfet due to This still ooms, at least occasionally see `78624679a8/1` ([comment](https://github.com/pytorch/pytorch/pull/154630#issuecomment-2923759745))	2025-05-31 00:07:56 +00:00
Pian Pawakapan	5acb8d5080	[draft export] avoid storing intermediate real tensors in proxies (#154630 ) Handles GC for non-strict draft export; GPU memory usage shouldn't be much more than eager mode + input tensors now. While trying to do draft export CPU offloading, I found out GC is feasible, because in non-strict, there's 2 places holding references to a `.real_tensor` attribute: 1) the FakeTensors in fake tensor prop, but these are held by the actual variables in the model's forward call, and so the real tensor gets gc-ed along with the fake one when the variable goes out of scope. 2) A clone of the fake tensor in 1) stored in `proxy.node.meta["val"]`, which was added in https://github.com/pytorch/pytorch/pull/150948. But we didn't actually need to store them on intermediate values; the placeholders are enough for retracing/lowering. Avoiding storing the intermediate values in 2), the values in 1) should be naturally GC-ed, and the real-tensor memory usage for non-strict should be pretty similar to eager computation? Strict still OOMs; dynamo still holds these in variable tracking, and not sure how to GC those. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154630 Approved by: https://github.com/angelayi, https://github.com/yushangdi	2025-05-30 21:06:55 +00:00
Laith Sakka	0ec8fe46d7	cleanup, refactor and add missing self._dde_suppressed checks (#152657 ) so two things other than cleanups and refactoring 1) do not use propagate_real_tensors to resolve eval under guard_or_true/guard_or_false . 2) do not guard for dimensions of type DimDynamic.OBLIVIOUS_SIZE under guard_or_true/guard_or_false . Pull Request resolved: https://github.com/pytorch/pytorch/pull/152657 Approved by: https://github.com/pianpwk	2025-05-19 16:15:14 +00:00
PyTorch MergeBot	1748fa529a	Revert "cleanup, refactor and add missing self._dde_suppressed checks (#152657 )" This reverts commit `f7fb2f66e3`. Reverted https://github.com/pytorch/pytorch/pull/152657 on behalf of https://github.com/malfet due to Broke lint ([comment](https://github.com/pytorch/pytorch/pull/152657#issuecomment-2887539146))	2025-05-16 19:42:20 +00:00
Laith Sakka	f7fb2f66e3	cleanup, refactor and add missing self._dde_suppressed checks (#152657 ) so two things other than cleanups and refactoring 1) do not use propagate_real_tensors to resolve eval under guard_or_true/guard_or_false . 2) do not guard for dimensions of type DimDynamic.OBLIVIOUS_SIZE under guard_or_true/guard_or_false . Pull Request resolved: https://github.com/pytorch/pytorch/pull/152657 Approved by: https://github.com/pianpwk	2025-05-16 19:10:04 +00:00
angelayi	d51bc27378	[export] Make draft_export public (#153219 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/153219 Approved by: https://github.com/pianpwk	2025-05-14 02:18:36 +00:00
Pian Pawakapan	fd3d339e17	[dynamic shapes] be less aggressive with runtime assert CSE for bounds (#151590 ) Fixes #150540 Fixes #147772 Stops trying to CSE bound expressions, only does exact deduplication for runtime asserts. Adds the test cases to check that AOTAutograd doesn't data-dependent error out when retracing due to not seeing the asserts. Pull Request resolved: https://github.com/pytorch/pytorch/pull/151590 Approved by: https://github.com/laithsakka	2025-04-23 23:07:00 +00:00
Pian Pawakapan	54f736155b	[dynamic shapes] guard_or_false for _reshape_view_helper, utils._infer_size for wildcard dims (#150127 ) For reshape/view: removes fast paths for 0 elements, checking dimensions to skip. Modifies the loop accumulating input elements, to raise a UserError if we run out of dimensions, graph breaking for compile and erroring out for export. For infer_size: assumes if user passes us an unbacked, it's probably not -1 Will think about changes in https://docs.google.com/document/d/1WYx6EZwVDXtBnWyrzoecgGWdiK0V3XZKftfpWwQ5i3E/edit?tab=t.0#heading=h.22k54zym11qp in a later PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/150127 Approved by: https://github.com/laithsakka	2025-04-23 05:42:30 +00:00
PyTorch MergeBot	e76c0b159a	Revert "[dynamic shapes] guard_or_false for _reshape_view_helper, utils._infer_size for wildcard dims (#150127 )" This reverts commit `a02eae8142`. Reverted https://github.com/pytorch/pytorch/pull/150127 on behalf of https://github.com/malfet due to Caused TestDynamoTimed.test_dynamo_timed to fail on macOS, see https://github.com/pytorch/pytorch/actions/runs/14584536979/job/40908019050 ([comment](https://github.com/pytorch/pytorch/pull/150127#issuecomment-2820081721))	2025-04-22 05:05:50 +00:00
Pian Pawakapan	a02eae8142	[dynamic shapes] guard_or_false for _reshape_view_helper, utils._infer_size for wildcard dims (#150127 ) For reshape/view: removes fast paths for 0 elements, checking dimensions to skip. Modifies the loop accumulating input elements, to raise a UserError if we run out of dimensions, graph breaking for compile and erroring out for export. For infer_size: assumes if user passes us an unbacked, it's probably not -1 Will think about changes in https://docs.google.com/document/d/1WYx6EZwVDXtBnWyrzoecgGWdiK0V3XZKftfpWwQ5i3E/edit?tab=t.0#heading=h.22k54zym11qp in a later PR Pull Request resolved: https://github.com/pytorch/pytorch/pull/150127 Approved by: https://github.com/laithsakka	2025-04-22 01:14:15 +00:00
angelayi	01f1cc44cb	Rename register_fake_profile to unsafe_generate_fake_kernels (#151797 ) Fixes https://docs.google.com/document/d/1BZsuUR1zJ-52Y7wP4yWX8beB4dwYbgdu5o1qKam_iWg/edit?disco=AAABiJdX1XU Pull Request resolved: https://github.com/pytorch/pytorch/pull/151797 Approved by: https://github.com/zou3519	2025-04-21 23:08:15 +00:00
angelayi	d5dda82586	[export] Integrate meta kernel generation with draft-export (#150809 ) If a custom operator does not contain a fake impl, currently draft-export will use the real-tensor propagation to get an output for the operator and continue tracing. However if we retrace the exported model using `ep.run_decompositions`, or `export`, or run the exported program with fake tensors, we'll still fail because there's no fake impl. With this PR, after draft-export we will generate an operator profile for each operator call that we encounter, and store this on the report attached to the exported program `ep._report.op_profiles`. Users can then use `torch._library.fake_profile.register_fake_profile` to temporarily generate and register a fake impl based on these operator profiles. This way future fake tensor retracing will work. The workflow would look something like: ```python class M(torch.nn.Module): def forward(self, a, b): res = torch.ops.mylib.foo8(a, b) # no fake impl return res ep = export(M(), (torch.ones(3, 4), torch.ones(3, 4)) # this fails bc no fake impl ep = draft_export(M(), (torch.ones(3, 4), torch.ones(3, 4)) ep.run_decompositions() # this fails bc no fake impl # this registers fake impls based on the profiles with torch._library.fake_profile.register_fake_profile(ep._report.op_profiles): decomp = ep.run_decompositions() # this works new_inp = ( torch.ones(2, 3, 4), torch.ones(2, 3, 4), ) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/150809 Approved by: https://github.com/zou3519	2025-04-17 20:52:31 +00:00
Shangdi Yu	92e81cf41a	Add real_tensor to the FakeTensor in node.meta["val"] (#150948 ) Summary: We need real_tensor on the FakeTensor in node.meta["val"] in order to aot_compile the draft exported programs. Otherwise, we cannot propagate real tensors even when fake_mode.propagate_real_tensors = True. This also fixes real tensor propagation in `run_decomposition()`. Test Plan: ``` buck2 run @mode/dev-nosan caffe2/test:test_export -- -r test_dedup_data_dependent_failure ``` Differential Revision: D72732714 Pull Request resolved: https://github.com/pytorch/pytorch/pull/150948 Approved by: https://github.com/angelayi	2025-04-10 00:11:46 +00:00
angelayi	ea0cbba1fc	[export] Refine draft-export CVE with Dim.AUTO (#150876 ) Instead of using refine_dynamic_shapes_from_suggested_fixes to fix ConstraintViolationErrors in draft-export, we can just convert the dims to Dim.AUTO, which is less error prone Pull Request resolved: https://github.com/pytorch/pytorch/pull/150876 Approved by: https://github.com/pianpwk	2025-04-09 19:44:30 +00:00
angelayi	bf34e228c5	[export] Beef up guard_added logs (#149465 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149465 Approved by: https://github.com/pianpwk	2025-03-20 23:02:07 +00:00
Shangdi Yu	2a7d583452	Consolidate torchbind fake class registration (#149063 ) Summary: Remove duplicated fake class registration Test Plan: CI Differential Revision: D71052419 Pull Request resolved: https://github.com/pytorch/pytorch/pull/149063 Approved by: https://github.com/angelayi	2025-03-13 06:57:13 +00:00
Pian Pawakapan	a929e11e4f	[dynamic shapes][export] ignore when real-tensor fallback fails (#147779 ) Summary: uninspired solution to https://github.com/pytorch/pytorch/issues/147402 Test Plan: test_draft_export Differential Revision: D70132269 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147779 Approved by: https://github.com/bobrenjc93	2025-03-03 19:09:56 +00:00
Catherine Lee	f47573f70d	Add super().setUp() to some test cases (#147651 ) I saw that their disabled issues were getting spammed with comments, meaning that they were still running in CI despite having a disable issue, so I added the super().setUp() call to check if there's a disable issue for them since they were missing it Pull Request resolved: https://github.com/pytorch/pytorch/pull/147651 Approved by: https://github.com/huydhn	2025-02-23 18:21:17 +00:00
Angela Yi	6e0b09728a	[export] Remove report from draft-export output (#147558 ) Summary: This matches the export API. To print the report, people can just do `print(ep._report)`. This information is also displayed in the terminal after the draft_export call. Test Plan: CI Reviewed By: SherlockNoMad Differential Revision: D69689154 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147558 Approved by: https://github.com/pianpwk	2025-02-22 00:54:29 +00:00
Pian Pawakapan	1e94c7aaa4	[draft_export] only clear pending unbacked symbols for overwritten kernels (#147427 ) This was wrong, we were doing this in all cases Pull Request resolved: https://github.com/pytorch/pytorch/pull/147427 Approved by: https://github.com/angelayi	2025-02-20 00:07:54 +00:00
angelayi	57060bebf3	[symbolic shapes] Add replacement for backed symints (#147240 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/147240 Approved by: https://github.com/pianpwk ghstack dependencies: #146939	2025-02-18 18:49:51 +00:00
angelayi	84abeaad5c	[export] Log evaluate_expr (#146939 ) We want to log each symnode created so that we can do provenance tracking in the tlparse report generated for draft export. To do this, we want to assign a unique id to every symnode, which python's `id` function already does, and then for every expression created, we can find the provenance by tracing back through its arguments ids. This logging only happens when dtrace_structured is enabled, which is only when running draft export. An example output is as follows: <img width="799" alt="image" src="https://github.com/user-attachments/assets/88bb31b4-8c31-43fb-aa88-08b573b9f71d" /> For the increase in the compile_time_instruction_count benchmark, this seems unavoidable because I need to call `id` to get the unique identifier for each symnode. But I believe `id` is an inexpensive operation, so hopefully it should be ok? I tried doing the following: * Originally I was passing around `self`, which is a SymNode, which caused the compile time to be ~6.36M * I changed it to pass around `id(self)` instead, which reduced the compile time to ~6.33M * Then I changed it to be passed as a positional arg instead of a kwarg, which reduced the compile time to ~6.22M, but this doesn't seem to be a super worthwhile fix? #suppress-bc-linter Pull Request resolved: https://github.com/pytorch/pytorch/pull/146939 Approved by: https://github.com/oulgen	2025-02-18 18:49:51 +00:00
angelayi	67cbbb29e0	[export] Dedup expression_created logs (#146859 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/146859 Approved by: https://github.com/pianpwk ghstack dependencies: #146532, #146533, #146534, #146858	2025-02-13 00:21:34 +00:00
angelayi	59bc5d0d71	[tlparse] Add stacktrace filter utility (#146858 ) Added a utility function for capturing the user stack and framework stacktrace. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146858 Approved by: https://github.com/bobrenjc93 ghstack dependencies: #146532, #146533, #146534	2025-02-13 00:21:34 +00:00
angelayi	b4bdbce1ac	[export] Use custom stream logger in draft-export (#146533 ) Using a custom logger so that we can store our own buffer to dedup logs that look the same. The schema for deduping is as follows: ```python if key == "missing_fake_kernel": return hash((key, data["op"])) # Same ops get deduped elif key == "mismatched_fake_kernel": return hash((key, data["op"], data["reason"])) # Same op and reason for errors get deduped elif key == "propagate_real_tensors": return hash((key, json.dumps(data["stack"]))) # Guards appearing on the same stacktrace get deduped elif key == "create_unbacked_symbol": return hash((key, json.dumps(data["stack"]))) # Unbacked symbols appearing on the same stacktrace get deduped ``` Notably, guards appearing on the same stacktrace get deduped. This is because there are some cases in PT2I models where a piece of code which creates a new unbacked symint + runs into a DDE gets called 800 times, causing 800 new symints to be created, and 800 propagate_real_tensor errors that are all the same expression. This is hard to look at, so we should just deduplicate this. The con of this is that if there exists multiple DDE on the same stacktrace, we will only show the first issue. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146533 Approved by: https://github.com/avikchaudhuri ghstack dependencies: #146532	2025-02-13 00:21:34 +00:00
angelayi	be387f57b1	[symbolic shapes] Log SymNode id for provenance (#146532 ) We can use the SymNode id to point us back to how previous expressions were created, and construct this nice tree in tlparse: <img width="761" alt="image" src="https://github.com/user-attachments/assets/531b03e8-4398-4d0a-bd11-16078256041c" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/146532 Approved by: https://github.com/bobrenjc93	2025-02-13 00:21:34 +00:00
Pian Pawakapan	3a6a203b98	[dynamic shapes][real tensor tracing] propagate unbacked hint when creating mod replacement (#146381 ) Fixes data-dependent errors for 2 PT2I models in draft export Pull Request resolved: https://github.com/pytorch/pytorch/pull/146381 Approved by: https://github.com/angelayi	2025-02-06 01:48:40 +00:00
Yiming Zhou	549e230c33	[draft_export] Clear pending unbacked symbols when overriding mismatched fake kernels (#146089 ) Summary: When encountering a mismatched fake kernel that also creates unbacked symbols, draft export will fail with `PendingUnbackedSymbolNotFound` error. Clearing `shape_env.pending_fresh_unbacked_symbols` fixes this issue. Test Plan: ``` buck2 run mode/dev-nosan caffe2/test:test_export -- -r test_override_mismatched_fake_kernel_with_unbacked_symbols ``` Differential Revision: D68920990 Pull Request resolved: https://github.com/pytorch/pytorch/pull/146089 Approved by: https://github.com/pianpwk	2025-02-01 03:32:50 +00:00
angelayi	1c9014a135	[export] Add tlparse to draft-export (#145810 ) Dependent on https://github.com/ezyang/tlparse/pull/87/files Pull Request resolved: https://github.com/pytorch/pytorch/pull/145810 Approved by: https://github.com/pianpwk	2025-01-29 19:26:00 +00:00
Pian Pawakapan	4be831ba2d	[draft_export] fix dense-in-memory check for inferring fakes (#145653 ) Test Plan: fixes check for dense tensors with size-1 dimensions Differential Revision: D68644028 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145653 Approved by: https://github.com/zou3519	2025-01-28 02:52:14 +00:00
Aaron Orenstein	99dbc5b0e2	PEP585 update - test (#145176 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145176 Approved by: https://github.com/bobrenjc93	2025-01-22 04:48:28 +00:00
Angela Yi	a94ec0a9a5	[aoti] Remove example inputs from aoti_compile_and_package (#144520 ) Summary: The args were removed in https://github.com/pytorch/pytorch/pull/140991 Test Plan: CI Differential Revision: D67998954 Pull Request resolved: https://github.com/pytorch/pytorch/pull/144520 Approved by: https://github.com/yushangdi	2025-01-10 21:56:23 +00:00
Yanan Cao (PyTorch)	ba5cacbc17	[Codemod][AddExplicitStrictExportArg] caffe2/test (#143688 ) Reviewed By: avikchaudhuri Differential Revision: D67530154 Pull Request resolved: https://github.com/pytorch/pytorch/pull/143688 Approved by: https://github.com/tugsbayasgalan	2024-12-27 07:58:44 +00:00
Tom Ritchford	d8c8ba2440	Fix unused Python variables in test/[e-z]* (#136964 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964 Approved by: https://github.com/justinchuby, https://github.com/albanD	2024-12-18 23:02:30 +00:00
Edward Z. Yang	a87925cc7e	Fix AttributeError: 'int' object has no attribute 'node' due to constant prop (#141250 ) Fixes https://github.com/pytorch/pytorch/issues/140625 Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/141250 Approved by: https://github.com/bobrenjc93	2024-11-24 08:20:04 +00:00
angelayi	53df1c11cd	[export] Add custom op guards (#141072 ) For custom ops that do not have a meta kernel, draft export automatically creates a meta kernel based on the tracing example inputs. To ensure that these assumptions made during tracing is clear to the user, we add assertions into the traced exported program: An example graph: ``` ExportedProgram: class GraphModule(torch.nn.Module): def forward(self, a: "f32[s0, s1]", b: "f32[s2, s3]"): # File: /data/users/angelayi/pytorch/test/export/test_draft_export.py:172 in forward, code: res1 = torch.ops.mylib.foo4(a, b) _assert_tensor_metadata = torch.ops.aten._assert_tensor_metadata(a, dtype = torch.float32, device = device(type='cpu')); _assert_tensor_metadata = None _assert_tensor_metadata_1 = torch.ops.aten._assert_tensor_metadata(b, dtype = torch.float32, device = device(type='cpu')); _assert_tensor_metadata_1 = None foo4: "f32[u2, u3]" = torch.ops.mylib.foo4.default(a, b); a = b = None return (foo4,) ``` Differential Revision: [D66321129](https://our.internmc.facebook.com/intern/diff/D66321129) Pull Request resolved: https://github.com/pytorch/pytorch/pull/141072 Approved by: https://github.com/pianpwk ghstack dependencies: #141071	2024-11-22 20:55:04 +00:00
Pian Pawakapan	1132b6764a	[draft export] generate fake outputs when real tensor prop finds mismatches (#139766 ) Currently real tensor tracing raises MetadataMismatchErrors if registered fake kernels don't match the real kernels (e.g. shape, aliasing, dtype, etc.). This adds an option to use fake kernel inference to bypass mismatches - this option defaults to False for real tensor tracing, but is on for draft export. Pull Request resolved: https://github.com/pytorch/pytorch/pull/139766 Approved by: https://github.com/angelayi, https://github.com/zou3519	2024-11-21 08:01:09 +00:00
Angela Yi	de509abe1c	[export] Dedup data-dependent errors based on stacktrace (#139540 ) Summary: Dedup the data-dependent errors based on the stacktrace it points to. Right now we just display every propagate-real-tensor log that shows up, but we actually can dedup them if they are due to the same piece of code (ex. there could multiple calls to a piece of code that does some data dependent computation). This occurred when trying out draft export on the PT2I model zoo. For a specific model, previously we would get ~3k data dependent errors, but after deduping based on the stacktrace we now only get 4 errors. Test Plan: CI Differential Revision: D65374254 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139540 Approved by: https://github.com/pianpwk, https://github.com/zou3519	2024-11-05 18:16:05 +00:00
angelayi	86db2cd194	[export] Initial draft export (#139383 ) Differential Revision: [D65288590](https://our.internmc.facebook.com/intern/diff/D65288590) Pull Request resolved: https://github.com/pytorch/pytorch/pull/139383 Approved by: https://github.com/zou3519	2024-11-01 06:25:44 +00:00

41 Commits