pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
James Wu	63e1b58a13	[easy] [Precompile] Refactor guards, improve typing (#160530 ) Purely a refactor, improve typing and get rid of some type errors. Make certain fields as nonnull, since in general it's not empty. The goal of this stack of PRs is to move the save/load logic of guard serialization into separate, flat phases, instead of being embedded in guard creation. This way, we can put a try/catch around it and fail safely if certain guards are not serializable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/160530 Approved by: https://github.com/Lucaskabela, https://github.com/Skylion007	2025-08-17 17:54:55 +00:00
Prajesh Praveen Anchalia	052c441cf4	Add logging for when inbuilt_inline_nn_modules will help with ID_MATCH guard triggered recompiles (#160592 ) We add a logging around when an ID_MATCH guard is added at a place where inbuilt_inline_nn_modules would inline it. This is done with the aim of tagging recompiles that could be avoided by setting inbuilt_inline_nn_modules flag. It will help us log and track the flag's adoption and potentially quantify saving in the the number of recompiles. Differential Revision: D80075975 Pull Request resolved: https://github.com/pytorch/pytorch/pull/160592 Approved by: https://github.com/anijain2305	2025-08-15 17:09:39 +00:00
Animesh Jain	8d3d1c8443	[dynamo] fixes to propagate tag safeness (#159807 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159807 Approved by: https://github.com/jansel	2025-08-12 04:50:13 +00:00
Animesh Jain	a4f69a5da0	[dynamo][guards] Remove guards on stdlib modules (#159913 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159913 Approved by: https://github.com/StrongerXi	2025-08-08 16:26:04 +00:00
Lucas Kabela	40c4d61f9a	[Dynamo][Better Engineering] Typing `torch/_dynamo/guards.py` (#159315 ) As part of better engineering effort, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to `torch/_dynamo/guards.py` Running ``` mypy torch/_dynamo/guards.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Annotated \| Lines Total \| % lines covered \| Funcs Annotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 2030 \| 3945 \| 51.46% \| 70 \| 138 \| 50.72% \| \| This PR \| 4055 \| 4055 \| 100.00% \| 138 \| 138 \| 100.00% \| \| Delta \| +2025 \| +90 \| +48.54% \| +68 \| 0 \| +49.28% \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/159315 Approved by: https://github.com/williamwen42, https://github.com/Skylion007	2025-08-06 21:52:14 +00:00
Zhengxu Chen	79eca4677b	[precompile] Skip serializing unnecesssary objects for guards. (#158926 ) Summary: The following type of objects don't need to be serialized for precompile: 1. PyCapsule because we don't guard on C binding objects in meaningful ways. 2. Code object because we only id matching on these but id matches will always be dropped for precompile. 3. Nested function objects since we also ban CLOSURE_MATCH. Test Plan: buck run mode/opt test/dynamo:test_dynamo -- -k test_skipped_objects Rollback Plan: Differential Revision: D78816888 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158926 Approved by: https://github.com/jamesjwu	2025-08-06 15:00:28 +00:00
Animesh Jain	66ad881fc7	[dynamo][guards][refactor] Simplify type extraction from GuardManager (#159752 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159752 Approved by: https://github.com/jansel	2025-08-04 16:51:27 +00:00
Animesh Jain	64cbaa876c	[dynamo][guards] Make class members go through obj.__class__.__dict__ (#159534 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159534 Approved by: https://github.com/jansel	2025-08-04 05:12:44 +00:00
Animesh Jain	4516c59f5f	[dynamo][source] Add special source for __code__ and __closure__ (#159722 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159722 Approved by: https://github.com/jansel	2025-08-04 05:02:05 +00:00
PyTorch MergeBot	805a102beb	Revert "[dynamo][guards] Make class members go through obj.__class__.__dict__ (#159534 )" This reverts commit `1616777cd2`. Reverted https://github.com/pytorch/pytorch/pull/159534 on behalf of https://github.com/malfet due to Broke some inductor test and lint among other things, see `9c18901bfd/1` ([comment](https://github.com/pytorch/pytorch/pull/159534#issuecomment-3146983186))	2025-08-03 04:58:32 +00:00
Animesh Jain	1616777cd2	[dynamo][guards] Make class members go through obj.__class__.__dict__ (#159534 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159534 Approved by: https://github.com/jansel ghstack dependencies: #159186	2025-08-02 18:04:35 +00:00
Lucas Kabela	2b1ae29960	[Dynamo][Better Engineering] Add typing annotations to guard and source (#158397 ) (#159491 ) Summary: X-link: https://github.com/pytorch/executorch/pull/12986 As part of better engineering week, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to a critical set of files for dynamo, `source.py` and the base `_guards.py` Running ``` mypy torch/_dynamo/source.py torch/_guards.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Unannotated \| Lines Total \| % lines covered \| Funcs Unannotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 1227 \| 2208 \| 55.57% \| 207 \| 362 \| 57.18% \| \| This PR \| 2217 \| 2217 \| 100.00% \| 362 \| 362 \| 100.00% \| \| Delta \| +990 \| +9 \| +44.43% \| +155 \| 0 \| +42.82% \| cc jgong5 mingfeima XiaobingSuper sanchitintel ashokei jingxu10 jerryzh168 voznesenskym penguinwu EikanWang Guobing-Chen zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov coconutruben Test Plan: Imported from GitHub, without a `Test Plan:` line. Rollback Plan: Reviewed By: JacobSzwejbka, yangw-dev Differential Revision: D79199389 Pulled By: Lucaskabela Pull Request resolved: https://github.com/pytorch/pytorch/pull/159491 Approved by: https://github.com/anijain2305, https://github.com/yangw-dev	2025-07-30 22:57:50 +00:00
James Wu	90fd06be71	Various bugfixes for running NanoGPT training (#159166 ) Fix various small bugs with running nanogpt on torchbenchmark in OSS under python 3.10. After these changes, the following now succeeds: ``` tlp python benchmarks/dynamo/torchbench.py --only nanogpt --performance --training --backend inductor --caching-precompile --warm-start-latency ``` Cold start: https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmp12LuZ5/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 Warm start (we are invesigating the recompile): https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmpT5YTB2/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159166 Approved by: https://github.com/zhxchen17	2025-07-30 16:30:22 +00:00
Animesh Jain	52a52d1b78	[dynamo][guards] Skip no tensor aliasing guard on inbuilt nn module buffers (#159453 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159453 Approved by: https://github.com/jansel	2025-07-30 15:31:07 +00:00
Animesh Jain	7eb5fdb358	[dynamo][guards] Recursive dict tag optimization (#159183 ) Design doc here - https://docs.google.com/document/d/1W29DrWID5miGWlZXspsQVN5U0zydE3kjZpziOXrhuaY/edit?tab=t.0#bookmark=id.sba04iw9sp68 Pull Request resolved: https://github.com/pytorch/pytorch/pull/159183 Approved by: https://github.com/jansel	2025-07-30 06:01:32 +00:00
PyTorch MergeBot	d987a6f7f0	Revert "[Dynamo][Better Engineering] Add typing annotations to guard and source (#158397 )" This reverts commit `abcb24f4de`. Reverted https://github.com/pytorch/pytorch/pull/158397 on behalf of https://github.com/yangw-dev due to Suggested to fix failing internal signals on D78911890 ([comment](https://github.com/pytorch/pytorch/pull/158397#issuecomment-3133823766))	2025-07-29 19:49:40 +00:00
Animesh Jain	e43e09e6c1	[dynamo][guards] Use lambda guards for object aliasing to improve object aliasing guards (#159288 ) # Note - On Lambda guarding of object aliasing # We previously installed object‑aliasing guards as relational guards, # but that undermined the recursive‑dict guard optimization: placing the # aliasing guard at a leaf prevented the parent dict node from # qualifying as a recursive‑dict guard root. Because aliasing guards are # rare, we now emit them as epilogue guards via a small Python lambda. # This repeats the access in Python—adding a bit of work—but the # overhead is outweighed by the gains from enabling recursive‑dict guard # optimization. Pull Request resolved: https://github.com/pytorch/pytorch/pull/159288 Approved by: https://github.com/StrongerXi	2025-07-29 18:36:49 +00:00
zhxchen17	90c241dedd	[precompile] Support user defined function calls from bytecode. (#158947 ) Previously precompile was implemented under the assumption that dynamo always inlines the user code and generate resume functions when a graph break is hit. In cases like nanogpt training, there exists nontrivial amount of code causing dynamo to fail the speculation and stop inlining certain type of user function. This results in more code objects to be tracked by CompilePackage. Since these new code objects are user defined, we need to also serialize the location of these code so that we can load the precompile entries to the these code objects in another process. With this fix, we are able to run nanogpt inference+training with precompile under torchbench. Differential Revision: [D78691422](https://our.internmc.facebook.com/intern/diff/D78691422/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/158947 Approved by: https://github.com/jamesjwu	2025-07-24 20:10:57 +00:00
Lucas Kabela	abcb24f4de	[Dynamo][Better Engineering] Add typing annotations to guard and source (#158397 ) As part of better engineering week, we would like to improve out type support to improve dev experience in dynamo This PR adds strict typing support to a critical set of files for dynamo, `source.py` and the base `_guards.py` Running ``` mypy torch/_dynamo/source.py torch/_guards.py --linecount-report /tmp/coverage_log ``` \| -------- \| Lines Unannotated \| Lines Total \| % lines covered \| Funcs Unannotated \| Funcs Total \| % funcs covered \| \| -------- \| ------- \| -------- \| ------- \| ------- \| ------- \| ------- \| \| Main \| 1227 \| 2208 \| 55.57% \| 207 \| 362 \| 57.18% \| \| This PR \| 2217 \| 2217 \| 100.00% \| 362 \| 362 \| 100.00% \| \| Delta \| +990 \| +9 \| +44.43% \| +155 \| 0 \| +42.82% \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/158397 Approved by: https://github.com/anijain2305	2025-07-24 15:55:18 +00:00
James Wu	f55c5d085e	[Precompile] Various small bugfixes, add CachingPrecompile to torchbench (#158847 ) This PR addresses a few small bugfixes needed to make NanoGPT inference work, and also adds a new `--caching-precompile` argument to torchbench. With `--caching-precompile`, after every benchmark we save precompile artifacts to DynamoCache, allowing us to test caching precompile on all existing benchmarks. The following bugfixes are in this PR to make all of this work: - Fix global variables being pruned with DUPLICATE_INPUT guards. DUPLICATE_INPUT guards have additional vars from the second input, which we track with additional_local_vars, but we never tracked additional global variables. This fixes the issue. (See torch/_dynamo/guards.py changes) - Return None from PRecompileContext.serialize() if no new dynamo compiles occurred. There's no reason to save artifacts (i.e. autotuning artifacts, etc) if no dynamo_compile occurred, so we return None early. We may later want to support editing existing dynamo artifacts as a TODO, but that's upcoming. - log `dynamo_start` on CompilePackage.load: This is only needed so that tlparse doesn't ignore TORCH_TRACE logs generated when caching precompile hits. If there are no actual compiles, we never log a "dynamo_start" entry, which makes internal tlparse ignore the TORCH_TRACE file. ## Test Plan After this PR, the following now works: ``` TORCH_LOGS=dynamo tlp python benchmarks/dynamo/torchbench.py --only nanogpt --performance --inference --backend inductor --caching-precompile --warm-start-latency ``` tlparse result (internal): Cold Start (6 seconds): https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmpAWe0zD/dedicated_log_torch_trace_vk9nkp4m.log/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 Warm Start (~1 s): https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmpAWe0zD/dedicated_log_torch_trace_5l4iwrpm.log/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 The 1 second of warm start here can be improved: the costs here are mostly in starting up workers and triton and initializing CUDA, a lot of which should not be included in the compile time cost in real world scenarios where these are already loaded before training begins. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158847 Approved by: https://github.com/zhxchen17	2025-07-24 14:09:54 +00:00
PyTorch MergeBot	76be282e3a	Revert "[Precompile] Various small bugfixes, add CachingPrecompile to torchbench (#158847 )" This reverts commit `d898d0d437`. Reverted https://github.com/pytorch/pytorch/pull/158847 on behalf of https://github.com/jithunnair-amd due to Broke ROCm CI jobs on MI200 and MI300 ([comment](https://github.com/pytorch/pytorch/pull/158847#issuecomment-3109664713))	2025-07-23 18:25:46 +00:00
Animesh Jain	1b456c580d	[dynamo][guards] Add type info of the guarded value in guard managers (#158765 ) tlparse looks like this <img width="1165" height="226" alt="image" src="https://github.com/user-attachments/assets/04c4e6b1-34a3-4d9d-8304-6eb6d9a94980" /> This will aid in reading guards. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158765 Approved by: https://github.com/Lucaskabela, https://github.com/StrongerXi	2025-07-23 16:59:15 +00:00
James Wu	d898d0d437	[Precompile] Various small bugfixes, add CachingPrecompile to torchbench (#158847 ) This PR addresses a few small bugfixes needed to make NanoGPT inference work, and also adds a new `--caching-precompile` argument to torchbench. With `--caching-precompile`, after every benchmark we save precompile artifacts to DynamoCache, allowing us to test caching precompile on all existing benchmarks. The following bugfixes are in this PR to make all of this work: - Fix global variables being pruned with DUPLICATE_INPUT guards. DUPLICATE_INPUT guards have additional vars from the second input, which we track with additional_local_vars, but we never tracked additional global variables. This fixes the issue. (See torch/_dynamo/guards.py changes) - Return None from PRecompileContext.serialize() if no new dynamo compiles occurred. There's no reason to save artifacts (i.e. autotuning artifacts, etc) if no dynamo_compile occurred, so we return None early. We may later want to support editing existing dynamo artifacts as a TODO, but that's upcoming. - log `dynamo_start` on CompilePackage.load: This is only needed so that tlparse doesn't ignore TORCH_TRACE logs generated when caching precompile hits. If there are no actual compiles, we never log a "dynamo_start" entry, which makes internal tlparse ignore the TORCH_TRACE file. ## Test Plan After this PR, the following now works: ``` TORCH_LOGS=dynamo tlp python benchmarks/dynamo/torchbench.py --only nanogpt --performance --inference --backend inductor --caching-precompile --warm-start-latency ``` tlparse result (internal): Cold Start (6 seconds): https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmpAWe0zD/dedicated_log_torch_trace_vk9nkp4m.log/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 Warm Start (~1 s): https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmpAWe0zD/dedicated_log_torch_trace_5l4iwrpm.log/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 The 1 second of warm start here can be improved: the costs here are mostly in starting up workers and triton and initializing CUDA, a lot of which should not be included in the compile time cost in real world scenarios where these are already loaded before training begins. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158847 Approved by: https://github.com/zhxchen17	2025-07-23 15:06:54 +00:00
Michael Lazos	89850bbc07	[Dynamo] Use proper sources for constructing dataclass defaults (#157993 ) Partially fixes https://github.com/pytorch/pytorch/issues/154009 Pull Request resolved: https://github.com/pytorch/pytorch/pull/157993 Approved by: https://github.com/williamwen42, https://github.com/anijain2305	2025-07-18 21:51:40 +00:00
Zhengxu Chen	036eb1f65d	[precompile] Filter out ID_MATCH family of guards with caching_precompile. (#158368 ) Summary: For case like caching_precompile, we almost always want to drop ID_MATCH-type guards since they will block serialization. This diff add this behavior when this global flag is toggled on so that ID_MATCH guards are excluded from compilation and serialization. Test Plan: test_dynamo -- -k test_id_match_with_config Rollback Plan: Differential Revision: D78363609 Pull Request resolved: https://github.com/pytorch/pytorch/pull/158368 Approved by: https://github.com/jamesjwu	2025-07-18 14:47:11 +00:00
Animesh Jain	af6624023e	[dynamo] Skip training flag check id already guarding on nn modules (#158492 ) This might help some legacy models that still have inline_inbuilt_nn_modules False for some reason. Pull Request resolved: https://github.com/pytorch/pytorch/pull/158492 Approved by: https://github.com/StrongerXi	2025-07-17 21:42:19 +00:00
William Wen	6b84cb29f9	[dynamo] trace through torch.get_device_module (#157980 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157980 Approved by: https://github.com/anijain2305	2025-07-12 06:25:46 +00:00
zhxchen17	7be862ab8f	[dynamo] Relax DUPLICATED_INPUT to be serializable. (#157492 ) Since we don't actually rely on any real data while building DUPLICATE_INPUT guard, we can safely serialize it with sources and it should be able to reconstruct the guard correctly in the new process. Therefore we don't really need to prevent serializing it. Differential Revision: [D77683302](https://our.internmc.facebook.com/intern/diff/D77683302/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157492 Approved by: https://github.com/jamesjwu, https://github.com/jansel	2025-07-04 15:19:34 +00:00
Guilherme Leobas	e7167dbacf	[Set] Support sets in VariableBuilder (#153150 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/153150 Approved by: https://github.com/zou3519	2025-07-04 00:45:03 +00:00
zhxchen17	e20784f228	[dynamo] Support BUILTIN_MATCH serialization. (#157016 ) Serialize BUILTIN_MATCH since they are all stored in __builtin__ dict. Also fixed an issue that the wrong global scope is passed to CheckFunctionManager while loading guards. Previously we can always reuse the compile-time global scope for evaluating guards because the compile-time and runtime global scope are always the same. For precompile, we need to serialize the compile-time global scope for loading only. We need to point the CheckFunctionManager to the new global scope after loading is finished for evaluating guards. Differential Revision: [D77159313](https://our.internmc.facebook.com/intern/diff/D77159313/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157016 Approved by: https://github.com/jansel, https://github.com/jamesjwu	2025-07-02 20:24:24 +00:00
James Wu	bd6b5fddbf	[Precompile] [easy] Serialize requires_grad for tensors when serializing guards (#157372 ) Need to keep requires_grad on the tensor when serializing/deserializing guards. This matters when there's a TENSOR_MATCH guard on a tensor that requires_grad. Added a unit test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157372 Approved by: https://github.com/jansel, https://github.com/zhxchen17 ghstack dependencies: #156433	2025-07-02 16:34:37 +00:00
zhxchen17	0f9c1b374f	[dynamo] Ensure global state guard is preserved across serialization. (#157285 ) Currently, every time we construct a GLOBAL_STATE guard, we always create a fresh guard based on the current global state. For precompile, we want to create a GLOBAL_STATE guard always based on some external sources, e.g. serialized global states. This can also be applied with the normal case where we just pass in the global state guard from Python. Differential Revision: [D77400988](https://our.internmc.facebook.com/intern/diff/D77400988/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/157285 Approved by: https://github.com/jansel	2025-07-01 15:46:34 +00:00
Isuru Fernando	40a785103c	[dynamo] fix debugging code_parts for relational guards (#154753 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154753 Approved by: https://github.com/anijain2305 ghstack dependencies: #154772	2025-06-24 01:38:29 +00:00
Xuehai Pan	1b2146fc6d	[BE][4/16] fix typos in torch/ (torch/_dynamo/) (#156314 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156314 Approved by: https://github.com/jingsh ghstack dependencies: #156313	2025-06-23 02:57:19 +00:00
Xuehai Pan	6ff6630375	[BE][3/16] fix typos in torch/ (torch/_inductor/) (#156313 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156313 Approved by: https://github.com/jingsh	2025-06-23 02:57:12 +00:00
PyTorch MergeBot	5b427c92a8	Revert "[BE][4/16] fix typos in torch/ (torch/_dynamo/) (#156314 )" This reverts commit `ead741c5fb`. Reverted https://github.com/pytorch/pytorch/pull/156314 on behalf of https://github.com/atalman due to export/test_torchbind.py::TestCompileTorchbind::test_compile_error_on_input_aliasing_contents_backend_aot_eager [GH job link](https://github.com/pytorch/pytorch/actions/runs/15804799771/job/44548489912) [HUD commit link](`c95f7fa874`) ([comment](https://github.com/pytorch/pytorch/pull/156313#issuecomment-2994171213))	2025-06-22 12:31:57 +00:00
Xuehai Pan	ead741c5fb	[BE][4/16] fix typos in torch/ (torch/_dynamo/) (#156314 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/156314 Approved by: https://github.com/jingsh ghstack dependencies: #156313	2025-06-22 08:43:18 +00:00
James Wu	b2fc9cfea1	[precompile] Add CompilePackage to serialize dynamo states. (#155118 ) Adding a per torch.compile() object CompilePackage which tracks dynamo artifact. CompilePackage is considered a low level component and should not be directly exposed to end users. It has the following interface: 1. `CompilePackage.__init__()` which optionally takes previously serialized dynamo states. a. when `dynamo` argument is None, it will contruct a brand new CompilePackage object. b. when `dynamo` argument is not None, it will load a pre-compiled dynamo state. 2. `package.save()` which dumps the dynamo states into _DynamoCacheEntry. 3. `package.install(backends)` which will handle all the side-effectful global scope updates with compiled functions and resume functions. This diff focus on making the low level mechanism for precompile. It will be left to upper level interface to use these API to build more user-facing frontend. Differential Revision: [D75956538](https://our.internmc.facebook.com/intern/diff/D75956538/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155118 Approved by: https://github.com/jamesjwu Co-authored-by: James Wu <jjwu@meta.com>	2025-06-13 13:54:10 +00:00
Isuru Fernando	53d06e18d9	[dynamo] add missing algorithm header (#154754 ) Needed for `std::max(<initializer-list>)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/154754 Approved by: https://github.com/Skylion007, https://github.com/anijain2305	2025-06-13 06:56:11 +00:00
Animesh Jain	a9d5157e25	[dynamo] Use BINARY_SUBSCR for pre-graph bytecode for regular dict accesses (#155727 ) vLLM profiler sets with_stack=True that shows the dict_getitem on the profiler, both inflating the numbers and confusing compile users. This PR keeps BINARY_SUBSCR for regular dicts, while using `dict.__getitem__` only for dict subclasses. Using binary_subscr is little bit faster, but not enough to make any major latency improvements. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155727 Approved by: https://github.com/zou3519, https://github.com/StrongerXi, https://github.com/jansel	2025-06-12 04:02:29 +00:00
Oguz Ulgen	d1947a8707	Migrate from lru_cache to cache (#155613 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155613 Approved by: https://github.com/ezyang ghstack dependencies: #155612	2025-06-11 19:44:18 +00:00
Animesh Jain	13ea0f2c0a	[dynamo][dynamic] Recompilation hint for nn module integer attributes (#154867 ) For program like this ``` class Mod(torch.nn.Module): def __init__(self): super().__init__() self.c = 0 def forward(self, x): self.c += 1 return x * self.c ``` You can check the recompile reasons at https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmpzv9z6Q/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 ![image](https://github.com/user-attachments/assets/856a95fd-0533-4abc-a213-1f73ae2cb766) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154867 Approved by: https://github.com/zou3519	2025-06-05 16:37:22 +00:00
PyTorch MergeBot	a0f2544502	Revert "[dynamo][dynamic] Recompilation hint for nn module integer attributes (#154867 )" This reverts commit `6c2f941e25`. Reverted https://github.com/pytorch/pytorch/pull/154867 on behalf of https://github.com/seemethere due to This fails internal testing see, https://fburl.com/diff/b0yuxk4w ([comment](https://github.com/pytorch/pytorch/pull/154780#issuecomment-2940381691))	2025-06-04 15:03:34 +00:00
Animesh Jain	6c2f941e25	[dynamo][dynamic] Recompilation hint for nn module integer attributes (#154867 ) For program like this ``` class Mod(torch.nn.Module): def __init__(self): super().__init__() self.c = 0 def forward(self, x): self.c += 1 return x * self.c ``` You can check the recompile reasons at https://manifold.edge.x2p.facebook.net/v0/read/tree/logs/.tmpzv9z6Q/index.html?bucketName=tlparse_reports&apiKey=tlparse_reports-key&withPayload=1&timeoutMsec=10000 ![image](https://github.com/user-attachments/assets/856a95fd-0533-4abc-a213-1f73ae2cb766) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154867 Approved by: https://github.com/zou3519 ghstack dependencies: #154780	2025-06-04 00:05:53 +00:00
Animesh Jain	635b73e697	[dynamo][guards] Flush cache to more accurately measure guard overhead (#154764 ) We observed that guard overhead at runtime using profiler traces was higher than reported in this profiling function at the compile time. After investigation, we found that f_locals are already in cache and that was causing the guard overhead to be way smaller while profiling during the compilation. To be more realistic, we flush the cache here. Profiling the guard overhead during compilation (in addition to at runtime) allows faster iteration time, and logging in tlparse and internal databases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154764 Approved by: https://github.com/zou3519, https://github.com/jansel, https://github.com/StrongerXi	2025-06-03 11:50:57 +00:00
PyTorch MergeBot	b86aaaae0b	Revert "[dynamo][guards] Flush cache to more accurately measure guard overhead (#154764 )" This reverts commit `7dee899130`. Reverted https://github.com/pytorch/pytorch/pull/154764 on behalf of https://github.com/seemethere due to This fails internal tests see [fburl.com/diff/67gyp7gp](https://fburl.com/diff/67gyp7gp) ([comment](https://github.com/pytorch/pytorch/pull/154769#issuecomment-2933629894))	2025-06-03 06:13:49 +00:00
Isuru Fernando	7f44b589be	[dynamo] fix pruning locals with ShapeEnvSource (#154752 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154752 Approved by: https://github.com/zhxchen17	2025-06-03 00:35:11 +00:00
Animesh Jain	7dee899130	[dynamo][guards] Flush cache to more accurately measure guard overhead (#154764 ) We observed that guard overhead at runtime using profiler traces was higher than reported in this profiling function at the compile time. After investigation, we found that f_locals are already in cache and that was causing the guard overhead to be way smaller while profiling during the compilation. To be more realistic, we flush the cache here. Profiling the guard overhead during compilation (in addition to at runtime) allows faster iteration time, and logging in tlparse and internal databases. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154764 Approved by: https://github.com/zou3519, https://github.com/jansel, https://github.com/StrongerXi ghstack dependencies: #154769	2025-06-02 23:01:58 +00:00
Zhengxu Chen	0f56318152	[precompile] Add Exception type PackageError for unsupported precompile features. (#154430 ) Summary: Today when guard serialization fails, dynamo will raise an internal error like: ``` torch._dynamo.exc.InternalTorchDynamoError: RuntimeError: CLOSURE_MATCH guard cannot be serialized. ``` Adding a dedicated PackageError type to surface the error more clearly. Test Plan: CI Differential Revision: D75452124 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154430 Approved by: https://github.com/jamesjwu, https://github.com/jansel	2025-05-28 22:34:51 +00:00
Zhengxu Chen	5bf74753f6	[precompile] Prune local scope variables for guard serialization. (#154431 ) Summary: Prune unused local objects from serialized local scope if they are not used in guard reconstruction. This is helpful when a user program takes things like local callable functions or the function call is recursive. Test Plan: test/dynamo/test_guard_serialization.py -k test_function_locals Before pruning locals: ``` state = GuardsState(output_graph=OutputGraphGuardsState(local_scope={'x': tensor([ 0.0461, 0.4024, -1.0115]), 'g': <function ...aints=None, _guards=<torch._guards.GuardsSet object at 0x7fbccc7e9fc0>, _aotautograd_guards=[]), shape_code_parts=None) def pickle_guards_state(state: GuardsState) -> bytes: buf = io.BytesIO() pickler = GuardsStatePickler(buf) try: pickler.dump(state) except AttributeError as e: > raise torch._dynamo.exc.PackageError(str(e)) from e E torch._dynamo.exc.PackageError: Can't pickle local object 'TestGuardSerialization.test_function_locals.<locals>.foo' ``` After the diff ``` Tests finished: Pass 1. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` Differential Revision: D75452123 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154431 Approved by: https://github.com/jansel	2025-05-28 16:03:02 +00:00
Ruisi Zhang	f74842d665	[DTensor] enable SimpleFSDP's composability with Tensor Parallel (#152286 ) This PR adds support for SimpleFSDP's composability with Tensor Parallel + torch.compile. `_StridedShard` is used in SimpleFSDP/FSDP2 to support correct distributed checkpointing when FSDP+TP is applied. Previously, `_StridedShard` is not guarded by torch.compile. This PR adds `_StridedShard` as an additional placement type to be guarded by torch.compile. Pull Request resolved: https://github.com/pytorch/pytorch/pull/152286 Approved by: https://github.com/bdhirsh	2025-05-23 01:40:38 +00:00
bobrenjc93	413664b3c5	catch CSE recursion depth errors (#154039 ) Fixes #153777 CSE is an optimization and shouldn't block a compile if it hits recursion depth limits. Unfortunately we can't write this iteratively due to a dependency on `ast.unparse` which necessarily needs to do recursion. This PR catches opts out of CSE when we hit recursion depth errors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154039 Approved by: https://github.com/Microve	2025-05-22 20:17:19 +00:00
IvanKobzarev	4439255148	[aotd] Support saved tensors hooks in aot_autograd (#150032 ) https://github.com/pytorch/pytorch/issues/148222 Goal: At the moment autograd saved tensors hooks are run in eager after compiled forward. They are executed at the same time for all saved tensors. Hooks can be used to reduce amout of memory used for saved tensors, doing quantization or offloading to cpu. This is suboptimal for optimization of peak memory. Better solution will be to put the hooks in the graph, as close as possible to the last usage of the tensor. To get user specified autograd saved tensors hooks in the graph. Logic: UX: If user specifies with torch.autograd.graph.saved_tensors_hooks(pack_gm, unpack_gm). Where pack_gm and unpack_gm are torch.fx.GraphModule. Then AotAutograd will retrace those graph modules, doing decompositions and functionalization in aot_autograd, inlining the result graphs in forward epilogue and backward prologue. User may want to use control logic in the hooks, for example applying quantization only for specific dtypes and sizes. This is also possible, user can put it into torch.fx.wrap function and use symbolic trace to make a GraphModule. In that case AotAutograd cahing will work only in case when user explicitly set to the torch.fx.wrap call_function node "user_cache_hash" metadata. If this metadata set - then aot_autograd cache can use saved cache artifact. If metadata is not set - then cache is bypassed. Dynamo: Dynamo traces pack and unpack hooks and installs them as subgraph and explicitly adds to the output_graph. (As those subgraphs are not used and will not be copied in the result by default). The complexity here is that at this moment we do not have example of inputs for the hooks. We trace pack_hook with some Tensor from the inputs. The result subgraphs are added to the hashing of AotAutograd Cache. In AotAutograd we retrace the graph with the true saved tensors coming from partitioner. Backwards Compatibility: As current hooks are executed in eager mode and not all of them will be traceable - we only try to put in the graph hooks, explicitly marked by user with annotation (@_inlineable_saved_tensors_hooks). For other hooks or if compiled autograd is enabled - keep the same logic. Recompilations: Hooks are guarded with lambda guard matching function id to cause recompilation if user reruns compiled function. Aot_autograd: After partitioner prepared forward and backward module - we trace prepared at Dynamo graphs for pack and unpack hooks and inline them in epilogue of forward and prologue of backward. Forward outputs and backward inputs are changed, transparently for user. We do not try to put it close the last usage etc., relying on inductor to do this optimization. ``` INFO: TRACED GRAPH ===== Forward graph pre saved_tensors_hooks inlining 3 ===== /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module): def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", primals_3: "f32[s0, s1][s1, 1]cuda:0"): # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6660 in simple_fn, code: x = x + 1 add: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(primals_3, 1); primals_3 = None # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x) view: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.view.default(add, [primals_1, primals_2]) return (view, add, primals_1, primals_2) INFO: TRACED GRAPH ===== Backward graph pre saved_tensors_hooks inlining 3 ===== /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module): def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", primals_3: "f32[s0, s1][s1, 1]cuda:0"): # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6660 in simple_fn, code: x = x + 1 add: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(primals_3, 1); primals_3 = None # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x) view: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.view.default(add, [primals_1, primals_2]) return (view, add, primals_1, primals_2) INFO: TRACED GRAPH ===== saved_tensors_pack_hook add 3 ===== /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class pack_float8(torch.nn.Module): def forward(self, x_1: "f32[s0, s1][s1, 1]cuda:0"): # No stacktrace found for following nodes _to_copy: "f8e4m3fn[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(x_1, dtype = torch.float8_e4m3fn); x_1 = None return (torch.float32, _to_copy) INFO: TRACED GRAPH ===== saved_tensors_unpack_hook add 3 ===== <eval_with_key>.22 from /data/users/ivankobzarev/a/pytorch/torch/fx/experimental/proxy_tensor.py:1225 in wrapped class pack_float8(torch.nn.Module): def forward(self, x_1: "f32[s0, s1][s1, 1]cuda:0"): # No stacktrace found for following nodes _to_copy: "f8e4m3fn[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(x_1, dtype = torch.float8_e4m3fn); x_1 = None return (torch.float32, _to_copy) INFO: TRACED GRAPH ===== Forward graph 3 ===== /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module): def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", primals_3: "f32[s0, s1][s1, 1]cuda:0"): # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6660 in simple_fn, code: x = x + 1 add: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(primals_3, 1); primals_3 = None # No stacktrace found for following nodes _to_copy: "f8e4m3fn[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(add, dtype = torch.float8_e4m3fn) # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x) view: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.view.default(add, [primals_1, primals_2]); add = None return (view, _to_copy, primals_1, primals_2) INFO: TRACED GRAPH ===== Backward graph 3 ===== <eval_with_key>.21 class GraphModule(torch.nn.Module): def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", add_packed_2: "f8e4m3fn[s0, s1][s1, 1]cuda:0", tangents_1: "f32[s0, s1][s1, 1]cuda:0"): # No stacktrace found for following nodes _to_copy: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(add_packed_2, dtype = torch.float32); add_packed_2 = None # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x) add_7: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(tangents_1, _to_copy); tangents_1 = _to_copy = None return (None, None, add_7) ``` Differential Revision: [D72187044](https://our.internmc.facebook.com/intern/diff/D72187044) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150032 Approved by: https://github.com/bdhirsh	2025-05-22 14:09:38 +00:00
angelayi	3fe42d4d5d	[export] Dynamo symint support (#152677 ) Basically adds native _IntWrapper support to dynamo. Here's my process of trying to make symint input support work on dynamo, and how I ended up with this approach [(doc)](https://docs.google.com/document/d/1GvNRQd8BnxlMay_hrEVgEta6VUeUW_hcFeRuB7q1nDY/edit?tab=t.0). What I did was, before passing inputs to dynamo.export, I first wrap them with a class, `_IntWrapper`. When processing dynamic shapes, I will then add the corresponding dynamic shape specification to the `dynamism` field stored on the `_IntWrapper`. If there is no dynamism specified, then this will get unwrapped back to an integer. When dynamo tracing, when we encounter an `_IntWrapper`, we will convert this to a symint if the dynamism was specified as `Dim.DYNAMIC/AUTO`. Dynamo will then trace a graph that contains symint inputs, which will get passed to AOTAutograd and so on. Pull Request resolved: https://github.com/pytorch/pytorch/pull/152677 Approved by: https://github.com/pianpwk	2025-05-16 07:51:50 +00:00
zhxchen17	a67dd2083c	[dynamo] Guard serialization for SHAPE_ENV (#153258 ) Differential Revision: [D74483150](https://our.internmc.facebook.com/intern/diff/D74483150/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/153258 Approved by: https://github.com/jansel ghstack dependencies: #153255, #153256, #153257	2025-05-12 14:42:01 +00:00
zhxchen17	e2f6870c98	[dynamo] Guard serialization for DEFAULT_DEVICE (#153257 ) Differential Revision: [D74483147](https://our.internmc.facebook.com/intern/diff/D74483147/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/153257 Approved by: https://github.com/jansel ghstack dependencies: #153255, #153256	2025-05-12 14:42:00 +00:00
Joel Schlosser	62b7ef06cc	[Dynamo] Remove unused guard PYMODULE_MATCH (#152961 ) Not used anywhere: https://www.internalfb.com/code/search?q=repo%3Afbcode%20PYMODULE_MATCH Pull Request resolved: https://github.com/pytorch/pytorch/pull/152961 Approved by: https://github.com/jansel ghstack dependencies: #152725, #152727, #152728, #152730, #152865, #152872	2025-05-07 18:58:18 +00:00
Joel Schlosser	42954ab28e	[Dynamo] Guard serialization for CLOSURE_MATCH (#152728 ) Unsupported because it uses unsupported FUNCTION_MATCH. Pull Request resolved: https://github.com/pytorch/pytorch/pull/152728 Approved by: https://github.com/jansel ghstack dependencies: #152725, #152727	2025-05-07 18:58:18 +00:00
Joel Schlosser	a9186ec723	[Dynamo] Guard serialization for FUNCTION_MATCH (#152727 ) Unsupported because it uses unsupported ID_MATCH. Pull Request resolved: https://github.com/pytorch/pytorch/pull/152727 Approved by: https://github.com/jansel ghstack dependencies: #152725	2025-05-07 18:58:18 +00:00
Joel Schlosser	a6f51be2fd	[Dynamo] Guard serialization for NN_MODULE (#152725 ) Throws an error when attempting to serialize an NN_MODULE guard. It is not supported because it uses the unsupported ID_MATCH guard (#152330): `a6dd1c2208/torch/_dynamo/guards.py (L1738-L1739)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/152725 Approved by: https://github.com/jansel	2025-05-07 18:58:17 +00:00
Zhengxu Chen	1965a2ca1e	[dynamo][ez] Remove unused guard OBJECT_MUTATION. (#152855 ) Summary: seems not used anywhere https://www.internalfb.com/code/search?q=case%3Ayes%20filepath%3Acaffe2%20OBJECT_MUTATION Test Plan: CI Differential Revision: D74196559 Pull Request resolved: https://github.com/pytorch/pytorch/pull/152855 Approved by: https://github.com/jansel, https://github.com/jbschlosser	2025-05-07 02:32:32 +00:00
Joel Schlosser	b06cbd49f1	[Dynamo] Guard serialization for TENSOR_SUBCLASS_METADATA_MATCH (#152626 ) This PR updates `GuardsStatePickler.reducer_override()` in `torch/_dynamo/guards.py` to handle reconstruction of traceable wrapper subclasses. It's intended to work recursively and handle any level of subclass instance nesting (e.g. subclass instances that contain subclass instances, etc.) This PR tests the guard on several traceable wrapper tensor subclasses: * `LocalSubclass`: used to ensure the correct error message is thrown when the subclass is not defined globally * `torch.testing._internal.two_tensor.TwoTensor`: defines None for its extra metadata * `SubclassWithMeta`: stores non-trivial extra metadata * `SubclassWithCustomMetadataGuard`: stores non-trivial extra metadata and defines a custom `__metadata_guard__` classmethod * `SubclassWithSubclassInnerTensors`: used to test recursiveness; this subclass contains subclass inner tensor components Pull Request resolved: https://github.com/pytorch/pytorch/pull/152626 Approved by: https://github.com/jansel	2025-05-06 14:06:36 +00:00
bobrenjc93	e2eb845313	[ez] fix a bunch of typos in dynamo (#152886 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152886 Approved by: https://github.com/williamwen42	2025-05-06 05:13:56 +00:00
zhxchen17	2da9ab4b1c	[dynamo] Guard serialization for MAPPING_KEYS_CHECK (#152721 ) MappingProxyType Differential Revision: [D74091363](https://our.internmc.facebook.com/intern/diff/D74091363/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152721 Approved by: https://github.com/jansel ghstack dependencies: #152615, #152616, #152687, #152716	2025-05-05 18:05:56 +00:00
zhxchen17	24e1666b3a	[dynamo] Guard serialization for WEAKREF_ALIVE (#152716 ) Punt on WEAREF_ALIVE as weakref won't live across the process and users might need to drop them upfront. Differential Revision: [D74088735](https://our.internmc.facebook.com/intern/diff/D74088735/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152716 Approved by: https://github.com/jansel ghstack dependencies: #152615, #152616, #152687	2025-05-05 18:05:56 +00:00
zhxchen17	2cb16df6e2	[dynamo] Guard serialization for DUPLICATE_INPUT. (#152687 ) Seems this guard is not very active. Adding a test to detect error handling at least. Differential Revision: [D74074837](https://our.internmc.facebook.com/intern/diff/D74074837/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152687 Approved by: https://github.com/jansel ghstack dependencies: #152615, #152616	2025-05-05 18:05:56 +00:00
zhxchen17	ffd58293f7	[dynamo] Guard serialization for FUNCTORCH_STACK_MATCH (#152616 ) Make Functorch interpreters serializable most of the time, so that we can save the guards on functorch states. ## Test Cases: 0. torch.compile() without functorch layers present. Guard should fail with any layer being pushed. 1. torch.compile() nested in vmap. 2. torch.compile() nested in grad. 3. torch.compile() nested in jvp + vmap 4. torch.compile() nested functionalize 5. torch.compile() nested in vmap + grad Differential Revision: [D74008787](https://our.internmc.facebook.com/intern/diff/D74008787/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152616 Approved by: https://github.com/zou3519 ghstack dependencies: #152615	2025-05-05 18:05:56 +00:00
zhxchen17	1d1cbcd8a3	[dynamo] Guard serialization for DUAL LEVEL. (#152615 ) Seem dual level counter should be stored in OutputGraph so that the value can be preserved through roundtripping. Differential Revision: [D74008786](https://our.internmc.facebook.com/intern/diff/D74008786/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152615 Approved by: https://github.com/jansel, https://github.com/zou3519	2025-05-05 18:05:56 +00:00
zhxchen17	1d8cdf373b	[dynamo] Guard serialization for NAME_MATCH (#152332 ) Differential Revision: [D73780430](https://our.internmc.facebook.com/intern/diff/D73780430/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152332 Approved by: https://github.com/jansel ghstack dependencies: #152325, #152326, #152327, #152328, #152329, #152330, #152331	2025-04-29 20:16:00 +00:00
zhxchen17	5c297b2846	[dynamo] Guard serialization for DISPATCH_KEY_SET_MATCH (#152331 ) Differential Revision: [D73780433](https://our.internmc.facebook.com/intern/diff/D73780433/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152331 Approved by: https://github.com/jansel ghstack dependencies: #152325, #152326, #152327, #152328, #152329, #152330	2025-04-29 20:16:00 +00:00
zhxchen17	4cb75d7afc	[dynamo] Guard serialization for ID_MATCH (#152330 ) Differential Revision: [D73780431](https://our.internmc.facebook.com/intern/diff/D73780431/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152330 Approved by: https://github.com/jansel ghstack dependencies: #152325, #152326, #152327, #152328, #152329	2025-04-29 20:16:00 +00:00
zhxchen17	52202525b9	[dynamo] Guard serialization for DICT_VERSION (#152326 ) I think we shouldn't support DICT_VERSION for 2 reasons: 1. dict version is not well defined across processes 2. they are pretty rare (only with pytree calls) Differential Revision: [D73780437](https://our.internmc.facebook.com/intern/diff/D73780437/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152326 Approved by: https://github.com/jansel ghstack dependencies: #152325	2025-04-29 20:16:00 +00:00
zhxchen17	df663b9e72	[dynamo] Guard serialization for TYPE_MATCH (#152325 ) Adding guard serialization for TYPE_MATCH Differential Revision: [D73780438](https://our.internmc.facebook.com/intern/diff/D73780438/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/152325 Approved by: https://github.com/jansel	2025-04-29 20:16:00 +00:00
Zhengxu Chen	203201255f	[dynamo] remove dead code for DATA_PTR_MATCH (#152206 ) Summary: Seems this guard is not created anywhere Test Plan: CI Differential Revision: D73682084 Pull Request resolved: https://github.com/pytorch/pytorch/pull/152206 Approved by: https://github.com/anijain2305, https://github.com/jansel	2025-04-26 15:25:01 +00:00
zhxchen17	558f45190e	[dynamo] Guard serialization for NOT_PRESENT_IN_GENERIC_DICT (#151343 ) Adding guard serialization for type NOT_PRESENT_IN_GENERIC_DICT Differential Revision: [D73057304](https://our.internmc.facebook.com/intern/diff/D73057304/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151343 Approved by: https://github.com/jansel, https://github.com/anijain2305 ghstack dependencies: #151318	2025-04-25 14:16:30 +00:00
zhxchen17	a34c28e0d2	[dynamo] Add guard serialization for tensor matches. (#151318 ) This is a proof-of-concept of how we could serialize a guard and deserialize it back from the bytes. The main behavioral change introduced in this diff is on CheckFunctionManager: ``` check_fn_manager = CheckFunctionManager(code, output_graph, guards_serialization_mode="save") guards_state: bytes = check_fn_manager.guards_state ``` Once `guards_serialization_mode` is set to `save`, CheckFunctionManager will return an addtional `bytes` object called `guards_state` which should contain all the information needed for deserializing guards later. When we load back guards state, we will set `guards_serialization_mode` is set to `load`: ``` output_graph_state = pickle.loads(guards_state) check_fn_manager = CheckFunctionManager(code, output_graph_state, guards_serialization_mode="load") ``` # TENSOR_MATCH Since we have many types of guards to support, we will break the work into small diffs instead of a single diff to support every guards. We kick off the work from TENSOR_MATCH from this diff. # Testing For each type of guard we will test it like the following: 1. Use guard_filter_fn to select 1 type of guard each time. 2. Call InstructionTranslator directly on an example function to get OutputGraph and CheckFunctionManager (reference guard manager) 3. Serialize->deserialize the output graph state and re-build the guards with a new CheckFunctionManager (loaded guard manager) 4. Throw a set of example inputs to both reference and loaded guard manager to see if their behavior match. Pull Request resolved: https://github.com/pytorch/pytorch/pull/151318 Approved by: https://github.com/jansel, https://github.com/anijain2305	2025-04-25 14:16:23 +00:00
PyTorch MergeBot	b1d055fd6a	Revert "[dynamo] Add guard serialization for tensor matches. (#151318 )" This reverts commit `81c4369d81`. Reverted https://github.com/pytorch/pytorch/pull/151318 on behalf of https://github.com/zhxchen17 due to macos test failing ([comment](https://github.com/pytorch/pytorch/pull/151318#issuecomment-2828638168))	2025-04-24 19:22:45 +00:00
zhxchen17	81c4369d81	[dynamo] Add guard serialization for tensor matches. (#151318 ) This is a proof-of-concept of how we could serialize a guard and deserialize it back from the bytes. The main behavioral change introduced in this diff is on CheckFunctionManager: ``` check_fn_manager = CheckFunctionManager(code, output_graph, guards_serialization_mode="save") guards_state: bytes = check_fn_manager.guards_state ``` Once `guards_serialization_mode` is set to `save`, CheckFunctionManager will return an addtional `bytes` object called `guards_state` which should contain all the information needed for deserializing guards later. When we load back guards state, we will set `guards_serialization_mode` is set to `load`: ``` output_graph_state = pickle.loads(guards_state) check_fn_manager = CheckFunctionManager(code, output_graph_state, guards_serialization_mode="load") ``` # TENSOR_MATCH Since we have many types of guards to support, we will break the work into small diffs instead of a single diff to support every guards. We kick off the work from TENSOR_MATCH from this diff. # Testing For each type of guard we will test it like the following: 1. Use guard_filter_fn to select 1 type of guard each time. 2. Call InstructionTranslator directly on an example function to get OutputGraph and CheckFunctionManager (reference guard manager) 3. Serialize->deserialize the output graph state and re-build the guards with a new CheckFunctionManager (loaded guard manager) 4. Throw a set of example inputs to both reference and loaded guard manager to see if their behavior match. Differential Revision: [D72987485](https://our.internmc.facebook.com/intern/diff/D72987485/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151318 Approved by: https://github.com/jansel, https://github.com/anijain2305	2025-04-24 18:07:01 +00:00
Sam Larsen	585d03fa39	Record how many parameters we're parsing within dynamo (#148508 ) This allows us to track how many paramaters we have in compilations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148508 Approved by: https://github.com/jansel, https://github.com/anijain2305 Co-authored-by: Sam Larsen <slarsen@meta.com>	2025-04-16 06:15:11 +00:00
Animesh Jain	7b1a2373e8	[dynamo][super variable] Fix bug to use correct source (#151154 ) Fixes https://github.com/pytorch/pytorch/issues/150994 We should cherry-pick to 2.7 branch if possible, because this breaks torch.compile on some HF models. Look at the issue referenced here. Pull Request resolved: https://github.com/pytorch/pytorch/pull/151154 Approved by: https://github.com/jansel	2025-04-13 04:48:52 +00:00
Zhengxu Chen	be24e7b4b4	[dynamo] Use sentinel value for guard filter. (#151131 ) Summary: `None` can collide with the real values in the scope, so we should use a separate value. Also added "has_value" to the struct so that it's more clear whether the value is absent or not. Test Plan: CI Differential Revision: D72881300 Pull Request resolved: https://github.com/pytorch/pytorch/pull/151131 Approved by: https://github.com/jansel, https://github.com/anijain2305	2025-04-12 15:29:57 +00:00
Bartlomiej Stemborowski	12281f9c18	[dynamo] Deprecate enable_cpp_framelocals_guard_eval config variable - default: True (#151008 ) [dynamo] Deprecate enable_cpp_framelocals_guard_eval config variable - default: True Reading the feature enabling param `enable_cpp_framelocals_guard_eval `at the CPP level is time consuming and slows down the operation of the dynamo as it is done every time the function using this param is called. Reading the value only once at init isn’t an option as it would disable the modification of this param at the runtime. Since this feature is enabled by default for some time and it doesn’t cause known issues, the `enable_cpp_framelocals_guard_eval `configuration param will be deprecated by this commit and its value is hardcoded to true. Local microbenchmark dynamo_guard_eval.py: - 931.9 us -> 538.9 us (3.10) @williamwen42 @jansel @anijain2305 Pull Request resolved: https://github.com/pytorch/pytorch/pull/151008 Approved by: https://github.com/williamwen42	2025-04-11 21:07:59 +00:00
Isuru Fernando	a22d3e778e	[dynamo][guards] Print relational guards only once (#150810 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150810 Approved by: https://github.com/anijain2305	2025-04-11 04:10:37 +00:00
Zhengxu Chen	86370fd658	[dynamo] Allow guards to be dropped with custom filter functions. (#150936 ) Summary: A follow up of https://github.com/pytorch/pytorch/pull/150689. Test Plan: test_dynamo -k test_guard_filter_fn Differential Revision: D72722322 Pull Request resolved: https://github.com/pytorch/pytorch/pull/150936 Approved by: https://github.com/jansel	2025-04-11 03:06:34 +00:00
Isuru Fernando	a72b4eb806	Support windows in C++ shape guards (#149211 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149211 Approved by: https://github.com/anijain2305 ghstack dependencies: #149149, #149197	2025-04-03 14:42:08 +00:00
Isuru Fernando	f9a7eac718	use python fallback if there are overflows (#149197 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149197 Approved by: https://github.com/anijain2305 ghstack dependencies: #149149	2025-04-03 14:39:03 +00:00
Animesh Jain	f9a787224c	[dynamo][guards][serialization] Dont use ID_MATCH guard for bool and None (#149228 ) Doing this removes the need of collecting `id` and therefore facilitates serialization. It also improves readability with recompilations. Earlier, recompile message will just show the `id`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/149228 Approved by: https://github.com/jansel	2025-03-18 01:25:37 +00:00
PyTorch MergeBot	b52a8bef01	Revert "[dynamo][guards][serialization] Dont use ID_MATCH guard for bool and None (#149228 )" This reverts commit `5905bbe745`. Reverted https://github.com/pytorch/pytorch/pull/149228 on behalf of https://github.com/malfet due to I wonder if this will fix the pr-time-benchmark regressions ([comment](https://github.com/pytorch/pytorch/pull/149228#issuecomment-2731237949))	2025-03-18 00:10:50 +00:00
Animesh Jain	5905bbe745	[dynamo][guards][serialization] Dont use ID_MATCH guard for bool and None (#149228 ) Doing this removes the need of collecting `id` and therefore facilitates serialization. It also improves readability with recompilations. Earlier, recompile message will just show the `id`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/149228 Approved by: https://github.com/jansel	2025-03-16 15:56:17 +00:00
Animesh Jain	f1787ee0f7	[dynamo] Remove L scoping for recompilation messages (#148917 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/148917 Approved by: https://github.com/williamwen42	2025-03-11 14:26:26 +00:00
Animesh Jain	992838e702	[dynamo][guards] Do not ID_MATCH on numpy tensors (#148923 ) Might help with https://github.com/pytorch/pytorch/issues/148535 Pull Request resolved: https://github.com/pytorch/pytorch/pull/148923 Approved by: https://github.com/jansel	2025-03-11 14:20:26 +00:00
Ryan Guo	1d7fc0c681	[dynamo] Remove dead code path around `functools.partial` objects (#148683 ) This removes the code paths added in #98120, which has then been superceded by #108846. More importantly, it makes `EQUALS_MATCH`'s `ok_mutable_types` (added in #134016) easier to reason about, i.e., no need to worry about `dict` types, which was only needed for #98120. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148683 Approved by: https://github.com/yanboliang	2025-03-06 21:20:04 +00:00
Ryan Guo	ad9a10aff0	[dynamo] Make `nonstrict_trace` work with some `pytree.register_constant`-ed instances (#148007 ) As title, this enables `nonstrict_trace`-ed function to take in object whose type has been `pytree.register_constant`-ed, as long as the object existed outside the `torch.compile` region. This also forces Dynamo to emit a `EQUALS_MATCH` guard on the object. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148007 Approved by: https://github.com/zou3519 ghstack dependencies: #148385	2025-03-05 21:28:26 +00:00
Xuehai Pan	3ce352e389	[BE][PYFMT] migrate PYFMT for `torch._dynamo` to `ruff format` (#144549 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144549 Approved by: https://github.com/jansel	2025-02-28 03:03:53 +00:00
William Wen	cf6d1e6824	[dynamo] add generic graph break hints (#147429 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/147429 Approved by: https://github.com/jansel, https://github.com/zou3519 ghstack dependencies: #147385	2025-02-26 09:20:28 +00:00
William Wen	3fd68e4e2f	[dynamo] make some more graph break messages readable in English [2/N] (#147385 ) This is for "for some large number Z, make sure the error messages are readable English." - beginning to audit all `unimplemented` sites and making sure that all messages are at least English-readable. Hints may not necessarily be provided. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147385 Approved by: https://github.com/jansel	2025-02-26 09:20:28 +00:00
Animesh Jain	9dc702875d	[dynamo][mappingproxy][inspect] Support existing types.MappingProxyType (#147217 ) Fixes https://github.com/pytorch/pytorch/issues/147162 Pull Request resolved: https://github.com/pytorch/pytorch/pull/147217 Approved by: https://github.com/williamwen42	2025-02-15 07:59:33 +00:00
Raymond Li	21c2565f35	Document dynamo (#146736 ) Many files in dynamo are currently lacking file/module-level documentation, which makes it hard to know what they do at a glance and without digging into the code. This fixes that. Note: documentation was AI-generated and could be incorrect, please review carefully. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146736 Approved by: https://github.com/jansel, https://github.com/StrongerXi, https://github.com/anijain2305, https://github.com/zou3519	2025-02-13 00:02:21 +00:00
Animesh Jain	d6513f3246	[dynamo] Support list subclasses and fix dict subclasses mutation bugs (#146819 ) This PR adds support for list subclasses. Among other things are 1) Tracking the mutations on internal vts like `_dict_vt` and `_list_vt` using sources. This helps identify if there was a mutation in the underlying data structures, and we need to reconstruct it. 2) `UserDefinedObjectVariable` now has a new method - `is_modified` which `side_effect` infra relies upon to check mutations in the underlying vts (like `_dict_vt`). 3) `reconstruction` logic ensures that we use `dict.__getitem__` and `list.__getitem__` methods. This is super important because we don't want to call the overridden `__getitem__` methods. If this PR is hard to review, please let me know. I can break it into several small PRs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146819 Approved by: https://github.com/StrongerXi, https://github.com/jansel	2025-02-12 17:46:02 +00:00
Simon Fan	1d4adf4e1f	[dynamo] log recompile reason to dynamo_compile (#146117 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/146117 Approved by: https://github.com/bobrenjc93	2025-02-03 21:04:04 +00:00

1 2 3 4 5 ...

549 Commits