pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Jovian Anthony Jaison	9a0f7a3bb0	[retry-land][pytorch][dynamo_compile] Log stack_trace to dynamo_compile (#160348 ) refer: https://github.com/pytorch/pytorch/pull/159655 Earlier pr failed on dynamo/test_utils.py::TestDynamoTimed::test_dynamo_timed. Updated test_dynamo_timed + re-ran locally to test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/160348 Approved by: https://github.com/masnesral	2025-08-12 06:24:54 +00:00
PyTorch MergeBot	206c1eef65	Revert "[pytorch][dynamo_compile] Log stack_trace to dynamo_compile (#159655 )" This reverts commit `2ee22e4351`. Reverted https://github.com/pytorch/pytorch/pull/159655 on behalf of https://github.com/clee2000 due to broke dynamo/test_utils.py::TestDynamoTimed::test_dynamo_timed [GH job link](https://github.com/pytorch/pytorch/actions/runs/16839294394/job/47711078667) [HUD commit link](`2ee22e4351`). Probably a landrace since it did run on the PR ([comment](https://github.com/pytorch/pytorch/pull/159655#issuecomment-3169400889))	2025-08-08 22:04:22 +00:00
Jovian Anthony Jaison	2ee22e4351	[pytorch][dynamo_compile] Log stack_trace to dynamo_compile (#159655 ) This change logs the stack trace of the code being compiled by Dynamo, improving visibility into what is compiled. It adds a stack_trace field to compilation metrics. This helps with debugging and analysis of Dynamo compilation behavior. Ref [D79287964](https://www.internalfb.com/diff/D79287964) Test Plan: $ python -m test_utils Internal: ref [D79372519](https://www.internalfb.com/diff/D79372519) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159655 Approved by: https://github.com/c00w	2025-08-08 19:53:47 +00:00
Xu Han	06824f3c72	[inductor] fix test_dynamo_timed on Windows. (#159981 ) Fixed `test_dynamo_timed `: <img width="1030" height="389" alt="image" src="https://github.com/user-attachments/assets/02d84dd8-6a65-4f91-8d4c-48ba0a81fac1" /> Pull Request resolved: https://github.com/pytorch/pytorch/pull/159981 Approved by: https://github.com/angelayi	2025-08-07 16:37:52 +00:00
Ivan Zaitsev	e4b123b5e4	Revert direct updates (#159654 ) reverts: ``` commit `5711a8f069` (tag: trunk/5711a8f06948eeee56ed5f53f171fa519f78491c, origin/main, main) Author: Jovian Anthony Jaison <38627145+jovianjaison@users.noreply.github.com> Date: Fri Aug 1 09:32:52 2025 -0700 Update test_utils.py commit `b4b71d011e` (tag: trunk/b4b71d011ed07a41c2086ff0dec2988a63662877) Author: Jovian Anthony Jaison <38627145+jovianjaison@users.noreply.github.com> Date: Fri Aug 1 09:27:54 2025 -0700 Update utils.py commit `52376b9b6f` (tag: trunk/52376b9b6fbf9fe24f5d82038dc520f0c64b6f8d) Author: Jovian Anthony Jaison <38627145+jovianjaison@users.noreply.github.com> Date: Fri Aug 1 09:26:05 2025 -0700 ``` (commits pushed directly to main by mistake) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159654 Approved by: https://github.com/atalman	2025-08-01 16:54:51 +00:00
Jovian Anthony Jaison	5711a8f069	Update test_utils.py	2025-08-01 09:32:52 -07:00
Boyuan Feng	94995eba07	[Log] add a hook for recompile user context (#157961 ) Users may want compile-related but customized logging info to dynamo_compile. One example is to logging the current training iteration index when recompilation happens. In general, current training iteration index is not available to compiler, since the same compiled function may be called multiple times in the same training iteration. The user could provide the training iteration index in a user hook where torch.compile logs it when recompilation happens. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157961 Approved by: https://github.com/masnesral	2025-07-11 03:41:33 +00:00
Raymond Li	82765dad16	Fix logging of config_suppress_errors and config_inline_inbuilt_nn_modules (#157947 ) Currently ~50% of the time we fail or crash before logging metrics, so moving where this is logged will let us have more comprehensive (less-null) data. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157947 Approved by: https://github.com/masnesral, https://github.com/jovianjaison	2025-07-10 12:05:43 +00:00
Xuehai Pan	6d5c789ad5	[BE][PYFMT] migrate PYFMT for `test/[a-h]*/` to `ruff format` (#144555 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144555 Approved by: https://github.com/ezyang ghstack dependencies: #144551, #144554	2025-06-24 04:53:54 +00:00
Joel Schlosser	c4b93e6579	Replace frame_traced_fn hook with get_traced_code() util (#155249 ) #153622 introduced a hook for getting the relevant code objects after frame tracing. The idea is to have vLLM use this instead of monkey-patching `inline_call_()` to determine the source code files to hash. Unfortunately, the hook runs too late; the vLLM backend needs access to the set of source code filenames while it's running. This PR replaces the newly-added hook with a utility function that a backend can call to get this information. I've made the change in vLLM and can verify that this allows the information to be queried at the right time. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155249 Approved by: https://github.com/zou3519	2025-06-10 22:40:58 +00:00
Joel Schlosser	43b18d098b	Forward fix for test_frame_traced_hook in internal testing (#154641 ) Summary: Fixes the newly-added dynamo test test_frame_traced_hook so it can run internally Test Plan: This is a test change Differential Revision: D75616787 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154641 Approved by: https://github.com/Skylion007	2025-05-29 23:02:01 +00:00
Joel Schlosser	9db7bcb3fe	[Dynamo] Introduce hook receiving list of traced code objects (#153622 ) This PR: * Expands `Hooks` with a new, optional `frame_traced_fn` field. It should be a callable receiving the list of traced code objects * Maintains a list of `traced_code` objects in the `TracingContext` of an `OutputGraph` * Whenever an `inline_call()` is encountered, the corresponding code object is added to this set * `OutputGraph`'s associated `f_code` is added to the list just before the hook is called I believe use of this hook should enable the source code hashing that vLLM does in a better way than monkey-patching `inline_call()`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153622 Approved by: https://github.com/jansel	2025-05-28 15:40:09 +00:00
Nikita Shulga	acd0873d3b	[CI] Fix `TestDynamoTimed.test_ir_count` for 3.12 (#154268 ) Python-3.12 emits the same bytecode as 3.13 for code in question Pull Request resolved: https://github.com/pytorch/pytorch/pull/154268 Approved by: https://github.com/clee2000, https://github.com/atalman ghstack dependencies: #154237	2025-05-23 20:08:19 +00:00
IvanKobzarev	4439255148	[aotd] Support saved tensors hooks in aot_autograd (#150032 ) https://github.com/pytorch/pytorch/issues/148222 Goal: At the moment autograd saved tensors hooks are run in eager after compiled forward. They are executed at the same time for all saved tensors. Hooks can be used to reduce amout of memory used for saved tensors, doing quantization or offloading to cpu. This is suboptimal for optimization of peak memory. Better solution will be to put the hooks in the graph, as close as possible to the last usage of the tensor. To get user specified autograd saved tensors hooks in the graph. Logic: UX: If user specifies with torch.autograd.graph.saved_tensors_hooks(pack_gm, unpack_gm). Where pack_gm and unpack_gm are torch.fx.GraphModule. Then AotAutograd will retrace those graph modules, doing decompositions and functionalization in aot_autograd, inlining the result graphs in forward epilogue and backward prologue. User may want to use control logic in the hooks, for example applying quantization only for specific dtypes and sizes. This is also possible, user can put it into torch.fx.wrap function and use symbolic trace to make a GraphModule. In that case AotAutograd cahing will work only in case when user explicitly set to the torch.fx.wrap call_function node "user_cache_hash" metadata. If this metadata set - then aot_autograd cache can use saved cache artifact. If metadata is not set - then cache is bypassed. Dynamo: Dynamo traces pack and unpack hooks and installs them as subgraph and explicitly adds to the output_graph. (As those subgraphs are not used and will not be copied in the result by default). The complexity here is that at this moment we do not have example of inputs for the hooks. We trace pack_hook with some Tensor from the inputs. The result subgraphs are added to the hashing of AotAutograd Cache. In AotAutograd we retrace the graph with the true saved tensors coming from partitioner. Backwards Compatibility: As current hooks are executed in eager mode and not all of them will be traceable - we only try to put in the graph hooks, explicitly marked by user with annotation (@_inlineable_saved_tensors_hooks). For other hooks or if compiled autograd is enabled - keep the same logic. Recompilations: Hooks are guarded with lambda guard matching function id to cause recompilation if user reruns compiled function. Aot_autograd: After partitioner prepared forward and backward module - we trace prepared at Dynamo graphs for pack and unpack hooks and inline them in epilogue of forward and prologue of backward. Forward outputs and backward inputs are changed, transparently for user. We do not try to put it close the last usage etc., relying on inductor to do this optimization. ``` INFO: TRACED GRAPH ===== Forward graph pre saved_tensors_hooks inlining 3 ===== /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module): def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", primals_3: "f32[s0, s1][s1, 1]cuda:0"): # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6660 in simple_fn, code: x = x + 1 add: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(primals_3, 1); primals_3 = None # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x) view: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.view.default(add, [primals_1, primals_2]) return (view, add, primals_1, primals_2) INFO: TRACED GRAPH ===== Backward graph pre saved_tensors_hooks inlining 3 ===== /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module): def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", primals_3: "f32[s0, s1][s1, 1]cuda:0"): # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6660 in simple_fn, code: x = x + 1 add: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(primals_3, 1); primals_3 = None # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x) view: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.view.default(add, [primals_1, primals_2]) return (view, add, primals_1, primals_2) INFO: TRACED GRAPH ===== saved_tensors_pack_hook add 3 ===== /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class pack_float8(torch.nn.Module): def forward(self, x_1: "f32[s0, s1][s1, 1]cuda:0"): # No stacktrace found for following nodes _to_copy: "f8e4m3fn[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(x_1, dtype = torch.float8_e4m3fn); x_1 = None return (torch.float32, _to_copy) INFO: TRACED GRAPH ===== saved_tensors_unpack_hook add 3 ===== <eval_with_key>.22 from /data/users/ivankobzarev/a/pytorch/torch/fx/experimental/proxy_tensor.py:1225 in wrapped class pack_float8(torch.nn.Module): def forward(self, x_1: "f32[s0, s1][s1, 1]cuda:0"): # No stacktrace found for following nodes _to_copy: "f8e4m3fn[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(x_1, dtype = torch.float8_e4m3fn); x_1 = None return (torch.float32, _to_copy) INFO: TRACED GRAPH ===== Forward graph 3 ===== /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module): def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", primals_3: "f32[s0, s1][s1, 1]cuda:0"): # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6660 in simple_fn, code: x = x + 1 add: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(primals_3, 1); primals_3 = None # No stacktrace found for following nodes _to_copy: "f8e4m3fn[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(add, dtype = torch.float8_e4m3fn) # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x) view: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.view.default(add, [primals_1, primals_2]); add = None return (view, _to_copy, primals_1, primals_2) INFO: TRACED GRAPH ===== Backward graph 3 ===== <eval_with_key>.21 class GraphModule(torch.nn.Module): def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", add_packed_2: "f8e4m3fn[s0, s1][s1, 1]cuda:0", tangents_1: "f32[s0, s1][s1, 1]cuda:0"): # No stacktrace found for following nodes _to_copy: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(add_packed_2, dtype = torch.float32); add_packed_2 = None # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x) add_7: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(tangents_1, _to_copy); tangents_1 = _to_copy = None return (None, None, add_7) ``` Differential Revision: [D72187044](https://our.internmc.facebook.com/intern/diff/D72187044) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150032 Approved by: https://github.com/bdhirsh	2025-05-22 14:09:38 +00:00
clr	a952f42bdb	dynamo: Log if we're using dynamic shapes via set_feature_usage (#153490 ) This makes it extremely clear if a specific model didn't use dynamic shapes and should have (except it had a bad config option). Pull Request resolved: https://github.com/pytorch/pytorch/pull/153490 Approved by: https://github.com/jansel	2025-05-16 23:59:00 +00:00
clr	85f97b5a8c	compile_fx: make a compile event that corresponds to the fx_compile waitcounter (#152983 ) This is a pretty minor change, but by having exact correspondence, we can easily confirm data differences between perfetto and wait counters Pull Request resolved: https://github.com/pytorch/pytorch/pull/152983 Approved by: https://github.com/jansel, https://github.com/masnesral	2025-05-14 01:54:42 +00:00
Sam Larsen	dde705864a	Fix test broken by D73809989 (#153413 ) Summary: I forgot to remove this unused field in D73809989. Test Plan: `buck test 'fbcode//mode/opt' fbcode//caffe2/test:fbonly -- --exact 'caffe2/test:fbonly - test_compilation_metrics_logger_in_sync (caffe2.test.fb.test_fb.TestFBOnly)'` Pull Request resolved: https://github.com/pytorch/pytorch/pull/153413 Approved by: https://github.com/c00w	2025-05-13 16:44:30 +00:00
Animesh Jain	7fdd754136	[compile-time traces] Profile large missing gaps in compile time (#151256 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151256 Approved by: https://github.com/bdhirsh, https://github.com/masnesral, https://github.com/zou3519, https://github.com/jansel	2025-05-13 14:44:51 +00:00
Sam Larsen	e6e1ca1996	[easy] Fix test_dynamo_timed (#152387 ) Summary: I'm just trying to fix the test again. It's out of date because it's disabled and some dynamo_timed-related fields are gone now. Test Plan: `python test/dynamo/test_utils.py -k dynamo_timed` Pull Request resolved: https://github.com/pytorch/pytorch/pull/152387 Approved by: https://github.com/anijain2305	2025-04-29 19:22:56 +00:00
Animesh Jain	159e2f96e3	[dynamo][ci] Fix recently broken test (#151877 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151877 Approved by: https://github.com/masnesral, https://github.com/jansel	2025-04-22 06:42:03 +00:00
Sam Larsen	80a3877b3d	[easy] Fix test_dynamo_timed (#151816 ) Summary: The structured logging counter is a global that might have been affected by earlier tests. Clear it explicitly. Fixes #148093 Test Plan: `pytest test/dynamo/test_utils.py` Pull Request resolved: https://github.com/pytorch/pytorch/pull/151816 Approved by: https://github.com/ppanchalia	2025-04-22 00:12:31 +00:00
Sam Larsen	f20a266512	[easy] Update test/dynamo/test_utils.py (#151599 ) Summary: test/dynamo/test_utils.py is out of date because of some new dynamo_timed fields. (I guess the test is disabled?). Bring it up to date Test Plan: `python test/dynamo/test_utils.py` Fixes #148093 Pull Request resolved: https://github.com/pytorch/pytorch/pull/151599 Approved by: https://github.com/Skylion007	2025-04-18 18:49:24 +00:00
Sam Larsen	585d03fa39	Record how many parameters we're parsing within dynamo (#148508 ) This allows us to track how many paramaters we have in compilations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148508 Approved by: https://github.com/jansel, https://github.com/anijain2305 Co-authored-by: Sam Larsen <slarsen@meta.com>	2025-04-16 06:15:11 +00:00
Sam Larsen	2a1e2b88ed	[logging] Add pgo remote get/put timings to dynamo_compile (#150322 ) Test Plan: https://fburl.com/scuba/dynamo_compile/sandbox/xf950tw8 Pull Request resolved: https://github.com/pytorch/pytorch/pull/150322 Approved by: https://github.com/ppanchalia	2025-04-07 18:08:26 +00:00
Sam Larsen	90543e90a0	Fix broken dynamo_timed test due to python_version field (#149659 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149659 Approved by: https://github.com/ppanchalia	2025-03-21 00:27:28 +00:00
Shunting Zhang	6c7d8419e3	fix two accuracy regression (#149172 ) There are 2 accuracy regression in 3/12 nightly perf run. I can not repro them locally thus there is no effective way to bisect. Raise the tolerance to make them pass the accuracy check. - error log for HF MegatronBertForQuestionAnswering https://gist.github.com/shunting314/25322b66e15e98feed32e0d9a1e43316 - error log for TIMM gluon_inception_v3 https://gist.github.com/shunting314/df64ce22327df27a7057bbbd19ef5164 Pull Request resolved: https://github.com/pytorch/pytorch/pull/149172 Approved by: https://github.com/jansel, https://github.com/eellison	2025-03-17 19:34:00 +00:00
Sam Larsen	7cdbb913e7	[logging] Set compile_id in the CachingAutotuner during compilation so we have it for dynamo_timed logging (#148693 ) Summary: This is a simpler alternative to https://github.com/pytorch/pytorch/pull/146455, where we can stick the compileId (and forward/backward bool) in the CachingAutotuner so that we have it for logging `benchmark_all_configs`. Recall that the first attempt put the compileId in the inductor_meta and that interfered with caching. Test Plan: `python benchmarks/dynamo/torchbench.py --performance --training --amp --backend inductor --device cuda --print-compilation-time --repeat 5 --cold-start-latency --only nanogpt` * tlparse: https://fburl.com/e71yn6uc * dynamo_compile: https://fburl.com/scuba/dynamo_compile/sandbox/4ageghhv * pt2_compile_events: https://fburl.com/scuba/pt2_compile_events/4fgv1itq Pull Request resolved: https://github.com/pytorch/pytorch/pull/148693 Approved by: https://github.com/eellison	2025-03-13 03:50:58 +00:00
clr	2a7e997b3f	test/dynamo/test_utils: Fix one broken test on different python versions (#148987 ) We correctly handed different python version in the explicit ir_nodes test, but didn't handle it in the dynamo_timed test. Just explicitly deleting the fields there so the dynamo_timed test passes on all python versions. (I noticed it breaking on 3.13). Pull Request resolved: https://github.com/pytorch/pytorch/pull/148987 Approved by: https://github.com/jansel	2025-03-12 02:11:08 +00:00
clr	6b0fd741d1	dynamo: Count number of opcodes processes (#147149 ) This gives us a decent proxy for how big of a graph we functionally had to parse. Note that this is a cummulative counter. If people feel strongly, I can either write into the dynamo_timed datasets with metrics contexts, or clear the counters / write a counter per frame id as well. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147149 Approved by: https://github.com/jansel	2025-03-10 19:20:09 +00:00
Sam Larsen	40c2505f16	[logging] Log individual Triton kernel compilation times to dynamo_compile (#147022 ) Summary: Gather the compilation time of individual triton kernels and log them to dynamo_compile: * Time compilation in `_worker_compile_triton` and pass back to the main process and logged from `get_result()`. * Added a way to track the "top N" (or N most-expensive compiles) in the metrics_context. I did this because I doubt we really care to capture potentially thousands of kernel compile times. That would be problematic for scuba logging anyway, so let's limit the number we track from the beginning. Arbitrarily chose 25 for now. * Format the list of compile times as a json string before logging. Test Plan: `python benchmarks/dynamo/torchbench.py --performance --training --amp --backend inductor --device cuda --print-compilation-time --repeat 5 --cold-start-latency --only nanogpt` Scuba: https://fburl.com/scuba/dynamo_compile/sandbox/nc4dzm3r Pull Request resolved: https://github.com/pytorch/pytorch/pull/147022 Approved by: https://github.com/jamesjwu	2025-03-03 19:32:17 +00:00
Raymond Li	c5bf9aaf1c	Log graph breaks (#146537 ) Graph breaks currently aren't logged to dynamo_compile and pt2_compile_events. We want to log them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/146537 Approved by: https://github.com/c00w	2025-02-27 11:06:33 +00:00
Simon Fan	1d4adf4e1f	[dynamo] log recompile reason to dynamo_compile (#146117 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/146117 Approved by: https://github.com/bobrenjc93	2025-02-03 21:04:04 +00:00
Colin L. Rice	c1161957a4	inductor_config_logging: Don't drop keys (#144700 ) This bit me while I was trying to debug some trace issues. In general this config is already quite large when dumping, so adding more fields doesn't make it significantly worse. Also a number of the items we are type checking for (except the test configs), don't even show up. Primarily this will help us when debugging rocm, halide, and trace configs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/144700 Approved by: https://github.com/ezyang	2025-01-27 23:47:25 +00:00
Animesh Jain	ef60de07a0	[dynamo] Log guard latency (#145132 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145132 Approved by: https://github.com/ezyang ghstack dependencies: #145509	2025-01-25 03:01:18 +00:00
PyTorch MergeBot	6f60c65a3a	Revert "[dynamo] Log guard latency (#145132 )" This reverts commit `0a310d7388`. Reverted https://github.com/pytorch/pytorch/pull/145132 on behalf of https://github.com/anijain2305 due to CI failures observed after PR was merged ([comment](https://github.com/pytorch/pytorch/pull/145132#issuecomment-2611268421))	2025-01-24 00:11:50 +00:00
Animesh Jain	0a310d7388	[dynamo] Log guard latency (#145132 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/145132 Approved by: https://github.com/ezyang ghstack dependencies: #145351, #145420	2025-01-23 23:30:07 +00:00
Colin L. Rice	73278e6a5d	easy: sort dictionary keys for inductor config when publishing (#143307 ) This means we should get consistent logging strings for the same config on different ranks Pull Request resolved: https://github.com/pytorch/pytorch/pull/143307 Approved by: https://github.com/xmfan	2025-01-09 18:01:20 +00:00
Colin L. Rice	d79fbf6b6d	test/dynamo/test_utils: logging - Stop testing for impossible things. (#143535 ) We don't support assigning to objects or numeric constants at the top level in config modules, no need to test for them. (This specifically breaks later sorting refactoring, since it requires < to be implemented). Pull Request resolved: https://github.com/pytorch/pytorch/pull/143535 Approved by: https://github.com/ppanchalia	2024-12-20 17:21:49 +00:00
bobrenjc93	8850a7b62c	add some logging for tensorify (#143391 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143391 Approved by: https://github.com/jamesjwu	2024-12-19 20:06:26 +00:00
qiurc	90cc43f270	Support garbage collection after pt2 compilation (#143364 ) Summary: Support garbage collection after pt2 compilation. Add jk to control the global rollout / rollback of this functionality Add env var to control individual job's rollout Test Plan: Test the model training job with / without this changes Reviewers: @yuxihu @ezyang , @Yuzhen11 , Subscribers: Tasks: Tags: Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/143364 Approved by: https://github.com/ezyang	2024-12-18 07:25:11 +00:00
Sam Larsen	60c54467db	[logging] Log runtime autotuning timing to scuba (#141919 ) See test plan in internal diff [D66679369](https://our.internmc.facebook.com/intern/diff/D66679369) Pull Request resolved: https://github.com/pytorch/pytorch/pull/141919 Approved by: https://github.com/jamesjwu, https://github.com/ezyang	2024-12-13 21:22:13 +00:00
Sam Larsen	30b61e521c	[logging] Populate compile_time_autotune_time_us (#143104 ) See testing in attached diff Differential Revision: [D67128210](https://our.internmc.facebook.com/intern/diff/D67128210) Pull Request resolved: https://github.com/pytorch/pytorch/pull/143104 Approved by: https://github.com/ezyang	2024-12-12 17:08:43 +00:00
Sam Larsen	692b5e75ed	[logging] Add triton_compile_time_us column to dynamo_compile (#142068 ) Test Plan: See internal diff [D66799565](https://www.internalfb.com/diff/D66799565) Differential Revision: [D66799565](https://our.internmc.facebook.com/intern/diff/D66799565) Pull Request resolved: https://github.com/pytorch/pytorch/pull/142068 Approved by: https://github.com/c00w	2024-12-06 16:11:57 +00:00
Bob Ren	9286c21b22	Fix fbcode tests for automatic dynamic unspecialize float (#141975 ) Differential Revision: D66708552 Pull Request resolved: https://github.com/pytorch/pytorch/pull/141975 Approved by: https://github.com/bdhirsh, https://github.com/atalman	2024-12-03 23:59:06 +00:00
Bob Ren	2f72635a5c	automatic dynamic unspecialize float (#141647 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/141647 Approved by: https://github.com/ezyang	2024-11-29 22:36:53 +00:00
PyTorch MergeBot	9e98b3d73c	Revert "automatic dynamic unspecialize float (#141647 )" This reverts commit `1a32daeb17`. Reverted https://github.com/pytorch/pytorch/pull/141647 on behalf of https://github.com/atalman due to functorch/test_aotdispatch.py::TestAOTAutogradWithCache::test_inner_grad [GH job link](https://github.com/pytorch/pytorch/actions/runs/12080983316/job/33697901875) [HUD commit link](`1a32daeb17`) ([comment](https://github.com/pytorch/pytorch/pull/141647#issuecomment-2507980876))	2024-11-29 15:00:33 +00:00
Bob Ren	1a32daeb17	automatic dynamic unspecialize float (#141647 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/141647 Approved by: https://github.com/ezyang	2024-11-29 07:53:53 +00:00
PyTorch MergeBot	ad37afd590	Revert "Always unspecialize float in OSS (#138922 )" This reverts commit `ba5253da9b`. Reverted https://github.com/pytorch/pytorch/pull/138922 on behalf of https://github.com/yf225 due to perf regression on torchbench ([comment](https://github.com/pytorch/pytorch/pull/138922#issuecomment-2499277511))	2024-11-26 00:03:03 +00:00
Sam Larsen	07906f2f2b	[logging] Move population of common MetricsContext fields to record_compilation_metrics (#141291 ) Summary: Fix outstanding TODOs related to logging of CompilationMetrics by moving the population of common fields to record_compilation_metrics() instead of populating those independently wherever we use a the metrics_context contextmanager: * Keep track of start and end time in MetricsContext and pass those to record_compilation_metrics() and populate those fields in that function. * Pass exception info to record_compilation_metrics() and populate those field in that function. * Add a new contextmanager, chromium_event_timed, to create the start/end "dynamo" event. This is important because I want this contextmanager to complete _after_ building the CompilationMetrics. * Populate the compile_id field centrally in record_compilation_metrics(). * Populate the structured_logging_overhead centrally in record_compilation_metrics(). * Add the CompilationMetrics to the current chromium event in record_compilation_metrics(), after all common fields have been added. In a future diff, I can also add _all_ compilation metrics to the chromium event. Test plan: Unit tests. Also see internal testing: * dynamo_compile: https://fburl.com/scuba/dynamo_compile/sandbox/jrascnf9 * pt2_compile_events: https://fburl.com/scuba/pt2_compile_events/l3jnla06 * tlparse: https://fburl.com/bq5a9nqs Pull Request resolved: https://github.com/pytorch/pytorch/pull/141291 Approved by: https://github.com/jamesjwu	2024-11-25 13:18:40 +00:00
Bob Ren	ba5253da9b	Always unspecialize float in OSS (#138922 ) Fixes https://github.com/pytorch/pytorch/issues/107277 Pull Request resolved: https://github.com/pytorch/pytorch/pull/138922 Approved by: https://github.com/ezyang Co-authored-by: Edward Z. Yang <ezyang@meta.com>	2024-11-24 01:58:13 +00:00

1 2

70 Commits