Commit Graph

78 Commits

Author SHA1 Message Date
William Wen
f452edd782 [dynamo, 3.14] fix misc. bugs to get most dynamo unittests passing locally in 3.14 (#164631)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164631
Approved by: https://github.com/Lucaskabela, https://github.com/mlazos
2025-10-28 03:24:22 +00:00
Colin L Reliability Rice
ca5b7f8ded torch.compile: populate compiler_config (#165581)
Summary: This starts writing the compiler_config metadata into logger

Test Plan:
Modified existing test case to make sure this is not null.
(Also eyeballed what we're logging tomake sure it's reasonable

Reviewed By: masnesral

Differential Revision: D84014636

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165581
Approved by: https://github.com/masnesral
2025-10-17 18:21:18 +00:00
Colin L Reliability Rice
98a488c9aa Start recording inductor provenance (#162669)
Summary:
This stores information on where fx graphs come from, which makes it
significantly easier to debug.

One outstanding question

1) I only stored the kernel stack traces, do we also want the node mappings?

Test Plan:
I wrote a explicit logging test which makes a module, fx traces it, compiles it, and makes sure the logging infomration shows up.

```
clr@devvm17763 ~/fbsource/fbcode/caffe2/test/dynamo
 % buck2 test @//mode/opt fbcode//caffe2/test/dynamo:test_dynamo -- test_utils

File changed: fbsource//xplat/caffe2/test/dynamo/test_utils.py
File changed: fbcode//caffe2/test/dynamo/test_utils.py
Buck UI: https://www.internalfb.com/buck2/528dea32-2416-4a62-a1ec-39f3c0efdd2e
Test UI: https://www.internalfb.com/intern/testinfra/testrun/13229324015574003
Network: Up: 0B  Down: 0B
Executing actions. Remaining     0/2
Command: test.
Time elapsed: 17.3s
Tests finished: Pass 16. Fail 0. Fatal 0. Skip 0. Build failure 0
```

Rollback Plan:

Differential Revision: D82037582

Pull Request resolved: https://github.com/pytorch/pytorch/pull/162669
Approved by: https://github.com/yushangdi
2025-10-16 23:05:31 +00:00
Sam Larsen
a2e2e1d8c0 Add pytorch_version and mast_application_packages to pt2 compile scuba logging (#165018)
Summary: Two more fields requested for conda-on-mast jobs

Differential Revision: D84214442

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165018
Approved by: https://github.com/c00w
2025-10-10 17:57:40 +00:00
Jovian Anthony Jaison
2fdd4f918c Log exception_stack_trace to dynamo_compile (#161096)
Note: Adding unit test for this is tricky as having errors in the specific unit test would cause test_utils.py to crash all together.

Tested as follows:
1. Added x = 1/0 after guarded_code = compile_inner(code, one_graph, hooks, transform) in convert_frame.py
2. Printed exception_stack_trace and got: ['Traceback (most recent call last):\n  File "/data/users/jovian/pytorch/torch/_dynamo/convert_frame.py", line 1207, in _compile\n    x = 1/0\n        ~^~\nZeroDivisionError: division by zero\n']

Pull Request resolved: https://github.com/pytorch/pytorch/pull/161096
Approved by: https://github.com/c00w
2025-08-22 03:29:15 +00:00
Jovian Anthony Jaison
c02e26bf31 Fix filename showing up as ints in dynamo_compile stack_trace column. (#160916)
Test plan:
$ python -m test_utils

Note:
Another way is adding the actual file_name to from_traceback, but since it's referenced in multiple places and may have associated tests this seems safer. Lmk if changes are needed @c00w

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160916
Approved by: https://github.com/c00w, https://github.com/masnesral
2025-08-20 18:38:38 +00:00
Prajesh Praveen Anchalia
052c441cf4 Add logging for when inbuilt_inline_nn_modules will help with ID_MATCH guard triggered recompiles (#160592)
We add a logging around when an ID_MATCH guard is added at a place where inbuilt_inline_nn_modules would inline it. This is done with the aim of tagging recompiles that could be avoided by setting inbuilt_inline_nn_modules flag.
It will help us log and track the flag's adoption and potentially quantify saving in the the number of recompiles.

Differential Revision: D80075975

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160592
Approved by: https://github.com/anijain2305
2025-08-15 17:09:39 +00:00
Jovian Anthony Jaison
cd8d8c18f5 [pytorch][dynamo_compile] Log graph_node_shape to dynamo_compile (#160556)
This PR adds the dynamo graph node shape logging to dynamo compile. Also added unit tests to check if correct graph node shape is being logged.

Test Plan:
$ python -m test_utils
Ran 12 tests in 36.447s
OK

Note: Will merge after D80185628 lands.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160556
Approved by: https://github.com/masnesral, https://github.com/jingsh
2025-08-14 16:42:35 +00:00
Jovian Anthony Jaison
9a0f7a3bb0 [retry-land][pytorch][dynamo_compile] Log stack_trace to dynamo_compile (#160348)
refer: https://github.com/pytorch/pytorch/pull/159655

Earlier pr failed on dynamo/test_utils.py::TestDynamoTimed::test_dynamo_timed.
Updated test_dynamo_timed + re-ran locally to test.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/160348
Approved by: https://github.com/masnesral
2025-08-12 06:24:54 +00:00
PyTorch MergeBot
206c1eef65 Revert "[pytorch][dynamo_compile] Log stack_trace to dynamo_compile (#159655)"
This reverts commit 2ee22e4351.

Reverted https://github.com/pytorch/pytorch/pull/159655 on behalf of https://github.com/clee2000 due to broke dynamo/test_utils.py::TestDynamoTimed::test_dynamo_timed [GH job link](https://github.com/pytorch/pytorch/actions/runs/16839294394/job/47711078667) [HUD commit link](2ee22e4351).  Probably a landrace since it did run on the PR ([comment](https://github.com/pytorch/pytorch/pull/159655#issuecomment-3169400889))
2025-08-08 22:04:22 +00:00
Jovian Anthony Jaison
2ee22e4351 [pytorch][dynamo_compile] Log stack_trace to dynamo_compile (#159655)
This change logs the stack trace of the code being compiled by Dynamo, improving visibility into what is compiled. It adds a stack_trace field to compilation metrics. This helps with debugging and analysis of Dynamo compilation behavior.
 Ref [D79287964](https://www.internalfb.com/diff/D79287964)

Test Plan:
$ python -m test_utils
Internal: ref [D79372519](https://www.internalfb.com/diff/D79372519)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159655
Approved by: https://github.com/c00w
2025-08-08 19:53:47 +00:00
Xu Han
06824f3c72 [inductor] fix test_dynamo_timed on Windows. (#159981)
Fixed `test_dynamo_timed `:
<img width="1030" height="389" alt="image" src="https://github.com/user-attachments/assets/02d84dd8-6a65-4f91-8d4c-48ba0a81fac1" />

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159981
Approved by: https://github.com/angelayi
2025-08-07 16:37:52 +00:00
Ivan Zaitsev
e4b123b5e4 Revert direct updates (#159654)
reverts:
```

commit 5711a8f069 (tag: trunk/5711a8f06948eeee56ed5f53f171fa519f78491c, origin/main, main)
Author: Jovian Anthony Jaison <38627145+jovianjaison@users.noreply.github.com>
Date:   Fri Aug 1 09:32:52 2025 -0700

    Update test_utils.py

commit b4b71d011e (tag: trunk/b4b71d011ed07a41c2086ff0dec2988a63662877)
Author: Jovian Anthony Jaison <38627145+jovianjaison@users.noreply.github.com>
Date:   Fri Aug 1 09:27:54 2025 -0700

    Update utils.py

commit 52376b9b6f (tag: trunk/52376b9b6fbf9fe24f5d82038dc520f0c64b6f8d)
Author: Jovian Anthony Jaison <38627145+jovianjaison@users.noreply.github.com>
Date:   Fri Aug 1 09:26:05 2025 -0700
```

(commits pushed directly to main by mistake)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/159654
Approved by: https://github.com/atalman
2025-08-01 16:54:51 +00:00
Jovian Anthony Jaison
5711a8f069
Update test_utils.py 2025-08-01 09:32:52 -07:00
Boyuan Feng
94995eba07 [Log] add a hook for recompile user context (#157961)
Users may want compile-related but customized logging info to dynamo_compile. One example is to logging the current training iteration index when recompilation happens. In general, current training iteration index is not available to compiler, since the same compiled function may be called multiple times in the same training iteration. The user could provide the training iteration index in a user hook where torch.compile logs it when recompilation happens.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157961
Approved by: https://github.com/masnesral
2025-07-11 03:41:33 +00:00
Raymond Li
82765dad16 Fix logging of config_suppress_errors and config_inline_inbuilt_nn_modules (#157947)
Currently ~50% of the time we fail or crash before logging metrics, so moving where this is logged will let us have more comprehensive (less-null) data.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/157947
Approved by: https://github.com/masnesral, https://github.com/jovianjaison
2025-07-10 12:05:43 +00:00
Xuehai Pan
6d5c789ad5 [BE][PYFMT] migrate PYFMT for test/[a-h]*/ to ruff format (#144555)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/144555
Approved by: https://github.com/ezyang
ghstack dependencies: #144551, #144554
2025-06-24 04:53:54 +00:00
Joel Schlosser
c4b93e6579 Replace frame_traced_fn hook with get_traced_code() util (#155249)
#153622 introduced a hook for getting the relevant code objects after frame tracing. The idea is to have vLLM use this instead of monkey-patching `inline_call_()` to determine the source code files to hash. Unfortunately, the hook runs too late; the vLLM backend needs access to the set of source code filenames while it's running.

This PR replaces the newly-added hook with a utility function that a backend can call to get this information. I've made the change in vLLM and can verify that this allows the information to be queried at the right time.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/155249
Approved by: https://github.com/zou3519
2025-06-10 22:40:58 +00:00
Joel Schlosser
43b18d098b Forward fix for test_frame_traced_hook in internal testing (#154641)
Summary: Fixes the newly-added dynamo test test_frame_traced_hook so it can run internally

Test Plan: This is a test change

Differential Revision: D75616787

Pull Request resolved: https://github.com/pytorch/pytorch/pull/154641
Approved by: https://github.com/Skylion007
2025-05-29 23:02:01 +00:00
Joel Schlosser
9db7bcb3fe [Dynamo] Introduce hook receiving list of traced code objects (#153622)
This PR:
* Expands `Hooks` with a new, optional `frame_traced_fn` field. It should be a callable receiving the list of traced code objects
* Maintains a list of `traced_code` objects in the `TracingContext` of an `OutputGraph`
    *  Whenever an `inline_call()` is encountered, the corresponding code object is added to this set
    * `OutputGraph`'s associated `f_code` is added to the list just before the hook is called

I believe use of this hook should enable the source code hashing that vLLM does in a better way than monkey-patching `inline_call()`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153622
Approved by: https://github.com/jansel
2025-05-28 15:40:09 +00:00
Nikita Shulga
acd0873d3b [CI] Fix TestDynamoTimed.test_ir_count for 3.12 (#154268)
Python-3.12 emits the same bytecode as 3.13 for code in question
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154268
Approved by: https://github.com/clee2000, https://github.com/atalman
ghstack dependencies: #154237
2025-05-23 20:08:19 +00:00
IvanKobzarev
4439255148 [aotd] Support saved tensors hooks in aot_autograd (#150032)
https://github.com/pytorch/pytorch/issues/148222

Goal:

At the moment autograd saved tensors hooks are run in eager after compiled forward.
They are executed at the same time for all saved tensors.
Hooks can be used to reduce amout of memory used for saved tensors, doing quantization or offloading to cpu.
This is suboptimal for optimization of peak memory.
Better solution will be to put the hooks in the graph, as close as possible to the last usage of the tensor.

To get user specified autograd saved tensors hooks in the graph.

Logic:

UX:
If user specifies with torch.autograd.graph.saved_tensors_hooks(pack_gm, unpack_gm).
Where pack_gm and unpack_gm are torch.fx.GraphModule.
Then AotAutograd will retrace those graph modules, doing decompositions and functionalization in aot_autograd, inlining the result graphs in forward epilogue and backward prologue.

User may want to use control logic in the hooks, for example applying quantization only for specific dtypes and sizes.

This is also possible, user can put it into torch.fx.wrap function and use symbolic trace to make a GraphModule.

In that case AotAutograd cahing will work only in case when user explicitly set to the torch.fx.wrap call_function node "user_cache_hash" metadata.

If this metadata set - then aot_autograd cache can use saved cache artifact.
If metadata is not set - then cache is bypassed.

Dynamo:
Dynamo traces pack and unpack hooks and installs them as subgraph and explicitly adds to the output_graph. (As those subgraphs are not used and will not be copied in the result by default).

The complexity here is that at this moment we do not have example of inputs for the hooks.
We trace  pack_hook with some Tensor from the inputs.
The result subgraphs are added to the hashing of AotAutograd Cache.

In AotAutograd we retrace the graph with the true saved tensors coming from partitioner.

Backwards Compatibility:
As current hooks are executed in eager mode and not all of them will be traceable - we only try to put in the graph hooks, explicitly marked by user with annotation (@_inlineable_saved_tensors_hooks).
For other hooks or if compiled autograd is enabled - keep the same logic.

Recompilations:
Hooks are guarded with lambda guard matching function id to cause recompilation if user reruns compiled function.

Aot_autograd:
After partitioner prepared forward and backward module - we trace prepared at Dynamo graphs for pack and unpack hooks and inline them in epilogue of forward and prologue of backward. Forward outputs and backward inputs are changed, transparently for user.

We do not try to put it close the last usage etc., relying on inductor to do this optimization.

```
INFO: TRACED GRAPH
 ===== Forward graph pre saved_tensors_hooks inlining 3 =====
 /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module):
    def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", primals_3: "f32[s0, s1][s1, 1]cuda:0"):
         # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6660 in simple_fn, code: x = x + 1
        add: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(primals_3, 1);  primals_3 = None

         # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x)
        view: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.view.default(add, [primals_1, primals_2])
        return (view, add, primals_1, primals_2)

INFO: TRACED GRAPH
 ===== Backward graph pre saved_tensors_hooks inlining 3 =====
 /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module):
    def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", primals_3: "f32[s0, s1][s1, 1]cuda:0"):
         # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6660 in simple_fn, code: x = x + 1
        add: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(primals_3, 1);  primals_3 = None

         # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x)
        view: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.view.default(add, [primals_1, primals_2])
        return (view, add, primals_1, primals_2)

INFO: TRACED GRAPH
 ===== saved_tensors_pack_hook add 3 =====
 /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class pack_float8(torch.nn.Module):
    def forward(self, x_1: "f32[s0, s1][s1, 1]cuda:0"):
        # No stacktrace found for following nodes
        _to_copy: "f8e4m3fn[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(x_1, dtype = torch.float8_e4m3fn);  x_1 = None
        return (torch.float32, _to_copy)

INFO: TRACED GRAPH
 ===== saved_tensors_unpack_hook add 3 =====
 <eval_with_key>.22 from /data/users/ivankobzarev/a/pytorch/torch/fx/experimental/proxy_tensor.py:1225 in wrapped class pack_float8(torch.nn.Module):
    def forward(self, x_1: "f32[s0, s1][s1, 1]cuda:0"):
        # No stacktrace found for following nodes
        _to_copy: "f8e4m3fn[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(x_1, dtype = torch.float8_e4m3fn);  x_1 = None
        return (torch.float32, _to_copy)

INFO: TRACED GRAPH
 ===== Forward graph 3 =====
 /data/users/ivankobzarev/a/pytorch/torch/fx/_lazy_graph_module.py class GraphModule(torch.nn.Module):
    def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", primals_3: "f32[s0, s1][s1, 1]cuda:0"):
         # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6660 in simple_fn, code: x = x + 1
        add: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(primals_3, 1);  primals_3 = None

        # No stacktrace found for following nodes
        _to_copy: "f8e4m3fn[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(add, dtype = torch.float8_e4m3fn)

         # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x)
        view: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.view.default(add, [primals_1, primals_2]);  add = None
        return (view, _to_copy, primals_1, primals_2)

INFO: TRACED GRAPH
 ===== Backward graph 3 =====
 <eval_with_key>.21 class GraphModule(torch.nn.Module):
    def forward(self, primals_1: "Sym(s0)", primals_2: "Sym(s1)", add_packed_2: "f8e4m3fn[s0, s1][s1, 1]cuda:0", tangents_1: "f32[s0, s1][s1, 1]cuda:0"):
        # No stacktrace found for following nodes
        _to_copy: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten._to_copy.default(add_packed_2, dtype = torch.float32);  add_packed_2 = None

         # File: /data/users/ivankobzarev/a/pytorch/test/functorch/test_aotdispatch.py:6661 in simple_fn, code: x = SAF.apply(x)
        add_7: "f32[s0, s1][s1, 1]cuda:0" = torch.ops.aten.add.Tensor(tangents_1, _to_copy);  tangents_1 = _to_copy = None
        return (None, None, add_7)

```

Differential Revision: [D72187044](https://our.internmc.facebook.com/intern/diff/D72187044)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/150032
Approved by: https://github.com/bdhirsh
2025-05-22 14:09:38 +00:00
clr
a952f42bdb dynamo: Log if we're using dynamic shapes via set_feature_usage (#153490)
This makes it extremely clear if a specific model didn't use dynamic shapes and
should have (except it had a bad config option).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153490
Approved by: https://github.com/jansel
2025-05-16 23:59:00 +00:00
clr
85f97b5a8c compile_fx: make a compile event that corresponds to the fx_compile waitcounter (#152983)
This is a pretty minor change, but by having exact correspondence, we can
easily confirm data differences between perfetto and wait counters

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152983
Approved by: https://github.com/jansel, https://github.com/masnesral
2025-05-14 01:54:42 +00:00
Sam Larsen
dde705864a Fix test broken by D73809989 (#153413)
Summary: I forgot to remove this unused field in D73809989.

Test Plan: `buck test 'fbcode//mode/opt' fbcode//caffe2/test:fbonly -- --exact 'caffe2/test:fbonly - test_compilation_metrics_logger_in_sync (caffe2.test.fb.test_fb.TestFBOnly)'`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/153413
Approved by: https://github.com/c00w
2025-05-13 16:44:30 +00:00
Animesh Jain
7fdd754136 [compile-time traces] Profile large missing gaps in compile time (#151256)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151256
Approved by: https://github.com/bdhirsh, https://github.com/masnesral, https://github.com/zou3519, https://github.com/jansel
2025-05-13 14:44:51 +00:00
Sam Larsen
e6e1ca1996 [easy] Fix test_dynamo_timed (#152387)
Summary: I'm just trying to fix the test again. It's out of date because it's disabled and some dynamo_timed-related fields are gone now.

Test Plan: `python test/dynamo/test_utils.py -k dynamo_timed`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/152387
Approved by: https://github.com/anijain2305
2025-04-29 19:22:56 +00:00
Animesh Jain
159e2f96e3 [dynamo][ci] Fix recently broken test (#151877)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/151877
Approved by: https://github.com/masnesral, https://github.com/jansel
2025-04-22 06:42:03 +00:00
Sam Larsen
80a3877b3d [easy] Fix test_dynamo_timed (#151816)
Summary: The structured logging counter is a global that might have been affected by earlier tests. Clear it explicitly.
Fixes #148093

Test Plan: `pytest test/dynamo/test_utils.py`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151816
Approved by: https://github.com/ppanchalia
2025-04-22 00:12:31 +00:00
Sam Larsen
f20a266512 [easy] Update test/dynamo/test_utils.py (#151599)
Summary: test/dynamo/test_utils.py is out of date because of some new dynamo_timed fields. (I guess the test is disabled?). Bring it up to date

Test Plan: `python test/dynamo/test_utils.py`

Fixes #148093

Pull Request resolved: https://github.com/pytorch/pytorch/pull/151599
Approved by: https://github.com/Skylion007
2025-04-18 18:49:24 +00:00
Sam Larsen
585d03fa39 Record how many parameters we're parsing within dynamo (#148508)
This allows us to track how many paramaters we have in compilations.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148508
Approved by: https://github.com/jansel, https://github.com/anijain2305

Co-authored-by: Sam Larsen <slarsen@meta.com>
2025-04-16 06:15:11 +00:00
Sam Larsen
2a1e2b88ed [logging] Add pgo remote get/put timings to dynamo_compile (#150322)
Test Plan: https://fburl.com/scuba/dynamo_compile/sandbox/xf950tw8

Pull Request resolved: https://github.com/pytorch/pytorch/pull/150322
Approved by: https://github.com/ppanchalia
2025-04-07 18:08:26 +00:00
Sam Larsen
90543e90a0 Fix broken dynamo_timed test due to python_version field (#149659)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/149659
Approved by: https://github.com/ppanchalia
2025-03-21 00:27:28 +00:00
Shunting Zhang
6c7d8419e3 fix two accuracy regression (#149172)
There are 2 accuracy regression in 3/12 nightly perf run. I can not repro them locally thus there is no effective way to bisect. Raise the tolerance to make them pass the accuracy check.

- error log for HF MegatronBertForQuestionAnswering https://gist.github.com/shunting314/25322b66e15e98feed32e0d9a1e43316
- error log for TIMM gluon_inception_v3 https://gist.github.com/shunting314/df64ce22327df27a7057bbbd19ef5164

Pull Request resolved: https://github.com/pytorch/pytorch/pull/149172
Approved by: https://github.com/jansel, https://github.com/eellison
2025-03-17 19:34:00 +00:00
Sam Larsen
7cdbb913e7 [logging] Set compile_id in the CachingAutotuner during compilation so we have it for dynamo_timed logging (#148693)
Summary: This is a simpler alternative to https://github.com/pytorch/pytorch/pull/146455, where we can stick the compileId (and forward/backward bool) in the CachingAutotuner so that we have it for logging `benchmark_all_configs`. Recall that the first attempt put the compileId in the inductor_meta and that interfered with caching.

Test Plan:
`python benchmarks/dynamo/torchbench.py --performance --training --amp --backend inductor --device cuda --print-compilation-time --repeat 5 --cold-start-latency --only nanogpt`
* tlparse: https://fburl.com/e71yn6uc
* dynamo_compile: https://fburl.com/scuba/dynamo_compile/sandbox/4ageghhv
* pt2_compile_events: https://fburl.com/scuba/pt2_compile_events/4fgv1itq

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148693
Approved by: https://github.com/eellison
2025-03-13 03:50:58 +00:00
clr
2a7e997b3f test/dynamo/test_utils: Fix one broken test on different python versions (#148987)
We correctly handed different python version in the explicit ir_nodes test, but
didn't handle it in the dynamo_timed test. Just explicitly deleting the fields
there so the dynamo_timed test passes on all python versions.

(I noticed it breaking on 3.13).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148987
Approved by: https://github.com/jansel
2025-03-12 02:11:08 +00:00
clr
6b0fd741d1 dynamo: Count number of opcodes processes (#147149)
This gives us a decent proxy for how big of a graph we functionally had to parse.

Note that this is a cummulative counter. If people feel strongly, I can either write into the dynamo_timed datasets with metrics contexts, or clear the counters / write a counter per frame id as well.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147149
Approved by: https://github.com/jansel
2025-03-10 19:20:09 +00:00
Sam Larsen
40c2505f16 [logging] Log individual Triton kernel compilation times to dynamo_compile (#147022)
Summary: Gather the compilation time of individual triton kernels and log them to dynamo_compile:
* Time compilation in `_worker_compile_triton` and pass back to the main process and logged from `get_result()`.
* Added a way to track the "top N" (or N most-expensive compiles) in the metrics_context. I did this because I doubt we really care to capture potentially thousands of kernel compile times. That would be problematic for scuba logging anyway, so let's limit the number we track from the beginning. Arbitrarily chose 25 for now.
* Format the list of compile times as a json string before logging.

Test Plan:
`python benchmarks/dynamo/torchbench.py --performance --training --amp --backend inductor --device cuda --print-compilation-time --repeat 5 --cold-start-latency --only nanogpt`
Scuba: https://fburl.com/scuba/dynamo_compile/sandbox/nc4dzm3r

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147022
Approved by: https://github.com/jamesjwu
2025-03-03 19:32:17 +00:00
Raymond Li
c5bf9aaf1c Log graph breaks (#146537)
Graph breaks currently aren't logged to dynamo_compile and pt2_compile_events. We want to log them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/146537
Approved by: https://github.com/c00w
2025-02-27 11:06:33 +00:00
Simon Fan
1d4adf4e1f [dynamo] log recompile reason to dynamo_compile (#146117)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146117
Approved by: https://github.com/bobrenjc93
2025-02-03 21:04:04 +00:00
Colin L. Rice
c1161957a4 inductor_config_logging: Don't drop keys (#144700)
This bit me while I was trying to debug some trace issues.
In general this config is already quite large when dumping, so adding
more fields doesn't make it significantly worse.

Also a number of the items we are type checking for (except the test
configs), don't even show up. Primarily this will help us when debugging
rocm, halide, and trace configs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/144700
Approved by: https://github.com/ezyang
2025-01-27 23:47:25 +00:00
Animesh Jain
ef60de07a0 [dynamo] Log guard latency (#145132)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145132
Approved by: https://github.com/ezyang
ghstack dependencies: #145509
2025-01-25 03:01:18 +00:00
PyTorch MergeBot
6f60c65a3a Revert "[dynamo] Log guard latency (#145132)"
This reverts commit 0a310d7388.

Reverted https://github.com/pytorch/pytorch/pull/145132 on behalf of https://github.com/anijain2305 due to CI failures observed after PR was merged ([comment](https://github.com/pytorch/pytorch/pull/145132#issuecomment-2611268421))
2025-01-24 00:11:50 +00:00
Animesh Jain
0a310d7388 [dynamo] Log guard latency (#145132)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/145132
Approved by: https://github.com/ezyang
ghstack dependencies: #145351, #145420
2025-01-23 23:30:07 +00:00
Colin L. Rice
73278e6a5d easy: sort dictionary keys for inductor config when publishing (#143307)
This means we should get consistent logging strings for the same
config on different ranks

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143307
Approved by: https://github.com/xmfan
2025-01-09 18:01:20 +00:00
Colin L. Rice
d79fbf6b6d test/dynamo/test_utils: logging - Stop testing for impossible things. (#143535)
We don't support assigning to objects or numeric constants at the top level in
config modules, no need to test for them.

(This specifically breaks later sorting refactoring, since it requires <
to be implemented).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143535
Approved by: https://github.com/ppanchalia
2024-12-20 17:21:49 +00:00
bobrenjc93
8850a7b62c add some logging for tensorify (#143391)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143391
Approved by: https://github.com/jamesjwu
2024-12-19 20:06:26 +00:00
qiurc
90cc43f270 Support garbage collection after pt2 compilation (#143364)
Summary:
Support garbage collection after pt2 compilation.
Add jk to control the global rollout / rollback of this functionality
Add env var to control individual job's rollout

Test Plan:
Test the model training job with / without this changes

Reviewers:
@yuxihu @ezyang , @Yuzhen11 ,

Subscribers:

Tasks:

Tags:

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143364
Approved by: https://github.com/ezyang
2024-12-18 07:25:11 +00:00
Sam Larsen
60c54467db [logging] Log runtime autotuning timing to scuba (#141919)
See test plan in internal diff [D66679369](https://our.internmc.facebook.com/intern/diff/D66679369)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141919
Approved by: https://github.com/jamesjwu, https://github.com/ezyang
2024-12-13 21:22:13 +00:00
Sam Larsen
30b61e521c [logging] Populate compile_time_autotune_time_us (#143104)
See testing in attached diff

Differential Revision: [D67128210](https://our.internmc.facebook.com/intern/diff/D67128210)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143104
Approved by: https://github.com/ezyang
2024-12-12 17:08:43 +00:00