Commit Graph

111 Commits

Author SHA1 Message Date
Rachel Guo
aaa4c3d60b [mm_logs] make aten mm info readable (#148800)
Summary:
as title. make it into a table like

e.g. also see pic in test plan

| Name     | M   | N   | K   | Count |
| aten.mm | 16  | 6   |  16 |     1     |
...

Test Plan: {F1975907876}
<img width="1090" alt="Screenshot 2025-03-11 at 3 13 00 PM" src="https://github.com/user-attachments/assets/ffae8c56-e32c-49cc-bbfb-5b8d216b8657" />

Differential Revision: D70825664

Pull Request resolved: https://github.com/pytorch/pytorch/pull/148800
Approved by: https://github.com/henrylhtsang
2025-03-17 17:00:58 +00:00
William Wen
4caeede799 [dynamo] more better error messages [3/N] (#147494)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147494
Approved by: https://github.com/jansel, https://github.com/zou3519
2025-02-28 06:23:28 +00:00
Riley Dulin
93316cfe94 Move ir_pre_fusion.txt and ir_post_fusion.txt to TORCH_LOGS (#147248)
Fixes #147002

Moves ir_{pre, post}_fusion.txt to be controlled by TORCH_LOGS instead of TORCH_COMPILE_DEBUG.
Updated tests of these logs as well.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/147248
Approved by: https://github.com/eellison
2025-02-20 00:26:17 +00:00
Michael Lazos
81eb2a78ad [Inductor] Add autotuning artifact logging (#147222)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/147222
Approved by: https://github.com/henrylhtsang, https://github.com/eellison
2025-02-19 09:22:42 +00:00
Ryan Guo
bfaf76bfc6 [dynamo] clear out traced frames at the start of test_log_traced_frames (#145640)
The test was being flaky in CI, and this patch fixes it.

Fixes #137461.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/145640
Approved by: https://github.com/williamwen42
2025-01-27 20:49:59 +00:00
Isuru Fernando
0efa843392 Dynamic shape guards in C++ (#139899)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139899
Approved by: https://github.com/anijain2305, https://github.com/albanD, https://github.com/jansel
ghstack dependencies: #143385, #143164
2025-01-22 14:58:35 +00:00
Isuru Fernando
fbaef0ac03 Add a language option for symbolic shape guards (#143164)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143164
Approved by: https://github.com/ezyang
ghstack dependencies: #143385
2025-01-22 14:58:35 +00:00
Tom Ritchford
d25e6e623f Fix unused Python variables in test/[a-d]* (#134665)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134665
Approved by: https://github.com/albanD
2024-12-13 22:13:12 +00:00
Shangdi Yu
8fae4397b4 Add "inductor_pre_grad_graph" logging (#142717) (#143126)
Summary:

Add new structured logging "inductor_pre_grad_graph"

This is for inductor provenance tracking front-end to load this graph from tlparse.
ghstack-source-id: 257581974
exported-using-ghexport

Test Plan:
```
buck2 run 'fbcode//mode/dev-nosan' //caffe2/test/dynamo:test_dynamo -- -r StructuredTraceTest
```

Differential Revision: D67150288

Pull Request resolved: https://github.com/pytorch/pytorch/pull/143126
Approved by: https://github.com/desertfire
2024-12-13 21:48:25 +00:00
Michael Lazos
49e4307686 [Dynamo] add debug logging for graph region expansion (#141382)
This PR adds debug logging for the region expansion algorithm.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141382
Approved by: https://github.com/williamwen42
ghstack dependencies: #141381
2024-12-11 02:22:21 +00:00
Yuanhao Ji
67ba79676f [Dynamo] Replace torch._dynamo.optimize() with torch.compile() [7/N] (#140922)
related commits:

- #139706
- #140238
- #140247
- #140253
- #140663
- #140688
- #140922
- #140924
- #140933

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140922
Approved by: https://github.com/williamwen42
2024-12-06 07:07:29 +00:00
eellison
f83361b274 inductor dtype propagation fixes (#141495)
- Add in upcast_compute_type on creation of new tensors (loads, constants)
- Fixes index_expr - right now we are sort of inconsistent in dtype and dont always respect the dtype specified. would be nice to fix but not doing in this pr.
- bug fix in view dtype where we were always upcasting back to fp32 when input was in bf16/fp16. we should only be doing that if the output is also in bf16/fp16.
- for masked, avoid calling dtype propagation and just use output dtype.

Turns on the runtime dtype verification for opinfo tests. The separate test file is still useful because we can use it for testing turning off codegen_upcast_to_fp32.

Follow ups:

- We could consider requiring less explicit upcast_compute_types calls and do it automatically. That would potentially make things easier but be less flexible in the future. Maybe I should have done it this pr.
- Be more consistent on our index expr dtype printing.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141495
Approved by: https://github.com/blaine-rister, https://github.com/arui-meta, https://github.com/ezyang
ghstack dependencies: #139945, #140057
2024-11-28 11:39:38 +00:00
Isuru Fernando
44186a0a4e Move Sympy printers to torch/utils/_sympy/printers.py (#140597)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2024-11-26 18:11:00 +00:00
PyTorch MergeBot
ad37afd590 Revert "Always unspecialize float in OSS (#138922)"
This reverts commit ba5253da9b.

Reverted https://github.com/pytorch/pytorch/pull/138922 on behalf of https://github.com/yf225 due to perf regression on torchbench ([comment](https://github.com/pytorch/pytorch/pull/138922#issuecomment-2499277511))
2024-11-26 00:03:03 +00:00
Bob Ren
ba5253da9b Always unspecialize float in OSS (#138922)
Fixes https://github.com/pytorch/pytorch/issues/107277

Pull Request resolved: https://github.com/pytorch/pytorch/pull/138922
Approved by: https://github.com/ezyang

Co-authored-by: Edward Z. Yang <ezyang@meta.com>
2024-11-24 01:58:13 +00:00
PyTorch MergeBot
a8c90e5140 Revert "Always unspecialize float in OSS (#138922)"
This reverts commit 6d779d0549.

Reverted https://github.com/pytorch/pytorch/pull/138922 on behalf of https://github.com/huydhn due to Sorry for reverting your change but there is some slow tests failing after this land ([comment](https://github.com/pytorch/pytorch/pull/138922#issuecomment-2495076878))
2024-11-22 23:18:36 +00:00
Bob Ren
6d779d0549 Always unspecialize float in OSS (#138922)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138922
Approved by: https://github.com/ezyang

Co-authored-by: Edward Z. Yang <ezyang@meta.com>
2024-11-22 17:54:42 +00:00
PyTorch MergeBot
f23621ec56 Revert "Move Sympy printers to torch/utils/_sympy/printers.py (#140597)"
This reverts commit c25b201583.

Reverted https://github.com/pytorch/pytorch/pull/140597 on behalf of https://github.com/huydhn due to Trunk is sad again after this lands, this looks like a landrace this time, so please do a rebase ([comment](https://github.com/pytorch/pytorch/pull/140597#issuecomment-2494052978))
2024-11-22 15:43:39 +00:00
Isuru Fernando
c25b201583 Move Sympy printers to torch/utils/_sympy/printers.py (#140597)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2024-11-22 02:04:36 +00:00
PyTorch MergeBot
701e06b643 Revert "Move Sympy printers to torch/utils/_sympy/printers.py (#140597)"
This reverts commit aefcdb3c9f.

Reverted https://github.com/pytorch/pytorch/pull/140597 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I think it fails inductor/test_padding in trunk. This is a target determination miss and that failed test was not run in your PR ([comment](https://github.com/pytorch/pytorch/pull/140597#issuecomment-2489641453))
2024-11-20 22:13:57 +00:00
Isuru Fernando
aefcdb3c9f Move Sympy printers to torch/utils/_sympy/printers.py (#140597)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/140597
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2024-11-20 20:26:49 +00:00
Edward Z. Yang
e05a096c49 Ignore polyfill when reporting user backtraces in summarized form (#139850)
Fixes https://github.com/pytorch/pytorch/issues/139316

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139850
Approved by: https://github.com/bobrenjc93
2024-11-06 16:33:34 +00:00
Xuan Zhang
2980aed65b [inductor][memory] restructuring memory.py and turn on the flag (#137205)
Addressing additional comments given in PR https://github.com/pytorch/pytorch/pull/134874

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137205
Approved by: https://github.com/eellison
2024-10-25 17:19:34 +00:00
William Wen
93bbc8abcc [dynamo, 3.13] use 3.13 multiline traceback in get_instruction_source_311 (#137617)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137617
Approved by: https://github.com/jansel
2024-10-10 20:19:27 +00:00
Michael Lazos
d50d5df2fb Add warning for non static grads in optimizer variable (#137554)
Fixes https://github.com/pytorch/pytorch/issues/112548

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137554
Approved by: https://github.com/williamwen42
2024-10-10 01:23:21 +00:00
Michael Lazos
d5785d4295 [Dynamo] Handle torch function subclass/mode dispatch on generic tensor methods (#137119)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137119
Approved by: https://github.com/williamwen42, https://github.com/anijain2305
ghstack dependencies: #137114, #137115, #137116, #137117, #137120, #137227
2024-10-09 02:29:40 +00:00
PyTorch MergeBot
c88c0e6c65 Revert "[Dynamo] Handle torch function subclass/mode dispatch on generic tensor methods (#137119)"
This reverts commit d255b34c0a.

Reverted https://github.com/pytorch/pytorch/pull/137119 on behalf of https://github.com/malfet due to Need to revert to be able to revert https://github.com/pytorch/pytorch/pull/136910 ([comment](https://github.com/pytorch/pytorch/pull/137119#issuecomment-2400401262))
2024-10-08 17:09:26 +00:00
Ryan Guo
900f57216f [dynamo] Log a summary of frames Dynamo traced (#137297)
This patch adds logging for all frames Dynamo traced, during each invocation of a Dynamo-optimized function.

Example:
```python
import torch

@torch.compile
def foo():
    x = torch.ones([10])
    def bar():
        y = x + x
        torch._dynamo.graph_break()
        z = y * x
        return z

    return bar(), bar

foo()
foo()
```

Running `TORCH_LOGS="dynamo" python` on the above dumps the following near the very end.
```
......
I1003 12:18:31.058000 177 torch/_dynamo/eval_frame.py:486] starting from foo /Users/ryanguo99/Documents/work/scratch/test.py:4, torchdynamo attempted to trace the following frames: [
I1003 12:18:31.058000 177 torch/_dynamo/eval_frame.py:486]   * foo /Users/ryanguo99/Documents/work/scratch/test.py:4
I1003 12:18:31.058000 177 torch/_dynamo/eval_frame.py:486]   * bar /Users/ryanguo99/Documents/work/scratch/test.py:7
I1003 12:18:31.058000 177 torch/_dynamo/eval_frame.py:486] ]
I1003 12:18:31.064000 177 torch/_dynamo/eval_frame.py:486] starting from foo /Users/ryanguo99/Documents/work/scratch/test.py:4, torchdynamo attempted to trace the following frames: []
......
```

Fixes #118262.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137297
Approved by: https://github.com/williamwen42
2024-10-07 19:44:41 +00:00
Michael Lazos
d255b34c0a [Dynamo] Handle torch function subclass/mode dispatch on generic tensor methods (#137119)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/137119
Approved by: https://github.com/williamwen42
ghstack dependencies: #137114, #137115, #137116, #137117, #137120, #137227
2024-10-07 18:55:26 +00:00
Edward Z. Yang
9dbc6bacff Propagate detailed location information of shape guards to guards/recompiles output (#136917)
To see the payoff, look at test/dynamo/test_logging.py

The general idea is to refactor produce_guards into produce_guards_verbose which also returns verbose code parts, which have our annotations.

The rest of the logic is plumbing around SLocs to the places they need to be so we can print them. Guards are easy; value ranges and duck sizing take more care.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/136917
Approved by: https://github.com/anijain2305
2024-09-30 00:43:12 +00:00
Bob Ren
5314ae2660 Don't use exception chaining for BackendCompilerFailed (#135545)
Commandeered from https://github.com/pytorch/pytorch/pull/135496 as I'm now helping @ezyang ship dynamic float arguments in PT2.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/135545
Approved by: https://github.com/ezyang
2024-09-11 17:49:18 +00:00
Shunting Zhang
1e92d7b688 [inductor] move loop ordering after fusion (#126254)
Restart the work from PR https://github.com/pytorch/pytorch/pull/100331 in this new PR since it's hard to rebase. It would be expected that some code is copy/pasted from the previous PR and main idea is the same.

Previously we see relatively large compilation time increase due to too many loop orders being considered. This PR tries to continue the work by doing pruning and only considering loop orders that we know for sure are relevant (i.e. do it on demand).

Some manually created cases that loop ordering matters are added as unit tests. The PR can make sure inductor does not miss fusion opportunities for them.

This PR should solve the not-able to fusion problem in https://github.com/pytorch/pytorch/issues/130015

Right now there is still significant increase of compilation time. I'll disable the feature by default. Later on after the compilation time issue is resolved, I'll enable it  by default.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126254
Approved by: https://github.com/jansel
2024-08-29 21:50:07 +00:00
Xu Han
d503217ea4 [inductor] calibration inductor windows uts (15/N) (#134586)
Fix `test_logs_out` UT on Windows. make `test/dynamo/test_logging.py` all UTs pass on Windows.

Changes:
1. Close `NamedTemporaryFile` to release file handle to avoid PermissionError issue.
2. `PermissionError` setup as `delete=False`, let file not be auto deleted.
3. Open log file as "utf-8" to align with Linux.
4. Process wrap difference for Windows.
5. Delete tmp file manually.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/134586
Approved by: https://github.com/jansel
2024-08-29 16:18:40 +00:00
IvanKobzarev
8ae4f82243 [aotd] Support HOP effects in backward (#132638)
Support of effectful operations in backward:

1/ AOTD collects metadata from forward fn only, so we can have usage of effectful ops in backward, that were not used in forward => Allowing tokens discovery during joint function .

FunctionalTensorMode holds _tokens, in Joint function after tracing forward we memoize _tokens as `_tokens_forward_output`.

2/ Tokens are added as primals inputs (forward) in EffectTokensWrapper.
Tokens that will be used in backward are in partitioner saved values. We do not have control on which positions they are saved in forward outputs.

2/ If new tokens discovered in backward after tracing joint_fn, the result graph will be manually added in the end of primals.
_aot_autograd/utils.py

3/ All effectful ops during backward are marked with 'must_be_in_backward' partitioner_tag, to prevent partiitoner to place them in forward.

For that functional_tensor_mode got new optional state `self._effects_partitioner_tag` for effectful ops, to set after tracing forward.

There are additional changes in partitioner to improve functionality of 'must_be_in_backward'

4/ Unlift tokens now should run for both forward and backward.
- As saved for backward tokens are placed on non static places - we identify input and output tokens to erase, by input and output of `with_effects` operation
- In forward we can have input tokens, discovered in backward, that are not used in with_effects ops in forward, but saved for backward. We identify them by position in forward inputs.

5/ Adding aot debug logging for graphs before unlifting and before adding additional primal for backward tokens.

Tests:
```
python test/higher_order_ops/test_with_effects.py
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132638
Approved by: https://github.com/bdhirsh
2024-08-23 15:30:58 +00:00
Nicolas Macchioni
5cb05a82b4 [BC breaking] move benchmarking + prefer inductor path (#132827)
move benchmarking out of `torch._inductor.runtime.runtime_utils` and into `torch._inductor.runtime.benchmarking`, and prefer this path over directly accessing Triton's benchmarking

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132827
Approved by: https://github.com/eellison
2024-08-08 00:47:45 +00:00
Michael Lazos
a8f0979962 Add cudagraph static inputs logging (#132726)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132726
Approved by: https://github.com/anijain2305
2024-08-06 12:01:20 +00:00
Nick Westlake
053e5080f6 Enable exception chaining in call_user_compiler (#131186)
Enable exception chaining of BackendCompilerFailed exception in call_user_compiler. This prevents the original exception and traceback, which is often the most useful for debugging, from being discarded.

Example output without the patch
> Traceback (most recent call last):
> [Traceback from test_slice_scatter_issue122291 to raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(]
> [Trace back from call_user_compiler to  _inplace_generalized_scatter raise RuntimeError]
>  torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
>  RuntimeError: shape error in scatter op, can not broadcast torch.Size([16, 2]) to torch.Size([16, 6])
> Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

Example output with the patch
> Traceback (most recent call last):
> [Traceback from_inplace_generalized_scatter to raise error_type(message_evaluated)]
> RuntimeError: expand: attempting to expand a dimension of length 2!
> The above exception was the direct cause of the following exception:
> Traceback (most recent call last):
> [Traceback from  call_user_compiler to  _inplace_generalized_scatter raise RuntimeError]
> RuntimeError: shape error in scatter op, can not broadcast torch.Size([16, 2]) to torch.Size([16, 6])
> The above exception was the direct cause of the following exception:
> Traceback (most recent call last):
> [Traceback from test_slice_scatter_issue122291 to raise BackendCompilerFailed(self.compiler_fn, e) with e]
> RuntimeError: shape error in scatter op, can not broadcast torch.Size([16, 2]) to torch.Size([16, 6])
> Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information

Pull Request resolved: https://github.com/pytorch/pytorch/pull/131186
Approved by: https://github.com/jansel
2024-08-02 14:07:06 +00:00
Oguz Ulgen
920f0426ae Add None return type to init -- tests rest (#132376)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132376
Approved by: https://github.com/jamesjwu
ghstack dependencies: #132335, #132351, #132352
2024-08-01 15:44:51 +00:00
Xuehai Pan
918ece4f4d [BE][Easy][11/19] enforce style for empty lines in import segments in test/dy*/ (#129762)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129762
Approved by: https://github.com/anijain2305
2024-07-27 17:43:53 +00:00
Edward Z. Yang
6f54e961ea Add trace_shape_events artifact tracing for ShapeEnv events (#130473)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130473
Approved by: https://github.com/lezcano
2024-07-12 13:50:25 +00:00
Pian Pawakapan
1b3b4c2fb9 [runtime asserts] deduplicate runtime asserts & CSE (#128599) (#130380)
original PR: https://github.com/pytorch/pytorch/pull/128599 (re-created after revert + poisoned diff train)

Summary:
This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example:
```
z = torch.cat([x, x], dim=0)  # 2*s0
w = z.repeat(y.shape[0])  # 2*s0*s1
_w = w.shape[0]

s0 = x.shape[0]
s1 = y.shape[0]
_w0 = 2 * s0
_w = _w0 * s1
```

Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example:
```
torch.sym_constrain_range_for_size(n, min=2, max=16)
torch.sym_constrain_range(n, min=4, max=20)
torch._check(n >= 0)
torch._check(n >= 3)
torch._check(n <= 14)

torch.sym_constrain_range_for_size(n)
torch._check(n >= 4)
torch._check(n <= 14)
```

Test Plan:
contbuild & OSS CI, see 940e4477ab

Original Phabricator Test Plan:
Imported from GitHub, without a `Test Plan:` line.

Differential Revision: D59543603

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130380
Approved by: https://github.com/izaitsevfb
2024-07-10 19:23:37 +00:00
PyTorch MergeBot
9c9744c3ac Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599)"
This reverts commit 940e4477ab.

Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/izaitsevfb due to breaking internal APS tests, see D59498864 ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2218724762))
2024-07-09 21:03:49 +00:00
Edward Z. Yang
e836ee1955 Enhancements to recompiles logs (#130043)
----

- We now record on CacheEntry what the compile id that populated it was, so now we can say why a specific frame was rejected
- Add structured log for recompiles under name artifact "recompile_reasons". As it stands, it's not terribly structured, but this was the easiest thing I could do to start
- Slightly reformat multi-reason printing; since we only report one guard failure seems better to have it as a single line

Example output:

```
V0703 10:34:13.273000 140345997743104 torch/_dynamo/guards.py:2590] [0/1] [__recompiles] Recompiling function f in /data/users/ezyang/a/pytorch/b.py:3
V0703 10:34:13.273000 140345997743104 torch/_dynamo/guards.py:2590] [0/1] [__recompiles]     triggered by the following guard failure(s):
V0703 10:34:13.273000 140345997743104 torch/_dynamo/guards.py:2590] [0/1] [__recompiles]     - 0/0: tensor 'L['x']' size mismatch at index 0. expected 4, actual 5
```

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130043
Approved by: https://github.com/anijain2305
2024-07-09 03:40:56 +00:00
Pian Pawakapan
940e4477ab [runtime asserts] deduplicate runtime asserts & CSE (#128599)
This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example:
```
z = torch.cat([x, x], dim=0)  # 2*s0
w = z.repeat(y.shape[0])  # 2*s0*s1
_w = w.shape[0]
# something with _w ...

# turns into ->
s0 = x.shape[0]
s1 = y.shape[0]
_w0 = 2 * s0
_w = _w0 * s1
```

Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example:
```
torch.sym_constrain_range_for_size(n, min=2, max=16)
torch.sym_constrain_range(n, min=4, max=20)
torch._check(n >= 0)
torch._check(n >= 3)
torch._check(n <= 14)

# turns into
torch.sym_constrain_range_for_size(n)
torch._check(n >= 4)
torch._check(n <= 14)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599
Approved by: https://github.com/ezyang
2024-07-07 20:10:14 +00:00
PyTorch MergeBot
963f430d13 Revert "[runtime asserts] deduplicate runtime asserts & CSE (#128599)"
This reverts commit 0267b2ddcb.

Reverted https://github.com/pytorch/pytorch/pull/128599 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it seems to cause a landrace and fails inductor/test_cudagraph_trees in trunk 0267b2ddcb ([comment](https://github.com/pytorch/pytorch/pull/128599#issuecomment-2211690518))
2024-07-06 07:20:05 +00:00
Pian Pawakapan
0267b2ddcb [runtime asserts] deduplicate runtime asserts & CSE (#128599)
This PR adds deduplication and CSE for runtime asserts. Existing size computation in the graph is CSE'd along with added runtime asserts, and redundant asserts are removed. Shape calls on intermediate tensors are also turned into compute on input sizes if possible, allowing intermediate tensors to be freed earlier. For example:
```
z = torch.cat([x, x], dim=0)  # 2*s0
w = z.repeat(y.shape[0])  # 2*s0*s1
_w = w.shape[0]
# something with _w ...

# turns into ->
s0 = x.shape[0]
s1 = y.shape[0]
_w0 = 2 * s0
_w = _w0 * s1
```

Additionally, constrain_range calls are deduplicated. Single-symbol bound checks for unbacked symbols (e.g. u0 >= 0, u0 <= 5) and sym_constrain_range.default calls are also removed, since they accumulate range info in the ShapeEnv, and are replaced with two _assert_scalar.default calls that check the min/max bounds. For example:
```
torch.sym_constrain_range_for_size(n, min=2, max=16)
torch.sym_constrain_range(n, min=4, max=20)
torch._check(n >= 0)
torch._check(n >= 3)
torch._check(n <= 14)

# turns into
torch.sym_constrain_range_for_size(n)
torch._check(n >= 4)
torch._check(n <= 14)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/128599
Approved by: https://github.com/ezyang
2024-07-06 03:44:49 +00:00
Edward Z. Yang
b6bcd09173 Get rid of tabular and sizes, beef up verbosity of output graph (#125507)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/125507
Approved by: https://github.com/Chillee, https://github.com/jansel
ghstack dependencies: #125505
2024-05-06 13:41:58 +00:00
William Wen
55c705b602 [dynamo] add trace_bytecode logging artifact (#125360)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125360
Approved by: https://github.com/ezyang
2024-05-02 22:01:00 +00:00
Simon Fan
43a7ab2a21 [compiled autograd] introduce verbose logs, add autograd node info to graph (#124954)
- sets it as a fake stack trace as we don't have a generic comment feature
- when verbose is disabled, still adds a contextmanager and flag checks. the alternative is to use MACROS, but that wouldn't be usable with TORCH_LOGS

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124954
Approved by: https://github.com/jansel
2024-04-27 01:10:37 +00:00
Xuehai Pan
93e249969b [BE] enable ruff rule RSE and remove useless parentheses in raise statements (#124261)
Remove useless parentheses in `raise` statements if the exception type is raised with no argument.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/124261
Approved by: https://github.com/albanD
2024-04-17 19:29:34 +00:00