Commit Graph

93 Commits

Author SHA1 Message Date
PyTorch MergeBot
3f1824742c Revert "Fix comparing inductor actual strides vs bw graph for activations should not throw DDE. (#166277)"
This reverts commit b2a0f90501.

Reverted https://github.com/pytorch/pytorch/pull/166277 on behalf of https://github.com/atalman due to Breaks internal executorch tests ([comment](https://github.com/pytorch/pytorch/pull/166277#issuecomment-3468696623))
2025-10-30 15:49:23 +00:00
Laith Sakka
b2a0f90501 Fix comparing inductor actual strides vs bw graph for activations should not throw DDE. (#166277)
Fix https://github.com/pytorch/pytorch/issues/163894

Pull Request resolved: https://github.com/pytorch/pytorch/pull/166277
Approved by: https://github.com/Lucaskabela
2025-10-30 00:34:05 +00:00
Scott Wolchok
331b7cc054 Fix double dispatch to Python for detach (#163671)
This fixes #71725.

Differential Revision: [D83857880](https://our.internmc.facebook.com/intern/diff/D83857880)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163671
Approved by: https://github.com/ezyang, https://github.com/albanD
2025-10-15 17:24:50 +00:00
Animesh Jain
c9b2a09530 [export] Turn on install_free_tensors flag (#164691)
The final step in removing the discrepancy between
torch.compile(fullgraph=True) and torch.export(strict=True).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164691
Approved by: https://github.com/avikchaudhuri
2025-10-14 15:33:50 +00:00
PyTorch MergeBot
fa3916f466 Revert "[export] Turn on install_free_tensors flag (#164691)"
This reverts commit 220a34118f.

Reverted https://github.com/pytorch/pytorch/pull/164691 on behalf of https://github.com/seemethere due to Breaks some internal things, both me and author agreed that revert was the best course of action ([comment](https://github.com/pytorch/pytorch/pull/164691#issuecomment-3400013759))
2025-10-14 03:58:12 +00:00
PyTorch MergeBot
267348fe7f Revert "Fix double dispatch to Python for detach (#163671)"
This reverts commit a3e3efe474.

Reverted https://github.com/pytorch/pytorch/pull/163671 on behalf of https://github.com/seemethere due to We should've reverted this when we decided to revert https://github.com/pytorch/pytorch/pull/164691 since they were actually stacked ([comment](https://github.com/pytorch/pytorch/pull/163671#issuecomment-3400009953))
2025-10-14 03:55:36 +00:00
PyTorch MergeBot
1803d40c99 Reapply "[export] Turn on install_free_tensors flag (#164691)" (#165353)
This reverts commit 9166f6120f.

Reverted https://github.com/pytorch/pytorch/pull/165353 on behalf of https://github.com/seemethere due to This is causing merge conflicts since a dependent PR wasn't reverted ([comment](https://github.com/pytorch/pytorch/pull/165353#issuecomment-3400006587))
2025-10-14 03:52:50 +00:00
Animesh Jain
9166f6120f Revert "[export] Turn on install_free_tensors flag (#164691)" (#165353)
This reverts commit 220a34118f.

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/165353
Approved by: https://github.com/seemethere
2025-10-13 23:40:11 +00:00
Scott Wolchok
a3e3efe474 Fix double dispatch to Python for detach (#163671)
This fixes #71725.

Differential Revision: [D83857880](https://our.internmc.facebook.com/intern/diff/D83857880)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163671
Approved by: https://github.com/ezyang, https://github.com/albanD
2025-10-13 16:10:17 +00:00
Animesh Jain
220a34118f [export] Turn on install_free_tensors flag (#164691)
The final step in removing the discrepancy between
torch.compile(fullgraph=True) and torch.export(strict=True).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164691
Approved by: https://github.com/avikchaudhuri
2025-10-11 04:26:09 +00:00
Lucas Kabela
f363114852 [Bugfix][Inductor][Dynamo] Fix stride incorrectness issues for stride 0 tensor (#164897)
Fixes #164814 - we update to include cases where we know symbolic expression is statically one.  There are two errors here; first in graph capture, where a tensor with size 0 yet symbolic stride would attempt to keep the symbolic stride, resulting in a mismatch.  The second is in inductor code gen, where we only checked in squeeze if size == 1, missing the case where a symbolic stride equals 1.

Also fixes #164924 (@bobrenjc93  for fuzzer finding an issue affecting users : )

### Test plan:
```
python test/dynamo/test_aot_autograd.py AotAutogradFallbackTests
```

Results in:
```
..
----------------------------------------------------------------------
Ran 49 tests in 45.622s

OK (expected failures=1)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164897
Approved by: https://github.com/laithsakka
2025-10-10 21:26:57 +00:00
PyTorch MergeBot
34ac9b61cb Revert "[export] Turn on install_free_tensors flag (#164691)"
This reverts commit 0e9b3a772a.

Reverted https://github.com/pytorch/pytorch/pull/164691 on behalf of https://github.com/izaitsevfb due to breaks tests internally, author asked to revert, see [D84230990](https://www.internalfb.com/diff/D84230990) ([comment](https://github.com/pytorch/pytorch/pull/164691#issuecomment-3387718323))
2025-10-09 22:53:50 +00:00
Sherlock Huang
e532f62e0d Introduce joint_custom_pass callback (#164981)
```
        def joint_custom_pass(joint_gm: torch.fx.GraphModule, joint_inputs):
           # apply your pass for joint graph here

            return joint_gm

        class M(torch.nn.Module):
            def forward(self, x):
                return x.sin()

        x = torch.randn(10, requires_grad=False)
        compiled_fn = torch.compile(M(), backend="aot_eager")

        with torch._functorch.config.patch("joint_custom_pass", joint_custom_pass):
            out = compiled_fn(x)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164981
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2025-10-09 04:40:54 +00:00
Animesh Jain
0e9b3a772a [export] Turn on install_free_tensors flag (#164691)
The final step in removing the discrepancy between
torch.compile(fullgraph=True) and torch.export(strict=True).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164691
Approved by: https://github.com/avikchaudhuri
ghstack dependencies: #164721
2025-10-09 03:25:15 +00:00
PyTorch MergeBot
97463d4cf3 Revert "Fix double dispatch to Python for detach (#163671)"
This reverts commit c32118dc3e.

Reverted https://github.com/pytorch/pytorch/pull/163671 on behalf of https://github.com/izaitsevfb due to breaks export tests ([comment](https://github.com/pytorch/pytorch/pull/163671#issuecomment-3379281422))
2025-10-08 01:46:45 +00:00
Scott Wolchok
c32118dc3e Fix double dispatch to Python for detach (#163671)
This fixes #71725.

Differential Revision: [D83857880](https://our.internmc.facebook.com/intern/diff/D83857880)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/163671
Approved by: https://github.com/ezyang, https://github.com/albanD
2025-10-07 23:34:37 +00:00
Shangdi Yu
6b768e1890 Support propagating custom meta field to backward graph nodes (#164174)
# Propagate custom meta data to backward

Support propagating the user annotation tags to backward graph, by extending the `copy_fwd_metadata_to_bw_nodes` utils (recommended by @xmfan , thanks!).

Example annotation API (added in https://github.com/pytorch/pytorch/pull/163673):

```
class M(torch.nn.Module):
    def forward(self, x):
        with fx_traceback.annotate({"pp_stage": 0}):
            with fx_traceback.annotate({"fdsp_bucket": 0}):
                x = x + 1
            x = x - 2
            with fx_traceback.annotate({"cuda_stream": 2, "fsdp_bucket": 1}):
                x = x * 2
        x = x / 3
        return x
```

Assumptions (some inherited from https://github.com/pytorch/pytorch/pull/126573):

- I am trusting the seq_nr mapping introduced to aot_autograd nodes in https://github.com/pytorch/pytorch/pull/103129
- I am also trusting that the forward is single threaded, since seq_nr is thread local.  If this isn't always true, we'll need to also plumb thread_id through the same machinery which is populating seq_nr.
- **(This is changed in this PR!) I assume all backward graph nodes has "is_backward" for 'partitioner_tag', and all other nodes are forward graph nodes**.  If we don't run export before `aot_export_join_with_descriptors`, then none of the nodes has "nn_module_stack" in node meta. If we do run export first, then we don't need this change.
- I copy "custom" node meta from forward to backward graph nodes.

Question:
- Is it a good idea to copy all "custom" node meta? Or should we create a dedicated key in custom node meta to be copied? @SherlockNoMad
- Do we expect people to run export before using `aot_export_join_with_descriptors`?
- Can we assume the following for graph produced by `aot_export_join_with_descriptors`? "all backward graph nodes has "is_backward" for 'partitioner_tag', and all other nodes are forward graph nodes". Maybe this is a question for @ezyang

```
python test/functorch/test_aot_joint_with_descriptors.py -k test_preserve_
python test/export/test_export.py -k preserve_anno
python test/distributed/tensor/test_dtensor_export.py
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/164174
Approved by: https://github.com/xmfan, https://github.com/SherlockNoMad
2025-10-04 05:03:32 +00:00
Lucas Kabela
5d89634ca8 Graph break with error message (#158800)
Fixes #157452

Test with
```
python test/dynamo/test_repros.py ReproTests.test_nn_parameter_ctor_graph_breaks
```

### Release Notes

Change to nn.Parameter Constructor Behavior in Dynamo

Semantic change introduced in the nn.Parameter constructor; previously, if the constructor lacked a clean source, the system would attempt to infer arguments to construct a clone and lift this synthetic proxy in the computation graph. This approach had many potential edge cases and was difficult to reason about. The new behavior defaults to graph breaking when the nn.Parameter constructor does not have a clean source. Users are now suggested to manually move the constructor out of the graph in such cases. This change improves clarity and reduces complexity in graph construction and debugging.  Users can escape hatch to old semantics with `torch.dynamo.config.graph_break_on_nn_param_ctor=False` if this cannot be done.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158800
Approved by: https://github.com/anijain2305
2025-07-29 17:34:49 +00:00
PyTorch MergeBot
8d2a1d6e18 Revert "Graph break with error message (#158800)"
This reverts commit cae4746952.

Reverted https://github.com/pytorch/pytorch/pull/158800 on behalf of https://github.com/clee2000 due to broke some tests on main inductor/test_distributed_patterns.py::DistributedPatternTests::test_nn_param_return4 [GH job link](https://github.com/pytorch/pytorch/actions/runs/16507837934/job/46685704688) [HUD commit link](cae4746952), note to self: bad TD, but also dynamo/test_repros failed but didn't get skipped by TD so maybe a landrace, or I just blaming the wrong commit entirely.. ([comment](https://github.com/pytorch/pytorch/pull/158800#issuecomment-3115224608))
2025-07-24 22:45:58 +00:00
Lucas Kabela
cae4746952 Graph break with error message (#158800)
Fixes #157452

Test with
```
python test/dynamo/test_repros.py ReproTests.test_nn_parameter_ctor_graph_breaks
```

### Release Notes

Change to nn.Parameter Constructor Behavior in Dynamo

Semantic change introduced in the nn.Parameter constructor; previously, if the constructor lacked a clean source, the system would attempt to infer arguments to construct a clone and lift this synthetic proxy in the computation graph. This approach had many potential edge cases and was difficult to reason about. The new behavior defaults to graph breaking when the nn.Parameter constructor does not have a clean source. Users are now suggested to manually move the constructor out of the graph in such cases. This change improves clarity and reduces complexity in graph construction and debugging.  Users can escape hatch to old semantics with `torch.dynamo.config.graph_break_on_nn_param_ctor=False` if this cannot be done.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158800
Approved by: https://github.com/anijain2305
2025-07-24 21:05:17 +00:00
Edward Z. Yang
979fae761c Rename modules in AOTAutograd (#158449)
Fixes https://github.com/pytorch/pytorch/issues/158382

```
renamed:    torch/_functorch/_aot_autograd/dispatch_and_compile_graph.py -> torch/_functorch/_aot_autograd/graph_capture.py
renamed:    torch/_functorch/_aot_autograd/traced_function_transforms.py -> torch/_functorch/_aot_autograd/graph_capture_wrappers.py
renamed:    torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py -> torch/_functorch/_aot_autograd/graph_compile.py
```

Everything else is ONLY import changes. I did not rename any functions
even if we probably should have.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158449
Approved by: https://github.com/jamesjwu
2025-07-21 13:27:07 +00:00
Xuehai Pan
02715d0876 [BE][5/6] fix typos in test/ (test/dynamo/) (#157639)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157639
Approved by: https://github.com/yewentao256, https://github.com/jansel
ghstack dependencies: #157638
2025-07-06 06:34:25 +00:00
Animesh Jain
f1787ee0f7 [dynamo] Remove L scoping for recompilation messages (#148917)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148917
Approved by: https://github.com/williamwen42
2025-03-11 14:26:26 +00:00
Brian Hirsh
447a142de2 support input mutations on tangents in compile (#141131)
Fixes https://github.com/pytorch/pytorch/issues/141111. We previously supported mutations on saved activations that happened in the backward. This PR extends the support to tangents

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141131
Approved by: https://github.com/zou3519
2025-02-13 17:48:56 +00:00
Yuanhao Ji
6bbbb08458 [Dynamo] Replace torch._dynamo.optimize() with torch.compile() [10/N] (#142451)
> This is the last one

related commits:

- #139706
- #140238
- #140247
- #140253
- #140663
- #140688
- #140922
- #140924
- #140933
- #142451

Pull Request resolved: https://github.com/pytorch/pytorch/pull/142451
Approved by: https://github.com/bdhirsh
2024-12-17 12:18:29 +00:00
Tom Ritchford
d25e6e623f Fix unused Python variables in test/[a-d]* (#134665)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/134665
Approved by: https://github.com/albanD
2024-12-13 22:13:12 +00:00
James Wu
fbbafd0320 Turn on AOTAutogradCache by default on open source (#141981)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141981
Approved by: https://github.com/bdhirsh, https://github.com/oulgen
2024-12-12 04:21:11 +00:00
Yukio Siraichi
470b775d7a Remove functorch config: _max_aliased_inputs_with_dynamic_shapes_enabled. (#141680)
This PR removes the functorch config that set an upper limit on the number of aliased
inputs with dynamic shapes. After moving them to be run at runtime in C++, the compilation
time and runtime (in true alias cases) improved, rendering the error no longer relevant.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141680
Approved by: https://github.com/bdhirsh
ghstack dependencies: #139554, #139555, #140013
2024-12-05 14:43:58 +00:00
Yukio Siraichi
12d28a5929 Move overlapping guards to C++. (#140013)
This PR moves the logic for computing the overlapping relations between input tensors that
share a storage instance to C++.

In summary, this PR:

- Moves both `tensors_definitely_do_not_overlap` and part of `compute_overlapping_tensors`
to C++
- Introduces a `check_overlapping` function that re-runs `compute_overlapping_tensors`,
checking that the result is consistent with what is expected
- Introduces the `StorageOverlapChecker` class
    - Keeps track of overlapping and non-overlapping tensors
    - Actually checks the overlapping relation (call `check_overlapping`) when all tensors
    are collected
- Introduces the `STORAGE_OVERLAPPING` relational guard
    - Has a reference to a `StorageOverlapChecker`
    - Stores the to-be-checked tensors in the checker, and triggers its check
- Introduces `install_storage_overlapping_guard` python function
    - Creates an instance of `StorageOverlapChecker`
    - Creates 2 instances of the `STORAGE_OVERLAPPING` guard (for overlapping and
    non-overlapping tensors), referencing the same `StorageOverlapChecker` instance

**Why is `StorageOverlapChecker` needed?**

The way `GuardManager` is implemented, we have no control over the order in which the
check methods are called, i.e. no control over the order the tensors are collected. So, we
can't easily split them in "overlapping" and non-overlapping kinds.

Instead, we create 2 instances of `STORAGE_OVERLAPPING` guard, each of which helps
collecting the tensors for one of the kinds mentioned above. They are then used in a
single `StorageOverlapChecker` instance.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140013
Approved by: https://github.com/bdhirsh
ghstack dependencies: #139554, #139555
2024-12-05 14:43:58 +00:00
Yukio Siraichi
3a1ded5caa Add tensor overlapping guards. (#139555)
Fix: #118214

This PR replaces the guards introduced by running `_tensors_definitely_do_not_overlap` at
compile-time by a single `___check_overlapping` guard. When evaluated, this function calls
the original `_tensors_definitely_do_not_overlap` so as to check whether the current state
of the inputs are consistent, i.e. tensors that should overlap do overlap, and those that
shouldn't don't.

In summary, the changes are:

- Introduce `StorageOverlap` derived class from `GuardEnvExpr`
- Plumb `AOTConfig` to the `compute_overlapping_inputs` function, so as to have access to
AOTAutograd input sources
- Suppress the guards generated by `_tensors_definitely_do_not_overlap` function at runtime
- Issue a `StorageOverlap` AOTAutograd guard, specifying the sources that should and
shouldn't overlap

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139555
Approved by: https://github.com/bdhirsh
ghstack dependencies: #139554
2024-12-05 14:43:58 +00:00
Boyuan Feng
3ef031909f [Donated Buffer] support metadata mutation ops (#141308)
### Background:

`set(x,y)` changes the untyped storage of x to be the same as y.

```python
import torch
from torch._subclasses.fake_tensor import FakeTensorMode

x1 = torch.ones(2,3)
y1 = torch.ones(2,3)
z1 = torch.ops.aten.set_.source_Tensor(x1, y1)

fake_tensor_mode = FakeTensorMode()
x2 = fake_tensor_mode.from_tensor(torch.ones(2,3))
y2 = fake_tensor_mode.from_tensor(torch.ones(2,3))
z2 = torch.ops.aten.set_.source_Tensor(x2, y2)

print(f"x1: {x1.untyped_storage()._cdata}, y1: {y1.untyped_storage()._cdata}, z1: {z1.untyped_storage()._cdata}")
print(f"x2: {x2.untyped_storage()._cdata}, y2: {y2.untyped_storage()._cdata}, z2: {z2.untyped_storage()._cdata}")
# x1: 99973024, y1: 99973024, z1: 99973024
# x2: 112107232, y2: 112107232, z2: 112107232
```

### Error before this diff

Consider this example:
```python
import torch

def fn(x):
    p = torch.nn.Parameter(x + 123)
    return p, p.sin()

opt = torch.compile(fn, fullgraph=True)
x = torch.ones(16, device="cuda", requires_grad=True)

p, r = opt(x)
r.sum().backward()
```

When running with `TORCH_LOGS=aot`, we have `set_` in the graph.
```
def forward(self, primals_1: "f32[16][1]cuda:0", primals_2: "f32[16][1]cuda:0"):
   # File: /home/boyuan/playground/inductor/donated_buffer.py:4 in fn, code: p = torch.nn.Parameter(x + 123)
  add: "f32[16][1]cuda:0" = torch.ops.aten.add.Tensor(primals_1, 123);  primals_1 = None

   # File: /home/boyuan/playground/inductor/donated_buffer.py:5 in fn, code: return p, p.sin()
  sin: "f32[16][1]cuda:0" = torch.ops.aten.sin.default(add)

  # No stacktrace found for following nodes
  set_: "f32[16][1]cuda:0" = torch.ops.aten.set_.source_Tensor(primals_2, add);  primals_2 = set_ = None
  return (sin, add)
```

`set_: "f32[16][1]cuda:0" = torch.ops.aten.set_.source_Tensor(primals_2, add)` should change the storage of `primals_2` to be the same as `add`. However, this is not true before this diff. We found different untyped_storage() for meta['val'] of `set_`, `add`, and `primals_2`.

This also leads to an error with donated buffer (#130580), which checks alias by untyped_storage. Since `add` and `primals_2` have different untyped_storage (which is wrong), add is wrongly marked as donated buffer.

### Root Cause

During tracing, we have args, kwargs, out, and proxy_args, proxy_kwargs, proxy_out.

We use args and kwargs to compute `out = func(*args, **kwargs)` ([Here](https://github.com/pytorch/pytorch/blob/main/torch/fx/experimental/proxy_tensor.py#L912)). Later, we set out to its proxy, essentially calling `proxy_out.node.meta["val"] = out.detach()`.

Due to the detach, the storage change happens on args but not on proxy_args.node.meta["val"] when func is torch.ops.aten.set_. I repro'ed this behavior of detach in eager code.

```python
import torch

x = torch.ones(2,3)
x_detach = x.detach()
y = torch.ones(2,3)
z = torch.ops.aten.set_.source_Tensor(x_detach, y)

print(f"x: {x.untyped_storage()._cdata}, x_detach: {x_detach.untyped_storage()._cdata}, y: {y.untyped_storage()._cdata}, z: {z.untyped_storage()._cdata}")
# x: 97023632, x_detach: 97026480, y: 97026480, z: 97026480
```

To fix the issue, this PR manually resets node.meta["val"] if the storage has changed.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/141308
Approved by: https://github.com/bdhirsh
2024-11-26 17:06:46 +00:00
zeshengzong
cb71bcc542 Replace clone.detach with detach.clone (#140264)
Fixes #64532

As state in issue, replace `clone.detach` by `detach.clone`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/140264
Approved by: https://github.com/soulitzer
2024-11-13 07:01:02 +00:00
Yuanhao Ji
7f1e248b50 [Dynamo] Replace torch._dynamo.optimize() with torch.compile() [1/N] (#139706)
``torch._dynamo.optimize()`` is wrapped for convenience by ``torch.compile()``.

related commits:

- #139706
- #140238
- #140247
- #140253

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139706
Approved by: https://github.com/jansel, https://github.com/ezyang
2024-11-11 20:04:08 +00:00
Boyuan Feng
87059d4547 [AOTAutograd] Handle edge cases for donated buffer & enable in oss (#139669)
This PR enables donated buffer in OSS and handles two edge cases:

1. While donated buffer relies on storage to check alias, sparse tensor subclasses does not provide access to storage. So we skip sparse tensor subclasses for donated buffer.
2. Handles missing "val" from n.meta. This is observed from `inductor/test_fused_attention.py::SDPAPatternRewriterCpuTests::test_sdpa_rewriter_11_cpu`,
`functorch/test_aotdispatch.py::TestAOTAutograd::test_input_mutation_simple_with_none_and_nontensor`, and
`inductor/test_compiled_autograd.py::TestCompiledAutograd::test_trace_run_with_rng_state`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/139669
Approved by: https://github.com/bdhirsh
2024-11-05 18:38:20 +00:00
PyTorch MergeBot
796c3c3415 Revert "Disallow FakeTensor.data_ptr access in eager mode (#137221)"
This reverts commit 7e13e7dd7e.

Reverted https://github.com/pytorch/pytorch/pull/137221 on behalf of https://github.com/jovianjaison due to failing internal tests ([comment](https://github.com/pytorch/pytorch/pull/137221#issuecomment-2397957081))
2024-10-07 21:46:13 +00:00
rzou
7e13e7dd7e Disallow FakeTensor.data_ptr access in eager mode (#137221)
Previously we raised a deprecation warning (beginning PyTorch 2.4). Now
that we are on 2.6, we're completing the deprecation and disallowing
this behavior.

Test Plan:
- tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/137221
Approved by: https://github.com/albanD, https://github.com/eellison
2024-10-03 23:47:55 +00:00
vasiliy
48f7bdbbe1 aot_autograd: copy metadata from fw to bw nodes (#126573)
Summary:

Uses the `seq_nr` field (introduced to aot_autograd nodes in
https://github.com/pytorch/pytorch/pull/103129) to map the aot_autograd
fx bw nodes to the corresponding fw nodes, and copy the metadata over.

I am trusting the `seq_nr` mapping in the linked PR here. I did
some validation with a toy LLaMa 3 8b training run and the mapping seemed
correct.

I am also trusting that the forward is single threaded, since `seq_nr` is thread local.  If this isn't always true, we'll need to also plumb `thread_id` through the same machinery which is populating `seq_nr`.

I'd like to use this data in a future PR to make inductor kernels easily
attributable to the nn.Module path in modeling land, to make it easier
to do performance debugging.

Test Plan:

```
// 1. unit test
python test/dynamo/test_aot_autograd.py -k test_aot_sequence_nr

// 2. manual test
// run LLaMa 3 8B fw + bw with torch.compile, print out the inductor graphs
// seen in `torch/_inductor/utils.py::get_kernel_metadata`, they seemed
// right to me.
```

Reviewers:

Subscribers:

Tasks:

Tags:

Pull Request resolved: https://github.com/pytorch/pytorch/pull/126573
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2024-08-07 21:25:09 +00:00
Oguz Ulgen
920f0426ae Add None return type to init -- tests rest (#132376)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132376
Approved by: https://github.com/jamesjwu
ghstack dependencies: #132335, #132351, #132352
2024-08-01 15:44:51 +00:00
Xuehai Pan
918ece4f4d [BE][Easy][11/19] enforce style for empty lines in import segments in test/dy*/ (#129762)
See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter.

You can review these PRs via:

```bash
git diff --ignore-all-space --ignore-blank-lines HEAD~1
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/129762
Approved by: https://github.com/anijain2305
2024-07-27 17:43:53 +00:00
Boyuan Feng
40cc5c0697 [AOT Autograd] Donated Buffer (#130580)
Implements donated buffer feature and adds unit tests. Donated buffer is a saved tensor that is not aliased with forward inputs, fw_outputs (except saved tensors), and bw_outputs. We detect donated buffers during `aot_dispatch_autograd` and store donated buffers in `ViewAndMutationMetadata`, such that it can be accssed in inductor.

Fixes #129496

Pull Request resolved: https://github.com/pytorch/pytorch/pull/130580
Approved by: https://github.com/bdhirsh
2024-07-26 17:14:34 +00:00
Xuehai Pan
26f4f10ac8 [5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
2024-05-27 14:49:57 +00:00
PyTorch MergeBot
55c0ab2887 Revert "[5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)"
This reverts commit 7763c83af6.

Reverted https://github.com/pytorch/pytorch/pull/127126 on behalf of https://github.com/XuehaiPan due to Broken CI ([comment](https://github.com/pytorch/pytorch/pull/127126#issuecomment-2133044286))
2024-05-27 09:22:08 +00:00
Xuehai Pan
7763c83af6 [5/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort torch (#127126)
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127126
Approved by: https://github.com/kit1980
ghstack dependencies: #127122, #127123, #127124, #127125
2024-05-27 04:22:18 +00:00
Xuehai Pan
a28bfb5ed5 [4/N][Easy] fix typo for usort config in pyproject.toml (kown -> known): sort functorch (#127125)
The `usort` config in `pyproject.toml` has no effect due to a typo. Fixing the typo make `usort` do more and generate the changes in the PR. Except `pyproject.toml`, all changes are generated by `lintrunner -a --take UFMT --all-files`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127125
Approved by: https://github.com/Skylion007
ghstack dependencies: #127122, #127123, #127124
2024-05-25 22:45:38 +00:00
Animesh Jain
1346ebf12e [dynamo][guards] Delay DUPLICATE_INPUT guard because of incorrect ordering (#123605)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123605
Approved by: https://github.com/jansel
ghstack dependencies: #123606
2024-04-10 07:30:02 +00:00
rzou
fd60752786 Turn _allow_unsafe_data_ptr_access into a config option (#123291)
We're not planning on having this flag around for very long (see
deprecation in next PR), so it's better as a config option.

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123291
Approved by: https://github.com/eellison
ghstack dependencies: #123261, #123282
2024-04-04 20:35:24 +00:00
rzou
c81c9ba472 Disallow {FakeTensor,FunctionalTensor}.data_ptr (#122514)
This PR:
- disallows FakeTensor.data_ptr when it is called inside PT2 or fx tracing.
- disallows FunctionalTensor.data_ptr (python FunctionalTensor is only used in
  PT2)

The motivation behind this is that the leading cause of segfaults when
using custom ops with PT2 is calling .data_ptr on FunctionalTensor or
FakeTensor.

This change is BC-breaking. If your code broke as a result of this, it's
because there was a bug in it (these .data_ptr should never be
accessed!). You can either fix the bug (recommended) or get the previous
behavior back with:
```
from torch._subclasses.fake_tensor import FakeTensor
from torch._subclasses.functional_tensor import FunctionalTensor

data_ptr = 0 if isinstance(tensor, (FakeTensor, FunctionalTensor)) else tensor.data_ptr()
```

Test Plan:
- existing tests

Differential Revision: [D55366199](https://our.internmc.facebook.com/intern/diff/D55366199)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122514
Approved by: https://github.com/ezyang, https://github.com/albanD, https://github.com/yifuwang, https://github.com/kurtamohler
2024-03-26 23:55:42 +00:00
rzou
e6cf3e90a5 [AOTAutograd / Functionalization] Fix incorrect expand_inverse (#122114)
This is a rebase of https://github.com/pytorch/pytorch/pull/114538,
originally submited by @jon-chuang.

Fixes #114302

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122114
Approved by: https://github.com/bdhirsh
2024-03-18 22:52:57 +00:00
soulitzer
55483fc2c9 Min-cut partitioner always saves tensors that are returned as-is in backward (#114970)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114970
Approved by: https://github.com/Chillee
2024-02-13 00:04:41 +00:00
Sergii Dymchenko
bd9db6a9c7 Update to TorchFix 0.4.0 (#119424)
`torch.library.Library` updated to `torch.library._scoped_library` in files with many tests where it seems obvious to do, otherwise `noqa: TOR901` added - see https://github.com/pytorch/pytorch/pull/118318 for more context.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119424
Approved by: https://github.com/zou3519
2024-02-12 23:30:12 +00:00