Commit Graph

821 Commits

Author SHA1 Message Date
Yanbo Liang
ab5385fc50 [Dynamo][6.3/N] Further cleanup torch.py (#114669)
A follow-up PR to clean up what I found during the refactor of torch.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114669
Approved by: https://github.com/jansel
2023-12-01 04:08:29 +00:00
Yanbo Liang
7f40640342 [Dynamo] Support torch.amp.autocast as decorator (#114845)
Fixes #114818

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114845
Approved by: https://github.com/jansel
2023-11-30 23:54:57 +00:00
vfdev
f93ea14309 [dynamo] Added support for math ops on ints with dynamic shapes (#114507)
Fixes #114218

```
import math
import torch

def func(x, a):
    b = math.floor(a + 0.5)
    b = math.radians(a) + b
    y = x + b
    return y

cfunc = torch.compile(func, dynamic=True, fullgraph=True, backend="eager")
x = torch.tensor([0, 1, 2, 3], dtype=torch.float32)
a = 12

out = cfunc(x, a)
```

```
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG] TRACED GRAPH
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]  ===== __compiled_fn_0 =====
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]  <eval_with_key>.0 class GraphModule(torch.nn.Module):
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]     def forward(self, L_a_ : torch.SymInt, s1 : torch.SymInt, L_x_ : torch.Tensor):
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         l_a_ = L_a_
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         l_x_ = L_x_
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         # File: check_math_ops.py:7, code: b = math.floor(a + 0.5)
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         add = l_a_ + 0.5
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         floor = math_floor(add);  add = None
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         # File: /pytorch/torch/_dynamo/polyfill.py:28, code: return math.pi / 180.0 * x
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         mul = 0.017453292519943295 * l_a_;  l_a_ = None
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         # File: check_math_ops.py:9, code: b = math.radians(a) + b
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         add_1 = mul + floor;  mul = floor = None
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         # File: check_math_ops.py:13, code: y = x + b
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         y = l_x_ + add_1;  l_x_ = add_1 = None
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]         return (y,)
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]
[2023-11-29 18:10:08,385] [0/0] torch._dynamo.output_graph.__graph_code: [DEBUG]
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114507
Approved by: https://github.com/lezcano
2023-11-30 14:11:57 +00:00
rzou
ce4bff4013 [dynamo] fix functools.wraps on nested functions (#114279)
Updated version of #108885 addressing the review. In this PR:
- We add a VT.can_reconstruct utility that checks if VT.reconstruct()
  does something.
- If functools.wraps(fn) is passed a `fn` that either has a source or
  has .can_reconstruct() == True, then we stash the source (or the VT)
- Later on, we use the source (or VT.reconstruct) to actually
  reconstruct the object in codegen.

Test Plan:
- New tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114279
Approved by: https://github.com/voznesenskym
2023-11-28 22:34:59 +00:00
voznesenskym
ddf1cb7870 AOTAutograd: handle set_(), detect metadata mutations that cancel out (#111554)
This should be enough to get @voznesenskym 's FSDP branch to plumb `set_()` through AOTAutograd properly and have everything properly no-op out. Main changes are:

(1) graph break on `aten::set_.source_Tensor_storage_offset` (we could support it but it isn't needed, seems safer to graph break)

(2) Functionalization: add a "proper" functionalization kernel for `aten::set_.source_Tensor`. The previous one we had was codegen'd and it was wrong (it would just clone() and call set_(), which does not do the right thing). I also manually mark on the `FunctionalTensorWrapper` when a given tensor has been mutated by a `set_()` call.

(3) AOTAutograd: I added a new field, `InputAliasInfo.mutates_storage_metadata`, so we can distinguish between "regular" metadata mutations, and metadata mutations due to `set_()` calls. This is mainly because at runtime, one requires calling `as_strided_()` to fix up metadata, while the other requires calling `set_()`.

(4) Made AOTAutograd's detection for metadata mutations / set_() mutations smarter and detect no-ops (if the storage and metadata are all the same).

I also killed `was_updated()` and `was_metadata_updated()`, and replaced them with (existing) `has_data_mutation() ` and (new) `has_data_mutation()`, which can more accurately distinguish between data-mutation vs. `set_()` calls vs. metadata-mutation

**This PR is still silently correct in one case though**, which I'd like to discuss more. In particular, this example:
```
def f(x):
    x_view = x.view(-1)
    x.set_(torch.ones(2))
    x_view.mul_(2)
    return
```

If you have an input that experiences both a data-mutation **and** a `x_old.set_(x_new)` call, there are two cases:

(a) the data mutation happened on the storage of `x_new`. This case should be handled automatically: if x_new is a graph intermediate then we will functionalize the mutation. If x_new is a different graph input, then we will perform the usual `copy_()` on that other graph input

(b) the data mutation happened on the storage of `x_old`. This is more of a pain to handle, and doesn't currently work. At runtime, the right thing to do is probably something like:
```

def functionalized_f(x):
    x_view = x.view(-1)
    # set_() desugars into a no-op; later usages of x will use x_output
    x_output = torch.ones(2)
    # functionalize the mutation on x_view
    x_view_updated = x.mul(2)
    x_updated = x_view_updated.view(x.shape)
    # x experienced TWO TYPES of mutations; a data mutation and a metatadata mutation
    # We need to return both updated tensors in our graph
    return x_updated, x_output
def runtime_wrapper(x):
    x_data_mutation_result, x_set_mutation_result = compiled_graph(x)
    # First, perform the data mutation on x's old storage
    x.copy_(x_data_mutation_result)
    # Then, swap out the storage of x with the new storage
    x.set_(x_set_mutation_result)
```

There are two things that make this difficult to do though:

(1) Functionalization: the functionalization rule for `set_()` will fully throw away the old `FunctionalStorageImpl` on the graph input. So if there are any mutations to that `FunctionalStorageImpl` later on in the graph, the current graph input won't know about it. Maybe we can have a given `FunctionalTensorWrapper` remember all previous storages that it had, and track mutations on all of them - although this feels pretty complicated.

(2) AOTAutograd now needs to know that we might have *two* graph outputs that correspond to a single "mutated input", which is annoying.

It's worth pointing out that this issue is probably extremely unlikely for anyone to run into - can we just detect it and error? This feels slightly easier than solving it, although not significantly easier. We would still need `FunctionalTensorWrapper` to keep track of mutations on any of its "previous" storages, so it can report this info back to AOTAutograd so we can raise an error.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111554
Approved by: https://github.com/ezyang
ghstack dependencies: #113926
2023-11-28 19:33:35 +00:00
Bin Bao
0bef97fac3 [dynamo] Support itertools.groupby (#114192)
Summary: for https://github.com/pytorch/pytorch/issues/108698

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114192
Approved by: https://github.com/jansel
2023-11-28 14:58:59 +00:00
lezcano
79ee99e6d2 [easy] Dispatch torch.from_numpy to torch.as_tensor (#114609)
...rather than detaching the tensor

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114609
Approved by: https://github.com/larryliu0820, https://github.com/voznesenskym
ghstack dependencies: #114608
2023-11-28 12:04:37 +00:00
lezcano
0bb2600c28 Allow to differentiate through NumPy code (#114608)
With this PR it is possible to differentiate through NumPy code modulo
the usual caveats that apply to differentiation:
- That there are no graphbreaks
- That the decomposition in `torch._numpy` is differentiable

@ev-br and I were somewhat careful to achieve the second point, but
it is not tested though and through, so YMMV

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114608
Approved by: https://github.com/voznesenskym
2023-11-28 12:04:37 +00:00
Angela Yi
dffa5f3f23 [dynamo][reland] ExecutorchCallDelegateHigherOrderVariable - add sanity check that input and output tensors are disjoint (#114167)
Summary: Reland of https://github.com/pytorch/pytorch/pull/111960, Fixes https://github.com/pytorch/pytorch/issues/111917

Original PR broke some internal tests which the current diff has resolved.

Test Plan: CI

Differential Revision: D51473196

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114167
Approved by: https://github.com/jon-chuang, https://github.com/zou3519
2023-11-28 00:27:23 +00:00
ydwu4
2ac0b61e60 [HigherOrderOp] dedup repeated get_attr placeholders in branches of cond (#112874)
We further de-duplicate the dupliacted get_attrs nodes.

For code below:
```python
def test_cond_free_variable_in_both_branches(self):
    backend = EagerAndRecordGraphs()
    cnt = CompileCounterWithBackend(backend)

    z = torch.ones(4, 4)

    class Foo(torch.nn.Module):
        def __init__(self):
            super().__init__()
            self.register_buffer("buffer", torch.ones(6, 4))

        def forward(self, x, y):
            def true_fn(x):
                return x.sum() + self.buffer.sum() + z.sum()

            def false_fn(x):
                return x.sum() - z.sum() - self.buffer.sum()

            return control_flow.cond(y, true_fn, false_fn, [x])

    mod_for_compile = torch.compile(
        Foo(), backend=cnt, dynamic=True, fullgraph=True
    )
```

Before de-duplication, we have the following graph module:
```python
class GraphModule(torch.nn.Module):
    def forward(self, L_y_ : torch.Tensor, L_x_ : torch.Tensor, s0 : torch.SymInt, L_z_ : torch.Tensor):
        l_y_ = L_y_
        l_x_ = L_x_
        l_z_ = L_z_

        # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1243, code: return x.sum() + self.buffer.sum() + z.sum()
        l__self___buffer = self.L__self___buffer

        # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1246, code: return x.sum() - z.sum() - self.buffer.sum()
        l__self___buffer_1 = self.L__self___buffer

        # File: /home/yidi/local/pytorch/torch/_higher_order_ops/cond.py:118, code: return cond_op(pred, true_fn, false_fn, operands)
        cond_true_0 = self.cond_true_0
        cond_false_0 = self.cond_false_0
        cond = torch.ops.higher_order.cond(l_y_, cond_true_0, cond_false_0, [l_x_, l_z_, l__self___buffer, l__self___buffer_1]);  l_y_ = cond_true_0 = cond_false_0 = l_x_ = l_z_ = l__self___buffer = l__self___buffer_1 = None
        return (cond,)

    class GraphModule(torch.nn.Module):
        def forward(self, l_x_, l_z_, l__self___buffer_true_branch, l__self___buffer_1_false_branch):
            l_x__1 = l_x_
            l_z__1 = l_z_

            # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1243, code: return x.sum() + self.buffer.sum() + z.sum()
            sum_1 = l_x__1.sum();  l_x__1 = None
            sum_2 = l__self___buffer_true_branch.sum();  l__self___buffer_true_branch = None
            add = sum_1 + sum_2;  sum_1 = sum_2 = None
            sum_3 = l_z__1.sum();  l_z__1 = None
            add_1 = add + sum_3;  add = sum_3 = None
            return add_1

    class GraphModule(torch.nn.Module):
        def forward(self, l_x_, l_z_, l__self___buffer_true_branch, l__self___buffer_1_false_branch):
            l_x__1 = l_x_
            l_z__1 = l_z_

            # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1246, code: return x.sum() - z.sum() - self.buffer.sum()
            sum_1 = l_x__1.sum();  l_x__1 = None
            sum_2 = l_z__1.sum();  l_z__1 = None
            sub = sum_1 - sum_2;  sum_1 = sum_2 = None
            sum_3 = l__self___buffer_1_false_branch.sum();  l__self___buffer_1_false_branch = None
            sub_1 = sub - sum_3;  sub = sum_3 = None
            return sub_1
```

After de-duplication, we have the following graph module:
```python
class GraphModule(torch.nn.Module):
    def forward(self, L_x_ : torch.Tensor, L_y_ : torch.Tensor, s0 : torch.SymInt, L_z_ : torch.Tensor):
        l_x_ = L_x_
        l_y_ = L_y_
        l_z_ = L_z_

        # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1232, code: return x.sum() + self.buffer.sum() + z.sum()
        l__self___buffer = self.L__self___buffer

        # File: /home/yidi/local/pytorch/torch/_higher_order_ops/cond.py:118, code: return cond_op(pred, true_fn, false_fn, operands)
        cond_true_0 = self.cond_true_0
        cond_false_0 = self.cond_false_0
        cond = torch.ops.higher_order.cond(l_y_, cond_true_0, cond_false_0, [l__self___buffer, l_x_, l_z_]);  l_y_ = cond_true_0 = cond_false_0 = l__self___buffer = l_x_ = l_z_ = None
        return (cond,)

    class GraphModule(torch.nn.Module):
        def forward(self, l__self___buffer, l_x_, l_z_):
            l__self___buffer_1 = l__self___buffer
            l_x__1 = l_x_
            l_z__1 = l_z_

            # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1232, code: return x.sum() + self.buffer.sum() + z.sum()
            sum_1 = l_x__1.sum();  l_x__1 = None
            sum_2 = l__self___buffer_1.sum();  l__self___buffer_1 = None
            add = sum_1 + sum_2;  sum_1 = sum_2 = None
            sum_3 = l_z__1.sum();  l_z__1 = None
            add_1 = add + sum_3;  add = sum_3 = None
            return add_1

    class GraphModule(torch.nn.Module):
        def forward(self, l__self___buffer_1, l_x_, l_z_):
            l__self___buffer_2 = l__self___buffer_1
            l_x__1 = l_x_
            l_z__1 = l_z_

            # File: /home/yidi/local/pytorch/test/dynamo/test_higher_order_ops.py:1235, code: return x.sum() - z.sum() - self.buffer.sum()
            sum_1 = l_x__1.sum();  l_x__1 = None
            sum_2 = l_z__1.sum();  l_z__1 = None
            sub = sum_1 - sum_2;  sum_1 = sum_2 = None
            sum_3 = l__self___buffer_2.sum();  l__self___buffer_2 = None
            sub_1 = sub - sum_3;  sub = sum_3 = None
            return sub_1

```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112874
Approved by: https://github.com/zou3519
2023-11-27 22:07:42 +00:00
voznesenskym
081c5b3adc Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926) (#114526)
Summary:

The primary problem we are setting out to solve here is fake tensor freshness. Before this PR, fake tensors after dynamo represented fake tensors *at the end* of trace, so subsequent retraces like aot_autograd would start off with fake tensors in the wrong (end result) state, rather than their expected fresh state. The solution here is to start a fresh fake mode, and re-fakify the tensors. The nuance comes from ensuring that symbols are uniformly created for the symbolic sizes and strides of the tensor.

This PR is the result of *a lot* of back and forth with ezyang and eellison. Initially, the first pass at this was not super different from what we have in the PR - the broad strokes were the same:

1) We cache source->symbol in shape_env
2) We pass policy objects around, stored at dynamo fakificaiton time, and reused for later fakification
3) We create a new fake mode for backends
(from https://github.com/pytorch/pytorch/pull/113605/files)

This is ugly, and has some layering violations. We detoured our decision making through a few other alternatives. Immutable/mutable fake tensor mode was the most interesting alternative, https://github.com/pytorch/pytorch/pull/113653, and was struck down on concerns of complexity in fake mode combined with it not covering all edge cases. We also detoured on what to do about tensor memoization returning back potentially different tensors than requested, and if that was an anti pattern (it is) we want to hack in with the symbol cache (we don't).

We went back to the drawing board here, but with a few concessions:
1) the cache for source->symbol must live outside of shape_env, for both lifecycle, and layering reasons
2) A good amount of work needs to be done to pipe policy around fake_mode and meta_utils correctly, to cover all the cases (ezyang did this)

cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

imported-using-ghimport

Test Plan: Imported from OSS

Reviewed By: huydhn, Chillee

Differential Revision: D51566250

Pulled By: voznesenskym

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114526
Approved by: https://github.com/Chillee, https://github.com/huydhn
2023-11-26 23:40:32 +00:00
PyTorch MergeBot
2f3beb715c Revert "Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926)"
This reverts commit 2ca1119d53.

Reverted https://github.com/pytorch/pytorch/pull/113926 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/113926#issuecomment-1822713852))
2023-11-22 12:52:33 +00:00
PyTorch MergeBot
3e1abde46d Revert "AOTAutograd: handle set_(), detect metadata mutations that cancel out (#111554)"
This reverts commit a911b4db9d.

Reverted https://github.com/pytorch/pytorch/pull/111554 on behalf of https://github.com/DanilBaibak due to The lower PR in the stack #113926 breaks the internal build ([comment](https://github.com/pytorch/pytorch/pull/111554#issuecomment-1822472206))
2023-11-22 10:13:48 +00:00
Jon Chuang
172a103857 [dynamo] strict=True kwarg for zip (#114047)
Fixes https://github.com/pytorch/pytorch/issues/113894

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114047
Approved by: https://github.com/ezyang
2023-11-22 08:48:51 +00:00
ydwu4
c5ddfa79b3 [HigherOrderOp] add output tensor meta check for cond (#113900)
This PR checks the tensor meta of the outputs of cond's branches. This helps us to identify several tests that return outputs that have different requires_grad. Also fix the error messages, which previously was in torch.ops.higher_order.cond now is raised in dynamo CondHigherOrder.

Test Plan:
Existing tests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113900
Approved by: https://github.com/zou3519
ghstack dependencies: #113819
2023-11-22 04:06:30 +00:00
ydwu4
9e657ce2ed [HigherOrderOp] set should_flatten_output=True for cond (#113819)
This PR add should_flatten_outpu=True for cond. This effectively allows cond to support pytree output with the output being flattened. Note: a single tensor output will be automatically casted as tuple for torch.ops.higher_order.cond.

This PR also adds support for comparing BuiltinVariables e.g. tuple, this is to make sure we could make dynamo inline comparing two tree_spec to make sure both branches returns the same tree_spec.

Test Plan:
Existing tests. Will add more pytree tests and modify the documentations in the follow-up prs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113819
Approved by: https://github.com/zou3519
2023-11-22 04:06:30 +00:00
Jon Chuang
f66add9b85 [dynamo] graph break on np.ndarray.tobytes (#114208)
We can't model this accurately across np and tnp https://github.com/pytorch/pytorch/issues/114204#issuecomment-1820269949

So let's not even try. Just graph break.

Fixes: https://github.com/pytorch/pytorch/issues/114204

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114208
Approved by: https://github.com/lezcano
2023-11-21 18:19:37 +00:00
voznesenskym
a911b4db9d AOTAutograd: handle set_(), detect metadata mutations that cancel out (#111554)
This should be enough to get @voznesenskym 's FSDP branch to plumb `set_()` through AOTAutograd properly and have everything properly no-op out. Main changes are:

(1) graph break on `aten::set_.source_Tensor_storage_offset` (we could support it but it isn't needed, seems safer to graph break)

(2) Functionalization: add a "proper" functionalization kernel for `aten::set_.source_Tensor`. The previous one we had was codegen'd and it was wrong (it would just clone() and call set_(), which does not do the right thing). I also manually mark on the `FunctionalTensorWrapper` when a given tensor has been mutated by a `set_()` call.

(3) AOTAutograd: I added a new field, `InputAliasInfo.mutates_storage_metadata`, so we can distinguish between "regular" metadata mutations, and metadata mutations due to `set_()` calls. This is mainly because at runtime, one requires calling `as_strided_()` to fix up metadata, while the other requires calling `set_()`.

(4) Made AOTAutograd's detection for metadata mutations / set_() mutations smarter and detect no-ops (if the storage and metadata are all the same).

I also killed `was_updated()` and `was_metadata_updated()`, and replaced them with (existing) `has_data_mutation() ` and (new) `has_data_mutation()`, which can more accurately distinguish between data-mutation vs. `set_()` calls vs. metadata-mutation

**This PR is still silently correct in one case though**, which I'd like to discuss more. In particular, this example:
```
def f(x):
    x_view = x.view(-1)
    x.set_(torch.ones(2))
    x_view.mul_(2)
    return
```

If you have an input that experiences both a data-mutation **and** a `x_old.set_(x_new)` call, there are two cases:

(a) the data mutation happened on the storage of `x_new`. This case should be handled automatically: if x_new is a graph intermediate then we will functionalize the mutation. If x_new is a different graph input, then we will perform the usual `copy_()` on that other graph input

(b) the data mutation happened on the storage of `x_old`. This is more of a pain to handle, and doesn't currently work. At runtime, the right thing to do is probably something like:
```

def functionalized_f(x):
    x_view = x.view(-1)
    # set_() desugars into a no-op; later usages of x will use x_output
    x_output = torch.ones(2)
    # functionalize the mutation on x_view
    x_view_updated = x.mul(2)
    x_updated = x_view_updated.view(x.shape)
    # x experienced TWO TYPES of mutations; a data mutation and a metatadata mutation
    # We need to return both updated tensors in our graph
    return x_updated, x_output
def runtime_wrapper(x):
    x_data_mutation_result, x_set_mutation_result = compiled_graph(x)
    # First, perform the data mutation on x's old storage
    x.copy_(x_data_mutation_result)
    # Then, swap out the storage of x with the new storage
    x.set_(x_set_mutation_result)
```

There are two things that make this difficult to do though:

(1) Functionalization: the functionalization rule for `set_()` will fully throw away the old `FunctionalStorageImpl` on the graph input. So if there are any mutations to that `FunctionalStorageImpl` later on in the graph, the current graph input won't know about it. Maybe we can have a given `FunctionalTensorWrapper` remember all previous storages that it had, and track mutations on all of them - although this feels pretty complicated.

(2) AOTAutograd now needs to know that we might have *two* graph outputs that correspond to a single "mutated input", which is annoying.

It's worth pointing out that this issue is probably extremely unlikely for anyone to run into - can we just detect it and error? This feels slightly easier than solving it, although not significantly easier. We would still need `FunctionalTensorWrapper` to keep track of mutations on any of its "previous" storages, so it can report this info back to AOTAutograd so we can raise an error.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111554
Approved by: https://github.com/ezyang
ghstack dependencies: #113926
2023-11-21 01:52:46 +00:00
voznesenskym
2ca1119d53 Add Stateful/Stateless symbolic contexts, use fresh fake mode for dynamo backends (#113926)
The primary problem we are setting out to solve here is fake tensor freshness. Before this PR, fake tensors after dynamo represented fake tensors *at the end* of trace, so subsequent retraces like aot_autograd would start off with fake tensors in the wrong (end result) state, rather than their expected fresh state. The solution here is to start a fresh fake mode, and re-fakify the tensors. The nuance comes from ensuring that symbols are uniformly created for the symbolic sizes and strides of the tensor.

This PR is the result of *a lot* of back and forth with @ezyang and @eellison. Initially, the first pass at this was not super different from what we have in the PR - the broad strokes were the same:

1) We cache source->symbol in shape_env
2) We pass policy objects around, stored at dynamo fakificaiton time, and reused for later fakification
3) We create a new fake mode for backends
(from https://github.com/pytorch/pytorch/pull/113605/files)

This is ugly, and has some layering violations. We detoured our decision making through a few other alternatives. Immutable/mutable fake tensor mode was the most interesting alternative, https://github.com/pytorch/pytorch/pull/113653, and was struck down on concerns of complexity in fake mode combined with it not covering all edge cases. We also detoured on what to do about tensor memoization returning back potentially different tensors than requested, and if that was an anti pattern (it is) we want to hack in with the symbol cache (we don't).

We went back to the drawing board here, but with a few concessions:
1) the cache for source->symbol must live outside of shape_env, for both lifecycle, and layering reasons
2) A good amount of work needs to be done to pipe policy around fake_mode and meta_utils correctly, to cover all the cases (@ezyang did this)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113926
Approved by: https://github.com/ezyang, https://github.com/eellison
2023-11-20 23:06:37 +00:00
Edward Z. Yang
59ad51e10a Insert deferred runtime asserts into Dynamo FX graph (#113958)
During the course of fake tensor propagation (and, potentially, also Dynamo execution, although I do not believe it is possible to exercise this right now), we may generate deferred runtime asserts, which represent "guards" on unbacked symbols which cannot be immediately checked on entry to a code block; instead, they have to be checked at runtime. However, we currently accumulate these deferred runtime asserts into the ShapeEnv, and don't do anything with them.

This PR modifies Dynamo to automatically insert these runtime asserts into the FX graph, before passing it on to the backend compiler. The assert format coincides with the export assert format as practiced in `torch/_export/passes/add_runtime_assertions_for_constraints_pass.py`, but actually these passes are completely disjoint right now as I only handle deferred runtime asserts, while export only handles ranges (which I should probably also handle, but don't in this PR.)

The assertions must be inserted by Dynamo, because you could potentially then pass the asserts onto another backend like "eager" which no longer looks at the ShapeEnv before. Thanks to previous work in export, these asserts are preserved in AOTAutograd, but they are dropped by Inductor, which needs to be fixed in future work. This piece will be a bit awkward, as Inductor would have preferred to work with the Sympy expressions directly, ah well.

Here is what the Dynamo traced FX graph looks like for the test in question:

```
  <eval_with_key>.0 class GraphModule(torch.nn.Module):
     def forward(self, L_x_ : torch.Tensor):
         l_x_ = L_x_

         # File: /data/users/ezyang/c/pytorch/wu.py:8, code: y = x.item()
         item = l_x_.item()

         # No stacktrace found for following nodes
         ge_1 = item >= 0
         scalar_tensor_default = torch.ops.aten.scalar_tensor.default(ge_1);  ge_1 = None
         _assert_async_msg = torch.ops.aten._assert_async.msg(scalar_tensor_default, "Deferred runtime assert failed: i0 >= 0, where i0 was defined by 'item' (for more information, run with TORCH_LOGS=+dynamo,dynamic)");  scalar_tensor_default = None

         # File: /data/users/ezyang/c/pytorch/wu.py:9, code: torch._check_is_size

         _check_is_size = torch._check_is_size(item)

         # File: /data/users/ezyang/c/pytorch/wu.py:10, code: if y >= 0:
         ge = item >= 0;  item = None

         # File: /data/users/ezyang/c/pytorch/wu.py:11, code: return x * 2
         mul = l_x_ * 2;  l_x_ = None
         return (mul,)

```

Note that we actually keep the `_check_is_size` in the graph redundantly. However, assert_async is retained in the graph, whereas _check_is_size ends up getting DCE'ed.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113958
Approved by: https://github.com/aakhundov, https://github.com/tugsbayasgalan
ghstack dependencies: #113978
2023-11-20 21:25:11 +00:00
Edward Z. Yang
934e9c3346 Boolean masking backwards doesn't work even with dynamic output shape ops, break accordingly (#114126)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114126
Approved by: https://github.com/albanD
2023-11-20 19:07:37 +00:00
Jon Chuang
9d2425c8a4 [dynamo] Be clearer about dict subtype source availability (#114069)
```
# [NOTE] OrderedDict, dict subtypes must always have source
# We cannot instantiate such subtypes in-graph due to builtin __new__
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114069
Approved by: https://github.com/ezyang
2023-11-20 18:49:42 +00:00
Jon Chuang
100b9952b1 [dynamo] Fix user defined object sourceless callable (#114066)
Fixes https://github.com/pytorch/pytorch/issues/114019
We do not need to guard on callable user object defined instantiated in graph

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114066
Approved by: https://github.com/ezyang
2023-11-20 18:38:03 +00:00
Yanbo Liang
870539670a [Dynamo] Support skip/inline function by name and consolidate skip/inline check logics (#113888)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113888
Approved by: https://github.com/mlazos
2023-11-18 21:36:29 +00:00
Yanbo Liang
033d7b670a [Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432)
This is splitted from #113009, please check https://github.com/pytorch/pytorch/pull/113009#issuecomment-1804417925 for more details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113432
Approved by: https://github.com/ezyang
2023-11-17 23:42:00 +00:00
Edward Z. Yang
e2b114ab9f [BE] Package dynamic_dims/constraint_dims into CreateSymbolicPolicy (#113802)
This will make it more convenient to propagate more information through
all of these functions in the future (e.g., for storage offset
information.)

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113802
Approved by: https://github.com/davidberard98, https://github.com/voznesenskym
2023-11-17 18:22:46 +00:00
vfdev-5
a56af02913 [dynamo] Added support for is_contiguous with dynamic shapes (#113645)
Description:
- Added support for `x.is_contiguous` with dynamic shapes

On `main` the following code is giving a graph break:
```python
import torch

@torch.compile(backend="eager", dynamic=True, fullgraph=True)
def f(x):
    if x.is_contiguous():
        return x
    else:
        return 0

x = torch.randn(13, 14)
f(x)
```
with the error message:
```
  File "pytorch/torch/_dynamo/variables/builder.py", line 1541, in wrap_fx_proxy_cls
    unimplemented(
  File "pytorch/torch/_dynamo/exc.py", line 193, in unimplemented
    raise Unsupported(msg)
torch._dynamo.exc.Unsupported: torch.* op returned non-Tensor bool call_method is_contiguous

from user code:
   File "check_is_contig_dynamic_true.py", line 37, in f
    if x.is_contiguous():
```

This PR fixes the issue.
```
TORCH_COMPILE_DEBUG=1 python check_is_contig_dynamic_true.py
[2023-11-14 15:49:04,399] [0/0] torch._dynamo.symbolic_convert: [INFO] Step 1: torchdynamo start tracing f check_is_contig_dynamic_true.py:34
[2023-11-14 15:49:04,403] [0/0] torch._dynamo.symbolic_convert.__trace_source: [DEBUG] TRACE starts_line check_is_contig_dynamic_true.py:34 in f ()
[2023-11-14 15:49:04,403] [0/0] torch._dynamo.symbolic_convert.__trace_source: [DEBUG]     @torch.compile(backend="eager", dynamic=True, fullgraph=True)
[2023-11-14 15:49:04,405] [0/0] torch._dynamo.symbolic_convert.__trace_source: [DEBUG] TRACE starts_line check_is_contig_dynamic_true.py:37 in f (f)
[2023-11-14 15:49:04,405] [0/0] torch._dynamo.symbolic_convert.__trace_source: [DEBUG]         if x.is_contiguous():
[2023-11-14 15:49:04,405] [0/0] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_FAST x []
[2023-11-14 15:49:04,405] [0/0] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_ATTR is_contiguous [LazyVariableTracker()]
[2023-11-14 15:49:04,804] [0/0] torch._dynamo.output_graph: [DEBUG] create_graph_input L_x_ L['x']
[2023-11-14 15:49:04,805] [0/0] torch._dynamo.variables.builder: [DEBUG] wrap_to_fake L['x'] (5, 4) [<DimDynamic.DUCK: 1>, <DimDynamic.DUCK: 1>] [None, None]
[2023-11-14 15:49:04,839] [0/0] torch._dynamo.output_graph: [DEBUG] create_graph_input s0 L['x'].size()[0]
[2023-11-14 15:49:04,840] [0/0] torch._dynamo.output_graph: [DEBUG] create_graph_input s1 L['x'].size()[1]
[2023-11-14 15:49:04,840] [0/0] torch._dynamo.output_graph: [DEBUG] create_graph_input s2 L['x'].stride()[0]
[2023-11-14 15:49:04,840] [0/0] torch._dynamo.output_graph: [DEBUG] create_graph_input s1 L['x'].stride()[1]
[2023-11-14 15:49:04,840] [0/0] torch._dynamo.symbolic_convert: [DEBUG] TRACE CALL_FUNCTION 0 [GetAttrVariable(TensorVariable(), is_contiguous)]
[2023-11-14 15:49:04,843] [0/0] torch._dynamo.symbolic_convert: [DEBUG] TRACE POP_JUMP_IF_FALSE 12 [ConstantVariable(bool)]
[2023-11-14 15:49:04,844] [0/0] torch._dynamo.symbolic_convert.__trace_source: [DEBUG] TRACE starts_line check_is_contig_dynamic_true.py:42 in f (f)
[2023-11-14 15:49:04,844] [0/0] torch._dynamo.symbolic_convert.__trace_source: [DEBUG]             return 0
[2023-11-14 15:49:04,844] [0/0] torch._dynamo.symbolic_convert: [DEBUG] TRACE LOAD_CONST 0 []
[2023-11-14 15:49:04,844] [0/0] torch._dynamo.symbolic_convert: [DEBUG] TRACE RETURN_VALUE None [ConstantVariable(int)]
[2023-11-14 15:49:04,844] [0/0] torch._dynamo.convert_frame: [DEBUG] Skipping frame because no content in function call f                     check_is_contig_dynamic_true.py 34
[2023-11-14 15:49:04,844] [0/0] torch._dynamo.convert_frame: [DEBUG] No graph captured with one_graph=True
[2023-11-14 15:49:04,848] torch._dynamo.utils: [INFO] TorchDynamo compilation metrics:
[2023-11-14 15:49:04,848] torch._dynamo.utils: [INFO] Function                           Runtimes (s)
[2023-11-14 15:49:04,848] torch._dynamo.utils: [INFO] -------------------------------  --------------
[2023-11-14 15:49:04,848] torch._dynamo.utils: [INFO] _compile.<locals>.compile_inner          1.2083
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113645
Approved by: https://github.com/lezcano
2023-11-17 12:32:38 +00:00
angelayi
f27ab241a4 [dynamo] Fix UnspecializedNNModuleVariable's source (#113852)
Fixes https://github.com/pytorch/pytorch/issues/113041

In the case where we have an object represented as an UnspecializedNNModuleVariable, the source of an attribute on that object is `AttrSource(base=NotNNModuleSource(base=NNModuleSource(base=AttrSource(base=LocalSource(local_name='self', cell_or_freevar=False), member='seq'))), member='b')`. This causes dynamo to add an extra attribute as it doesn't go to this [`register_attr` step](eddce3c054/torch/_dynamo/variables/builder.py (L955-L962)).

However if we have an object represented as a UserDefinedObjectVariable, the source of an attribute on that object is `AttrSource(base=NNModuleSource(base=AttrSource(base=LocalSource(local_name='self', cell_or_freevar=False), member='seq')), member='b')`.

It seems that UnspecializedNNModuleVariables should behave in the same was as UserDefinedObjectVariables, but the source in these two cases are different. So, I removed the part that changes the source in the UnspecializedNNModuleVariables, and it seems to work! And CI is green (+ reduced graph breaks).

```
   def test_unspecialized_nnmodule(self):
        class TestModule(torch.nn.Module):
            def __init__(self):
                super().__init__()
                self.a = torch.tensor(1.0)

            def forward(self, x: torch.Tensor) -> torch.Tensor:
                return x + self.a

        def forward_hook(
            module: torch.nn.Module, inputs, output
        ) -> torch.Tensor:
            return 2 * output

        seq = torch.nn.Sequential(TestModule()).eval()
        seq.b = torch.tensor(2)
        handle = seq.register_forward_hook(forward_hook)

        class M(torch.nn.Module):
            def __init__(self):
                super().__init__()
                self.seq = seq

            def forward(self, x):
                # self.seq.b has source: AttrSource(base=NotNNModuleSource(base=NNModuleSource(base=AttrSource(base=LocalSource(local_name='self', cell_or_freevar=False), member='seq'))), member='b')
                return self.seq(x) + self.seq.b

        inp = (torch.randn(2, 8),)
        ep = export(M(), inp)
```
```
    def test_user_defined_var(self):
        class TestModule(torch.nn.Module):
            def __init__(self):
                super().__init__()
                self.a = torch.tensor(1.0)

            def forward(self, x: torch.Tensor) -> torch.Tensor:
                return x + self.a

        class UserDefined:
            def __init__(self):
                self.test_module = TestModule()
                self.b = torch.tensor(2)

            def __call__(self, x):
                return self.test_module(x)

        class M(torch.nn.Module):
            def __init__(self):
                super().__init__()
                self.seq = UserDefined()

            def forward(self, x):
                # self.seq.b has source: AttrSource(base=NNModuleSource(base=AttrSource(base=LocalSource(local_name='self', cell_or_freevar=False), member='seq')), member='b')
                return self.seq(x) + self.seq.b

        inp = (torch.randn(2, 8),)
        ep = export(M(), inp)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113852
Approved by: https://github.com/yanboliang
2023-11-17 08:17:27 +00:00
David Berard
7c38b76efe Make offsets dynamic by default (#113734)
Copied from @ezyang 's #113693.

The motivation for this change is that we'd like to guard on storage offset in inductor, to make assumptions about data alignment.

create_symbolic_sizes_strides_storage_offset() creates the sizes/strides/offset for fake tensors - they can either be integers or symints. This PR changes storage_offset to always be dynamic. In variables/builder.py, we remove a conditional so that all tensors get added to tracked_fakes. This is because the storage offset will be dynamic even if the other logic in builder.py suggests that it will be static; otherwise, we run into this issue:

1e260c851b/torch/fx/experimental/symbolic_shapes.py (L892-L895)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113734
Approved by: https://github.com/ezyang
2023-11-17 07:57:21 +00:00
Jon Chuang
c94fdebd3e [dynamo] chore: Fallback on const_handler instead of special-casing on ConstantVariable (#113893)
Fixes https://github.com/pytorch/pytorch/pull/113874#issuecomment-1815269686

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113893
Approved by: https://github.com/ezyang
2023-11-17 07:46:58 +00:00
ydwu4
0894981f6c [HigherOrderOp][BE] change _make_inlined check callable() (#113881)
A follow up of discussion #113814

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113881
Approved by: https://github.com/Skylion007
2023-11-17 02:44:12 +00:00
Jon Chuang
277229d0c6 [dynamo] Fix incorrectly casting SymNode to int when input is bool (#113871)
Fixes https://github.com/pytorch/pytorch/issues/113393, https://github.com/pytorch/pytorch/pull/113848#issuecomment-1814624510

Incorrectly casting symnode type will cause it to take the wrong path in symbolic_shapes

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113871
Approved by: https://github.com/jansel
2023-11-16 23:24:57 +00:00
Yanbo Liang
bab41f44b8 [dynamo] Fix allow_in_graph decorator doesn't work on autograd.Function (#113510)
Fixes #111032

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113510
Approved by: https://github.com/zou3519
2023-11-16 22:44:46 +00:00
PyTorch MergeBot
98df3088c3 Revert "Make offsets dynamic by default (#113734)"
This reverts commit 9efbb4ea73.

Reverted https://github.com/pytorch/pytorch/pull/113734 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is causing a memory leak in one of the test 9efbb4ea73 ([comment](https://github.com/pytorch/pytorch/pull/113734#issuecomment-1815297222))
2023-11-16 20:56:27 +00:00
ydwu4
7183926622 [HigherOrderOp][BE] consolidate UserFunctionVariable.call_function pattern to _make_inlined (#113814)
We saw some use cases in higher order operator that tries to directly inline a user-level function (e.g. pytree.tree_flatten and pytree.tree_unflatten) with no tensor operations by manually constructing a UserFunctionVariable and run call_function on it.

This PR consolidate this pattern a little bit by adding a _make_inlined helper function to make the UX better( i.e. the callilng convention is kept the same with the function that we'd like to inline) and also reduce redundancy, increase readability.

Test Plan:
Exisiting tests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113814
Approved by: https://github.com/yanboliang
2023-11-16 16:56:24 +00:00
David Berard
9efbb4ea73 Make offsets dynamic by default (#113734)
Copied from @ezyang 's #113693.

The motivation for this change is that we'd like to guard on storage offset in inductor, to make assumptions about data alignment.

create_symbolic_sizes_strides_storage_offset() creates the sizes/strides/offset for fake tensors - they can either be integers or symints. This PR changes storage_offset to always be dynamic. In variables/builder.py, we remove a conditional so that all tensors get added to tracked_fakes. This is because the storage offset will be dynamic even if the other logic in builder.py suggests that it will be static; otherwise, we run into this issue:

1e260c851b/torch/fx/experimental/symbolic_shapes.py (L892-L895)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113734
Approved by: https://github.com/ezyang
2023-11-16 06:49:09 +00:00
Brian Hirsh
cebad9867b graph break on intermediate leaves that require grad (#113277)
fixes https://github.com/pytorch/pytorch/issues/90552. This is a simpler fix that just detects the situation where AOTAutograd can't create a proper backward graph for the situation and graph breaks. This was technically a silent correctness issue before.

This PR tries to always graph break when we see a factory function that returns a tensor requiring grad. I check this by seeing if the op returned a `TensorVariable` in dynamo, and if one of the input arguments was a `requires_grad=True` kwarg. I think this is high-fidelity enough, and I'm also hoping that this is uncommon enough that a graph break is reasonable here.

The fix to avoid the graph break in user land is also pretty easy - just instantiate your tensor outside of the compiled region and plumb it in.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113277
Approved by: https://github.com/eellison
ghstack dependencies: #113267, #113416, #113584
2023-11-16 02:47:45 +00:00
Will Feng
d52b9ba6a8 [torch.compile + selective checkpoint] Attach context_fn to the checkpointed graph module, fixing flaky tests (#112672)
torch.compile + SAC unit test is causing adjacent unit tests to be flaky due to its modification of shared singleton object. This PR attaches the checkpoint context fn to the checkpointed GraphModule, and look it up during execution, avoiding the need to make the higher-order op stateful.

Specifically, we attach the `context_fn` to the checkpointed GraphModule. These two will be gc'ed at the same time, so it satisfies the lifetime requirement.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112672
Approved by: https://github.com/wanchaol
2023-11-16 01:34:52 +00:00
PyTorch MergeBot
5d170fce29 Revert "Support tensors as Dict keys (#111196)"
This reverts commit b0805fa5d0.

Reverted https://github.com/pytorch/pytorch/pull/111196 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing internally. I will provide the details there ([comment](https://github.com/pytorch/pytorch/pull/111196#issuecomment-1813410149))
2023-11-15 23:08:00 +00:00
PyTorch MergeBot
7137f5f8c3 Revert "[easy]Remove specialized value (#112252)"
This reverts commit 149b9dfd04.

Reverted https://github.com/pytorch/pytorch/pull/112252 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but https://github.com/pytorch/pytorch/pull/111196 is failing internally. I will provide the details there ([comment](https://github.com/pytorch/pytorch/pull/112252#issuecomment-1813401896))
2023-11-15 23:02:49 +00:00
voznesenskym
6435fc17bb Remove ignore_sublcass from FakeTensorMode (#113795)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113795
Approved by: https://github.com/ezyang
2023-11-15 22:30:13 +00:00
Brian Hirsh
720e866d18 graph break on out= ops with noncontiguous out args (#113267)
Fixes https://github.com/pytorch/pytorch/issues/113010

In eager mode, when you call an out= op like `add(..., out=out_arg)` with an out argument that is noncontiguous, the noncontiguous out arg will be returned directly. When we functionalize though, functionalization replaces it with a call to `add(...)` which ignores the contiguity of the original out arg.

Instead of trying to support this, this PR detects that situation and graph breaks

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113267
Approved by: https://github.com/albanD
2023-11-15 19:55:47 +00:00
Philip Meier
9146ca6a07 use sourceless builder for builtin getattr (#113340)
In TorchVision we use the following (simplified) dispatch mechanism:

```python
import torch

def kernel1(tensor):
    return tensor + 2

def dispatcher1(input):
    kernel = get_kernel(dispatcher1, type(input))
    return kernel(input)

def kernel2(tensor):
    return tensor - 2

def dispatcher2(input):
    kernel = get_kernel(dispatcher2, type(input))
    return kernel(input)

# We actually use the function and type as keys, rather than their names.
# However, this currently not supported, but should be easy to add after
# https://github.com/pytorch/pytorch/pull/111196
REGISTRY = {
    "dispatcher1": {"Tensor": kernel1},
    "dispatcher2": {"Tensor": kernel2},
}

def get_kernel(dispatcher, input_type):
    dispatcher_registry = REGISTRY[dispatcher.__name__]
    for cls in input_type.__mro__:
        kernel = dispatcher_registry[cls.__name__]
        break
    return kernel
```

This can be compiled without graph breaks:

```python
cfn = torch.compile(dispatcher1, fullgraph=True)
torch.testing.assert_close(int(cfn(torch.tensor(3))), 5)

cfn = torch.compile(dispatcher2, fullgraph=True)
torch.testing.assert_close(int(cfn(torch.tensor(3))), 1)
```

However, if we start chaining these calls, we hit some issues:

```python
class Pipeline(torch.nn.Module):
    def forward(self, input):
        input = dispatcher1(input)
        input = dispatcher2(input)
        return input

cfn = torch.compile(Pipeline(), fullgraph=True)
torch.testing.assert_close(int(cfn(torch.tensor(3))), 3)
```

```
Can't access members of type(obj) for a generated custom object. Please use __class__ instead
```

The error message is not really helpful here. The following happens: when compiling `dispatcher1`, `get_kernel` gets inlined. That means when hitting `dispatcher2`, the `type` call no longer happens on an input with a source. Thus, in the first iteration we hit the top branch, while in the second we hit the bottom:

addb8e29cd/torch/_dynamo/variables/builtin.py (L1264-L1268)

And the error message I posted above originates from the type being treated as constant. This PR replaces this with a `SourcelessBuilder` instead.

With that fix in place, we hit another pointing to `input_type.__mro__`

```
AssertionError: Consider SourcelessBuilder for ephemeral objects, usually objects created locally.
```

Fix is similar: instead of using a `VariableBuilder` here, we use a `SourcelessBuilder` in case we have no `source`:

addb8e29cd/torch/_dynamo/variables/builtin.py (L1167-L1168)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113340
Approved by: https://github.com/peterbell10, https://github.com/lezcano
2023-11-15 13:01:20 +00:00
PyTorch MergeBot
77f66ade66 Revert "use sourceless builder for builtin getattr (#113340)"
This reverts commit d64bc8f0f8.

Reverted https://github.com/pytorch/pytorch/pull/113340 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but the test is failing internally ([comment](https://github.com/pytorch/pytorch/pull/113340#issuecomment-1811684167))
2023-11-15 02:06:00 +00:00
lezcano
149b9dfd04 [easy]Remove specialized value (#112252)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112252
Approved by: https://github.com/jansel
ghstack dependencies: #111196
2023-11-14 19:14:03 +00:00
lezcano
b0805fa5d0 Support tensors as Dict keys (#111196)
This prepares the PR where we implement sets in terms of dicts.
To do so, rather than storing internally a dictionary that maps literals
to VariableTrackers, it stores (pretty much) a dictionary from VTs to VTs.
To do so, keys are wrapped in an opaque internal class `_Hashable`.
The Hashable class is opaque on purpose so that it fails hard if
if it inadvertently leaks back into user code.

We also found and fixed a number of latent bugs and inconsistencies
in the way dynamo checked what can be a dict key. More generally, we
make much clearer what are the things that need to be modified to add
a new supported key type to Dicts.

Fixes https://github.com/pytorch/pytorch/issues/107595
Fixes https://github.com/pytorch/pytorch/issues/111603
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111196
Approved by: https://github.com/jansel
2023-11-14 19:14:03 +00:00
Jez Ng
5b95715bc0 Make {Tracing,Compile}Context.get() return non-optional type (#113535)
They are used in many contexts that don't actually check if the returned
type is `None`. I have also created `try_get()` for the cases where we
do actually want an Optional type returned.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113535
Approved by: https://github.com/ezyang
ghstack dependencies: #113412
2023-11-14 04:31:12 +00:00
ydwu4
3eacdaf1b3 [HigherOrderOp] add pytree operands tests for cond (#112661)
This is a follow-up of #111611. After this PR, we allow pytree with tensor-only leaves as operands of branches.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112661
Approved by: https://github.com/zou3519
2023-11-13 23:09:46 +00:00
Jez Ng
68278cf7a8 [dynamo] Initialize tensor_weakref_to_sizes_strides with a weak dict (#113412)
Spotted while working on getting output_graph.py to typecheck.

The type hint indicates that it was intended to be initialized with a
WeakIdKeyDictionary, but the actual runtime value was a regular dict.
Not sure if there's some kind of test we should add for this fix.

Looks like the code was originally added in
https://github.com/pytorch/pytorch/pull/100128.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113412
Approved by: https://github.com/Skylion007, https://github.com/voznesenskym
ghstack dependencies: #113413, #113518, #113519
2023-11-13 22:53:47 +00:00
PyTorch MergeBot
0e6b6a2483 Revert "AOTAutograd: handle set_(), detect metadata mutations that cancel out (#111554)"
This reverts commit 3afb4e5cf7.

Reverted https://github.com/pytorch/pytorch/pull/111554 on behalf of https://github.com/clee2000 due to the xla failure is real sorry, log classifier is showing the wrong line ([comment](https://github.com/pytorch/pytorch/pull/111554#issuecomment-1809177978))
2023-11-13 21:46:57 +00:00