Commit Graph

262 Commits

Author SHA1 Message Date
Edward Z. Yang
89e16c4f18 Assume sympy is always installed (#94903)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94903
Approved by: https://github.com/Skylion007, https://github.com/malfet
2023-02-16 14:09:58 +00:00
PyTorch MergeBot
a049bbb100 Revert "Change test_torchinductor_opinfo.py to mark skips/xfails in a better way (#94813)"
This reverts commit bfc0d5e22c.

Reverted https://github.com/pytorch/pytorch/pull/94813 on behalf of https://github.com/huydhn due to Sorry for reverting your PR, but it causes failures on trunk bfc0d5e22c due to a landrace with b6df987671
2023-02-16 05:08:23 +00:00
Fabio Rocha
bfc0d5e22c Change test_torchinductor_opinfo.py to mark skips/xfails in a better way (#94813)
With this change, expected failures will be correctly reported as such by pytest (instead of passes as before).
It was sometimes a little confusing to see operators you did not expect to work in inductor reported as passing their tests.

One downside is that expected failures/skips for test variants have now to be identified by tuples. I.e., `("max", "reduction_no_dim"): {f16},` instead of just `"max.reduction_no_dim": {f16}`. It seems to me it is worth it.

This change would also allow to simplify `TestInductorOpInfo` class a little, since it doesn't have to handle the skips/xfails anymore, but that might require dropping support for things like `PYTORCH_COLLECT_EXPECT` and `PYTORCH_FAIL_ON_SUCCESS` so I didn't do it.

Also couple of other minor changes:

 - Got rid of c32, c64, c128 in torchinductor_opinfo. We don't support complex numbers, so they shouldn't be necessary.
 - Renamed TestExpect Enum to ExpectedTestResult to get rid of a pytest warning that thinks it is a class that has tests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94813
Approved by: https://github.com/lezcano, https://github.com/jansel
2023-02-16 03:32:01 +00:00
min-jean-cho
b6df987671 [Inductor] Added aten.normal_ decomp (#91207)
Fixes #91085

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91207
Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano
2023-02-15 21:21:46 +00:00
Edward Z. Yang
abf59f5703 Make _simplified kwarg private (#94782)
CR on https://github.com/pytorch/pytorch/pull/94404

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94782
Approved by: https://github.com/voznesenskym
2023-02-15 01:52:16 +00:00
Edward Z. Yang
f1f26fe8ec Streamlining guard expect tests (#94404)
Changes:
* Add `simplified` kwarg to let you only render guards that are nontrivial (excludes duck sizing)
* Make a list of strings valid for sources, if you just have some variable names you want to bind to
* Add test helper `show_guards` using these facilities, switch a few tests to it

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94404
Approved by: https://github.com/Chillee
2023-02-13 23:36:21 +00:00
Aaron Gokaslan
3d82d8d0ed [BE] Enable more flake8-comprehensions checks (#94601)
I applied some flake8 fixes and enabled checking for them in the linter. I also enabled some checks for my previous comprehensions PR.

This is a follow up to #94323 where I enable the flake8 checkers for the fixes I made and fix a few more of them.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94601
Approved by: https://github.com/ezyang
2023-02-10 23:40:29 +00:00
mingfeima
c620ece726 port sparse_mm.reduce to pytorch and optimize it on CPU (#83727)
### Motivation of this PR

This patch is to migrate `spmm_reduce` from `torch-sparse` (a 3rd party dependency for PyG) to `torch`, which is a response to the initial proposal for fusion of **Gather, Apply Scatter** in Message Passing of GNN inference/training. https://github.com/pytorch/pytorch/issues/71300

**GAS** is the major step for Message Passing, the behavior of **GAS** can be classified into 2 kinds depending on the storage type of `EdgeIndex` which records the connections of nodes:

* COO: the hotspot is `scatter_reduce`
* CSR: the hotspot is `spmm_reduce`

The reduce type can be choose from: "max", "mean", "max",  "min".

extend `torch.sparse.mm` with an `reduce` argument, maps to `torch.sparse_mm.reduce` internally.
`sparse_mm_reduce` is registered under the TensorTypeId of `SparseCsrCPU`, and this operator requires an internal interface `_sparse_mm_reduce_impl` which has dual outputs:
* `out` - the actual output
* `arg_out` - records output indices in the non zero elements if the reduce type is "max" or "min", this is only useful for training. So for inference, it will not be calculated.

### Performance

Benchmark on GCN for obgn-products on Xeon single socket, the workload is improved by `4.3x` with this patch.

Performance benefit for training will be bigger, the original backward impl for `sum|mean` is sequential; the original backward impl for `max|min` is not fused.

#### before:
```
-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------
                         Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg    # of Calls
-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------
       torch_sparse::spmm_sum        97.09%       56.086s        97.09%       56.088s        6.232s             9
                 aten::linear         0.00%      85.000us         1.38%     795.485ms      88.387ms             9
                 aten::matmul         0.00%      57.000us         1.38%     795.260ms      88.362ms             9
                     aten::mm         1.38%     795.201ms         1.38%     795.203ms      88.356ms             9
                   aten::relu         0.00%      50.000us         0.76%     440.434ms      73.406ms             6
              aten::clamp_min         0.76%     440.384ms         0.76%     440.384ms      73.397ms             6
                   aten::add_         0.57%     327.801ms         0.57%     327.801ms      36.422ms             9
            aten::log_softmax         0.00%      23.000us         0.10%      55.503ms      18.501ms             3
```

#### after
```
-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------
                         Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg    # of Calls
-----------------------------  ------------  ------------  ------------  ------------  ------------  ------------
               aten::spmm_sum        87.35%       11.826s        87.36%       11.827s        1.314s             9
                 aten::linear         0.00%      92.000us         5.87%     794.451ms      88.272ms             9
                 aten::matmul         0.00%      62.000us         5.87%     794.208ms      88.245ms             9
                     aten::mm         5.87%     794.143ms         5.87%     794.146ms      88.238ms             9
                   aten::relu         0.00%      53.000us         3.35%     452.977ms      75.496ms             6
              aten::clamp_min         3.35%     452.924ms         3.35%     452.924ms      75.487ms             6
                   aten::add_         2.58%     348.663ms         2.58%     348.663ms      38.740ms             9
                 aten::argmax         0.42%      57.473ms         0.42%      57.475ms      14.369ms             4
            aten::log_softmax         0.00%      22.000us         0.39%      52.605ms      17.535ms             3
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83727
Approved by: https://github.com/jgong5, https://github.com/cpuhrsch, https://github.com/rusty1s, https://github.com/pearu
2023-02-10 15:56:40 +00:00
albanD
496c0a207b Make segment_reduce properly private. (#93166)
I am attempting not to change the aten function to reduce the amount of BC issues on the torchscript side.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93166
Approved by: https://github.com/ngimel
2023-02-06 18:32:23 +00:00
Michael Voznesensky
60a3b7425d Small refactor of shape guards to allow for 1:1 code_parts (#93894)
By moving guard string assembly into dynamo's default behavior and letting code_parts do the work, we can have much better shape guard failures.

Before this fix, the guard failure in the test would look like:

```
'x.size()[1] == x.size()[0] and x.stride()[0] == x.[264 chars]!= 1' != 'x.size()[0] < 3'
- x.size()[1] == x.size()[0] and x.stride()[0] == x.size()[0] and x.stride()[1] == 1 and x.storage_offset() == 0 and y.size()[0] == x.size()[0] and y.size()[1] == x.size()[0] and y.stride()[0] == x.size()[0] and y.stride()[1] == 1 and y.storage_offset() == 0 and x.size()[0] < 3 and x.size()[0] != 0 and x.size()[0] != 1
+ x.size()[0] < 3
```
now it is
```
"x.size()[0] < 3"
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93894
Approved by: https://github.com/ezyang
2023-02-05 09:24:12 +00:00
Michael Suo
4e4293f15f Add meta registration for bucketize (#93893)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93893
Approved by: https://github.com/zhxchen17
2023-02-02 21:03:08 +00:00
jon-chuang
d5901fcc80 fix(fx): make all make_fx invocations isolated (opaque to higher make_fx invocations) by default (#93290)
Fixes https://github.com/pytorch/pytorch/issues/88996#issuecomment-1409174554

Example code:
```python
import torch
from torch.fx.experimental.proxy_tensor import make_fx, wrapper_and_args_for_make_fx

@torch.fx.wrap
def func(a, b):
    return b.expand([1, a.shape[0], b.shape[-1]])

a = torch.randn(3, 4)
b = torch.randn(4)

class TestMode(torch.overrides.TorchFunctionMode):
    def __torch_function__(self, func, types, args=(), kwargs={}):
        if torch.overrides.resolve_name(func) in ["torch.Tensor.expand"]:
            print(f"TestMode: {func} {args} {kwargs}")
            wrapped, all_args = wrapper_and_args_for_make_fx(func, args, kwargs)
            gm = make_fx(wrapped, tracing_mode="real")(all_args)

        return func(*args, **kwargs)

with TestMode():
    gm = make_fx(func, tracing_mode="symbolic")(a, b)

gm.graph.print_tabular()
```
Before:
```
opcode         name        target               args                              kwargs
-------------  ----------  -------------------  --------------------------------  --------
placeholder    a_1         a_1                  ()                                {}
placeholder    b_1         b_1                  ()                                {}
call_function  detach      aten.detach.default  (b_1,)                            {}
call_function  detach_1    aten.detach.default  (detach,)                         {}
call_function  sym_size    aten.sym_size        (a_1, 0)                          {}
call_function  sym_size_1  aten.sym_size        (b_1, 0)                          {}
call_function  expand      aten.expand.default  (b_1, [1, sym_size, sym_size_1])  {}
call_function  detach_2    aten.detach.default  (expand,)                         {}
call_function  expand_1    aten.expand.default  (b_1, [1, sym_size, sym_size_1])  {}
output         output      output               (expand_1,)                       {}
```

After:
```
opcode         name        target               args                              kwargs
-------------  ----------  -------------------  --------------------------------  --------
placeholder    a_1         a_1                  ()                                {}
placeholder    b_1         b_1                  ()                                {}
call_function  sym_size    aten.sym_size        (a_1, 0)                          {}
call_function  sym_size_1  aten.sym_size        (b_1, 0)                          {}
call_function  expand      aten.expand.default  (b_1, [1, sym_size, sym_size_1])  {}
output         output      output               (expand_1,)                       {}
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93290
Approved by: https://github.com/ezyang
2023-02-01 17:28:48 +00:00
Ivan Yashchuk
fba13d94a1 Remove deprecated torch.symeig (#70988)
The time has come to remove deprecated linear algebra related functions. This PR removes `torch.symeig`.

- [x] XLA PR: https://github.com/pytorch/xla/pull/4498

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70988
Approved by: https://github.com/lezcano, https://github.com/kit1980, https://github.com/malfet
2023-01-31 11:59:11 +00:00
Edward Z. Yang
ec2461bbd8 Remove proxy tensor's check for data dependent output (#93265)
We'll rely on the underlying fake tensor to raise an error in these cases.  We only raise the error if there is an input to the data dependent operation that is a real tensor (and thus we are at risk of accidentally burning in real values)

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93265
Approved by: https://github.com/albanD
2023-01-31 11:58:49 +00:00
Aaron Gokaslan
e790281a85 SymInt'ify view_as (#93242)
Follow up to #93241
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93242
Approved by: https://github.com/ezyang
2023-01-30 01:56:50 +00:00
Edward Z. Yang
3c570a2be3 SymInt'ify reshape_as (#93241)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93241
Approved by: https://github.com/Skylion007
2023-01-30 01:46:16 +00:00
Edward Z. Yang
1b5bfe9dd1 Properly compute device for elementwise operations with CPU scalar tensor (#93073)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93073
Approved by: https://github.com/eellison, https://github.com/bdhirsh
2023-01-26 21:27:57 +00:00
Edward Z. Yang
17803fb36e Make meshgrid support symbolic shapes (#93075)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93075
Approved by: https://github.com/Skylion007
2023-01-26 20:57:29 +00:00
Joel Schlosser
e5fd7e6d8f Fix to use upsample_bicubic2d.vec decomp for dynamic shape support (#92854)
For the `crossvit_9_240` model - it works now with dynamo.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92854
Approved by: https://github.com/ezyang
2023-01-25 05:08:02 +00:00
PyTorch MergeBot
01f1097770 Revert "Fix to use upsample_bicubic2d.vec decomp for dynamic shape support (#92854)"
This reverts commit d49187bf88.

Reverted https://github.com/pytorch/pytorch/pull/92854 on behalf of https://github.com/malfet due to Resulted in 50+% flaky failures in dynamo, reverting
2023-01-25 00:10:14 +00:00
Joel Schlosser
d49187bf88 Fix to use upsample_bicubic2d.vec decomp for dynamic shape support (#92854)
For the `crossvit_9_240` model - it works now with dynamo.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92854
Approved by: https://github.com/ezyang
2023-01-24 21:36:17 +00:00
PyTorch MergeBot
acdd462b1a Revert "Remove deprecated torch.symeig (#70988)"
This reverts commit d70ed68162.

Reverted https://github.com/pytorch/pytorch/pull/70988 on behalf of https://github.com/kit1980 due to Failing XLA tests, forward fix unsuccessful
2023-01-24 19:03:40 +00:00
Ivan Yashchuk
d70ed68162 Remove deprecated torch.symeig (#70988)
The time has come to remove deprecated linear algebra related functions. This PR removes `torch.symeig`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/70988
Approved by: https://github.com/lezcano, https://github.com/kit1980
2023-01-23 22:51:40 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
8f3600b966 [RELAND] Add metadata coverage for unsafe_split and unsafe_split_with_sizes (#92802)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92802
Approved by: https://github.com/soumith
2023-01-23 10:57:10 +00:00
Edward Z. Yang
c4501593c3 Delete get_pyobj() entirely (#92638)
Opt for the shorter and more direct node attribute access.

I need to do this because I'm going to publicly document
SymInt and SymFloat but I don't want to doc get_pyobj().

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92638
Approved by: https://github.com/Chillee, https://github.com/albanD, https://github.com/voznesenskym, https://github.com/bdhirsh
2023-01-20 19:06:56 +00:00
kshitij12345
274958ef43 [vmap] unsafe_split : batching rule and OpInfo (#92291)
Ref: https://github.com/pytorch/functorch/issues/1089

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92291
Approved by: https://github.com/Chillee
2023-01-20 10:31:56 +00:00
PyTorch MergeBot
827e22ec2d Revert "[vmap] unsafe_split : batching rule and OpInfo (#92291)"
This reverts commit 0510ae59b3.

Reverted https://github.com/pytorch/pytorch/pull/92291 on behalf of https://github.com/kshitij12345 due to Broke trunk
2023-01-19 13:49:43 +00:00
kshitij12345
0510ae59b3 [vmap] unsafe_split : batching rule and OpInfo (#92291)
Ref: https://github.com/pytorch/functorch/issues/1089

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92291
Approved by: https://github.com/Chillee
2023-01-19 06:34:45 +00:00
Peter Bell
8770a7ed6f Decompose more inplace ops (#90967)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90967
Approved by: https://github.com/anijain2305
2023-01-18 21:07:47 +00:00
Richard Zou
5d01277fea Deprecate torch.nn.utils.stateless.functional_call (#92280)
This PR:
- Updates the docs to say it is deprecated
- Raises a UserWarning
- Changes most of the callsites inside PyTorch to use
torch.func.functional_call, minus the test_stateless testing.

The motivation behind this is that we can now align behind a single
functional_call API in PyTorch.

Test Plan:
- existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92280
Approved by: https://github.com/albanD
2023-01-18 14:26:25 +00:00
Peter Bell
f0b592dae7 Make masked_fill reference traceable (#90972)
As the comment states, `item()` cannot be used since you can't trace through a
scalar.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90972
Approved by: https://github.com/ngimel
2023-01-18 10:54:42 +00:00
Avik Chaudhuri
bb11e072ae Squash and merge linalg meta kernels (#92335)
Squashed changes from https://github.com/pytorch/pytorch/pull/92021 and https://github.com/pytorch/pytorch/pull/92020 and https://github.com/pytorch/pytorch/pull/92019

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92335
Approved by: https://github.com/avikchaudhuri
2023-01-18 05:55:52 +00:00
lezcano
138a0188e0 Add support for logaddexp(float16) in CUDA and implement its reference (#91869)
The reference is implemented so that it generates efficient and
numerically stable triton code.

Fixes https://github.com/pytorch/pytorch/issues/91683

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91869
Approved by: https://github.com/ngimel
2023-01-10 00:19:24 +00:00
Xia, Weiwen
de9c82f41a [Meta] Register aten.pixel_shuffle.default for meta (#91605)
**Summary**
Fixes #91551
`aten.pixel_shuffle.default` is not registered for meta and it always generates contiguous (channels-first) layout of outputs. It can be reproduced by `torch.compile` (as described in the issue #91551) and running in FakeTensorMode.

**Test plan**
python test/inductor/test_torchinductor.py -k test_pixel_shuffle_channels_last
python test/test_proxy_tensor.py

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91605
Approved by: https://github.com/jgong5, https://github.com/mingfeima, https://github.com/anijain2305
2023-01-06 00:45:14 +00:00
Edward Z. Yang
f8740db410 Properly resolve source_ref when constructing shape guards (#91058)
Whenever you guard on something, you're supposed to tell GuardBuilder about it, so GuardBuilder knows that it has to actually bind it in scope when it creates the guard function. But shape env guards bypass that mechanism completely. Well, now they don't.

For the most part, this didn't matter in practice, because we usually had a `TENSOR_MATCH` guard floating around that made sure that the guard stayed live. But if we ever eliminate those guards (e.g., because we build it into the shape guard directly; something we'll probably want to do when https://github.com/pytorch/pytorch/pull/89707 goes online) then this will indeed matter.

One complication: some of the shape env guards are on globals. You have to make sure to shunt the usage to the correct guard builder in that case. Maybe it would be better if we refactored things so there is only one GuardBuilder. Not sure.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91058
Approved by: https://github.com/voznesenskym
2022-12-30 05:56:56 +00:00
Edward Z. Yang
bcf15cd93b Store source, not sname, in Symbol (#91057)
I'm going to need this in the follow up PR. Instead of storing only Source.name() in Symbol, I now store a full on Source. Lots of replumbing reoccurs. In particular:

- Move Source to torch._guards to break cycles
- I have to add TensorPropertySource and NegateSource to handle x.size()[0] and -x codegen that I was doing with string manipulation previously
- I tighten up invariants so that I never pass source=None; instead I pass ConstantSource (these are constant sources right) and test for that rather than source being missing. I think this is more parsimonious
- Some mypy wobbles from new imports

I didn't move LocalSource and friends to torch._guards, but I ended up needing to access them in a few places. The main annoyance with moving these is that then I also need to move the bytecode codegen stuff, and that's not so easy to move without bringing in the kitchen sink.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91057
Approved by: https://github.com/albanD, https://github.com/voznesenskym, https://github.com/zou3519
2022-12-30 05:56:56 +00:00
Joel Schlosser
8b55b86dbd Move sym_int and sym_float alongside SymInt / SymFloat in base torch package (#91317)
This PR moves the definitions for:
* `sym_int`
* `sym_ceil` (used only for `sym_int`)
* `sym_floor` (used only for `sym_int`)
* `sym_float`

from `torch/fx/experimental/symbolic_shapes.py` to `torch/__init__.py`, where `SymInt` and `SymFloat` are already defined.

This removes the need for several in-line imports, and enables proper JIT script gating for #91318. I'm very open to doing this in a better way!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91317
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2022-12-28 16:08:16 +00:00
Joel Schlosser
1c40ec46ff Decomps and meta registrations for upsample_nearest 1D / 2D / 3D (#91260)
Adds decompositions and meta registrations for the 1D, 2D, and 3D implementations of `upsample_nearest`. All related OpInfo-based tests for AOTAutograd now pass.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91260
Approved by: https://github.com/ezyang
2022-12-28 16:03:25 +00:00
PyTorch MergeBot
b68fd7e319 Revert "Store source, not sname, in Symbol (#91057)"
This reverts commit 88c581be87.

Reverted https://github.com/pytorch/pytorch/pull/91057 on behalf of https://github.com/atalman due to causing internal build failures
2022-12-21 22:33:15 +00:00
Edward Z. Yang
88c581be87 Store source, not sname, in Symbol (#91057)
I'm going to need this in the follow up PR. Instead of storing only Source.name() in Symbol, I now store a full on Source. Lots of replumbing reoccurs. In particular:

- Move Source to torch._guards to break cycles
- I have to add TensorPropertySource and NegateSource to handle x.size()[0] and -x codegen that I was doing with string manipulation previously
- I tighten up invariants so that I never pass source=None; instead I pass ConstantSource (these are constant sources right) and test for that rather than source being missing. I think this is more parsimonious
- Some mypy wobbles from new imports

I didn't move LocalSource and friends to torch._guards, but I ended up needing to access them in a few places. The main annoyance with moving these is that then I also need to move the bytecode codegen stuff, and that's not so easy to move without bringing in the kitchen sink.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91057
Approved by: https://github.com/albanD, https://github.com/voznesenskym
2022-12-21 04:51:51 +00:00
Edward Z. Yang
e48c91688b DebugInterpreter works with symbolic shapes now, plus test (#90913)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90913
Approved by: https://github.com/voznesenskym
2022-12-16 05:22:56 +00:00
Edward Z. Yang
67436f621a Add utility for binding symbols based on arguments passed to placeholders (#90912)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90912
Approved by: https://github.com/voznesenskym
2022-12-16 05:22:56 +00:00
Edward Z. Yang
54563e6288 Don't put tracing state on Tensor (#90628)
Fixes https://github.com/pytorch/pytorch/issues/89626

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90628
Approved by: https://github.com/voznesenskym
2022-12-15 08:43:08 +00:00
Tugsbayasgalan (Tugsuu) Manlaibaatar
1aab755320 Fakify params and weights under private config (#90417)
Previously, we planned to lift the parameters and weights while exporting and implement our own transformer to "unlift" the lifted weights and params back to the graph as attributes. But this is bit challenging because:

- We need to maintain correct ordering for weights and parameters that are passed as inputs so that we know how to map them back.
- Some weights are unused in the graph, so our transformer needs to be aware of which weights and parameters are not used in the graph. And we need to distinguish which are real user input and which are parameters.
- There can be more edge cases we haven't seen in other models yet.

I am aware that @Chillee  and @bdhirsh mentioned that functionalization won't work with fake-tensor attributes but this is fine for the short term as we don't expect users to be modifying weights and params in inference mode. In fact, we explicitly disable attribute mutation in torchdynamo export mode right now.

Given above condition, it might be ok to just fakify params when we need. I use a flag to guard against this change.

Differential Revision: [D41891201](https://our.internmc.facebook.com/intern/diff/D41891201)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90417
Approved by: https://github.com/eellison
2022-12-14 09:33:18 +00:00
Joel Schlosser
4a5f4416d0 Make at::outer SymInt-aware (#90714)
Fixes matmul and related ops with meta; no more xfails needed. The non-working case for matmul was the matrix-vector case, which dispatches to `outer`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90714
Approved by: https://github.com/lezcano
2022-12-13 18:16:09 +00:00
Edward Z. Yang
f7365eca90 Add unbacked symints support; item works now (#90624)
The big idea is to add `create_unbacked_symfloat` and `create_unbacked_symint` to ShapeEnv, allowing you to allocate symbolic floats/ints corresponding to data you don't know about at compile time. Then, instead of immediately erroring out when you try to call local_scalar_dense on a FakeTensor, we instead create a fresh symint/symfloat and return that.

There a bunch of odds and ends that need to be handled:

* A number of `numel` calls converted to `sym_numel`
* When we finally return from item(), we need to ensure we actually produce a SymInt/SymFloat when appropriate. The previous binding code assumed that you would have to get a normal Python item. I add a pybind11 binding for Scalar (to PyObject only) and refactor the code to use that. There is some trickiness where you are NOT allowed to go through c10::SymInt if there isn't actually any SymInt involved. See comment.
* One of our unit tests tripped an implicit data dependent access which occurs when you pass a Tensor as an argument to a sizes parameter. This is also converted to support symbolic shapes
* We now support tracking bare SymInt/SymFloat returns in proxy tensor mode (this was already in symbolic-shapes branch)
* Whenever we allocate an unbacked symint, we record the stack trace it was allocated at. These get printed when you attempt data dependent access on the symint (e.g., you try to guard on it)
* Subtlety: unbacked symints are not necessarily > 1. I added a test for this.

These unbacked symints are not very useful right now as you will almost always immediately raise an error later when you try to guard on them. The next logical step is adding an assertion refinement system that lets ShapeEnv learn facts about unbacked symints so it can do a better job eliding guards that are unnecessary.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90624
Approved by: https://github.com/Skylion007, https://github.com/voznesenskym
2022-12-12 13:33:07 +00:00
Edward Z. Yang
e33f1eeeb7 SymIntify resize_ and deduplicate memory format logic (#90442)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90442
Approved by: https://github.com/bdhirsh
2022-12-11 14:38:38 +00:00
Edward Z. Yang
45109ec30a Completely redo how ShapeEnv guards are generated (#90528)
Instead of inferring shape mappings from a bunch of data structures that were plumbed in InstructionTranslator, we instead work out mappings by just iterating over the GraphArgs and mapping symbols to arguments as they show up. If multiple argument sizes/strides/offset map to the same symbol, this means they are duck sized, so we also generate extra equality tests that they must be equal. Finally, we generate 0/1 specialization guards. The resulting code is much shorter, and I think also easier to understand.

TODO: Delete all the tensor ref tracking code, it's unnecessary

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90528
Approved by: https://github.com/voznesenskym
2022-12-10 13:35:04 +00:00
Edward Z. Yang
49c674e155 Revert guaranteed symint allocation (#90381)
So, uh, I have a new strategy for generating dupe guards, one where I don't actually need to allocate symints for every tensor that is fakeified. So I'm reverting the changes I made from earlier PRs in this one.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90381
Approved by: https://github.com/voznesenskym
2022-12-10 13:17:34 +00:00
Edward Z. Yang
e03cde07e4 Guarantee symbol allocation for all sizes/strides/storage offset (#89879)
We may need to express guards on the size/stride/storage offset of
a tensor, but we cannot do this if it's already been duck sized.
This PR guarantees that we allocate a symbol (or negation of the
symbol) whenever we ask to create a SymInt, and propagates this
symbol to SymNode so that Dynamo can look at it (not in this PR).

This PR doesn't actually add guards, nor does Dynamo do anything
with these symbols.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89879
Approved by: https://github.com/albanD
2022-12-01 13:43:10 +00:00