Commit Graph

559 Commits

Author SHA1 Message Date
PyTorch MergeBot
9a28a7b498 Revert "Add support for torch.Generator type in TorchScript (#110413)"
This reverts commit 27e31ab6e8.

Reverted https://github.com/pytorch/pytorch/pull/110413 on behalf of https://github.com/PaliC due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/110413#issuecomment-1799003164))
2023-11-07 15:53:32 +00:00
Antonio Kim
27e31ab6e8 Add support for torch.Generator type in TorchScript (#110413)
- Add support for `torch.Generator` type in TorchScript
- Add `generator` args to all `torch.nn.init` functions that call `uniform_` or `normal_`
- Add support for `torch.Generator` in LTC's TorchScript backend (CC: @wconstab)

CC: @eellison @davidberard98 @GlebKazantaev @behzad-a
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110413
Approved by: https://github.com/wconstab, https://github.com/albanD, https://github.com/glebk-cerebras, https://github.com/davidberard98
2023-11-06 21:27:02 +00:00
Maxime Arthaud
62cbe86ac0 [torch] Skip the assertion on the return type when the annotation is a forward reference (#112870)
Summary:
The assertion is causing build failures when running Pysa, our security-focused static analyzer.
This is because we run `pyre infer` on the source code before analyzing it, which introduces annotations such as `def foo() -> 'torch._tensor.Tensor'`.
This does not work with the `out_wrapper` decorator which relies on inspecting the signature of the decorated function.
Let's skip the check on the return type if we detect that it was introduced by `pyre infer`.

Test Plan: eyes

Differential Revision: D50976601

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112870
Approved by: https://github.com/ZainRizvi
2023-11-04 00:22:13 +00:00
Peter Bell
46e80ce58a [ATen] Support multi dim any and all reductions (#110310)
This adds a new overload to `all` and `any` with support for multiple reduction dims.
```
all.dims(Tensor self, int[1]? dim=None, bool keepdim=False) -> Tensor
any.dims(Tensor self, int[1]? dim=None, bool keepdim=False) -> Tensor
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110310
Approved by: https://github.com/lezcano, https://github.com/albanD, https://github.com/justinchuby
2023-10-24 21:33:53 +00:00
Zhang, Wuxun
f32eb9bc55 fix missing non-contiguous output handling for add op (#111758)
patch for https://github.com/pytorch/pytorch/pull/104689 which is missing similiar handling for add op

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111758
Approved by: https://github.com/karthiknagasub, https://github.com/ezyang
2023-10-24 17:27:50 +00:00
Aaron Gokaslan
cb856b08b2 [BE]: Attach cause to some exceptions and enable RUFF TRY200 (#111496)
Did some easy fixes from enabling TRY200. Most of these seem like oversights instead of intentional. The proper way to silence intentional errors is with `from None` to note that you thought about whether it should contain the cause and decided against it.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111496
Approved by: https://github.com/malfet
2023-10-19 21:56:36 +00:00
lezcano
2fd546aa5e Allow strided layout in torch.normal (#111205)
Fixes https://github.com/pytorch/pytorch/issues/111119

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111205
Approved by: https://github.com/ezyang
2023-10-13 21:17:38 +00:00
isdanni
2f53085f3f [BE] Enable Ruff's Flake8 PYI030 (#111103)
Enable [unnecessary-literal-union (PYI030)](https://docs.astral.sh/ruff/rules/unnecessary-literal-union/)

Link: #110950
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111103
Approved by: https://github.com/albanD
2023-10-12 13:31:59 +00:00
PyTorch MergeBot
98c329b19e Revert "[core ATen IR] Add decompositions for max, min, var_mean (#110906)"
This reverts commit 9606cda64e.

Reverted https://github.com/pytorch/pytorch/pull/110906 on behalf of https://github.com/SS-JIA due to Breaks internal CI ([comment](https://github.com/pytorch/pytorch/pull/110906#issuecomment-1757490740))
2023-10-11 11:41:21 +00:00
Edward Z. Yang
24bf9aeb6b Fix arange with dynamic end argument. (#110979)
Fixes https://github.com/pytorch/pytorch/issues/93468

There's a few extra tests that are sort of unrelated, but I ended up writing them while working on the fix and decided to keep them. The big idea here is to split the `_check` so that `expect_true` works; I could have probably also improved the symbolic reasoning but I'm lazy. One small logging fix too.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110979
Approved by: https://github.com/Skylion007
2023-10-11 00:32:34 +00:00
SS-JIA
9606cda64e [core ATen IR] Add decompositions for max, min, var_mean (#110906)
## Context

Add decompositions for `aten.max`, `aten.min`, and `aten.var_mean`. These operators follow a pattern of returning a tuple of outputs from two component operators:

```
aten.max(x) -> return aten.amax(x), aten.argmax(x)
aten.min(x) -> return aten.amin(x), aten.argmin(x)
aten.var_mean(x) -> return aten.var(x), aten.mean(x)
```

For `var_mean`, the `refs` implementation was doing something similar, so I changed it to call `torch.` ops instead like was done for other `refs` implementations previously. cc: @peterbell10 @lezcano

Note that Inductor lowers all these directly, so they are excluded from the Inductor decomp table.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110906
Approved by: https://github.com/manuelcandales
2023-10-11 00:06:24 +00:00
Stephen Jia
c2e7a0d689 [core IR] Add decomps for aten.sum and aten.squeeze variants (#110645)
Summary:
## Context

Both `aten.sum` and `aten.squeeze` have a "most generic" variant in the form of `aten.sum.dim_IntList` and `aten.squeeze.dims` respectively. Add decompositions for other non generic variants of these operators to express them using the most generic variant.

Note that to register these decomps, the reference implementation under `_refs` had to be removed as registered decompositions. cc: @lezcano @peterbell10

Test Plan: Github CI + Meta Internal CI

Differential Revision: D49965952

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110645
Approved by: https://github.com/peterbell10, https://github.com/digantdesai, https://github.com/manuelcandales
2023-10-07 04:21:51 +00:00
Peter Bell
d796518485 [refs] Fix size check from #108360 (#109083)
PR #108360 uses the same default `last_dim_size` formula from complex-to-real (C2R) transforms for
complex-to-complex (C2C) and real-to-complex (R2C). However, this is not correct because for C2R
the input is only half the size of the full tensor, which is not the case for C2C and C2R.

This error is mostly benign since `last_dim_size` was only used for the `>= 1` condition which is
almost always met anyway.

For this PR I now use it as the argument to `_apply_norm` which makes it load-bearing for correctness
and so is thoroughly tested now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109083
Approved by: https://github.com/lezcano
2023-09-27 23:59:29 +00:00
SS-JIA
dec140f1ea [core IR] Add a core decomposition for aten.all (#110093)
## Context

Change the ref implementation of `aten.all` to only use other `torch` operators such that we can use it for the core ATen decomposition table. This will replace the decomposition for `aten.all` that was used specifically by Inductor.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110093
Approved by: https://github.com/manuelcandales, https://github.com/peterbell10, https://github.com/lezcano
2023-09-27 01:31:41 +00:00
SS-JIA
5df8aca994 [core IR] Add a core decomposition for floor_divide (#110046)
## Context

Introduce a core decomposition for `aten.floor_divide` into other `aten` ops, and add it to the core ATen decomposition table.

This replaces the decomposition of `floor_divide` that was used by Inductor. I noticed there was a note on that decomposition

```
# TorchInductor-only decomposition. It should not be taken to core.
# See https://github.com/pytorch/torchdynamo/pull/1120
```

but couldn't discern the reason why this is the case. cc: @lezcano

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110046
Approved by: https://github.com/peterbell10
2023-09-26 08:39:21 +00:00
Mwiza Kunda
5c4b5baf21 Fix python decomps for OpOverloadPackets and add tests (#107707)
- Extend `test_torch_dispatch_meta_outplace` to test torch ops that do not have an out parameter but have aten op overloads that have out parameters. Additionally, Python decompositions may register `OpOverloadPacket`'s so decompositions need to be tested to ensure all `OpOverloads` still function for the `Meta` key (e.g. if a python decomposition is registered for an aten op `aten.foo` with overloads `[default, out]`, the python function needs to support receiving out arguments)

- Add out parameter wrappers to python decomps for aten ops that have out overloads

CC. @ezyang @albanD @lezcano

Fixes #107713

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107707
Approved by: https://github.com/lezcano
2023-09-25 20:53:30 +00:00
SS-JIA
7de669f2f9 [core IR] Remove trunc decomp and add trunc to core (#109902)
Following up from [this comment](https://github.com/pytorch/pytorch/pull/109319#discussion_r1330803226). Remove the decomposition for `trunc`, and add it as a core operator.

Going forward, provide similar treatment for operators that map cleanly to hardware instructions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109902
Approved by: https://github.com/peterbell10
2023-09-25 18:18:06 +00:00
Mwiza Kunda
6b7b9c796e Fix registering jit decompositions for jvp for out wrapped decomps (#109367)
Python decompositions wrapped by `out_wrapper` need to be unwrapped before compiling with TorchScript since:
- `out_wrapper` extends the decompositions signature with an out parameter, however this `out` parameter is not present in the source code of the original decomposition so the resulting `ScriptFunction` will not have an `out` parameter
- `out_wrapper` is in the `torch._prims_common.wrappers` module so its `globals()` are different to the globals of the decomposition to be wrapped. This may cause symbol resolution to fail with the TorchScript compiler since it is compiling the unwrapped decomps source code rather than the wrapper

The python decomposition for `aten.trace` is wrapped as an example, other decompositions are to be fixed in https://github.com/pytorch/pytorch/pull/107707
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109367
Approved by: https://github.com/lezcano
2023-09-21 16:36:51 +00:00
Peter Bell
9e629dd73c [decomp] Add all std and std_mean overloads to core decompostions (#109667)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109667
Approved by: https://github.com/lezcano
2023-09-20 18:45:56 +00:00
Salil Desai
2e721aab98 [Decomposition] Trunc (#109319)
Summary:
Add Decomp for Trunc and add it to core_aten_decompositions

Differential Revision: D49042033

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109319
Approved by: https://github.com/SherlockNoMad
2023-09-19 13:30:13 +00:00
Edward Z. Yang
677a1010e6 Implement traceable torch.tensor when you have SymInt/SymFloat inputs (#109515)
I just ported the C++ torch.tensor implementation to Python, swapping out the inner bits to successively stack tensors together, so that we can trace through `scalar_tensor`.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109515
Approved by: https://github.com/voznesenskym
ghstack dependencies: #109513
2023-09-19 13:19:57 +00:00
Li-Huai (Allan) Lin
b2cba439b4 Introduce Tensor overload to linspace and logspace (#104889)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104889
Approved by: https://github.com/zou3519
ghstack dependencies: #107958
2023-09-11 23:30:40 +00:00
PyTorch MergeBot
a7f5abeade Revert "Introduce Tensor overload to linspace and logspace (#104889)"
This reverts commit 57e5239321.

Reverted https://github.com/pytorch/pytorch/pull/104889 on behalf of https://github.com/clee2000 due to sorry have to revert this to revert https://github.com/pytorch/pytorch/pull/107958 ([comment](https://github.com/pytorch/pytorch/pull/104889#issuecomment-1714305768))
2023-09-11 17:33:48 +00:00
Li-Huai (Allan) Lin
57e5239321 Introduce Tensor overload to linspace and logspace (#104889)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104889
Approved by: https://github.com/zou3519
ghstack dependencies: #107958
2023-09-11 15:29:39 +00:00
ekamiti
0f88d93b10 decomposition spectral ops fixes (#108360)
Fixes https://github.com/pytorch/pytorch/issues/105986, https://github.com/pytorch/pytorch/issues/108204, https://github.com/pytorch/pytorch/issues/108205

Fix all issues flagged when making changes for https://github.com/pytorch/pytorch/pull/107421

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108360
Approved by: https://github.com/ezyang
2023-09-09 04:48:09 +00:00
Ken Jin
c458fa0d35 Decompose/add reference for view_as_complex (#108005)
Aten source: d4a99631dd/aten/src/ATen/native/ComplexHelper.h (L78)

Documentation reference:
https://pytorch.org/docs/stable/generated/torch.view_as_complex.html

Note: this adds a new primitive `view_of_dtype`, which is trivially implemented, as its meta function is already implemented elsewhere.

Finally, this is not registered as a decomposition (yet), because TorchInductor does not yet support complex types. It should be added once we do.

Closes https://github.com/pytorch/pytorch/issues/108020 as well.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108005
Approved by: https://github.com/peterbell10, https://github.com/ezyang
2023-09-07 23:49:20 +00:00
Guilherme Leobas
7e878c9d10 Add decomposition for aten.take_along_dim (#108185)
xref #107875

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108185
Approved by: https://github.com/lezcano
2023-09-04 13:49:53 +00:00
Vishwa Raj Singh
1b3dc05c3e Use contiguous() to handle noncontiguous outputs during elementwise decomposition (#108140)
Fixes https://github.com/pytorch/pytorch/issues/108218

Use contiguous() API to handle noncontiguous outputs during elementwise decomp

With this change, ops is decomposing properly (testcase from the bug):
```
graph():
    %arg0_1 : [#users=3] = placeholder[target=arg0_1]
    %abs_1 : [#users=1] = call_function[target=torch.ops.aten.abs.default](args = (%arg0_1,), kwargs = {})
    %floor : [#users=1] = call_function[target=torch.ops.aten.floor.default](args = (%abs_1,), kwargs = {})
    %sign : [#users=1] = call_function[target=torch.ops.aten.sign.default](args = (%arg0_1,), kwargs = {})
    %mul : [#users=1] = call_function[target=torch.ops.aten.mul.Tensor](args = (%floor, %sign), kwargs = {})
    %sub : [#users=1] = call_function[target=torch.ops.aten.sub.Tensor](args = (%arg0_1, %mul), kwargs = {})
    return (sub,)
```
Output:
```
tensor([[ 0.2871,  0.7189,  0.7297],
        [ 0.8782, -0.4899,  0.7055]], device='hpu:0')
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108140
Approved by: https://github.com/ezyang
2023-09-03 04:32:22 +00:00
lezcano
239ee76177 Add refs/decomps for dot/vdot (#108194)
Follow-up on https://github.com/pytorch/pytorch/issues/108127#issuecomment-1698142427

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108194
Approved by: https://github.com/peterbell10
ghstack dependencies: #108188
2023-08-31 15:30:23 +00:00
lezcano
239fed7e1e Add reference for linalg.vecdot (#108188)
Was addressing https://github.com/pytorch/pytorch/issues/108127, but
then I realised that vecdot is already CompositeImplicit. Pushing anyway
as a short-and-sweet PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108188
Approved by: https://github.com/peterbell10
2023-08-31 15:30:23 +00:00
David Watson
598babf017 Added normal op decomposition for specializations of the normal op (#106792)
This fixes running normal with the meta key.

```
import torch

t = torch.tensor(4.0, device='meta')
torch.normal(0.5, t)
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106792
Approved by: https://github.com/lezcano
2023-08-25 16:18:28 +00:00
Vishwa Raj Singh
35de780aa6 Fix Inplace tensor update on transpose (#104689)
Fixes #https://github.com/pytorch/pytorch/issues/103650

- To align with HPU device backend architecture.
   Ensure all non-view ops return contiguous fake tensor outputs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/104689
Approved by: https://github.com/ezyang
2023-08-24 16:58:50 +00:00
Sherlock Huang
ee4b99cc3a Decomp for aten.dropout (#106274)
When exporting dropout with cpu tensor, we get following graph module
```
    class GraphModule(torch.nn.Module):
        def forward(self, arg0_1: f32[512, 10]):
            empty_memory_format: f32[512, 10] = torch.ops.aten.empty.memory_format([512, 10], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False, memory_format = torch.contiguous_format)
            bernoulli_p: f32[512, 10] = torch.ops.aten.bernoulli.p(empty_memory_format, 0.9);  empty_memory_format = None
            div_scalar: f32[512, 10] = torch.ops.aten.div.Scalar(bernoulli_p, 0.9);  bernoulli_p = None
            mul_tensor: f32[512, 10] = torch.ops.aten.mul.Tensor(arg0_1, div_scalar);  arg0_1 = div_scalar = None
            return (mul_tensor,)
```

In addition, if we export with eval() mode, we will have an empty graph.

However, when exporting with cuda tensor, we got
```
    class GraphModule(torch.nn.Module):
        def forward(self, arg0_1: f32[512, 10]):
            native_dropout_default = torch.ops.aten.native_dropout.default(arg0_1, 0.1, True);  arg0_1 = None
            getitem: f32[512, 10] = native_dropout_default[0];  native_dropout_default = None
            return (getitem,)
```
and exporting under eval() mode will still have a dropout node in graph.

This PR make exporting with CPU tensor also produce aten.native_dropout.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106274
Approved by: https://github.com/ezyang
2023-08-23 21:12:37 +00:00
Aaron Gokaslan
660e8060ad [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-22 23:16:38 +00:00
PyTorch MergeBot
d59a6864fb Revert "[BE]: Update ruff to 0.285 (#107519)"
This reverts commit 88ab3e4322.

Reverted https://github.com/pytorch/pytorch/pull/107519 on behalf of https://github.com/ZainRizvi due to Sorry, but this PR breaks internal tests. @ezyang, can you please hep them get unblocked? It seems like one of the strings was prob accidentally modified ([comment](https://github.com/pytorch/pytorch/pull/107519#issuecomment-1688833480))
2023-08-22 19:53:32 +00:00
Aaron Gokaslan
88ab3e4322 [BE]: Update ruff to 0.285 (#107519)
This updates ruff to 0.285 which is faster, better, and have fixes a bunch of false negatives with regards to fstrings.

I also enabled RUF017 which looks for accidental quadratic list summation. Luckily, seems like there are no instances of it in our codebase, so enabling it so that it stays like that. :)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107519
Approved by: https://github.com/ezyang
2023-08-20 01:36:18 +00:00
Edward Z. Yang
5673c0874c Use expect_true to make split with unbacked sizes work. (#106788)
This pattern shows up in torchrec KeyedJaggedTensor.  Most
of the change in this PR is mechanical: whenever we failed
an unbacked symint test due to just error checking, replace the
conditional with something that calls expect_true (e.g.,
torch._check or TORCH_SYM_CHECK).

Some of the changes are a bit more nuanced, I've commented on the PR
accordingly.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106788
Approved by: https://github.com/lezcano
ghstack dependencies: #106720
2023-08-15 20:31:30 +00:00
lezcano
2c5f96deac [Inductor] Make softshrink composite implicit (#107052)
The backward is pretty much equivalent to the one we had written

Pull Request resolved: https://github.com/pytorch/pytorch/pull/107052
Approved by: https://github.com/peterbell10
ghstack dependencies: #107038, #107039, #107051
2023-08-14 21:01:50 +00:00
lezcano
3b1254e800 Make hardshrink's decomp composite implicit (#107039)
The generated code is the same
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107039
Approved by: https://github.com/peterbell10
ghstack dependencies: #107038
2023-08-14 21:01:50 +00:00
lezcano
45c7880486 Simplify some decompositions. (#107038)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107038
Approved by: https://github.com/peterbell10
2023-08-14 21:01:50 +00:00
Peter Bell
ab6efb1649 [pt2] Add reference implementations of torch.{stft,istft} (#106400)
This allows symbolic shapes to be traced through `torch.stft` and `torch.istft`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106400
Approved by: https://github.com/lezcano
ghstack dependencies: #106319
2023-08-07 20:59:30 +00:00
Yanbo Liang
0ad93a3d56 Fix aten.logspace decomposition (#105201)
Fixes #104118

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105201
Approved by: https://github.com/ezyang
2023-07-22 04:10:20 +00:00
Justin Chu
8a688277a2 [BE] Enable ruff's UP rules and autoformat dynamo / functorch and refs (#105432)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105432
Approved by: https://github.com/ezyang
2023-07-19 13:48:44 +00:00
Nikita Shulga
5837e95d30 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`

Unrelated, to bypass CI failures due to the gcc9 dependency update in Ubuntu-18.04:
- Add hack to squash older libstdc++ from conda environment in favor one from OS to `.ci/docker/install_conda.sh`
- Update bazel cuda builds to focal, as with libstdc++-6.0.32 bazel builds loose the ability to catch exceptions (probably because they link with cupti statically, but I could not found where it is done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-15 20:30:20 +00:00
PyTorch MergeBot
15fd1ea118 Revert "[Reland] Update mypy to 1.4.1 (#105227)"
This reverts commit c9c4f8efc3.

Reverted https://github.com/pytorch/pytorch/pull/105227 on behalf of https://github.com/atalman due to trying to mitigate ci sev #105248 ([comment](https://github.com/pytorch/pytorch/pull/105227#issuecomment-1636510935))
2023-07-14 22:28:35 +00:00
Nikita Shulga
c9c4f8efc3 [Reland] Update mypy to 1.4.1 (#105227)
This PR re-lands
- [Typing] Fix PEP 484 Violation (#105022)
- Update mypy to 1.4.1 (#91983)

That were reverted due to the conflict with internal source repo.

Mostly fixes for PEP-484 violation (i.e. when default arg is set to None, but type is not annotated as optional)
Plus few real fixes:
  - Add missing `_get_upgraders_entry_map` to `torch/_C/__init__.pyi`
  - Add missing return statement to `torch._export. deserialize_graph`
  - Fix error message in `torch.ao.ns.fx.weight_utils.get_lstm_mod_weights`
  - Add assert it `torch/optim/optimizer.py` that Optional list is not None
TODO (in followup PR):
  - Fix erroneous `isinstance` check in `torch/ao/quantization/_pt2e/qat_utils.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105227
Approved by: https://github.com/atalman, https://github.com/albanD, https://github.com/Skylion007
2023-07-14 20:45:12 +00:00
PyTorch MergeBot
b4d91b1c5b Revert "[Typing] Fix PEP 484 Violation (#105022)"
This reverts commit 4148b7bada.

Reverted https://github.com/pytorch/pytorch/pull/105022 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/105022#issuecomment-1635967734))
2023-07-14 14:45:09 +00:00
Nikita Shulga
4148b7bada [Typing] Fix PEP 484 Violation (#105022)
Not sure, how it worked before, but if arguments must be annotated is optional if they are defaulted to None

Towards enabling mypy-1.4.1 in lintrunner

<!--
copilot:poem
-->
### <samp>🤖 Generated by Copilot at 5e1b9f4</samp>

> _We annotate the arguments of doom_
> _To show the `None` values of gloom_
> _We improve the type checking and readability_
> _With `Optional` annotations of metal-ity_

Pull Request resolved: https://github.com/pytorch/pytorch/pull/105022
Approved by: https://github.com/izaitsevfb, https://github.com/huydhn, https://github.com/Skylion007
2023-07-12 10:20:48 +00:00
Peter Bell
5c580a9846 [decomp] Add test tracking core ATen operators (#104262)
This adds an expect-test that finds the set of core ATen operators by
subtracting the operators with decomposition in core_aten_decompositions from the
set of all operators that have decompositions and could be decomposed.

This is useful because if you add a new decomposition but forget to add it to
the list of core decompositions, it will appear in the PR diff.

Also, by going through this list I have identified some operators where the
functional variant is decomposed, but not the inplace variant which must be an
oversight.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/104262
Approved by: https://github.com/lezcano
2023-07-04 16:41:44 +00:00
Peter Bell
8b418f197c [decomp] Add decomposition for torch.renorm (#103858)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103858
Approved by: https://github.com/ezyang, https://github.com/nkaretnikov
2023-06-21 20:57:43 +00:00
Peter Bell
a61096fb94 [decomp] Decompose logaddexp2 (#103765)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103765
Approved by: https://github.com/Chillee
2023-06-21 20:16:24 +00:00
Kurt Mohler
ee83c646bb Replace _prims_common.check with torch._check* (#103240)
This relands most of the changes from #102219 which were backed out by #103128. However, instead of removing `_prims_common.check`, it adds a warning and a comment mentioning that it will be removed in the future and `torch._check*` should be used instead. As mentioned in https://github.com/pytorch/pytorch/pull/103128#pullrequestreview-1466414415, `_prims_common.check` cannot yet be removed because of some internal usage

Part of #72948

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103240
Approved by: https://github.com/albanD
2023-06-21 00:46:17 +00:00
PyTorch MergeBot
7b6dc72ffa Revert "[decomp] Decompose logaddexp2 (#103765)"
This reverts commit bab21d20eb.

Reverted https://github.com/pytorch/pytorch/pull/103765 on behalf of https://github.com/ezyang due to looks like land race ([comment](https://github.com/pytorch/pytorch/pull/103765#issuecomment-1599030496))
2023-06-20 15:35:02 +00:00
Peter Bell
bab21d20eb [decomp] Decompose logaddexp2 (#103765)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103765
Approved by: https://github.com/Chillee
2023-06-20 09:24:21 +00:00
BowenBao
724a1ba2de Tidy __all__ under torch._refs (#103712)
- Added ops that were missing under `__all__`.
- Some misc changes to helper functions to make them private.
- Set correct `fn.__module__` for `fn` created by `_make_alias`, when called in another module.

All modification largely references results from a hacked version of `test_public_bindings::test_correct_module_names`.
By default `torch._refs` is not included in the test because it is technically a private package.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103712
Approved by: https://github.com/lezcano
2023-06-20 00:04:58 +00:00
ekkapricious
5d34656fd7 Update dynamo sum dtype handling to match eager (#103037)
The current behaviour for dynamo is to set the dtype to torch.int64 for integral types if the dtype is not specified explicitly which results in mismatched behaviour as compared to eager mode. In eager mode the semantics are:
- If both out is specified and dtype is specified then they have to match
- If dtype is not specified but out is specified then the dtype is set to match the out dtype
- If neither dtype nor out is set then the dtype is set to kLong if it is a bool or an integral type

Fixes #100698

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103037
Approved by: https://github.com/ngimel
2023-06-19 22:26:37 +00:00
Yu, Guangye
ad4ee297ed allow cpu scalar to be moved to xpu in masked_fill (#103645)
# Motivation
Align to CUDA scenario, allow cpu scalar to be moved to xpu device in masked_fill.

# Solution
Add "xpu" support in condition control.

# Additional
no need for more UT.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/103645
Approved by: https://github.com/jgong5, https://github.com/ezyang
2023-06-16 12:15:43 +00:00
Yanbo Liang
686d7e4c48 [Inductor] Fix x.view(dtype) decomp and make inductor support it (#102920)
Fixes #99804

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102920
Approved by: https://github.com/jansel, https://github.com/ngimel
2023-06-07 17:10:54 +00:00
Ivan Zaitsev
821493715c Back out "Remove check from _prims_common, replace with torch._check* (#102219)", Back out "Forwatd fix for D46427687" (#103128)
Test Plan: revertitparrot

Reviewed By: malfet

Differential Revision: D46506433

Pull Request resolved: https://github.com/pytorch/pytorch/pull/103128
Approved by: https://github.com/malfet
2023-06-07 01:41:41 +00:00
Kurt Mohler
a84bb2709a Remove check from _prims_common, replace with torch._check* (#102219)
Part of #72948

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102219
Approved by: https://github.com/lezcano, https://github.com/albanD
2023-06-03 02:23:21 +00:00
PyTorch MergeBot
a7efa0ce35 Revert "Remove check from _prims_common, replace with torch._check* (#102219)"
This reverts commit fb79d43649.

Reverted https://github.com/pytorch/pytorch/pull/102219 on behalf of https://github.com/malfet due to Broke lint, see https://github.com/pytorch/pytorch/actions/runs/5158949959/jobs/9293466925 ([comment](https://github.com/pytorch/pytorch/pull/102219#issuecomment-1574245414))
2023-06-02 20:00:48 +00:00
Kurt Mohler
fb79d43649 Remove check from _prims_common, replace with torch._check* (#102219)
Part of #72948

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102219
Approved by: https://github.com/lezcano, https://github.com/albanD
2023-06-02 19:13:45 +00:00
vfdev-5
319a1cb4e5 [inductor] Replaced refs.op by torch.op in _refs/* (#102176)
Description:
- Replaced `refs.op` by `torch.op` in `_refs/*`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102176
Approved by: https://github.com/lezcano
2023-05-29 22:36:14 +00:00
vfdev-5
e3d97b6213 [inductor] Added smooth_l1_loss refs (#102077)
Added `smooth_l1_loss` to refs + tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/102077
Approved by: https://github.com/lezcano, https://github.com/ngimel
2023-05-24 15:07:08 +00:00
Khushi
51fe53e619 [opinfo] item (#100313)
Follows #100223

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100313
Approved by: https://github.com/ezyang
2023-05-10 11:32:45 +00:00
Khushi
5a933d044f [opinfo prims] equal (#100663)
Follows: #100223
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100663
Approved by: https://github.com/ezyang
2023-05-10 08:16:00 +00:00
Nikita Karetnikov
e87ed2a88d [primTorch] add ref for polar (#100345)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100345
Approved by: https://github.com/ezyang
2023-05-04 01:37:02 +00:00
Angela Yi
d06b93b0c7 Decompose arange.default to arange.start_step (#99739)
The aten op arange.default is not in the core aten IR, and should decompose into the arange.start_step op.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99739
Approved by: https://github.com/SherlockNoMad
2023-04-27 19:06:36 +00:00
Yanbo Liang
4c6f7cbc86 Fix prims unbind if given dimension size is 0 (#100122)
Fixes #99832

Pull Request resolved: https://github.com/pytorch/pytorch/pull/100122
Approved by: https://github.com/ngimel
2023-04-26 23:40:21 +00:00
Nikita Karetnikov
f89b7c2bec [pt2] add SymInt support for roll (#99114)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/99114
Approved by: https://github.com/ezyang
2023-04-15 18:01:39 +00:00
Peter Bell
7b91bd2a7b [primTorch] Add count_nonzero (#98995)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98995
Approved by: https://github.com/lezcano
2023-04-13 22:08:19 +00:00
Peter Bell
7d74dca780 [primTorch] Add rad2deg and deg2rad (#98994)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98994
Approved by: https://github.com/lezcano
2023-04-13 22:08:19 +00:00
Nikita Karetnikov
ff825de442 [primTorch] add ref for cumprod (#98670)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98670
Approved by: https://github.com/ezyang
2023-04-09 15:22:28 +00:00
albanD
0210481dcb Fix _like meta registrations (#98160)
The meta implementation for these _like function is wrong whenever device != "meta" (it doesn't fill the memory!).
zeros_like is special due to sparse and is fixed directly by always filling it with zeros.
Every other one is CompositeExplicit implementation, I went with removing their meta registration and tweaking code to avoid infinite recursions.
I can do the same as zeros_like (and add the proper filling for each) but that would duplicate the c++ logic and make the meta registrations non trivial. I can do it if you prefer to removal.

test_meta works fine with these fixes, relying on CI to see if other tests are breaking as well.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/98160
Approved by: https://github.com/ezyang
2023-04-06 18:44:34 +00:00
Aaron Gokaslan
9c3fbe7475 [BE] Enable flake8-simplify checks (#97984)
Enable some sensible flake8-simplify rules. Mainly wanted to enable the SIM101, and `yield from` SIM103 checks. @kit1980 since you wanted to be tagged on this CI check.

Enabling this check also helped flag one logical bug so it's definitely beneficial (also fixed in this PR).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97984
Approved by: https://github.com/ezyang
2023-03-31 03:40:21 +00:00
Aaron Gokaslan
47dca20d80 [BE] Enable flake8-comprehension rule C417 (#97880)
Enables flake8-comprehension rule C417. Ruff autogenerated these fixes to the codebase.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97880
Approved by: https://github.com/ezyang, https://github.com/kit1980, https://github.com/albanD
2023-03-30 14:34:24 +00:00
Aaron Gokaslan
597b558c51 [BE]: Update flake8 and plugins and fix bugs (#97795)
Update flake8 and flake8-plugins in lintrunner to a modern version. Enables more checks and makes flake8 checks significantly faster. Added a few additional rule ignores that will need to be fixed in the future.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97795
Approved by: https://github.com/alexsio27444, https://github.com/janeyx99, https://github.com/ezyang
2023-03-28 23:51:55 +00:00
Vivek Khandelwal
5da86bbb68 Add decomposition for aten.squeeze.dims op (#97020)
Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97020
Approved by: https://github.com/jansel
2023-03-27 20:13:19 +00:00
Chung-chieh Shan
2c588b3ad5 Allow new_full's fill_value argument type to be complex (#91345)
It seems that this code should type-check but doesn't:
```python
torch.zeros((2,3),dtype=torch.cdouble).new_full((4,5),complex(6,7))
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91345
Approved by: https://github.com/zou3519, https://github.com/ezyang
2023-03-21 12:34:00 +00:00
Edward Z. Yang
3606f59366 Default specialize_int to False (#96624)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96624
Approved by: https://github.com/janeyx99
2023-03-16 02:54:18 +00:00
BowenBao
60a68477a6 Bump black version to 23.1.0 (#96578)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578
Approved by: https://github.com/ezyang
2023-03-15 06:27:59 +00:00
PyTorch MergeBot
ba4fb9b6ad Revert "Default specialize_int to False (#96624)"
This reverts commit 1ac8782db2.

Reverted https://github.com/pytorch/pytorch/pull/96624 on behalf of https://github.com/kit1980 due to Broke inductor/test_torchinductor_dynamic_shapes.py
2023-03-14 19:43:47 +00:00
Edward Z. Yang
1ac8782db2 Default specialize_int to False (#96624)
Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96624
Approved by: https://github.com/janeyx99
2023-03-14 18:37:47 +00:00
Khushi Agrawal
301a28bf8c [primTorch] move diagonal & add linalg.diagonal refs (#95774)
Fixes #85419

Also, add `_refs.linalg.diagonal`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95774
Approved by: https://github.com/lezcano
2023-03-06 17:59:47 +00:00
Nikita Karetnikov
c72fbf2e5a [inductor] do not use ceil in arange ref (#95773)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95773
Approved by: https://github.com/ezyang
2023-03-03 20:38:18 +00:00
mfkasim1
975333d80c Logaddexp for complex in CPU (#95717)
Continuation of PR #93153 where I implemented logaddexp for complex, but didn't expose it to `torch.logaddexp`. So this PR is to expose the complex logaddexp to `torch.logaddexp`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95717
Approved by: https://github.com/lezcano
2023-03-01 20:37:46 +00:00
Brian Hirsh
84e2d957a1 fix primtorch handling for sub.scalar with alpha and float64 arg (#95421)
This fixes the primtorch issue stemming from https://github.com/pytorch/pytorch/issues/95181

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95421
Approved by: https://github.com/ngimel, https://github.com/SherlockNoMad
2023-02-28 00:24:38 +00:00
Edward Z. Yang
4833e47feb Add support for nonzero, some improvements to reduce guards (#95387)
This takes the strategy described in https://docs.google.com/document/d/1lFRYAJo5nrfxRhwIzGnfi2pbLpU6T4ytSRSuLJ5qebI/edit#

It is essentially https://github.com/pytorch/pytorch/pull/95222 but squashed and with changes that are unnecessary given that we assume nonzero returns > 1.

What's in the PR:

* nonzero now supports meta propagation. When `capture_dynamic_output_shape_ops`, it will return a tensor with an unbacked SymInt representing the size in question.
* The unbacked SymInt is UNSOUNDLY assumed to be not equal to 0/1. We will still error if you guard otherwise.
* PrimTorch pointwise operators are updated to use empty_permuted, to avoid guarding on unbacked SymInt from empty_strided (tested in `test_dynamic_pointwise_scalar`)
* Convolution is updated to skip backend selection if batch is unbacked, to avoid guarding on unbacked SymInt (tested in `test_unbacked_batch_resnet`)
* I kept the helper utilities like `definitely_true` for working with possibly unbacked SymInts. They're not used right now but maybe someone will find them useful.
* Added `constrain_unify` to let you specify two unbacked SymInts must have the same value

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95387
Approved by: https://github.com/voznesenskym
2023-02-24 00:27:45 +00:00
Peter Bell
bc438af6fe std/var: support floating point correction value (#94073)
Ref https://github.com/pytorch/pytorch/issues/61492#issuecomment-1413003480

The array API specifies correction to be `Union[int, float]` while we currently only support integers.
https://data-apis.org/array-api/latest/API_specification/generated/array_api.std.html

As std/var is calculated currently, the final count of elements is already done
in floating point so we can make the correction floating point without any loss
of precision or generality.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94073
Approved by: https://github.com/ezyang
2023-02-23 05:50:45 +00:00
Peter Bell
640b9c80f9 [primTorch] Redefine prim.collapse{,_view} end point to be inclusive (#92017)
This makes `prims.collapse(a, start, end)` match the behavior of
`torch.flatten(a, start, end)` more closely.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92017
Approved by: https://github.com/mruberry
2023-02-21 20:36:50 +00:00
Edward Z. Yang
ce950b412f Reland "Add torch.empty_permuted (#95069)" (#95208)
This reverts commit 92e03cd583.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95208
Approved by: https://github.com/albanD
2023-02-21 18:02:48 +00:00
PyTorch MergeBot
92e03cd583 Revert "Add torch.empty_permuted (#95069)"
This reverts commit bedeb1f014.

Reverted https://github.com/pytorch/pytorch/pull/95069 on behalf of https://github.com/jeanschmidt due to Breaking internal builds. More in https://fburl.com/phabricator/ztrxrroq
2023-02-21 12:05:20 +00:00
Edward Z. Yang
bedeb1f014 Add torch.empty_permuted (#95069)
torch.empty_permuted is a generalized version of torch.empty(memory_format=...), where you can pass an arbitrary physical layout as a tuple of dims to allow you to setup dense, non-overlapping tensors with non-standard memory format. Check the docblock for a full description of semantics.

The initial motivation for this PR is with guard-less unbacked SymInts. Traditionally, the way we allocate dense tensors with arbitrary layout is with `empty_strided`. However, `empty_strided` does not know that the given strides are actually contiguous, and must test this manually to find out if it is the case. With `empty_permuted`, this is known statically to be the case and helps us skip some 0/1 guards.

However, I also think torch.empty_permuted is a useful API in its own right. It is technically possible to simulate this with an empty and a permute; however, there are some downsides:

* The manual incant is tricky to work out. To allocate an NHWC tensor, the invocation is `torch.empty(N, H, W, C).permute(0, 3, 1, 2)`; the permute call has to take NHWC to NCHW, and is the *inverse* of the permutation people are typically thinking of when they talk about NHWC (0, 2, 3, 1). Instead, torch.empty_permuted lets you say `torch.empty_permuted((N, C, H, W), (0, 2, 3, 1))`, letting you provide the intuitive permutation. It can be literally be read off as NHWC if you assign N=0, C=1, H=2, W=3.
* An empty(requires_grad=True).permute() is no longer a leaf tensor. You can force it to be a leaf with a detach(), but it is more straightforward and less error prone to allow directly allocating a tensor with the correct permutation.

It is also technically possible to simulate this with empty_strided. However, this requires the user to manually compute the contiguous output strides and is bad from a reduction of guards perspective. For what it's worth, this is one of the more common uses of as_strided in the wild, and it would be nice to get rid of it.

A nice enhancement of this feature would be to accept `physical_layout` anywhere `memory_format` is accepted. However, this would be a pretty involved change, so I'm doing the easy thing instead.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95069
Approved by: https://github.com/malfet, https://github.com/ngimel, https://github.com/albanD, https://github.com/dagitses
2023-02-20 00:23:10 +00:00
kshitij12345
06489a3c1c [functorch] roll : fix batching rule for scalar tensor (#95048)
Fixes https://github.com/pytorch/pytorch/issues/94925

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95048
Approved by: https://github.com/Skylion007, https://github.com/ngimel
2023-02-19 09:30:30 +00:00
Edward Z. Yang
ef5de0a4cf Don't use PrimTorch decomposition for empty (#94512)
This PR removes the unnecessary == 0 guard when constructing empty tensors, by ensuring that when we create a contiguous tensor we go directly to the C++ torch.empty implementation (instead of indirecting through empty_strided), where we can bypass doing zero tests when computing the size of the storage. This probably also speeds up trace time.

When I did this, I found out that `empty_tensor_restride_symint` was flagrantly wrong (we had never exercised it before because we redirected to `empty_strided` in PrimTorch decomp, which doesn't hit this codepath.) The bugs:

* Stride computation was wrong (only `last_idx` was ever written to)
* Using set_sizes_and_strides with `sym_sizes` input doesn't work, because there is some sort of ordering problem where `clone_symvec` isn't safe when you clone a vector into itself. Probably should fix this.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94512
Approved by: https://github.com/ngimel
2023-02-16 16:04:41 +00:00
min-jean-cho
b6df987671 [Inductor] Added aten.normal_ decomp (#91207)
Fixes #91085

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91207
Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano
2023-02-15 21:21:46 +00:00
min-jean-cho
22e2fd554c OpInfo for aten.exponential, Add check for dtype, parameter in decomp ref (#92709)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92709
Approved by: https://github.com/lezcano
2023-02-14 10:11:07 +00:00
Fabio Rocha
1dbaa5c290 Use decompositions for some fallbacks introduced in #94039 (#94206)
In some cases, implements required inductor primitives.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94206
Approved by: https://github.com/jansel, https://github.com/ngimel
2023-02-14 09:31:30 +00:00
Aaron Gokaslan
67d9790985 [BE] Apply almost all remaining flake8-comprehension checks (#94676)
Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676
Approved by: https://github.com/ezyang
2023-02-12 01:01:25 +00:00
Fabio Rocha
e116ca93e1 Run test_torchinductor*.py with implicit_fallbacks=False (#94039)
This way it errors out for ops that don't have decomps and
requires you to add explicit fallbacks to lowering.py

Turns out there are a lot, and this commit adds them as well.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94039
Approved by: https://github.com/lezcano, https://github.com/jansel, https://github.com/ngimel
2023-02-10 18:10:56 +00:00
Xuehai Pan
69e0bda999 [BE] Import Literal, Protocol, and Final from standard library typing as of Python 3.8+ (#94490)
Changes:

1. `typing_extensions -> typing-extentions` in dependency. Use dash rather than underline to fit the [PEP 503: Normalized Names](https://peps.python.org/pep-0503/#normalized-names) convention.

```python
import re

def normalize(name):
    return re.sub(r"[-_.]+", "-", name).lower()
```

2. Import `Literal`, `Protocal`, and `Final` from standard library as of Python 3.8+
3. Replace `Union[Literal[XXX], Literal[YYY]]` to `Literal[XXX, YYY]`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94490
Approved by: https://github.com/ezyang, https://github.com/albanD
2023-02-09 19:17:49 +00:00
min-jean-cho
81853354c3 added aten.log_normal_ decomp (#91674)
Fixes #91275

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91674
Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano
2023-02-09 18:34:25 +00:00
min-jean-cho
92f569fe11 [Inductor] added aten.geometric_ decomp (#91672)
Fixes #91671

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91672
Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano
2023-02-09 07:29:14 +00:00
min-jean-cho
66ae3aa096 [Inductor] added aten.cauchy_ decomp (#92047)
Fixes #91675

TODO: compare perf of decomposed tan --vs-- libdevice tan, aten tan for triton, cpp backeneds

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92047
Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano, https://github.com/ngimel
2023-02-09 00:02:56 +00:00
Peter Bell
819990f595 [decomp] Decompose std/std_mean into aten.var/var_mean (#94072)
These are currently decomposed into prims.var which is less useful for inductor.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94072
Approved by: https://github.com/lezcano
2023-02-06 10:22:07 +00:00
Natalia Gimelshein
3c79ea2607 Removes stray print (#94079)
Pertitle

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94079
Approved by: https://github.com/voznesenskym
2023-02-03 21:56:45 +00:00
Peter Bell
77acb556e6 [primTorch] Rewrite nan_to_num ref in terms of aten functions (#93952)
This de-duplicates `_refs.nan_to_num` with the inductor decomposition
and simplifies it to not reimplement `isnan`, `isposinf` and `isneginf`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93952
Approved by: https://github.com/lezcano
2023-02-03 13:51:37 +00:00
Peter Bell
72385bbd03 [primTorch] Rewrite is{,pos,neg}inf refs in terms of aten functions (#93951)
`isposinf` and `isneginf` currently fallback in inductor. Here, I
enable the existing decompositions to work with inductor.

`isinf` can also be written with aten functions, however I don't add
it to inductor's decompositions because `isinf` is lowered to
`tl.libdevice.isinf` in triton.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93951
Approved by: https://github.com/lezcano
2023-02-03 13:51:37 +00:00
Peter Bell
5817695bfa [pt2] Fix arange to match ATen behavior (#93353)
Fixes #92676

`arange` infers the output dtype from the argument types, but in order to reduce
falling back to ATen, inductor preferred to cast whole number float arguments to
int which gave the wrong output dtype. Instead, this decomposes floating point
arange into the prim equivalent for integers.

This also changes the signature of `prims.arange` to

```python
prims.iota(length, *, start, step, **factory_kwargs)
```

which only supports integers arguments. This is done because calculating the
output size from `start, end, step` is surprisingly complex and liable to off by
one errors so should not be duplicated in each backend.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93353
Approved by: https://github.com/ngimel, https://github.com/lezcano
2023-02-03 00:44:32 +00:00
Edward Z. Yang
37fcc53096 Remove import cycle from torch._refs.nn.functional (#93948)
This makes it possible to import torch._refs from
torch._subclasses.fake_tensor

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93948
Approved by: https://github.com/albanD
2023-02-02 21:06:37 +00:00
XiaobingSuper
db87396474 inductor: align the decomposition output stride with none-decomposition path for torch.lerp (#93336)
As title, we need to align the decomposition output stride with the none-decomposition path for torch.lerp. And also enable it's lowering path for inductor.

After this PR for the following case:

```

def fn(i0, i1):
    # i0: (10, 3, 10)
    # i1: (3, 10, 10)
    x1 = i0.transpose(-2, -3)
    #y = torch.lerp(x1, x1, 70000)
    z = torch.lerp(i1, x1, 70000)
    return z

x0 = torch.rand(10, 3, 10)
x1 = torch.rand(3, 10, 10)
ret_eager = fn(x0, x1)
print('==== Eager mode OK! ====')
compiled = torch.compile(fn, fullgraph=True)
ret_compiled = compiled(x0, x1)
print('==== compile mode OK! ====')
ret_compiled = compiled(x0, x1)
print(torch.equal(ret_eager, ret_compiled))
print(ret_eager.stride()==ret_compiled.stride())
```

the inductor output code will be like(CPU):

```

from ctypes import c_void_p, c_long
import torch
import random
from torch import empty_strided, as_strided, device
from torch._inductor.codecache import AsyncCompile
from torch._inductor.select_algorithm import extern_kernels

aten = torch.ops.aten
assert_size_stride = torch._C._dynamo.guards.assert_size_stride
async_compile = AsyncCompile()

kernel_cpp_0 = async_compile.cpp('''
#include "/tmp/torchinductor_xiaobing/77/c7773nj5pwikpmm2pwa62rcudlf7p3if7eyqb5k4sjsvewwje4le.h"
extern "C" void kernel(const float* __restrict__ in_ptr0,
                       const float* __restrict__ in_ptr1,
                       float* __restrict__ out_ptr0)
{
    {
        #pragma GCC ivdep
        for(long i0=0; i0<3; i0+=1)
        {
            #pragma GCC ivdep
            for(long i1=0; i1<10; i1+=1)
            {
                for(long i2=0; i2<0; i2+=1)
                {
                    auto tmp7 = at::vec::Vectorized<float>::loadu(in_ptr0 + (10*i0) + (16*i2) + (30*i1));
                    auto tmp8 = at::vec::Vectorized<float>::loadu(in_ptr1 + (10*i1) + (16*i2) + (100*i0));
                    auto tmp0 = at::vec::Vectorized<float>(static_cast<float>(70000.0));
                    auto tmp1 = tmp0.abs();
                    auto tmp2 = at::vec::Vectorized<float>(static_cast<float>(0.5));
                    auto tmp3 = tmp1 >= tmp2;
                    auto tmp4 = at::vec::Vectorized<float>(static_cast<float>(1));
                    auto tmp5 = tmp0 - tmp4;
                    auto tmp6 = decltype(tmp5)::blendv(tmp0, tmp5, tmp3);
                    auto tmp9 = tmp7 - tmp8;
                    auto tmp10 = tmp6 * tmp9;
                    auto tmp11 = decltype(tmp7)::blendv(tmp8, tmp7, tmp3);
                    auto tmp12 = tmp10 + tmp11;
                    tmp12.store(out_ptr0 + (10*i1) + (16*i2) + (100*i0));
                }
                #pragma omp simd simdlen(8)
                for(long i2=0; i2<10; i2+=1)
                {
                    auto tmp7 = in_ptr0[i2 + (10*i0) + (30*i1)];
                    auto tmp8 = in_ptr1[i2 + (10*i1) + (100*i0)];
                    auto tmp0 = static_cast<float>(70000.0);
                    auto tmp1 = std::abs(tmp0);
                    auto tmp2 = static_cast<float>(0.5);
                    auto tmp3 = tmp1 >= tmp2;
                    auto tmp4 = static_cast<float>(1);
                    auto tmp5 = tmp0 - tmp4;
                    auto tmp6 = tmp3 ? tmp5 : tmp0;
                    auto tmp9 = tmp7 - tmp8;
                    auto tmp10 = tmp6 * tmp9;
                    auto tmp11 = tmp3 ? tmp7 : tmp8;
                    auto tmp12 = tmp10 + tmp11;
                    out_ptr0[i2 + (10*i1) + (100*i0)] = tmp12;
                }
            }
        }
    }
}
''')

async_compile.wait(globals())
del async_compile

def call(args):
    arg0_1, arg1_1 = args
    args.clear()
    buf1 = empty_strided((3, 10, 10), (100, 10, 1), device='cpu', dtype=torch.float32)
    kernel_cpp_0(c_void_p(arg0_1.data_ptr()), c_void_p(arg1_1.data_ptr()), c_void_p(buf1.data_ptr()))
    del arg0_1
    del arg1_1
    return (buf1, )

if __name__ == "__main__":
    from torch._dynamo.testing import rand_strided
    from torch._inductor.utils import print_performance
    arg0_1 = rand_strided((10, 3, 10), (30, 10, 1), device='cpu', dtype=torch.float32)
    arg1_1 = rand_strided((3, 10, 10), (100, 10, 1), device='cpu', dtype=torch.float32)
    print_performance(lambda: call([arg0_1, arg1_1]))

```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93336
Approved by: https://github.com/jansel
2023-02-02 07:40:28 +00:00
Sherlock Huang
438f12d91a Rewrite some decomps to allow producing aten ops (#93099)
This introduces a new stop to the decomposition train.
Before reaching prims.view_of, it will stop at aten.alias. Export path wants to get off the train at aten ops.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/93099
Approved by: https://github.com/ngimel
2023-01-31 17:46:20 +00:00
Peter Bell
5644059489 [inductor] Lower torch.exp2 and use it for torch.pow(2, x) (#92632)
Before
```python
    tmp0 = 2.0
    tmp2 = tl.libdevice.pow(tmp0, tmp1)
```

After
```python
    tmp1 = tl.libdevice.exp2(tmp0)
```

I've benchmarked on CPU and CUDA with the following examples
```
@torch._dynamo.optimize()
def exp2(x):
    return torch.pow(2, x)

@torch._dynamo.optimize()
def logaddexp2(a, b):
    m = torch.maximum(a, b)
    return m + torch.log2(1 + torch.pow(2, -torch.abs(a-b)))
```

triton is able to specialize `pow(2, x)` such that this makes
no difference, but on CPU I see a surprisingly large speedup.

| device | Function  | Master (us) | This PR (us) | Speedup |
|--------|-----------|-------------|--------------|---------|
| CUDA   | exp2      | 64          | 63           | 1.0     |
|        | logaddexp | 109         | 107          | 1.0     |
| CPU    | exp2      | 220         | 40           | 5.5     |
|        | logaddexp | 282         | 140          | 2.0     |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92632
Approved by: https://github.com/lezcano, https://github.com/ngimel
2023-01-20 22:06:23 +00:00
Peter Bell
dd760c98f8 [decomp] Use new squeeze.dims overload in decompositions (#91602)
This removes the now-redundant `_squeeze_multiple` helpers and instead decomposes into a single call to `aten::squeeze.dims` which also has the effect of reducing the lowered graph size in inductor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91602
Approved by: https://github.com/ngimel
2023-01-20 18:08:18 +00:00
Peter Bell
a9f4462847 [primTorch] Remove prims.to_dtype (#92380)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92380
Approved by: https://github.com/lezcano, https://github.com/ngimel
2023-01-19 12:07:47 +00:00
lezcano
8b861544f9 Remove lowering and decompositions of zero_, zero, zeros_like... in favour of their references (#92071)
The generated triton code is identical.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92071
Approved by: https://github.com/ngimel
2023-01-18 23:22:36 +00:00
Peter Bell
8770a7ed6f Decompose more inplace ops (#90967)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90967
Approved by: https://github.com/anijain2305
2023-01-18 21:07:47 +00:00
Peter Bell
f0b592dae7 Make masked_fill reference traceable (#90972)
As the comment states, `item()` cannot be used since you can't trace through a
scalar.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90972
Approved by: https://github.com/ngimel
2023-01-18 10:54:42 +00:00
min-jean-cho
fb50a4b4ce [Inductor] added aten.exponential_ decomp (#91673)
Fixes #91276

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91673
Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano
2023-01-18 09:19:35 +00:00
lezcano
d162c8f92b Assorted decomposition fixes (#87183)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87183
Approved by: https://github.com/ngimel
2023-01-17 16:53:31 +00:00
lezcano
da58f9eb8f Rewrite out-of-place decompositions in terms of out-of-place ops (#92003)
Fixes https://github.com/pytorch/torchdynamo/issues/1863

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92003
Approved by: https://github.com/ngimel
2023-01-17 16:53:27 +00:00
Peter Bell
fb1427ea8f squeeze: allow squeezing multiple dimensions at once (#89017)
Ref #70924

This addresses part 1 of the issue, allowing `torch.squeeze` to be
passed a tuple of dimensions. e.g.
```python
x.squeeze(0).squeeze(0)
```
can now be written
```python
x.squeeze((0, 1))
```
(assuming x has at least 2 dimensions)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89017
Approved by: https://github.com/albanD
2023-01-17 14:20:15 +00:00
David Berard
d7dc1c2fd5 Support zero dimensions in softmax decompositions (#91322)
The eager implementation of softmax supports computation along zero dimensions, but many of the other implementations did not, including:
* decompositions & refs (this was causing dynamo failures)
* forward AD for logsumexp
* MPS log_softmax_backward

This PR handles the `input.numel() == 0` cases separately to avoid running `amax()`, which fails for zero dimensions, and updates opinfos.

example of "computation along zero dimensions":

```python
# example of where
import torch

t = torch.rand((4, 0, 0))
print("~")
print(torch.nn.functional.softmax(t, dim=-1))  # this passes
print("~")
torch._refs.softmax(t, dim=-1)  # this fails
print("~")
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91322
Approved by: https://github.com/lezcano
2023-01-11 09:35:43 +00:00
Nikita Karetnikov
d1cc64b2ac [primTorch] Fix masking in logsumexp ref (#91941)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91941
Approved by: https://github.com/ngimel, https://github.com/lezcano
2023-01-10 10:55:04 +00:00
lezcano
138a0188e0 Add support for logaddexp(float16) in CUDA and implement its reference (#91869)
The reference is implemented so that it generates efficient and
numerically stable triton code.

Fixes https://github.com/pytorch/pytorch/issues/91683

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91869
Approved by: https://github.com/ngimel
2023-01-10 00:19:24 +00:00
Nikita Karetnikov
00e5f3a9c5 [primTorch] Move logsumexp decomp to refs (#91860)
Fixes #91843.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91860
Approved by: https://github.com/lezcano
2023-01-09 17:00:43 +00:00
Natalia Gimelshein
2c00064113 remove unnecessary decomps (#91828)
in favor of refs. Generated triton code is the same.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91828
Approved by: https://github.com/lezcano, https://github.com/soumith
2023-01-07 20:37:12 +00:00
PyTorch MergeBot
c73147f741 Revert "[decomp] Use new squeeze.dims overload in decompositions (#91602)"
This reverts commit 9262ffc692.

Reverted https://github.com/pytorch/pytorch/pull/91602 on behalf of https://github.com/clee2000 due to stacked pr was reverted, this is dependent
2023-01-05 20:39:52 +00:00
PyTorch MergeBot
df4b3b13bc Revert "squeeze: allow squeezing multiple dimensions at once (#89017)"
This reverts commit e26cb06681.

Reverted https://github.com/pytorch/pytorch/pull/89017 on behalf of https://github.com/mehtanirav due to Internal breakages
2023-01-05 19:25:08 +00:00
Peter Bell
9262ffc692 [decomp] Use new squeeze.dims overload in decompositions (#91602)
This removes the now-redundant `_squeeze_multiple` helpers and instead decomposes into a single call to `aten::squeeze.dims` which also has the effect of reducing the lowered graph size in inductor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91602
Approved by: https://github.com/ngimel
2023-01-05 17:59:32 +00:00
lezcano
700399e3f1 Make sure the ends of linspace are correct regardless of the precision (#91625)
This operation is usually called with small sizes, so the fact that this
adds a couple of operations should be alright. Even more, given the
structure of the data, the branching in the `where` is pretty much free.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91625
Approved by: https://github.com/peterbell10, https://github.com/ngimel
2023-01-05 00:23:19 +00:00
lezcano
223d1aa692 Improve linspace decomposition and remove its lowering (#91621)
The code produced by the lowering and the decomposition is now the same
modulo a casting to `float32`. This casting is necessary as otherwise
the tests do not pass due to accuracy errors. We prefer accuracy over
speed here, given that this is an associative scan, and thus it's prone
to numerical errors.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91621
Approved by: https://github.com/ngimel
2023-01-05 00:23:19 +00:00
Peter Bell
e26cb06681 squeeze: allow squeezing multiple dimensions at once (#89017)
Ref #70924

This addresses part 1 of the issue, allowing `torch.squeeze` to be
passed a tuple of dimensions. e.g.
```python
x.squeeze(0).squeeze(0)
```
can now be written
```python
x.squeeze((0, 1))
```
(assuming x has at least 2 dimensions)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89017
Approved by: https://github.com/albanD
2023-01-04 14:40:56 +00:00
Joel Schlosser
8b55b86dbd Move sym_int and sym_float alongside SymInt / SymFloat in base torch package (#91317)
This PR moves the definitions for:
* `sym_int`
* `sym_ceil` (used only for `sym_int`)
* `sym_floor` (used only for `sym_int`)
* `sym_float`

from `torch/fx/experimental/symbolic_shapes.py` to `torch/__init__.py`, where `SymInt` and `SymFloat` are already defined.

This removes the need for several in-line imports, and enables proper JIT script gating for #91318. I'm very open to doing this in a better way!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91317
Approved by: https://github.com/ezyang, https://github.com/anijain2305
2022-12-28 16:08:16 +00:00
Brian Hirsh
c47bdd7522 *_scatter ops should preserve input stride/storage_offset (#91029)
It turns out that we *do* need to update *_scatter ops to return the exact same strides as their inputs. I added a test to `test/test_functionalization.py`, which now trips thanks to Ed's functionalization stride debugging check. It only actually ends up tripping silent correctness if you try to .backward() on that function.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/91029
Approved by: https://github.com/ezyang
2022-12-22 19:41:53 +00:00
Nikita Shulga
fd3a7264ae [MPS] Add group_norm[fwd+backward] and mean_var (take 2) (#91190)
Use Prims to implement group_norm, group_norm_backward and mean_var

Use `torch._ops.ops` instead of `torch.ops` in numerous subpackages in
order to be able to make them importable from `torch/backend/mps/__init__.py` as this alias is defined in
15af4b1cee/torch/__init__.py (L1095)
is executed last during init process.

Add `__all__` to `torch/backends/mps/__init__.py` as well as alias all imports as private

Add `TestNNMPS.test_group_norm_backward` that validates no NaNs are generated during the backward pass

Fixes https://github.com/pytorch/pytorch/issues/88331
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91190
Approved by: https://github.com/albanD
2022-12-22 08:54:37 +00:00
PyTorch MergeBot
645eda0a00 Revert "[MPS] Add group_norm[fwd+backward] and mean_var (#91190)"
This reverts commit 371716eb36.

Reverted https://github.com/pytorch/pytorch/pull/91190 on behalf of https://github.com/kit1980 due to Broke test_correct_module_names because of underscore _ops
2022-12-21 19:37:43 +00:00
Nikita Shulga
371716eb36 [MPS] Add group_norm[fwd+backward] and mean_var (#91190)
Use Prims to implement group_norm, group_norm_backward and mean_var

Use `torch._ops.ops` instead of `torch.ops` in numerous subpackages in
order to be able to make them importable from `torch/backend/mps/__init__.py` as this alias is defined in
15af4b1cee/torch/__init__.py (L1095)
is executed last during init process.

Depends on https://github.com/pytorch/pytorch/pull/91203

Fixes https://github.com/pytorch/pytorch/issues/88331
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91190
Approved by: https://github.com/albanD
2022-12-21 17:33:27 +00:00
Nikita Shulga
c8546c930f [BE] Use aten global in torch._refs (#91189)
Similar to pattern used in `torch._decomp`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91189
Approved by: https://github.com/ngimel
2022-12-21 02:28:51 +00:00
soulitzer
98a9235dce Fix prelu ref when a.ndim < 2 (#89809)
Fixes https://github.com/pytorch/pytorch/issues/89560

Previously the test case for "input is 1-D or scalar + weight is not scalar" did not exist; adding it introduced some failures:
- forward AD (fixed in this PR)
- vmap (filed https://github.com/pytorch/pytorch/issues/89895)
- ref/meta (fixed this PR, though this also regresses nvFuser support)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89809
Approved by: https://github.com/ngimel
2022-12-12 23:55:31 +00:00
Peter Bell
79406378ae [primTorch] Add prim and ref for as_strided_scatter (#88426)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88426
Approved by: https://github.com/mruberry
2022-12-08 00:17:39 +00:00
Peter Bell
5caa27a3fd as_strided: Fix default storage_offset for reference implementation (#89513)
This fixes the default storage_offset to take it from the input. This was
previously untested, so I've also added a new OpInfo which includes samples with
non-zero storage_offsets on the input tensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89513
Approved by: https://github.com/ezyang, https://github.com/ngimel
2022-12-06 22:39:21 +00:00
Yanbo Liang
25f39c1bce Fix uniform ref implementation (#90094)
Fixes https://github.com/pytorch/torchdynamo/issues/1954

Pull Request resolved: https://github.com/pytorch/pytorch/pull/90094
Approved by: https://github.com/ngimel
2022-12-06 21:28:17 +00:00
PyTorch MergeBot
e645771e95 Revert "as_strided: Fix default storage_offset for reference implementation (#89513)"
This reverts commit ba70a8be03.

Reverted https://github.com/pytorch/pytorch/pull/89513 on behalf of https://github.com/kit1980 due to Broke multiple workflows, 2 unexpected successes for autograd tests
2022-12-06 07:14:16 +00:00
Peter Bell
ba70a8be03 as_strided: Fix default storage_offset for reference implementation (#89513)
This fixes the default storage_offset to take it from the input. This was
previously untested, so I've also added a new OpInfo which includes samples with
non-zero storage_offsets on the input tensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89513
Approved by: https://github.com/ezyang, https://github.com/ngimel
2022-12-06 04:07:16 +00:00
PyTorch MergeBot
8845a8f899 Revert "as_strided: Fix default storage_offset for reference implementation (#89513)"
This reverts commit eded97ac72.

Reverted https://github.com/pytorch/pytorch/pull/89513 on behalf of https://github.com/peterbell10 due to broke master
2022-12-05 17:53:23 +00:00
Peter Bell
eded97ac72 as_strided: Fix default storage_offset for reference implementation (#89513)
This fixes the default storage_offset to take it from the input. This was
previously untested, so I've also added a new OpInfo which includes samples with
non-zero storage_offsets on the input tensor.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89513
Approved by: https://github.com/ezyang, https://github.com/ngimel
2022-12-05 15:52:49 +00:00
Nikita Karetnikov
0a1a53083e [primTorch] Enable regex error testing for some refs (#87765)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87765
Approved by: https://github.com/mruberry
2022-11-23 23:36:27 +00:00
Peter Bell
ac19c5be82 FFT: disable dimension wrapping for scalar tensors (#89234)
Fixes #88985

By default, `maybe_wrap_dim` allows through `dim=0` or `dim=-1`
for scalar tensors which leads to an invalid dimension being used to
index into `tensor.sizes()` as in the code sample from the issue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89234
Approved by: https://github.com/mruberry
2022-11-23 21:55:00 +00:00
Sergii Dymchenko
504570d577 Delete unused variable assignment in _refs/__init__.py (#89538)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89538
Approved by: https://github.com/huydhn
2022-11-23 02:59:25 +00:00
Edward Z. Yang
dbeacf1182 Fix cat striding in PrimTorch (#89332)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89332
Approved by: https://github.com/ngimel
2022-11-20 04:05:33 +00:00
Sherlock Huang
caf3d5319f Symintify numel(), infer_size, prims.elementwise_meta (#88956)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88956
Approved by: https://github.com/ezyang
2022-11-20 00:42:03 +00:00
PyTorch MergeBot
8ad39536d7 Revert "Symintify numel(), infer_size, prims.elementwise_meta (#88956)"
This reverts commit ce2f8700ba.

Reverted https://github.com/pytorch/pytorch/pull/88956 on behalf of https://github.com/ezyang due to somehow breaks torch.numel
2022-11-19 21:47:55 +00:00
lezcano
154e58c032 Add most in-place references/decompositions (#88117)
We add most in-place references in a generic way. We also implement a
wrapper to implement the annoying interface that `nn.functional`
nonlinearities have.

We fix along the way a couple decompositions for some non-linearities by
extending the arguments that the references have.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88117
Approved by: https://github.com/mruberry
2022-11-18 14:59:46 +00:00
lezcano
ce0e22a81a Fix names of some reference functions (#88115)
The `__name__` field of some binary reference functions was wrong. We
fix this to be consistent with unary reference functions. In the future,
we should probably make the binary reference wrapper return a wrapper
itself to avoid all those calls to `partial`.

This change helps performing some homogeneous treatment of functions by
their name.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88115
Approved by: https://github.com/mruberry
2022-11-18 14:59:43 +00:00
Kazuaki Ishizaki
1cd6ebe095 Fix typos in messages under torch (#89049)
This PR fixes typos of messages in `.py` files under torch directory.
Only in `torch/onnx/symbolic_opset16.py`, fix a typo in comment to make the operator name correct.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89049
Approved by: https://github.com/lezcano
2022-11-17 04:18:14 +00:00
lezcano
e1ecf53d84 Simplify linspace decomp and increase its tolerance (#87203)
This is an interesting one

Since this is an operation that's intrinsically defined on the reals,
we should perform the ops on that dtype always, and just cast to
the desired dtype at the end. This simplifies the decomposition.

Now, I started looking at this one when I started seeing failures on a
test that's added in a later PR. What's going on here is that, by doing
an upcast to a higher dtype and then cast down to integers, sometimes
there's an off-by-one error. I think this is fine, as the decomposition
is more accurate than the original function, which goes in line with
the whole PrimTorch effort.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87203
Approved by: https://github.com/mruberry
2022-11-16 17:46:54 +00:00
Sherlock Huang
ce2f8700ba Symintify numel(), infer_size, prims.elementwise_meta (#88956)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88956
Approved by: https://github.com/ezyang
2022-11-16 03:36:00 +00:00
Khushi Agrawal
f1a5044de0 [primTorch] _refs & opinfo alpha_dropout (#87989)
Add _refs and OpInfo for `nn.functional.alpha_dropout`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87989
Approved by: https://github.com/mruberry
2022-11-14 18:18:45 +00:00
Natalia Gimelshein
06f1b52705 don't use prims.unsqueeze in group_norm (#88927)
inductor doesn't have prims.squeeze lowering, so this breaks it. Longer term, `squeeze` with multiple dimensions is not a prim, nvfuser implements it with a loop, inductor uses `_squeeze_multiple` helper which turns it into a loop. Prim should accept only a single dimension.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88927
Approved by: https://github.com/eellison
2022-11-14 17:37:24 +00:00
Nikita Karetnikov
76af71444a [primTorch] Add ref for complex (#88562)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88562
Approved by: https://github.com/ezyang
2022-11-13 20:31:16 +00:00
Nikita Karetnikov
4270bb37da [primTorch] Improve narrow and narrow_copy: refs, tests, docs (#87045)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87045
Approved by: https://github.com/mruberry
2022-11-12 15:03:50 +00:00
Sherlock Huang
495e7b1c72 Ref for aten.full; symint changes in prim (#88762)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88762
Approved by: https://github.com/ezyang
2022-11-11 02:32:09 +00:00
Ryan Spring
534ae6ae47 [primTorch] Implement group norm reference (#87054)
Add group norm reference
Split from #81191
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87054
Approved by: https://github.com/mruberry
2022-11-11 01:08:20 +00:00
PyTorch MergeBot
93d3bd626e Revert "[primTorch] Improve narrow and narrow_copy: refs, tests, docs (#87045)"
This reverts commit aa8279bcb8.

Reverted https://github.com/pytorch/pytorch/pull/87045 on behalf of https://github.com/izaitsevfb due to BC-breaking change, D41161182
2022-11-09 20:48:32 +00:00
Nikita Karetnikov
aa8279bcb8 [primTorch] Improve narrow and narrow_copy: refs, tests, docs (#87045)
Fixes #87019.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87045
Approved by: https://github.com/mruberry
2022-11-09 09:19:28 +00:00
Edward Z. Yang
860e354d1c Support diag_embed.out decomposition (#88671)
This is a little tricky: there is a diag_embed.out, but its not bound
in Python because it's autogenerated, see https://github.com/pytorch/pytorch/issues/88598
So I can't "just" add the out variant to the ref, as this makes it
inconsistent with the torch API.  To workaround this, I mark the ref
as supporting out, but not the original function.

This is useful to do, because it means that diag_embed.out now supports
symbolic shapes.  However, this cannot be easily tested because
I can't mark the out variant as being supported in the normal OpInfo test.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88671
Approved by: https://github.com/mruberry
2022-11-08 18:28:36 +00:00
lezcano
1a7c4b0de7 Create _make_alias to preserve the name of a function when creating an alias (#88114)
Before, we would inherit the name of the aliased function, which was
very confusing, and disallowed some homogeneous treatment of references,
as we do later in this stack

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88114
Approved by: https://github.com/mruberry
2022-11-08 13:09:34 +00:00
Sherlock Huang
95d57b54e0 Handle pin_memory in refs.randn (#88473)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88473
Approved by: https://github.com/mruberry
2022-11-07 20:25:56 +00:00
lezcano
39d9d2ed70 Implement reference for lerp (#87424)
We follow the vectorised CPU implementation for numerical accuracy

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87424
Approved by: https://github.com/ezyang
2022-11-02 11:21:01 +00:00
Sherlock Huang
0a4ca9d083 Fix meta for aten.angle and aten.index_copy (#88066)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88066
Approved by: https://github.com/albanD
2022-10-31 17:11:29 +00:00
Khushi
a3f8495b84 [primTorch fix] use _maybe_convert_to_dtype (#85163)
Fixes #84561

- [x] fix lint tests

cc: @Lezcano!!

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85163
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-31 17:08:55 +00:00
Sherlock Huang
5723fd503c Fix meta function for aten.flip and aten.rot90 (#88065)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88065
Approved by: https://github.com/mruberry
2022-10-31 16:52:05 +00:00
Sherlock Huang
e8a97a3721 FakeTensorMode and Prims.add/sub/mul/div support scalar only inputs (#87759)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87759
Approved by: https://github.com/ngimel, https://github.com/mruberry, https://github.com/eellison
2022-10-28 04:34:25 +00:00
lezcano
fd27246c16 Fix decomposition for std (#87181)
The previous implementation was lacking a few features and incurred on a
pretty large error

cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87181
Approved by: https://github.com/ngimel, https://github.com/peterbell10
2022-10-28 00:50:29 +00:00
lezcano
f21d0b310c Add decomposition for diagonal_scatter (#87282)
cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87282
Approved by: https://github.com/mruberry
2022-10-28 00:50:29 +00:00
Sherlock Huang
b21fe312c0 Fix meta for index_add and index_put (#87775)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87775
Approved by: https://github.com/ezyang, https://github.com/ngimel
2022-10-26 20:33:23 +00:00
Sherlock Huang
0b162f5b49 Fix stride for prims.where (#87563)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87563
Approved by: https://github.com/ngimel, https://github.com/mruberry
2022-10-25 21:22:50 +00:00
Sherlock Huang
ece3758afc Fix _refs for aten.zeros/ones/empty/randn (#87569)
refs for aten.zeros/ones/empty/randn doesn't support .names overload.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87569
Approved by: https://github.com/ngimel
2022-10-25 20:06:57 +00:00
Sherlock Huang
eb99c1efce Prefer python meta function over c++ meta function (#87426)
This is a policy update for meta registration. **We now prefer python meta implementation over C++ meta function.**  This is a flip of the previous policy, where we prefer C++ meta function over python meta function if they both exist.

Here's the meta registration process:
1. register_meta and register_decomposition will place the python meta/decomp functions into the `global_decomp_table`.  However, they will NOT register them into dispatcher.
2. After global_decomp_table is populated, we will compile an `active_meta_table`. For a given op, we pick the most specific decomp function from `global_decomp_table` in the preference order of Meta > PostAutograd > PreAutograd.
3. We will unconditionally register all of them into python dispatcher. And register them into C++ dispatcher, unless it one of the following 3 cases
- 1. the op is a CompositeImplicitAutograd, and should rely on decomposed op's meta
- 2. the op is a view op, as the MetaTensor doesn't support aliased storage
- 3. the op is in the blocklist (due to UT failures, and we will burn down this list op by op)

Over the long run, we wish to implement all meta functions in python. With this PR, 321 op_overloads will have cpp meta overridden by python meta. There are still 400 op_overloads is using cpp meta. The exact list can be found here https://gist.github.com/SherlockNoMad/d20bb736178df8eebd3b054c8bb7cdc5

cc @ngimel @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87426
Approved by: https://github.com/ezyang, https://github.com/jansel
2022-10-25 16:49:02 +00:00
lezcano
faf9c47abb Simplify a few diagonal-related functions (#87180)
`diag` was unnecessarily implemented as a kernel rather than as a composite
function, which made it unnecessarily difficult (explicit backward + all it entails).

We also change a few uses of `diag` on 2D tensors for `diagonal()`. The
latter returns a view rather than creating a new tensor.

We also upgrade its meta implementation to a fully-fledged
decomposition

I tried implementing the backwards of `diagonal()` via `diag_scatter` (or better `diag_scatter_` to keep the perf) but functionalisation was failing and I was not sure how to fix this, so I moved on. It may be possible to simplify that one as well if @soulitzer or someone knows how to do this.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87180
Approved by: https://github.com/ngimel, https://github.com/albanD, https://github.com/mruberry
2022-10-24 06:11:53 +00:00
lezcano
08c2314d98 [PrimTorch] Add maker for *_copy variants of view functions (#87278)
Implements `diagonal_copy` as an example. This PR also fixes a number of
correcness issues with `diagonal_copy`.

cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87278
Approved by: https://github.com/mruberry
2022-10-24 06:11:53 +00:00
Ryan Spring
9bb4926de0 Add xlogy and xlog1py references (#77712)
* Add reference implementations for `xlogy` and `xlog1py`
 * Replace `_wrap_scalar` helper function with `scalar_tensor` prim
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77712
Approved by: https://github.com/mruberry
2022-10-22 17:59:25 +00:00
Edward Z. Yang
d73d4aa7de Audit for error prone isinstance int/float and add lint (#87345)
We recently fixed a bug on symbolic-shapes branch where
an isinstance(x, int) test failed when passed a SymIntNode.
To prevent this, I've added a lint for all the codepaths
where we may pass SymInt/SymFloat directly to reject
direct isinstance int/float tests, and instead use one of
the aliases.  The lint rule explains the options.  I then
go and fix all of them.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87345
Approved by: https://github.com/bdhirsh, https://github.com/albanD
2022-10-21 15:55:24 +00:00
Nikita Karetnikov
1b8af28fe8 [primTorch] Add refs for softmax, softmin, log_softmax (#84956)
cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84956
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-20 12:29:04 +00:00
PyTorch MergeBot
cd21613526 Revert "[primTorch] Add refs for softmax, softmin, log_softmax (#84956)"
This reverts commit c09ca93e47.

Reverted https://github.com/pytorch/pytorch/pull/84956 on behalf of https://github.com/ZainRizvi due to This is causing the MPS test test_output_match_log_softmax_with_dtype_cpu_float32 (__main__.TestConsistencyCPU) to fail
2022-10-19 20:36:55 +00:00
Nikita Karetnikov
c09ca93e47 [primTorch] Add refs for softmax, softmin, log_softmax (#84956)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84956
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-19 18:45:40 +00:00
Nikita Karetnikov
b886cd15f5 [primTorch] Add a ref for NumPy-style T (#86850)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86850
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-18 10:19:47 +00:00
Nikita Karetnikov
841995d53b [primTorch] Add refs for data conversion ops (#86561)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86561
Approved by: https://github.com/lezcano, https://github.com/mruberry, https://github.com/zou3519
2022-10-18 08:38:51 +00:00
Nikita Karetnikov
91b3cd0b5a [primTorch] Add a ref for narrow_copy (#86748)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86748
Approved by: https://github.com/mruberry
2022-10-17 10:16:05 +00:00
Ryan Spring
847ded6db3 [primTorch] Implement NLL loss reference (#81128)
Add Reference:
- nll_loss

Depends on:
- expand https://github.com/pytorch/pytorch/pull/79820
- advance indexing

Pull Request resolved: https://github.com/pytorch/pytorch/pull/81128
Approved by: https://github.com/mruberry
2022-10-17 06:20:31 +00:00
Nikita Karetnikov
4460e40db4 [primTorch] Add a ref for addcmul (#86731)
Based on:
https://github.com/pytorch/pytorch/pull/79827
https://github.com/pytorch/pytorch/pull/72949
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86731
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-14 14:26:23 +00:00
Will Constable
b97ae59e29 Change legacy wrap_dim to work with symint == (#86842)
- previously, sizes == vector<T>({0}) failed to hit SymInt::operator==, causing a the loop to bail out too early and make an invalid call to downstream maybe_wrap_dim helper

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86842
Approved by: https://github.com/Chillee, https://github.com/malfet, https://github.com/albanD
2022-10-13 15:10:46 +00:00
Brian Hirsh
e17732b234 [test] add cross-ref tests for python meta kernels (#86228)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86228
Approved by: https://github.com/albanD
2022-10-13 14:14:26 +00:00
Khushi Agrawal
77d29bcee2 [primTorch] special: ndtr, ndtri, log_ndtr, erfcx (#86077)
- Adds prims and _refs for `erfcx` and `ndtri`.
- Adds _refs for `ndtr`, and `log_ndtr`.

cc @kshitij12345 @lezcano @mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86077
Approved by: https://github.com/mruberry
2022-10-13 01:18:30 +00:00
Nikita Karetnikov
d56017a14f [primTorch] Add ref for triplet_margin_loss, improve triplet_margin_with_distance_loss (#85614)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85614
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-12 18:37:58 +00:00
Ivan Yashchuk
cd7c86eaa4 Add prims.clone (#86705)
This simple PR adds `clone` as a primitive.
Current implementation of `clone` is not supported with nvFuser executor because of `empty_like` + `copy_to`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86705
Approved by: https://github.com/mruberry
2022-10-12 18:22:00 +00:00
Fabio Rocha
493ded249e [primTorch] decomposition for bucketize (#86366)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86366
Approved by: https://github.com/mruberry
2022-10-12 12:25:42 +00:00
Khushi
2344135179 [primTorch] special: entr, expit (#86592)
Add _refs for `entr` & `expit`.

cc @mruberry @kshitij12345!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86592
Approved by: https://github.com/mruberry
2022-10-12 07:00:40 +00:00
Sherlock Huang
8a3a54e012 Fix index_select decomp (#86469)
For decomposing index_select with 0-dim tensor, we cannot write `x.unsqueeze(0)[index].squeeze(0).clone()` , as tensor[index] will trigger index.item() if index is a 0-dim tensor, and .item() cannot be symbolically traced with FakeTensor.

We use `torch.ops.aten.index(x.unsqueeze(0), [index]).squeeze(0).clone()` as a workaround.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86469
Approved by: https://github.com/ngimel
2022-10-07 22:59:49 +00:00