`torch._C.has_mkldnn` does not respect cases where users try to disable mkldnn using `torch._C._set_mkldnn_enabled()`. This is relevant to edge use cases, where they do not want decompositions to go to the ATen opset, and do not want the mkldnn operator to appear in the graph.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115448
Approved by: https://github.com/jgong5, https://github.com/ydwu4
For `_scaled_dot_product_flash_attention` we don't have
`Tensor? attn_mask=None`
but `scaled_dot_product_attention` has. In the original decomp there's a
mixup where I added this argument to
`_scaled_dot_product_flash_attention`.
Fix it so that `_scaled_dot_product_flash_attention` is being decomposed correctly.
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113102
Approved by: https://github.com/ezyang
## Context
Add decompositions for `aten.max`, `aten.min`, and `aten.var_mean`. These operators follow a pattern of returning a tuple of outputs from two component operators:
```
aten.max(x) -> return aten.amax(x), aten.argmax(x)
aten.min(x) -> return aten.amin(x), aten.argmin(x)
aten.var_mean(x) -> return aten.var(x), aten.mean(x)
```
For `var_mean`, the `refs` implementation was doing something similar, so I changed it to call `torch.` ops instead like was done for other `refs` implementations previously. cc: @peterbell10 @lezcano
Note that Inductor lowers all these directly, so they are excluded from the Inductor decomp table.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110906
Approved by: https://github.com/manuelcandales
Summary:
## Context
Both `aten.sum` and `aten.squeeze` have a "most generic" variant in the form of `aten.sum.dim_IntList` and `aten.squeeze.dims` respectively. Add decompositions for other non generic variants of these operators to express them using the most generic variant.
Note that to register these decomps, the reference implementation under `_refs` had to be removed as registered decompositions. cc: @lezcano @peterbell10
Test Plan: Github CI + Meta Internal CI
Differential Revision: D49965952
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110645
Approved by: https://github.com/peterbell10, https://github.com/digantdesai, https://github.com/manuelcandales
## Context
Add existing decomps for `lift_fresh`, `split.Tensor`, and `unbind` to the core ATen decomposition table. Do not use them in inductor, since Inductor currently lowers these directly.
One note though is that `lift_fresh`'s decomposition has a note saying it's not correct under autograd. However, my understanding is that these decompositions are registered to the `"post_autograd"` decomposition table, meaning autograd wouldn't be a factor. Would like some confirmation that this premise is correct.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110102
Approved by: https://github.com/jansel
## Context
Introduce a core decomposition for `aten.floor_divide` into other `aten` ops, and add it to the core ATen decomposition table.
This replaces the decomposition of `floor_divide` that was used by Inductor. I noticed there was a note on that decomposition
```
# TorchInductor-only decomposition. It should not be taken to core.
# See https://github.com/pytorch/torchdynamo/pull/1120
```
but couldn't discern the reason why this is the case. cc: @lezcano
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110046
Approved by: https://github.com/peterbell10
- Extend `test_torch_dispatch_meta_outplace` to test torch ops that do not have an out parameter but have aten op overloads that have out parameters. Additionally, Python decompositions may register `OpOverloadPacket`'s so decompositions need to be tested to ensure all `OpOverloads` still function for the `Meta` key (e.g. if a python decomposition is registered for an aten op `aten.foo` with overloads `[default, out]`, the python function needs to support receiving out arguments)
- Add out parameter wrappers to python decomps for aten ops that have out overloads
CC. @ezyang @albanD @lezcano
Fixes#107713
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107707
Approved by: https://github.com/lezcano
Python decompositions wrapped by `out_wrapper` need to be unwrapped before compiling with TorchScript since:
- `out_wrapper` extends the decompositions signature with an out parameter, however this `out` parameter is not present in the source code of the original decomposition so the resulting `ScriptFunction` will not have an `out` parameter
- `out_wrapper` is in the `torch._prims_common.wrappers` module so its `globals()` are different to the globals of the decomposition to be wrapped. This may cause symbol resolution to fail with the TorchScript compiler since it is compiling the unwrapped decomps source code rather than the wrapper
The python decomposition for `aten.trace` is wrapped as an example, other decompositions are to be fixed in https://github.com/pytorch/pytorch/pull/107707
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109367
Approved by: https://github.com/lezcano
The "safety" aspect refers to the output not being registered as aliasing the
input, but after AOTAutograd I don't think this distinction matters. However,
we shouldn't use the same decomposition as the safe variant in case the backend
doesn't want to decompose split.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109668
Approved by: https://github.com/lezcano
ghstack dependencies: #109667