Fixes#88985
By default, `maybe_wrap_dim` allows through `dim=0` or `dim=-1`
for scalar tensors which leads to an invalid dimension being used to
index into `tensor.sizes()` as in the code sample from the issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89234
Approved by: https://github.com/mruberry
We add most in-place references in a generic way. We also implement a
wrapper to implement the annoying interface that `nn.functional`
nonlinearities have.
We fix along the way a couple decompositions for some non-linearities by
extending the arguments that the references have.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88117
Approved by: https://github.com/mruberry
The `__name__` field of some binary reference functions was wrong. We
fix this to be consistent with unary reference functions. In the future,
we should probably make the binary reference wrapper return a wrapper
itself to avoid all those calls to `partial`.
This change helps performing some homogeneous treatment of functions by
their name.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88115
Approved by: https://github.com/mruberry
This is an interesting one
Since this is an operation that's intrinsically defined on the reals,
we should perform the ops on that dtype always, and just cast to
the desired dtype at the end. This simplifies the decomposition.
Now, I started looking at this one when I started seeing failures on a
test that's added in a later PR. What's going on here is that, by doing
an upcast to a higher dtype and then cast down to integers, sometimes
there's an off-by-one error. I think this is fine, as the decomposition
is more accurate than the original function, which goes in line with
the whole PrimTorch effort.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87203
Approved by: https://github.com/mruberry
inductor doesn't have prims.squeeze lowering, so this breaks it. Longer term, `squeeze` with multiple dimensions is not a prim, nvfuser implements it with a loop, inductor uses `_squeeze_multiple` helper which turns it into a loop. Prim should accept only a single dimension.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88927
Approved by: https://github.com/eellison
This is a little tricky: there is a diag_embed.out, but its not bound
in Python because it's autogenerated, see https://github.com/pytorch/pytorch/issues/88598
So I can't "just" add the out variant to the ref, as this makes it
inconsistent with the torch API. To workaround this, I mark the ref
as supporting out, but not the original function.
This is useful to do, because it means that diag_embed.out now supports
symbolic shapes. However, this cannot be easily tested because
I can't mark the out variant as being supported in the normal OpInfo test.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88671
Approved by: https://github.com/mruberry
This is a policy update for meta registration. **We now prefer python meta implementation over C++ meta function.** This is a flip of the previous policy, where we prefer C++ meta function over python meta function if they both exist.
Here's the meta registration process:
1. register_meta and register_decomposition will place the python meta/decomp functions into the `global_decomp_table`. However, they will NOT register them into dispatcher.
2. After global_decomp_table is populated, we will compile an `active_meta_table`. For a given op, we pick the most specific decomp function from `global_decomp_table` in the preference order of Meta > PostAutograd > PreAutograd.
3. We will unconditionally register all of them into python dispatcher. And register them into C++ dispatcher, unless it one of the following 3 cases
- 1. the op is a CompositeImplicitAutograd, and should rely on decomposed op's meta
- 2. the op is a view op, as the MetaTensor doesn't support aliased storage
- 3. the op is in the blocklist (due to UT failures, and we will burn down this list op by op)
Over the long run, we wish to implement all meta functions in python. With this PR, 321 op_overloads will have cpp meta overridden by python meta. There are still 400 op_overloads is using cpp meta. The exact list can be found here https://gist.github.com/SherlockNoMad/d20bb736178df8eebd3b054c8bb7cdc5
cc @ngimel @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87426
Approved by: https://github.com/ezyang, https://github.com/jansel
`diag` was unnecessarily implemented as a kernel rather than as a composite
function, which made it unnecessarily difficult (explicit backward + all it entails).
We also change a few uses of `diag` on 2D tensors for `diagonal()`. The
latter returns a view rather than creating a new tensor.
We also upgrade its meta implementation to a fully-fledged
decomposition
I tried implementing the backwards of `diagonal()` via `diag_scatter` (or better `diag_scatter_` to keep the perf) but functionalisation was failing and I was not sure how to fix this, so I moved on. It may be possible to simplify that one as well if @soulitzer or someone knows how to do this.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87180
Approved by: https://github.com/ngimel, https://github.com/albanD, https://github.com/mruberry
We recently fixed a bug on symbolic-shapes branch where
an isinstance(x, int) test failed when passed a SymIntNode.
To prevent this, I've added a lint for all the codepaths
where we may pass SymInt/SymFloat directly to reject
direct isinstance int/float tests, and instead use one of
the aliases. The lint rule explains the options. I then
go and fix all of them.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87345
Approved by: https://github.com/bdhirsh, https://github.com/albanD