pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
PyTorch MergeBot	2891cecd8d	Revert "Add meta kernel coverage for aten.unsafe_split, aten.unsafe_chunk (#92608 )" This reverts commit `4386f317b9`. Reverted https://github.com/pytorch/pytorch/pull/92608 on behalf of https://github.com/ZainRizvi due to test_aot_autograd_symbolic_exhaustive_unsafe_split_cpu_float32 (__main__.TestEagerFusionOpInfoCPU) is failing consistently since this PR was merged	2023-01-20 17:17:35 +00:00
Tugsbayasgalan Manlaibaatar	4386f317b9	Add meta kernel coverage for aten.unsafe_split, aten.unsafe_chunk (#92608 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/92608 Approved by: https://github.com/ngimel	2023-01-20 12:39:56 +00:00
lezcano	8b861544f9	Remove lowering and decompositions of zero_, zero, zeros_like... in favour of their references (#92071 ) The generated triton code is identical. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92071 Approved by: https://github.com/ngimel	2023-01-18 23:22:36 +00:00
Peter Bell	8770a7ed6f	Decompose more inplace ops (#90967 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90967 Approved by: https://github.com/anijain2305	2023-01-18 21:07:47 +00:00
Peter Bell	4058dedf21	Replace log(1 + x) with log1p(x) (#92114 ) `log1p` offers better precision near zero since `(1 + x) - 1` truncates any values less than the float epsilon to zero. For `soft_margin_loss` this also requires one fewer kernel invocation which for numel=1e7 gives me a 1.2x speedup on CUDA and a 1.1x speedup on CPU. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92114 Approved by: https://github.com/ngimel, https://github.com/lezcano	2023-01-18 10:43:56 +00:00
lezcano	da58f9eb8f	Rewrite out-of-place decompositions in terms of out-of-place ops (#92003 ) Fixes https://github.com/pytorch/torchdynamo/issues/1863 Pull Request resolved: https://github.com/pytorch/pytorch/pull/92003 Approved by: https://github.com/ngimel	2023-01-17 16:53:27 +00:00
vfdev-5	5f55335c2e	Fixed output memory format mismatch for bicubic2d (#90470 ) Description: - output memory format is matching input for bicubic2d Problem: output tensor's memory format does not match input format for bicubic2d ```python import torch i = torch.rand(1, 3, 32, 32).contiguous(memory_format=torch.channels_last) assert i.is_contiguous(memory_format=torch.channels_last) o = torch.nn.functional.interpolate(i, size=(4, 4), mode="bicubic") assert o.is_contiguous(memory_format=torch.channels_last), f"Should be channels last but given channels first ({o.is_contiguous(memory_format=torch.contiguous_format)})" > AssertionError: Should be channels last but given channels first (True) ``` Related PR fixing bilinear ops: https://github.com/pytorch/pytorch/pull/53535 (cc @VitalyFedyunin @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @bdhirsh ) Discovered together with @NicolasHug while working on https://github.com/pytorch/pytorch/tree/interpolate_uint8_images_linear_cpu_support_dev - Updated code to match grad input / output memory formats - temporary tensor creation matches memory format in `separable_upsample_generic_Nd_kernel_impl` - Updated tests - Added missing forward AD support for bicubic with antialiasing Pull Request resolved: https://github.com/pytorch/pytorch/pull/90470 Approved by: https://github.com/NicolasHug, https://github.com/lezcano	2023-01-12 19:52:28 +00:00
min-jean-cho	af242eedfb	[Inductor] Added aten.uniform_ decomp (#90869 ) Fixes #90815 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90869 Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano, https://github.com/ngimel, https://github.com/albanD	2023-01-11 23:23:42 +00:00
David Berard	d7dc1c2fd5	Support zero dimensions in softmax decompositions (#91322 ) The eager implementation of softmax supports computation along zero dimensions, but many of the other implementations did not, including: * decompositions & refs (this was causing dynamo failures) * forward AD for logsumexp * MPS log_softmax_backward This PR handles the `input.numel() == 0` cases separately to avoid running `amax()`, which fails for zero dimensions, and updates opinfos. example of "computation along zero dimensions": ```python # example of where import torch t = torch.rand((4, 0, 0)) print("~") print(torch.nn.functional.softmax(t, dim=-1)) # this passes print("~") torch._refs.softmax(t, dim=-1) # this fails print("~") ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91322 Approved by: https://github.com/lezcano	2023-01-11 09:35:43 +00:00
XiaobingSuper	3790b50505	inductor: fix .to(memort_format) issue which doesn't generate right stride (#91948 ) Motivation: for .to(memory_format), the inductor doesn't generate the right stride, see the following example: ``` class Model(torch.nn.Module): def __init__(self): super(Model, self).__init__() def forward(self, x): x = x.to(memory_format=torch.contiguous_format) return x ``` the generated code doesn't do the memory format change and gets a wrong stride (802816, 1, 14336, 256), it is not a contiguous stride. ``` from ctypes import c_void_p, c_long import torch import random from torch import empty_strided, as_strided, device from torch._inductor.codecache import AsyncCompile aten = torch.ops.aten assert_size_stride = torch._C._dynamo.guards.assert_size_stride async_compile = AsyncCompile() async_compile.wait(globals()) del async_compile def call(args): arg0_1, = args args.clear() return (arg0_1, ) if __name__ == "__main__": from torch._dynamo.testing import rand_strided from torch._inductor.utils import print_performance arg0_1 = rand_strided((128, 256, 56, 56), (802816, 1, 14336, 256), device='cpu', dtype=torch.float32) print_performance(lambda: call([arg0_1])) ``` After this PR, the will have a memory format change: ``` from ctypes import c_void_p, c_long import torch import random from torch import empty_strided, as_strided, device from torch._inductor.codecache import AsyncCompile aten = torch.ops.aten assert_size_stride = torch._C._dynamo.guards.assert_size_stride async_compile = AsyncCompile() kernel_cpp_0 = async_compile.cpp(''' #include "/tmp/torchinductor_xiaobing/77/c7773nj5pwikpmm2pwa62rcudlf7p3if7eyqb5k4sjsvewwje4le.h" extern "C" void kernel(const float* __restrict__ in_ptr0, float* __restrict__ out_ptr0) { #pragma omp parallel num_threads(40) { { #pragma omp for for(long i0=0; i0<128; i0+=1) { #pragma GCC ivdep for(long i1=0; i1<256; i1+=1) { #pragma GCC ivdep for(long i2=0; i2<3136; i2+=1) { auto tmp0 = in_ptr0[i1 + (256i2) + (802816i0)]; out_ptr0[i2 + (3136i1) + (802816i0)] = tmp0; } } } } } } ''') async_compile.wait(globals()) del async_compile def call(args): arg0_1, = args args.clear() buf1 = empty_strided((128, 256, 56, 56), (802816, 3136, 56, 1), device='cpu', dtype=torch.float32) kernel_cpp_0(c_void_p(arg0_1.data_ptr()), c_void_p(buf1.data_ptr())) del arg0_1 return (buf1, ) if __name__ == "__main__": from torch._dynamo.testing import rand_strided from torch._inductor.utils import print_performance arg0_1 = rand_strided((128, 256, 56, 56), (802816, 1, 14336, 256), device='cpu', dtype=torch.float32) print_performance(lambda: call([arg0_1])) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91948 Approved by: https://github.com/ngimel	2023-01-11 08:23:26 +00:00
min-jean-cho	364f526b9c	[Inductor] assert generator for random, dropout (#91833 ) See comment https://github.com/pytorch/pytorch/pull/90869#discussion_r1063731541 , https://github.com/pytorch/pytorch/pull/91673#discussion_r1061099337. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91833 Approved by: https://github.com/jansel	2023-01-11 03:24:10 +00:00
PyTorch MergeBot	43050b8301	Revert "[Inductor] Added aten.uniform_ decomp (#90869 )" This reverts commit `c55293d640`. Reverted https://github.com/pytorch/pytorch/pull/90869 on behalf of https://github.com/huydhn due to Crossref error cannot just simply be ignored because it would break trunk for every commits after this, i.e. `fd0030fe74`. The failure would need to be handled gracefully, i.e. adding an XFAIL for example	2023-01-11 01:18:11 +00:00
min-jean-cho	c55293d640	[Inductor] Added aten.uniform_ decomp (#90869 ) Fixes #90815 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90869 Approved by: https://github.com/jgong5, https://github.com/jansel, https://github.com/lezcano, https://github.com/ngimel, https://github.com/albanD	2023-01-10 23:05:01 +00:00
Nikita Karetnikov	00e5f3a9c5	[primTorch] Move `logsumexp` decomp to refs (#91860 ) Fixes #91843. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91860 Approved by: https://github.com/lezcano	2023-01-09 17:00:43 +00:00
Natalia Gimelshein	2c00064113	remove unnecessary decomps (#91828 ) in favor of refs. Generated triton code is the same. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91828 Approved by: https://github.com/lezcano, https://github.com/soumith	2023-01-07 20:37:12 +00:00
PyTorch MergeBot	c73147f741	Revert "[decomp] Use new squeeze.dims overload in decompositions (#91602 )" This reverts commit `9262ffc692`. Reverted https://github.com/pytorch/pytorch/pull/91602 on behalf of https://github.com/clee2000 due to stacked pr was reverted, this is dependent	2023-01-05 20:39:52 +00:00
Peter Bell	9262ffc692	[decomp] Use new squeeze.dims overload in decompositions (#91602 ) This removes the now-redundant `_squeeze_multiple` helpers and instead decomposes into a single call to `aten::squeeze.dims` which also has the effect of reducing the lowered graph size in inductor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91602 Approved by: https://github.com/ngimel	2023-01-05 17:59:32 +00:00
lezcano	484dd40022	Implement PReLU in a compositional way (#91238 ) The PReLU implementation was all over the place. This lead to a number of bugs like https://github.com/pytorch/pytorch/issues/68760. We fix it by: - Keeping the weird broadcasting logic it has as a CompositeImplicit kernel that calls into a second kernel - This second kernel is just a good-ol' pointwise kernel. - We implement the derivative for the pointwise kernel via TI as well for speed. - We implement the second derivative for the pointwise kernel and the forward AD derivatives compositionally This fixes a number of issues: - We don't perform copies any more when the inputs are not contiguous - The derivatives are now correct - We fix vmap and many other functorch-related issues. - CPU and CUDA now share the relevant broadcasting logic - The implementation is about 1/3 the length. Fixes https://github.com/pytorch/pytorch/issues/68760 Fixes https://github.com/pytorch/pytorch/issues/89895 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91238 Approved by: https://github.com/kshitij12345, https://github.com/jbschlosser, https://github.com/albanD	2022-12-30 10:42:30 +00:00
Joel Schlosser	8b55b86dbd	Move sym_int and sym_float alongside SymInt / SymFloat in base torch package (#91317 ) This PR moves the definitions for: * `sym_int` * `sym_ceil` (used only for `sym_int`) * `sym_floor` (used only for `sym_int`) * `sym_float` from `torch/fx/experimental/symbolic_shapes.py` to `torch/__init__.py`, where `SymInt` and `SymFloat` are already defined. This removes the need for several in-line imports, and enables proper JIT script gating for #91318. I'm very open to doing this in a better way! Pull Request resolved: https://github.com/pytorch/pytorch/pull/91317 Approved by: https://github.com/ezyang, https://github.com/anijain2305	2022-12-28 16:08:16 +00:00
Joel Schlosser	1c40ec46ff	Decomps and meta registrations for upsample_nearest 1D / 2D / 3D (#91260 ) Adds decompositions and meta registrations for the 1D, 2D, and 3D implementations of `upsample_nearest`. All related OpInfo-based tests for AOTAutograd now pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91260 Approved by: https://github.com/ezyang	2022-12-28 16:03:25 +00:00
Nikita Shulga	fd3a7264ae	[MPS] Add `group_norm[fwd+backward]` and `mean_var` (take 2) (#91190 ) Use Prims to implement group_norm, group_norm_backward and mean_var Use `torch._ops.ops` instead of `torch.ops` in numerous subpackages in order to be able to make them importable from `torch/backend/mps/__init__.py` as this alias is defined in `15af4b1cee/torch/__init__.py (L1095)` is executed last during init process. Add `__all__` to `torch/backends/mps/__init__.py` as well as alias all imports as private Add `TestNNMPS.test_group_norm_backward` that validates no NaNs are generated during the backward pass Fixes https://github.com/pytorch/pytorch/issues/88331 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91190 Approved by: https://github.com/albanD	2022-12-22 08:54:37 +00:00
PyTorch MergeBot	645eda0a00	Revert "[MPS] Add `group_norm[fwd+backward]` and `mean_var` (#91190 )" This reverts commit `371716eb36`. Reverted https://github.com/pytorch/pytorch/pull/91190 on behalf of https://github.com/kit1980 due to Broke test_correct_module_names because of underscore _ops	2022-12-21 19:37:43 +00:00
Nikita Shulga	371716eb36	[MPS] Add `group_norm[fwd+backward]` and `mean_var` (#91190 ) Use Prims to implement group_norm, group_norm_backward and mean_var Use `torch._ops.ops` instead of `torch.ops` in numerous subpackages in order to be able to make them importable from `torch/backend/mps/__init__.py` as this alias is defined in `15af4b1cee/torch/__init__.py (L1095)` is executed last during init process. Depends on https://github.com/pytorch/pytorch/pull/91203 Fixes https://github.com/pytorch/pytorch/issues/88331 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91190 Approved by: https://github.com/albanD	2022-12-21 17:33:27 +00:00
Nikita Shulga	46f64117db	[BE] Use `aten` global var (#91188 ) s/torch.ops.aten/aten/ Pull Request resolved: https://github.com/pytorch/pytorch/pull/91188 Approved by: https://github.com/ngimel	2022-12-21 02:28:51 +00:00
Peter Bell	e670c261c5	Decompose fill, zero, and zeros_like (#90968 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/90968 Approved by: https://github.com/ngimel	2022-12-21 00:59:50 +00:00
Natalia Gimelshein	e689c50922	Don't recompute var in bn decomp (#90984 ) Fixes https://github.com/pytorch/torchdynamo/issues/1988 Repeated `var` computation is not CSE'd for some reason. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90984 Approved by: https://github.com/Chillee	2022-12-16 21:38:49 +00:00
Brian Hirsh	7a683eaeb8	aot_autograd: add assert for functional-only graph (#88816 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88816 Approved by: https://github.com/ezyang, https://github.com/ngimel	2022-12-16 21:04:36 +00:00
soulitzer	98a9235dce	Fix prelu ref when a.ndim < 2 (#89809 ) Fixes https://github.com/pytorch/pytorch/issues/89560 Previously the test case for "input is 1-D or scalar + weight is not scalar" did not exist; adding it introduced some failures: - forward AD (fixed in this PR) - vmap (filed https://github.com/pytorch/pytorch/issues/89895) - ref/meta (fixed this PR, though this also regresses nvFuser support) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89809 Approved by: https://github.com/ngimel	2022-12-12 23:55:31 +00:00
Bin Bao	282dfe8ba4	[inductor][Reland] Use decomposition for _to_copy (#90494 ) Summary: also contains a fix for https://github.com/pytorch/pytorch/issues/89633 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90494 Approved by: https://github.com/ngimel	2022-12-09 16:51:50 +00:00
PyTorch MergeBot	e89685b0b5	Revert "[inductor] Use decomposition for _to_copy (#90314 )" This reverts commit `3fdb5f2dda`. Reverted https://github.com/pytorch/pytorch/pull/90314 on behalf of https://github.com/desertfire due to regresses performance on hf_Bert	2022-12-08 18:29:06 +00:00
Bin Bao	3fdb5f2dda	[inductor] Use decomposition for _to_copy (#90314 ) Summary: also contains a fix for https://github.com/pytorch/pytorch/issues/89633 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90314 Approved by: https://github.com/ngimel	2022-12-08 15:25:44 +00:00
Peter Bell	e6a7278753	Give std/var correction overloads proper defaults (#56398 ) The correction overloads defaults were left off for forward compatibility reasons, but this FC window expired well over a year ago at this point. Differential Revision: [D29625593](https://our.internmc.facebook.com/intern/diff/D29625593) Pull Request resolved: https://github.com/pytorch/pytorch/pull/56398 Approved by: https://github.com/mruberry	2022-12-07 15:15:00 +00:00
Yanbo Liang	25f39c1bce	Fix uniform ref implementation (#90094 ) Fixes https://github.com/pytorch/torchdynamo/issues/1954 Pull Request resolved: https://github.com/pytorch/pytorch/pull/90094 Approved by: https://github.com/ngimel	2022-12-06 21:28:17 +00:00
Animesh Jain	c1950620c5	[decomp] Fix native_batch_norm_backward dtype of dweight and dbias (#89740 ) Discovered while debugging an accuracy issue for Inductor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89740 Approved by: https://github.com/soumith, https://github.com/ngimel	2022-11-29 03:15:20 +00:00
Brian Hirsh	e20ec44544	fixes for inductor <> batch norm (#89603 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/89603 Approved by: https://github.com/albanD	2022-11-29 02:16:52 +00:00
Jane Xu	8695f0cced	Rectify `native_batch_norm` schema by splitting it into two legit schemas (#88697 ) Using the same repro from the issue (but with BatchNorm2D) Rectifies native_batch_norm schema by splitting the schema into 2: 1. one will have NON-optional alias-able running_mean and running_var inputs 2. the other will just not have those parameters at all (no_stats variation) Calling for name suggestions! ## test plan I've added tests in test_functionalization.py as well as an entry in common_method_invocations.py for `native_batch_norm_legit` CI should pass. ## next steps Because of bc/fc reasons, we reroute native_batch_norm to call our new schemas ONLY through the python dispatcher, but in 2 weeks or so, we should make `native_batch_norm_legit` the official batch_norm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88697 Approved by: https://github.com/albanD	2022-11-23 23:23:17 +00:00
Elias Ellison	a8d6b82167	Fix norm decomp when dtype is passed in (#89508 ) Fix for https://github.com/pytorch/torchdynamo/issues/1889. The wrapper was doing a downcast even when the dtype was explicitly passed in. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89508 Approved by: https://github.com/anijain2305	2022-11-23 20:49:09 +00:00
Elias Ellison	72110d7833	Fix Upsample Decomp Striding For Small Channels (#89528 ) Fix for https://github.com/pytorch/torchdynamo/issues/623. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89528 Approved by: https://github.com/ngimel, https://github.com/anijain2305	2022-11-23 20:47:39 +00:00
lezcano	154e58c032	Add most in-place references/decompositions (#88117 ) We add most in-place references in a generic way. We also implement a wrapper to implement the annoying interface that `nn.functional` nonlinearities have. We fix along the way a couple decompositions for some non-linearities by extending the arguments that the references have. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88117 Approved by: https://github.com/mruberry	2022-11-18 14:59:46 +00:00
lezcano	3320915303	Fix decomp for embedding_backward and simplify the decomposition of embedding_dense and embedding_dense_backward (#87204 ) See the title Pull Request resolved: https://github.com/pytorch/pytorch/pull/87204 Approved by: https://github.com/Chillee	2022-11-16 17:46:54 +00:00
Sherlock Huang	5faa2792fa	Symintify decomps for split and upsample_bilinear; Fix decomp for _softmax_backward_data and native_dropout_backward (#88761 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88761 Approved by: https://github.com/ezyang	2022-11-15 13:34:45 +00:00
PyTorch MergeBot	eea506aee1	Revert "Symintify decomps for split and upsample_bilinear; Fix decomp for _softmax_backward_data and native_dropout_backward (#88761 )" This reverts commit `9eabcc370f`. Reverted https://github.com/pytorch/pytorch/pull/88761 on behalf of https://github.com/suo due to much broken `9eabcc370f`	2022-11-14 01:58:47 +00:00
Sherlock Huang	9eabcc370f	Symintify decomps for split and upsample_bilinear; Fix decomp for _softmax_backward_data and native_dropout_backward (#88761 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88761 Approved by: https://github.com/ezyang	2022-11-13 21:30:53 +00:00
Horace He	37c5b42fa6	Fix matmul decomp to use reshape instead of contiguous().view() (#88832 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88832 Approved by: https://github.com/bertmaher, https://github.com/ngimel	2022-11-12 00:15:42 +00:00
Ryan Spring	534ae6ae47	[primTorch] Implement group norm reference (#87054 ) Add group norm reference Split from #81191 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87054 Approved by: https://github.com/mruberry	2022-11-11 01:08:20 +00:00
Sherlock Huang	c00c34fb69	Fix meta for aten.upsample_bilinear2d.vec (#88158 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88158 Approved by: https://github.com/ngimel	2022-11-02 16:58:29 +00:00
Sherlock Huang	de1f641f11	Fix meta function for aten.addmm (#88068 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88068 Approved by: https://github.com/albanD	2022-11-01 17:05:48 +00:00
lezcano	fd27246c16	Fix decomposition for std (#87181 ) The previous implementation was lacking a few features and incurred on a pretty large error cc @ezyang @mruberry @ngimel @Lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/87181 Approved by: https://github.com/ngimel, https://github.com/peterbell10	2022-10-28 00:50:29 +00:00
Sherlock Huang	eb99c1efce	Prefer python meta function over c++ meta function (#87426 ) This is a policy update for meta registration. We now prefer python meta implementation over C++ meta function. This is a flip of the previous policy, where we prefer C++ meta function over python meta function if they both exist. Here's the meta registration process: 1. register_meta and register_decomposition will place the python meta/decomp functions into the `global_decomp_table`. However, they will NOT register them into dispatcher. 2. After global_decomp_table is populated, we will compile an `active_meta_table`. For a given op, we pick the most specific decomp function from `global_decomp_table` in the preference order of Meta > PostAutograd > PreAutograd. 3. We will unconditionally register all of them into python dispatcher. And register them into C++ dispatcher, unless it one of the following 3 cases - 1. the op is a CompositeImplicitAutograd, and should rely on decomposed op's meta - 2. the op is a view op, as the MetaTensor doesn't support aliased storage - 3. the op is in the blocklist (due to UT failures, and we will burn down this list op by op) Over the long run, we wish to implement all meta functions in python. With this PR, 321 op_overloads will have cpp meta overridden by python meta. There are still 400 op_overloads is using cpp meta. The exact list can be found here https://gist.github.com/SherlockNoMad/d20bb736178df8eebd3b054c8bb7cdc5 cc @ngimel @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang Pull Request resolved: https://github.com/pytorch/pytorch/pull/87426 Approved by: https://github.com/ezyang, https://github.com/jansel	2022-10-25 16:49:02 +00:00
Ryan Spring	9bb4926de0	Add xlogy and xlog1py references (#77712 ) * Add reference implementations for `xlogy` and `xlog1py` * Replace `_wrap_scalar` helper function with `scalar_tensor` prim Pull Request resolved: https://github.com/pytorch/pytorch/pull/77712 Approved by: https://github.com/mruberry	2022-10-22 17:59:25 +00:00
Edward Z. Yang	d73d4aa7de	Audit for error prone isinstance int/float and add lint (#87345 ) We recently fixed a bug on symbolic-shapes branch where an isinstance(x, int) test failed when passed a SymIntNode. To prevent this, I've added a lint for all the codepaths where we may pass SymInt/SymFloat directly to reject direct isinstance int/float tests, and instead use one of the aliases. The lint rule explains the options. I then go and fix all of them. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/87345 Approved by: https://github.com/bdhirsh, https://github.com/albanD	2022-10-21 15:55:24 +00:00
Sherlock Huang	f7da9db9c1	Unify decomp registries into global_decomposition_table (#86857 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86857 Approved by: https://github.com/ezyang	2022-10-20 21:29:05 +00:00
Sherlock Huang	ef045695e0	Fix decomp for huber_loss_backward (#86955 ) Fixes https://github.com/pytorch/pytorch/issues/86846 aten.huber_loss_backward calls aten.huber_loss_backward.out in its CompositeExplicitAutograd kernel. The decomp was mistaken registered for both aten.huber_loss_backward.default and aten.huber_loss_backward.out. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86955 Approved by: https://github.com/Chillee	2022-10-14 18:53:02 +00:00
Nikita Karetnikov	4460e40db4	[primTorch] Add a ref for `addcmul` (#86731 ) Based on: https://github.com/pytorch/pytorch/pull/79827 https://github.com/pytorch/pytorch/pull/72949 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86731 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-14 14:26:23 +00:00
Brian Hirsh	e17732b234	[test] add cross-ref tests for python meta kernels (#86228 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86228 Approved by: https://github.com/albanD	2022-10-13 14:14:26 +00:00
Elias Ellison	d3f7c34cb3	Enable aten-aten decomps (#85921 ) Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374)) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921 Approved by: https://github.com/ezyang	2022-10-08 05:12:42 +00:00
PyTorch MergeBot	7ec12a559c	Revert "Enable aten-aten decomps (#85921 )" This reverts commit `62e4f51efd`. Reverted https://github.com/pytorch/pytorch/pull/85921 on behalf of https://github.com/huydhn due to Sorry for reverting your PR. I think it breaks a dynamo test in trunk `62e4f51efd`	2022-10-08 01:59:54 +00:00
Elias Ellison	62e4f51efd	Enable aten-aten decomps (#85921 ) Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374)) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921 Approved by: https://github.com/ezyang	2022-10-07 21:04:39 +00:00
lezcano	28a0b3fb18	Fix col2im and im2col decompositions (#86426 ) I threw in some tests for good measure. Fixes https://github.com/pytorch/pytorch/issues/86332 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86426 Approved by: https://github.com/ngimel	2022-10-07 08:14:06 +00:00
Elias Ellison	9ceadcadb2	Fix unfold backward decomp aliasing for 0 dim input (#86428 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86428 Approved by: https://github.com/ngimel, https://github.com/ezyang	2022-10-07 03:55:31 +00:00
lezcano	b67e022833	Fix ref / decomposition index_add (#86266 ) The decomposition of `index_add` was using `slice(None)`, when it should use just `None`. The reference for index_add was also wrong, as `x[idx] += t` does not use atomic add, so it does not work when several `idx`s point to the same location. This PR adds extra reference inputs to help test for this. Fixes https://github.com/pytorch/torchdynamo/issues/1356 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86266 Approved by: https://github.com/ngimel	2022-10-05 19:59:15 +00:00
lezcano	c609768896	Add refs for torch.unfold and a decomposition for its backward. (#85629 ) It's not clear to me what's the difference between `unfold` and `unfold_copy`, as this latter one is codegen'd I also took this chance to clean the implementation of unfold and its reference Pull Request resolved: https://github.com/pytorch/pytorch/pull/85629 Approved by: https://github.com/mruberry	2022-10-05 12:15:49 +00:00
Edward Z. Yang	d07b85393a	SymInt fixes from symbolic-shapes branch (#86242 ) symintify a few inplace meta functions symintify resize_(), nbytes(), functionalization input mutations meta funcs for avg_pool2d_backward Pull Request resolved: https://github.com/pytorch/pytorch/pull/86242 Approved by: https://github.com/Chillee	2022-10-05 04:52:02 +00:00
Peter Bell	b317736c39	Fix default correction value in std/var decompositions (#85839 ) `torch.std` and `torch.var` default to the unbiased estimator, i.e. `correction=1`. This only works as is because the default on this overload is not exercised by the tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85839 Approved by: https://github.com/ezyang	2022-10-04 23:23:39 +00:00
Horace He	82d9592f1b	Batch of symintifications to allow more models to pass in inference (#86104 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86104 Approved by: https://github.com/ezyang	2022-10-04 04:01:58 +00:00
Edward Z. Yang	f3d7ab5438	Unconditionally register Python decomps to Meta key in Python Dispatcher (#85750 ) This makes them available for Python Dispatcher to service them when symbolic shapes are involved. This is needed because under certain conditions, functionalization will directly call the Meta kernel for a function in order to produce a properly sized output wrapper tensor for a view operation. This direct call bypasses the normal decomposition table mechanism. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/85750 Approved by: https://github.com/wconstab	2022-10-03 22:49:25 +00:00
Horace He	37013bb443	Added _unsafe_view decomp (#86103 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86103 Approved by: https://github.com/ezyang	2022-10-03 20:38:31 +00:00
lezcano	07ce0b435b	Remove backward for im2col and col2im (#85542 ) `im2col` is a linear map, and `col2im` is its adjoint. As such, the adjoint to `col2im` is `im2col` (the adjoint of the adjoint is the original function. There's no point having explicit derivatives in ATen for these functions, so this PR deletes all these. Furthermore, along the way, we fix an error for the derivative of im2col for non-batched inputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85542 Approved by: https://github.com/soulitzer, https://github.com/ngimel	2022-10-03 00:16:42 +00:00
Horace He	e6dd2965af	A bunch of coverage improvements (re for models in inference snext50, BERT_pytorch, mobilenet_v3_large, pytorch_CycleGAN_and_pix2pix, dcgan, resnet18, mnasnet1_0) (#86050 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86050 Approved by: https://github.com/ezyang	2022-10-02 20:46:20 +00:00
lezcano	787028cadb	Implement col2im decomposition and fix im2col and add a few preconditions (#85541 ) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/85541 Approved by: https://github.com/jansel	2022-09-30 09:31:53 +00:00
Elias Ellison	6a2b12dd65	Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85471 Approved by: https://github.com/ezyang	2022-09-28 23:06:59 +00:00
Animesh Jain	796da4df4d	Return contiguous tensor from softmax decomposition (#85788 ) Fixes https://github.com/pytorch/torchdynamo/issues/1135 Softmax decomp's output stride does not match with aten softmax output stride. Not sure if its desirable. Opening a PR for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85788 Approved by: https://github.com/ngimel, https://github.com/ezyang	2022-09-28 20:52:45 +00:00
Nikita Karetnikov	8dd45424ea	[primTorch] Add ref for `huber_loss` and error inputs (#85041 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85041 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-09-28 19:56:17 +00:00
Edward Z. Yang	793488cda2	Revert "Revert "Symintifying slice ops (#85196 )"" (#85746 ) This reverts commit `3a171dfb0c`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85746 Approved by: https://github.com/albanD	2022-09-28 04:37:35 +00:00
PyTorch MergeBot	3a171dfb0c	Revert "Symintifying slice ops (#85196 )" This reverts commit `4c01c51266`. Reverted https://github.com/pytorch/pytorch/pull/85196 on behalf of https://github.com/atalman due to Break internal build Exutorch	2022-09-27 18:01:27 +00:00
Fabio Rocha	d5ce2bbed2	[primTorch] decompositions for upsample_bicubic2d (#85403 ) FYI, this decomposition seems to be significantly slower than the lowering in torchinductor: ``` ------------------------------------- upsample_bicubic2d -------------------------------------] \| lowering \| Inductor \| Eager 32 threads: ------------------------------------------------------------------------------------ (torch.Size([16, 4, 128, 256]),), ((512, 1024), True) \| 1.8 \| 3.880 \| 1.4 (torch.Size([16, 4, 128, 256]),), ((512, 1024), False) \| 1.9 \| 3.887 \| 1.4 ``` This seems related to the fact that in the lowering we can use int32s as the indices and in the decomp we can only use int64s (see https://github.com/pytorch/torchdynamo/issues/1293). Pull Request resolved: https://github.com/pytorch/pytorch/pull/85403 Approved by: https://github.com/ngimel	2022-09-26 20:11:23 +00:00
Elias Ellison	bcc544e9d7	Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417 Approved by: https://github.com/ezyang	2022-09-26 17:08:14 +00:00
Fabio Rocha	ffaff8896a	Removed None arg check in test/test_decomp.py (#85402 ) Not sure why this check was necessary? Tests seem to run fine without it. There were definitely tests this was skipping before that it shouldn't, e.g., pretty much all of the tests for `torch.nn.functional.interpolate` Pull Request resolved: https://github.com/pytorch/pytorch/pull/85402 Approved by: https://github.com/ezyang	2022-09-24 11:37:27 +00:00
Edward Z. Yang	4c01c51266	Symintifying slice ops (#85196 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85196 Approved by: https://github.com/ezyang	2022-09-23 22:01:32 +00:00
PyTorch MergeBot	d10de31cc8	Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 )" This reverts commit `78afa0cf0c`. Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk `78afa0cf0c`	2022-09-23 17:21:43 +00:00
PyTorch MergeBot	3b195fd33e	Revert "Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471 )" This reverts commit `1e92eb8068`. Reverted https://github.com/pytorch/pytorch/pull/85471 on behalf of https://github.com/clee2000 due to stacked prs https://github.com/pytorch/pytorch/pull/85417 and https://github.com/pytorch/pytorch/pull/85434 broke trunk, reverting this so i can revert the others	2022-09-23 17:13:35 +00:00
Elias Ellison	1e92eb8068	Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85471 Approved by: https://github.com/ezyang	2022-09-23 16:02:15 +00:00
Elias Ellison	78afa0cf0c	Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417 Approved by: https://github.com/ezyang	2022-09-23 15:50:03 +00:00
Ryan Spring	71dddec6ea	Cast grad_input to half when input_dtype is half in _softmax_backward_data aten decomposition (#85497 ) Fixes #85504 `_softmax_backward_data` and `_log_softmax_backward_data` cast `grad_input` to half when the `input_dtype` is half. When running with amp without the cast, consumer ops can trigger `RuntimeError: expected scalar type Float but found Half`. https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/SoftMax.cpp#L70-L83 https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/SoftMax.cpp#L102-L113 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85497 Approved by: https://github.com/ngimel	2022-09-23 06:52:38 +00:00
PyTorch MergeBot	5043457a8e	Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 )" This reverts commit `9c77083965`. Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk (and pull somehow) `9c77083965`	2022-09-22 15:44:38 +00:00
Elias Ellison	9c77083965	Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417 Approved by: https://github.com/ezyang	2022-09-22 13:03:57 +00:00
Horace He	2f4a517d67	Ported matmul compositeimplicitautograd impl into core (#85239 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85239 Approved by: https://github.com/ezyang, https://github.com/lezcano	2022-09-21 09:25:24 +00:00
lezcano	d17b144e65	Adding multigammaln ref and fix arange (#85153 ) Partially based on https://github.com/pytorch/pytorch/pull/83662. I'll help land this one, as Rob does not work in the PyTorch project anymore I removed the data-dependent check for the args, as data dependencies are bad for many reasons (and it was failing when the input has NaNs). It also registers arange as a decomposition, and fixes the naming of its args. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85153 Approved by: https://github.com/mruberry, https://github.com/ngimel	2022-09-20 17:52:56 +00:00
lezcano	5dd9610e9d	Refs and decompositions for index_{add,copy,select,fill} (#85002 ) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/85002 Approved by: https://github.com/ngimel	2022-09-17 19:57:34 +00:00
PyTorch MergeBot	e33b464ffc	Revert "Refs and decompositions for index_{add,copy,select,fill} (#85002 )" This reverts commit `2f0b3de443`. Reverted https://github.com/pytorch/pytorch/pull/85002 on behalf of https://github.com/huydhn due to Broke trunk slow tests	2022-09-17 04:26:04 +00:00
lezcano	2f0b3de443	Refs and decompositions for index_{add,copy,select,fill} (#85002 ) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/85002 Approved by: https://github.com/ngimel	2022-09-16 23:59:35 +00:00
Sherlock Huang	29eba319b4	Use alias for nop decomp (#84727 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84727 Approved by: https://github.com/Chillee	2022-09-16 18:50:56 +00:00
Natalia Gimelshein	6162a04364	fix half_to_float arg in *softmax decomp (#85120 ) Fixes https://github.com/pytorch/torchdynamo/issues/1239 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85120 Approved by: https://github.com/Chillee	2022-09-16 15:54:50 +00:00
soulitzer	7f88934a8f	[reland 2] Call jit decomp in VariableType to improve forward AD coverage (#84976 ) Reland of https://github.com/pytorch/pytorch/pull/84675 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84976 Approved by: https://github.com/zou3519	2022-09-15 22:46:19 +00:00
PyTorch MergeBot	36d79143ce	Revert "[reland] Call jit decomposition in VariableType to increase forward AD coverage (#84151 ) (#84675 )" This reverts commit `bb4e96c964`. Reverted https://github.com/pytorch/pytorch/pull/84675 on behalf of https://github.com/osalpekar due to causing asan xplat link-time errors like ld.lld: error: undefined symbol: torch::jit::has_jit_decomposition(c10::FunctionSchema const&)	2022-09-13 22:54:54 +00:00
soulitzer	bb4e96c964	[reland] Call jit decomposition in VariableType to increase forward AD coverage (#84151 ) (#84675 ) This reverts commit `acb4a09628`. In addition, we also fix a memory leak in layer norm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84675 Approved by: https://github.com/zou3519	2022-09-12 20:33:14 +00:00
Horace He	1459a909b4	Added mv, mm, and binary_cross_entropy_with_logits decomps (#84451 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84451 Approved by: https://github.com/ngimel	2022-09-08 17:56:18 +00:00
soulitzer	e31ad1c2d3	[reland] Move decompositions and helpers for jvp from functorch into core (#84581 ) Reland of https://github.com/pytorch/pytorch/pull/84358 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84581 Approved by: https://github.com/samdow	2022-09-07 15:31:46 +00:00
Ivan Yashchuk	6363b1b358	Add nvFuser support for aten.native_batch_norm_backward (#84546 ) Replacing `tensor.reshape(broadcast_mask)` with unsqueezes makes the implementation of `batch_norm_backward` more friendly for PrimTorch+nvFuser. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84546 Approved by: https://github.com/Chillee	2022-09-06 19:56:17 +00:00
Fabio Rocha	91a5f52f51	Decomp for nn.functional.grid_sampler_2d (#84350 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84350 Approved by: https://github.com/jansel, https://github.com/Lezcano	2022-09-05 21:33:26 +00:00
lezcano	3dfbf09afe	Optimise the decomposition for `adaptive_avg_pool2d` wrt. TorchInductor (#84483 ) This fixes some part of the implementation that did not work with TorchInductor (e.g. the indices in TorchInductor need to be `int64`s, while in PyTorch we can have `int32`s). It also brings up the performance of the kernel to similar numbers than those of the lowering (benchmarks below). Pull Request resolved: https://github.com/pytorch/pytorch/pull/84483 Approved by: https://github.com/jansel	2022-09-02 22:25:09 +00:00
PyTorch MergeBot	375d6cd5b7	Revert "Move decompositions and helpers for jvp from functorch into core (#84358 )" This reverts commit `a3c60a4db4`. Reverted https://github.com/pytorch/pytorch/pull/84358 on behalf of https://github.com/malfet due to Broke lint	2022-09-01 23:42:48 +00:00
soulitzer	a3c60a4db4	Move decompositions and helpers for jvp from functorch into core (#84358 ) This refactor shouldn't change any behavior. At this point functorch still relies on the mechanism in DynamicLayerFront; we just moved some parts of it into core. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84358 Approved by: https://github.com/samdow	2022-09-01 22:39:15 +00:00
Sherlock Huang	ef3ab31f1c	Decomp for aten.im2col (#84303 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84303 Approved by: https://github.com/jansel, https://github.com/ngimel	2022-09-01 00:06:35 +00:00
Nikita Karetnikov	71ce9cd072	[primTorch] Add decomp for `soft_margin_loss` (#83804 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83804 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-08-31 17:39:34 +00:00
Nikita Shulga	b8e1c54f53	[Prim] Implement group_norm_backward (#84037 ) Test plan: CI, i.e. `python3 test_decomp.py -v -k test_comprehensive_nn_functional_group_norm` plus: ``` #!/usr/bin/env python3.8 import torch func = torch.ops.aten.native_group_norm_backward.default decomp = torch._decomp.decomposition_table[func] for args in ( (torch.rand(1, 6, 3), torch.rand(1, 6, 3), torch.rand(1, 2), torch.rand(1, 2), torch.rand(6), 1, 6, 3, 2, [True, True, True]), (torch.rand(64, 768, 7, 7), torch.rand(64, 768, 7, 7), torch.rand(64, 1), torch.rand(64, 1), torch.rand(768), 64, 768, 49, 1, [True, True, True])): nrc=func(args) drc=decomp(args) for i in range(len(nrc)): print(i, torch.max(nrc[i]-drc[i])) print(all(torch.allclose(x, y) for (x, y) in zip(nrc, drc))) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/84037 Approved by: https://github.com/Chillee, https://github.com/ngimel	2022-08-29 09:29:30 +00:00
Natalia Gimelshein	533203f5aa	_to_copy decomp (#84108 ) Per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/84108 Approved by: https://github.com/Chillee	2022-08-29 02:25:02 +00:00
lezcano	9fc02f6bc5	Decomposition for adaptive_avg_pool2d (#84062 ) This was already implemented as a lowering in https://github.com/pytorch/torchdynamo/pull/962. I'm putting the idea up here ~(I haven't even run this code, so it surely has many issues, but I reckon the general idea should hopefully be alright).~ The tests now pass and I corrected the issues that the first implementation had. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84062 Approved by: https://github.com/jansel	2022-08-29 01:38:51 +00:00
PyTorch MergeBot	33db5da4c1	Revert "[Prim] Implement group_norm_backward (#84037 )" This reverts commit `bed85cce8b`. Reverted https://github.com/pytorch/pytorch/pull/84037 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally	2022-08-28 17:30:50 +00:00
PyTorch MergeBot	ff23f3ac1c	Revert "_to_copy decomp (#84108 )" This reverts commit `e33897cb99`. Reverted https://github.com/pytorch/pytorch/pull/84108 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally	2022-08-28 13:27:49 +00:00
Natalia Gimelshein	e33897cb99	_to_copy decomp (#84108 ) Per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/84108 Approved by: https://github.com/Chillee	2022-08-27 03:51:03 +00:00
Nikita Shulga	bed85cce8b	[Prim] Implement group_norm_backward (#84037 ) Test plan: CI, i.e. `python3 test_decomp.py -v -k test_comprehensive_nn_functional_group_norm` plus: ``` #!/usr/bin/env python3.8 import torch func = torch.ops.aten.native_group_norm_backward.default decomp = torch._decomp.decomposition_table[func] for args in ( (torch.rand(1, 6, 3), torch.rand(1, 6, 3), torch.rand(1, 2), torch.rand(1, 2), torch.rand(6), 1, 6, 3, 2, [True, True, True]), (torch.rand(64, 768, 7, 7), torch.rand(64, 768, 7, 7), torch.rand(64, 1), torch.rand(64, 1), torch.rand(768), 64, 768, 49, 1, [True, True, True])): nrc=func(args) drc=decomp(args) for i in range(len(nrc)): print(i, torch.max(nrc[i]-drc[i])) print(all(torch.allclose(x, y) for (x, y) in zip(nrc, drc))) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/84037 Approved by: https://github.com/Chillee, https://github.com/ngimel	2022-08-27 01:10:27 +00:00
Horace He	9a236c7ab4	Made some minor cleanups to decompositions (#83814 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83814 Approved by: https://github.com/ngimel	2022-08-26 10:55:31 +00:00
Animesh Jain	e2f75d63d4	Decomposition - batch_norm, save_mean and save_variance always float32 (#84013 ) AMP error shown here - https://github.com/pytorch/torchdynamo/issues/835 Test missing Pull Request resolved: https://github.com/pytorch/pytorch/pull/84013 Approved by: https://github.com/ezyang	2022-08-25 16:09:52 +00:00
Ivan Yashchuk	473b733bae	Replace .new_zeros(()) with 0.0 in torch/_decomp/decompositions (#83734 ) `new_zeros` is decomposed into `prims.empty_strided`+`prims.fill`+`prims.copy_to` and none of these are supported by prims+nvFuser executor currently. Replacing it with 0.0 makes these backward decompositions nvFuser friendly. Example with `torch.ops.aten.hardsigmoid_backward.default`: ```py # Before this PR opcode name target args kwargs ------------- ------------------------ -------------------------------- ------------------------------------------------------------ ---------------------------------------------------------------------------------------- placeholder a_1 a_1 () {} placeholder g_1 g_1 () {} call_function gt_default nvprims.gt.default (a_1, -3.0) {} call_function lt_default nvprims.lt.default (a_1, 3.0) {} call_function bitwise_and_default nvprims.bitwise_and.default (gt_default, lt_default) {} call_function mul_default nvprims.mul.default (g_1, 0.16666666666666666) {} call_function empty_strided prims.empty_strided.default ([], []) {'dtype': torch.float32, 'device': device(type='cuda', index=0), 'requires_grad': False} call_function fill_default prims.fill.default (empty_strided, 0) {} call_function copy_to_default prims.copy_to.default (empty_strided, fill_default) {} call_function broadcast_in_dim_default nvprims.broadcast_in_dim.default (copy_to_default, [3, 2], []) {} call_function where_default nvprims.where.default (bitwise_and_default, mul_default, broadcast_in_dim_default) {} output output output (where_default,) {} # After this PR opcode name target args kwargs ------------- ------------------- --------------------------- --------------------------------------- -------- placeholder a_1 a_1 () {} placeholder g_1 g_1 () {} call_function gt_default nvprims.gt.default (a_1, -3.0) {} call_function lt_default nvprims.lt.default (a_1, 3.0) {} call_function bitwise_and_default nvprims.bitwise_and.default (gt_default, lt_default) {} call_function mul_default nvprims.mul.default (g_1, 0.16666666666666666) {} call_function where_default nvprims.where.default (bitwise_and_default, mul_default, 0.0) {} output output output (where_default,) {} Pull Request resolved: https://github.com/pytorch/pytorch/pull/83734 Approved by: https://github.com/Chillee	2022-08-22 09:12:13 +00:00
Edward Z. Yang	02581f053b	Address CR comments for "Delete ProxyTensor wrapper subclass" (#83646 ) CR is on https://github.com/pytorch/pytorch/pull/83330 - Factor proxy slot getters/setters into helper functions - Use a weak map for storing proxies, so they go away when tracing is done - More documentation on SymDispatchMode Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83646 Approved by: https://github.com/Chillee	2022-08-18 22:18:09 +00:00
Edward Z. Yang	817a82704f	Delete ProxyTensor wrapper subclass (#83330 ) I was working on https://github.com/pytorch/torchdynamo/issues/80 and my working hypothesis for what was causing the error was that proxy tensor was not advertising correct dispatch keys, causing AMP to operate differently when you traced. I could have fixed this directly by replicating fake tensor's fix for setting dispatch keys to also apply to proxy tensor, but I was like, "Why must I repeat myself." This PR is the result. It completely deletes the ProxyTensor wrapper subclass, so that when we are tracing, the tensors flowing through the program are the original real or fake tensors, depending on what the user requested in the top-level API. There is no more wrapping. To store the Proxy objects necessary for actually doing tracing, I store the property directly on the tensors. (Note: I never clean up old entries from the map at the moment, this is easily fixed by using a weak map) Benefits of doing this: * No more tip-toeing around no_dispatch() creation of new ProxyTensors; we never create new tensors (except when we call the underlying func), so you don't have to worry about accidentally tracing them. * No more syncing up metadata from in place operators. In particular https://github.com/pytorch/pytorch/issues/81526 is mooted * This fixes https://github.com/pytorch/torchdynamo/issues/519 as we no longer need to teach proxy tensor to support sparse tensor. * No more schlepping symbolic integers from the inner fake tensor to the outer proxy tensor. If you can make a fake tensor with symbolic ints, you're done, nothing else to do. To avoid having to rewrite all of the guts, when I get to the actual proxy tensor handler, I first "fetch" the stored ProxyTensor data from the weakmap via a tree_map, and then operate on the consequent data as before. A more optimized implementation is possible. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83330 Approved by: https://github.com/Chillee	2022-08-18 01:56:07 +00:00
Nikita Karetnikov	cd86d25515	[primTorch] Move addcdiv from decompositions -> refs (#80842 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80842 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-08-16 17:23:00 +00:00
Horace He	f02f304657	Added nll_loss_forward decomposition + some other minor decomps (#83235 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83235 Approved by: https://github.com/ngimel	2022-08-13 10:24:58 +00:00
Brian Hirsh	1a51efd8bb	dispatch API for checking computed table, use it in prim decomps (#82358 ) Fixes https://github.com/pytorch/pytorch/issues/82331 Expose a `torch._C._dispatch_has_computed_kernel_for_dispatch_key` to check if an operator has a kernel registered to the given dispatch key in the computed table. Use it in the prim registration logic, making it more accurate and robust (so that it e.g. picks up `CompositeExplicitAutograd` kernels. It looks like before this change we'd register 134 prim ops to the meta key, and after we only register 62. So that's 72 ops that now use an existing C++ decomp to get meta working, instead of going directly through the prim decomp. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82358 Approved by: https://github.com/ezyang	2022-08-10 23:42:02 +00:00
Natalia Gimelshein	112ec24f09	Fix device behavior for masked_fill (#82737 ) Fixes #81018, based on #81036. It will create graph break for cpu 0d tensor value due to .item() call (we could maybe specialize on that instead of breaking?), but otherwise it would create graph break due to synchronizing `to` call, so there's no way around :-(, and for number `value` argument we already should be specializing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82737 Approved by: https://github.com/Chillee	2022-08-04 15:47:56 +00:00
Brian Hirsh	4a77bee661	prevent python view impls from getting registered to the meta key (#82007 ) We don't want to register view ops in python to the `Meta` dispatch key, because doing that prevents us from correctly aliasing storage information. This PR fixes the existing python registrations, and makes it an error to do that in the future. Example: ``` with FakeTensorMode.push() as mode: b = torch.ones(2) c = b.unsqueeze(-1) b_ = StorageWeakRef(b.storage()) c_ = StorageWeakRef(c.storage()) print(b_.cdata) print(c_.cdata) # their storages are different (now fixed in this PR) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/82007 Approved by: https://github.com/ezyang, https://github.com/eellison	2022-07-27 17:15:05 +00:00
Shangdi Yu	9088757cc6	move aten.native_batch_norm_backward decomposition to core (#81522 ) Move aten.native_batch_norm_backward decomposition from https://github.com/pytorch/functorch/blob/main/functorch/_src/decompositions.py#L148. Changed to not recompute mean and invstd, added type cast. In fucntorch, changed `@register_decomposition_for(aten.native_batch_norm_backward)` to `@register_decomposition_for_jvp(aten.native_batch_norm_backward)` Passing `pytest test/test_decomp.py -k norm` Note that when the output mask is False for grad_weight and grad_bias, we should return None to be consistent with the non-decomposed operator's behavior. But "None" doesn't work with vjp, so the version of decomposition in functorch used zeros. See `b33c1f7dd4/functorch/functorch/_src/decompositions.py (L210)`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81522 Approved by: https://github.com/Chillee	2022-07-27 06:11:34 +00:00
lezcano	11fe277b62	[PrimTorch] Add reference for torch.norm (#81765 ) This ref does more things than `torch.norm`, and it fixes a few bugs that `torch.norm` has. This implementation and the `torch.norm` implementation come to terms in the next PR of this stack We put this PR before, as otherwise `test_decomp.py` was failing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81765 Approved by: https://github.com/ngimel	2022-07-25 19:57:21 +00:00
Vivek Khandelwal	cb63ffc553	Add decomposition for `aten.upsample_bilinear2d.vec` (#80964 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80964 Approved by: https://github.com/jansel, https://github.com/Chillee	2022-07-23 02:22:15 +00:00
Huy Do	12cb26509a	Apply ufmt to torch internal (#81643 ) This is a big bang PR, merge conflicts are probably expected and will be addressed at merge. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81643 Approved by: https://github.com/ezyang	2022-07-22 02:19:50 +00:00
Horace He	a5fb41e3d3	Revert "Revert "Refactored prim utils into _prims_utils folder (#81746 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81746 Approved by: https://github.com/anijain2305, https://github.com/Krovatkin	2022-07-20 23:43:57 +00:00
PyTorch MergeBot	e43a02c314	Revert "Refactored prim utils into _prims_utils folder (#81088 )" This reverts commit `80231d0a72`. Reverted https://github.com/pytorch/pytorch/pull/81088 on behalf of https://github.com/jeanschmidt due to breaking internal tests	2022-07-19 19:56:41 +00:00
Horace He	80231d0a72	Refactored prim utils into _prims_utils folder (#81088 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81088 Approved by: https://github.com/ngimel	2022-07-19 03:55:51 +00:00
Natalia Gimelshein	50d205c551	make clamp decomps use torch.* calls, move clamp_min/clamp_max to refs (#81619 ) Per title, @chillee is anything else necessary to remove decomp other than decorating ref with `register_decomposition`? Pull Request resolved: https://github.com/pytorch/pytorch/pull/81619 Approved by: https://github.com/Chillee	2022-07-18 16:52:45 +00:00
Horace He	5139053e02	Fixed the decomposition for `embedding_dense_backward` (#81528 ) No guarantee about the strides of `grad_output`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81528 Approved by: https://github.com/jansel	2022-07-15 17:51:00 +00:00
Edward Z. Yang	fca03eeec1	Make proxy tensor support item() calls on torch.tensor constants (#81192 ) This PR is doing a few interrelated things, all of which are necessary to get correctness. Read the comment in torch/fx/experimental/proxy_tensor.py for the high level overview. Let's break down the parts of this PR: * Bug fix where `enable_torch_dispatch_mode` with `None` doesn't work. This make `enable_torch_dispatch_mode(current_mode.inner)` work which is the basis for how we temporarily disable fake tensor mode. * Bug fix for when fake tensor mode is combined with a non-mode tensor subclass. This actually could be ablated from this PR but it affects where the logic for allowing non fake tensor inputs with lift goes, so it's all in here in one go. There are some relevant tests for the fix in fake tensor, but it turns out I didn't need this because I'm always using proxy tensors as a mode (which ensures the ordering is right.) * New `lift_fresh` view operator. Note that like lift, we have to manually write the functionalize kernel for these functions. * The actual change, which is to save constants when we see them in the proxy tensor mode, and then propagate them as we go (because otherwise you'll handle mutations on constants incorrectly--see test.) This is mildly BC-breaking if anyone was previously interposing on at::lift, but this operator was relatively new and I checked functorch which has no explicit reference to lift. So I think it should not be too disruptive. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/81192 Approved by: https://github.com/samdow, https://github.com/bdhirsh	2022-07-15 03:53:40 +00:00
lezcano	b5b9db9f84	Make `kl_div` a composite function. (#80334 ) Benchmarks: https://github.com/pytorch/pytorch/pull/80334#issuecomment-1167229285 Fixes https://github.com/pytorch/pytorch/issues/80158 Fixes https://github.com/pytorch/pytorch/issues/78867 Fixes https://github.com/pytorch/pytorch/issues/69230 Supersedes https://github.com/pytorch/pytorch/pull/79007 Supersedes https://github.com/pytorch/pytorch/pull/69212 Supersedes https://github.com/pytorch/pytorch/pull/19659 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80334 Approved by: https://github.com/ezyang	2022-07-13 20:07:36 +00:00
PyTorch MergeBot	f2c8557521	Revert "Make `kl_div` a composite function. (#80334 )" This reverts commit `828c787ea9`. Reverted https://github.com/pytorch/pytorch/pull/80334 on behalf of https://github.com/ezyang due to doesn't work with xla	2022-07-06 17:51:06 +00:00
lezcano	eb0889cf7d	Add support for multiple inputs to out_wrapper and strict dtype checking (#80601 ) Reland of https://github.com/pytorch/pytorch/pull/79941 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80601 Approved by: https://github.com/albanD	2022-07-05 12:31:21 +00:00
lezcano	828c787ea9	Make `kl_div` a composite function. (#80334 ) Benchmarks: https://github.com/pytorch/pytorch/pull/80334#issuecomment-1167229285 Fixes https://github.com/pytorch/pytorch/issues/80158 Fixes https://github.com/pytorch/pytorch/issues/78867 Fixes https://github.com/pytorch/pytorch/issues/69230 Supersedes https://github.com/pytorch/pytorch/pull/79007 Supersedes https://github.com/pytorch/pytorch/pull/69212 Supersedes https://github.com/pytorch/pytorch/pull/19659 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80334 Approved by: https://github.com/ezyang	2022-07-04 19:33:43 +00:00
PyTorch MergeBot	184a065ba7	Revert "Add support for multiple inputs to out_wrapper and strict dtype checking (#79941 )" This reverts commit `dc7066a8f0`. Reverted https://github.com/pytorch/pytorch/pull/79941 on behalf of https://github.com/suo due to broke master `dc7066a8f0`	2022-06-30 03:29:30 +00:00
lezcano	dc7066a8f0	Add support for multiple inputs to out_wrapper and strict dtype checking (#79941 ) When a function returns multiple parameters in PyTorch, the `out` parameter takes a tuple of tensors (see `linalg.svd` for example). The current implementation in `out_wrapper_multi` modelled this wrong, as it assumed that it would take a number of different named parameters. This PR implements the correct behaviour in `out_wrapper`. As a small side-effect, we now need to call `@out_wrapper()` when the output is just one tensor. This PR also implements an additional optional parameter that checks whether the dtype of the given `out` is exactly the dtype that the meta function requires. This is the behaviour that we currently have in PyTorch, and this check is necessary in eager when we call with these tensors into external libraries. We also make the functions with several outputs return a namedtuple, similar to what we do in PyTorch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79941 Approved by: https://github.com/mruberry, https://github.com/ezyang	2022-06-30 02:47:16 +00:00
Horace He	d43e6c9f4a	Revert "Revert "formatted _decomp folder with black"" This reverts commit `2027eae67c`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79226 Approved by: https://github.com/Krovatkin	2022-06-22 20:47:52 +00:00
Horace He	4193252de9	Revert "Revert "Added kl_div_backward decomp"" This reverts commit `60a13f4ec9`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79225 Approved by: https://github.com/Krovatkin	2022-06-22 18:09:52 +00:00
Horace He	e89676f76c	fix logical_not reland issues Pull Request resolved: https://github.com/pytorch/pytorch/pull/79900 Approved by: https://github.com/ngimel	2022-06-21 03:41:18 +00:00
PyTorch MergeBot	79507d2a9d	error when registering meta kernels to composite ops in core Pull Request resolved: https://github.com/pytorch/pytorch/pull/79741 Approved by: https://github.com/Chillee, https://github.com/albanD	2022-06-21 02:17:13 +00:00
Nikita Shulga	f5eb05f107	Revert "Reland #2 of "Added {logical_not, trace} refs, moved logical ops to use method overloads"" This reverts commit `f3665dd237`. Reverted https://github.com/pytorch/pytorch/pull/79819 on behalf of https://github.com/malfet due to land raced with softshrink refs	2022-06-20 14:22:15 -07:00
Horace He	f3665dd237	Reland #2 of "Added {logical_not, trace} refs, moved logical ops to use method overloads" Pull Request resolved: https://github.com/pytorch/pytorch/pull/79819 Approved by: https://github.com/mruberry	2022-06-20 19:50:43 +00:00
lezcano	16f30b494c	Make l1_loss composite Fixing the forward AD for `sgn` in the next PR of this stack uncovered a number of issues with the derivatives of `l1_loss`. Upon inspection, `l1_loss` was just implemented as a composite function, but it was not differentiable. This PR makes it a fully differentiable function. As a side note, `l1_loss_out` was incorrect in a number of ways. Even more, it is not exposed to the public as `F.l1_loss` does not accept an `out=` parameter. As such it is not even tested. I wonder how useful is to have `out=` variants for loss functions if we don't expose them at all. Even more, I wonder how useful is to have `_out` variants for loss functions, given that their most normal use case is to return just a real number cc jbschlosser Pull Request resolved: https://github.com/pytorch/pytorch/pull/79804 Approved by: https://github.com/zou3519, https://github.com/malfet	2022-06-20 19:10:54 +00:00
Jason Ansel	d2e18606e7	Fix view issue in embedding_dense_backward decomp (#79857 ) I was hitting: ``` File "/home/jansel/pytorch/torch/fx/experimental/proxy_tensor.py", line 66, in proxy_call return CURRENT_DECOMPOSITION_TABLE[func_overload](args, kwargs) File "/home/jansel/pytorch/torch/_decomp/decompositions.py", line 801, in embedding_dense_backward indices_rank1 = indices.view(numel) File "/home/jansel/pytorch/torch/fx/experimental/proxy_tensor.py", line 122, in __torch_dispatch__ return proxy_call(func_overload, args, kwargs) File "/home/jansel/pytorch/torch/fx/experimental/proxy_tensor.py", line 86, in proxy_call real_out = func_overload(args, *kwargs) File "/home/jansel/pytorch/torch/_ops.py", line 49, in __call__ return self._op(args, **kwargs or {}) RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79857 Approved by: https://github.com/Chillee	2022-06-20 17:58:14 +00:00
PyTorch MergeBot	d4a9438786	Revert "Make l1_loss composite" This reverts commit `61a5c779bf`. Reverted https://github.com/pytorch/pytorch/pull/78257 on behalf of https://github.com/malfet due to This breaks executorch	2022-06-17 18:14:21 +00:00
Ivan Yashchuk	bc1fef96af	Reference implementations for rsqrt and native_layer_norm (#79413 ) This PR adds references for: - `torch.rsqrt` - `torch.native_layer_norm` - `torch.nn.functional.layer_norm` `native_layer_norm` had a different number of dimensions if the input was 0-sized. I fixed that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79413 Approved by: https://github.com/mruberry, https://github.com/Chillee	2022-06-17 07:24:02 +00:00
Jason Ansel	c8fb02b452	Use amax instead of max for softmax decomps (#79667 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79667 Approved by: https://github.com/Chillee	2022-06-16 04:09:33 +00:00
lezcano	61a5c779bf	Make l1_loss composite Fixing the forward AD for `sgn` in the next PR of this stack uncovered a number of issues with the derivatives of `l1_loss`. Upon inspection, `l1_loss` was just implemented as a composite function, but it was not differentiable. This PR makes it a fully differentiable function. As a side note, `l1_loss_out` was incorrect in a number of ways. Even more, it is not exposed to the public as `F.l1_loss` does not accept an `out=` parameter. As such it is not even tested. I wonder how useful is to have `out=` variants for loss functions if we don't expose them at all. Even more, I wonder how useful is to have `_out` variants for loss functions, given that their most normal use case is to return just a real number cc jbschlosser Pull Request resolved: https://github.com/pytorch/pytorch/pull/78257 Approved by: https://github.com/jbschlosser	2022-06-16 00:03:22 +00:00
PyTorch MergeBot	fefff54cad	Revert "Revert "Revert "Added {logical_not, trace} refs, moved logical ops to use method overloads""" This reverts commit `a2d2981e8e`. Reverted https://github.com/pytorch/pytorch/pull/79224 on behalf of https://github.com/suo due to broke lots of things `a2d2981e8e`	2022-06-10 04:40:43 +00:00
Horace He	a2d2981e8e	Revert "Revert "Added {logical_not, trace} refs, moved logical ops to use method overloads"" This reverts commit `d67309aefb`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79224 Approved by: https://github.com/mruberry	2022-06-10 03:07:14 +00:00
PyTorch MergeBot	d67309aefb	Revert "Added {logical_not, trace} refs, moved logical ops to use method overloads" This reverts commit `64b6bd8c1e`. Reverted https://github.com/pytorch/pytorch/pull/79000 on behalf of https://github.com/malfet due to Introduces test failure, see https://hud.pytorch.org/pr/79000	2022-06-09 13:11:23 +00:00
PyTorch MergeBot	60a13f4ec9	Revert "Added kl_div_backward decomp" This reverts commit `a08685ebc9`. Reverted https://github.com/pytorch/pytorch/pull/79001 on behalf of https://github.com/malfet due to PR failed in newly added tests, see https://hud.pytorch.org/pr/79001	2022-06-09 13:08:30 +00:00
PyTorch MergeBot	2027eae67c	Revert "formatted _decomp folder with black" This reverts commit `4945c72151`. Reverted https://github.com/pytorch/pytorch/pull/79002 on behalf of https://github.com/janeyx99 due to Broke decomp tests on trunk + also on PR https://hud.pytorch.org/minihud#4945c72151e29cb524974e1714654cf790ddb37d	2022-06-09 12:58:03 +00:00
Horace He	4945c72151	formatted _decomp folder with black Pull Request resolved: https://github.com/pytorch/pytorch/pull/79002 Approved by: https://github.com/ezyang	2022-06-09 07:16:37 +00:00
Horace He	a08685ebc9	Added kl_div_backward decomp Pull Request resolved: https://github.com/pytorch/pytorch/pull/79001 Approved by: https://github.com/ezyang	2022-06-09 07:16:37 +00:00
Horace He	64b6bd8c1e	Added {logical_not, trace} refs, moved logical ops to use method overloads Pull Request resolved: https://github.com/pytorch/pytorch/pull/79000 Approved by: https://github.com/ezyang	2022-06-09 07:16:36 +00:00
Horace He	dc11a5642d	Improved stack ref and added more decomposition annotations Pull Request resolved: https://github.com/pytorch/pytorch/pull/78994 Approved by: https://github.com/mruberry	2022-06-09 03:20:28 +00:00
PyTorch MergeBot	8ce310b943	Revert "Revert "moved logit to use torch ops instead of refs + added …a couple more decompositions"" (#79082 ) cc: @osalpekar Pull Request resolved: https://github.com/pytorch/pytorch/pull/79082 Approved by: https://github.com/eellison	2022-06-08 01:44:53 +00:00
PyTorch MergeBot	7d192d48d2	Revert "moved logit to use torch ops instead of refs + added a couple more decompositions" This reverts commit `1d9f445b5d`. Reverted https://github.com/pytorch/pytorch/pull/78984 on behalf of https://github.com/osalpekar due to broke some jobs, like meta functorch builds	2022-06-07 21:51:41 +00:00
Horace He	1d9f445b5d	moved logit to use torch ops instead of refs + added a couple more decompositions Pull Request resolved: https://github.com/pytorch/pytorch/pull/78984 Approved by: https://github.com/ezyang	2022-06-07 05:34:05 +00:00
Horace He	69778ee4eb	Ported nn.functional functions to use torch calls instead of ref calls Pull Request resolved: https://github.com/pytorch/pytorch/pull/78978 Approved by: https://github.com/ezyang	2022-06-07 05:09:05 +00:00
Horace He	e675dbadc4	Ported gelu decomp to ref (#78697 ) Ugh... these are actually so painful to write without operator overloading lol. Decided to just utilize operator overloading, and xfail the ref tests for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78697 Approved by: https://github.com/mruberry	2022-06-06 22:30:20 +00:00
Horace He	ea3c4d0c75	Added glu_backward decomp (#78919 ) Requested in https://github.com/pytorch/torchdynamo/issues/327 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78919 Approved by: https://github.com/mruberry	2022-06-06 19:56:01 +00:00
Horace He	080cf84bed	Reland hardtanh ref (again) (#78914 ) Fixes land race between `823ddb6e87` and Ed's stack. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78914 Approved by: https://github.com/wanchaol	2022-06-06 09:39:01 +00:00
PyTorch MergeBot	ddf1930734	Revert "reland Hardtanh ref (#78894 )" This reverts commit `823ddb6e87`. Reverted https://github.com/pytorch/pytorch/pull/78894 on behalf of https://github.com/suo due to this caused unexpected successes on master (lol), search test_python_ref_meta__refs_nn_functional_hardtanh_cpu_bfloat16: `823ddb6e87`"`	2022-06-06 03:59:53 +00:00
Horace He	823ddb6e87	reland Hardtanh ref (#78894 ) Reland of https://github.com/pytorch/pytorch/pull/78689 cc: @kit1980 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78894 Approved by: https://github.com/kit1980	2022-06-06 02:09:31 +00:00
PyTorch MergeBot	e6cc2e8d38	Revert "Ported hardtanh decomposition to ref (#78689 )" This reverts commit `484282a6fd`. Reverted https://github.com/pytorch/pytorch/pull/78689 on behalf of https://github.com/kit1980 due to test_meta_nn_functional_hardtanh_cuda_float32 failed on both PR and trunk, see `484282a6fd`	2022-06-05 17:46:54 +00:00
Horace He	484282a6fd	Ported hardtanh decomposition to ref (#78689 ) One note: The logic for handling scalar boundary conditions seems to be a bit different than other ops - I simply copied the ATen logic (https://github.com/pytorch/pytorch/blob/hardtanh_ref/aten/src/ATen/native/Activation.cpp#L370). Not sure if it's an inconsistency we should fix. Will add error opinfo after figuring out the scalar boundary condition stuff. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78689 Approved by: https://github.com/mruberry	2022-06-05 11:41:23 +00:00
Horace He	1ea4075bda	Ported t decomp to become a ref (#78686 ) Also added an error input for `t` Pull Request resolved: https://github.com/pytorch/pytorch/pull/78686 Approved by: https://github.com/mruberry	2022-06-03 01:16:20 +00:00
Shangdi Yu	02273f056b	Norm decomposition (#78582 ) A decomposition for torch.ops.aten.norm Pull Request resolved: https://github.com/pytorch/pytorch/pull/78582 Approved by: https://github.com/Chillee	2022-06-02 00:25:43 +00:00
Jason Ansel	dabf8f0569	Populate the torch._decomp table on import (#78476 ) #78041 broke TorchInductor, because of: ``` >>> from torch import _decomp >>> import torch >>> _decomp.get_decompositions([torch.ops.aten.leaky_relu]) {} >>> import torch._refs.nn.functional >>> _decomp.get_decompositions([torch.ops.aten.leaky_relu]) {<OpOverload(op='aten.leaky_relu', overload='default')>: <function leaky_relu at 0x7f5a39b56c10>, <OpOverload(op='aten.leaky_relu', overload='out')>: <function leaky_relu at 0x7f5a39b56c10>} ``` cc @Chillee Pull Request resolved: https://github.com/pytorch/pytorch/pull/78476 Approved by: https://github.com/Chillee	2022-05-31 03:46:38 +00:00
Bairen Yi	b6672b10e1	Fix incorrect decomposition for native_dropout (#77933 ) Quick sanity check: it should be identity function if p=0. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77933 Approved by: https://github.com/Chillee	2022-05-30 20:08:48 +00:00
Aidyn-A	31016eb81e	[primTorch] Elementwise Binary Ops I (#78023 ) This PR is a result of collaboration with @rdspring1 and @mruberry on primTorch. It adds the following prims: - `fmax` - `fmin` - `fmod` And adds the following refs: - `fmax` - `fmin` - `fmod` - `logical_xor` The work is in progress as there are some tests that fail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78023 Approved by: https://github.com/mruberry	2022-05-26 20:22:27 +00:00
Horace He	ea5d01e629	[Primtorch] Tried porting leaky_relu into a ref (#78041 ) Feels good to delete it from `torch._decomps`. This is mainly to clarify the process for me - Seems like there's still some components missing of the `torch <-> refs` mapping? For example, seems like methods don't work yet for mapping from torch <-> refs, and neither do the meta tests? (cc: @ezyang). If I replace the `torch` with `refs`, then the tests seem to pass. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78041 Approved by: https://github.com/mruberry	2022-05-23 18:00:21 +00:00
Horace He	4428218945	[primtorch] Added `native_group_norm` decomp (#78029 ) cc: @jansel @bertmaher More or less identical in spirit to the layer norm and batch norm ones. One annoying thing about all 3 of these is that layer_norm has slightly different `mean/var` semantics than batch norm and group norm. After normalization, `layer_norm` keeps them unsqueezed (so they're something like [1, 5, 1, 1]) while batch norm and group norm squeeze out the 1-dims. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78029 Approved by: https://github.com/bertmaher	2022-05-21 08:07:02 +00:00
Edward Z. Yang	6b273444c4	Add logit ref; allow non-refs to be called in refs. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77816 Approved by: https://github.com/mruberry	2022-05-21 02:35:14 +00:00
Horace He	64b4bb4b01	Fix meta tests on norm (and relanding norm fixes) (#77930 ) Had a land race with meta tests. Will also be relanding https://github.com/pytorch/pytorch/pull/77407 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77930 Approved by: https://github.com/malfet, https://github.com/ezyang	2022-05-20 23:15:53 +00:00
PyTorch MergeBot	03546e9c07	Revert "Fixed type promotion semantics for native_batch_norm and native_layer_norm (#77407 )" This reverts commit `70d80fb424`. Reverted https://github.com/pytorch/pytorch/pull/77407 on behalf of https://github.com/malfet due to as it broke meta tests ( I guess due to landrace), see `70d80fb424`	2022-05-20 02:31:57 +00:00
Horace He	70d80fb424	Fixed type promotion semantics for native_batch_norm and native_layer_norm (#77407 ) Originally, when these were written, they simply used the naive strategy of "upcast all inputs to floats, and downcast all inputs back". In addition to being... not quite what the kernels did, they also didn't capture some additional semantics. Namely, that the norms (except for layer norm on CPU! cc: @ngimel) return fp32 for the mean and rstd values. Also, folks didn't like that I wrote `native_layer_norm` in terms of `native_batch_norm`. Which is fair - so I refactored the common logic into a `normalize` function. cc: @jansel / @bertmaher , who've been looking at lowering layer norm/batch norm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77407 Approved by: https://github.com/bertmaher	2022-05-19 17:11:47 +00:00
Edward Z. Yang	88c89c9eb9	log_sigmoid_forward out support; out_wrapper_multi Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77739 Approved by: https://github.com/mruberry	2022-05-19 14:43:35 +00:00
Edward Z. Yang	4941e72e40	Revert "Revert "Implement sym_sizes to create proper IR for sym ints representing tensor sizes (#76836 )"" This reverts commit `c35bd8d423`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77719 Approved by: https://github.com/Chillee, https://github.com/malfet	2022-05-18 18:40:57 +00:00
Mike Ruberry	580a053832	[primTorch] Enforces stride metadata (#77542 ) This PR... Filed the Following Issues - https://github.com/pytorch/pytorch/issues/77553 - https://github.com/pytorch/pytorch/issues/77526 - https://github.com/pytorch/pytorch/issues/77600 Testing - Updates test_dtypes to longer attempt to test the backward of sample inputs where no inputs require grad - Adds a new test_python_reference_errors; it ensures the meta operations for references throw errors as expected - Updates compare_tensor_meta to better handle CUDA devices, and (temporarily) restricts stride checking to the CUDA device type - Elementwise unary and elementwise binary operators now have arbitrarily strided reference inputs - Reference inputs for _like functions are added - An OpInfo for torch.empty is added - Reference inputs for torch.clone are added - A NumPy reference for clone is added - Adds OpInfos for refs.empty and refs.empty_like Prims - Renames the "max" and "min" prims have been renamed to "maximum" and "minimum," respectively, to better conform to their ATen names - Adds the empty, empty_like, full, and full_like prims - Fixes the elementwise meta function's stride propagation - Fixes clone's meta function's stride propagation - Fixes convert_element_type's meta's stride propagation - Adds a (temporary) _to_dtype pprivate prim that casts a tensor while preserving its stride permutation - Removes the _set prim comment - Adds utils.compute_elementwise_output_strides, which computes the correct output strides for elementwise operations - Corrects an issue where utils.make_contiguous_strides_for was creating the incorrect strides for tensors with no elements References - Adds the empty, empty_like, full, full_like, and ones_like refs - Extends make_elementwise_unary_reference to accept an additional callable to perform extra input validation - Adds an extra validation function to handle refs.neg(BoolTensor) - Updates the isfinite ref to call ones_like when appropriate - Models Python scalar handling for elementwise binary operations - Added a 64 dim check for the amin and amax references - opmath is now a flag that can be set separately for cpu and CUDA Pull Request resolved: https://github.com/pytorch/pytorch/pull/77542 Approved by: https://github.com/ezyang	2022-05-18 13:57:26 +00:00
PyTorch MergeBot	48581d74ad	Revert "Add dispatch mode testing for meta tensors and other stuff" This reverts commit `c1cdb1216b`. Reverted https://github.com/pytorch/pytorch/pull/77477 on behalf of https://github.com/malfet	2022-05-18 02:56:48 +00:00
Edward Z. Yang	c1cdb1216b	Add dispatch mode testing for meta tensors and other stuff We don't have any coverage for meta tensor correctness for backwards because torch function mode can only allow us to interpose on Python torch API calls, but backwards invocations happen from C++. To make this possible, I add torch_dispatch_meta test which runs the tests with __torch_dispatch__ While doing this, I needed to generate fresh expected failure / skip lists for the new test suite, and I discovered that my original scaffolding for this purpose was woefully insufficient. So I rewrote how the test framework worked, and at the same time rewrote the __torch_function__ code to also use the new logic. Here's whats new: - Expected failure / skip is now done on a per function call basis, rather than the entire test. This means that separate OpInfo samples for a function don't affect each other. - There are now only two lists: expect failure list (where the test consistently fails on all runs) and skip list (where the test sometimes passes and fails. - We explicitly notate the dtype that failed. I considered detecting when something failed on all dtypes, but this was complicated and listing everything out seemed to be nice and simple. To keep the dtypes short, I introduce a shorthand notation for dtypes. - Conversion to meta tensors is factored into its own class MetaConverter - To regenerate the expected failure / skip lists, just run with PYTORCH_COLLECT_EXPECT and filter on a specific test type (test_meta or test_dispatch_meta) for whichever you want to update. Other misc fixes: - Fix max_pool1d to work with BFloat16 in all circumstances, by making it dispatch and then fixing a minor compile error (constexpr doesn't work with BFloat16) - Add resolve_name for turning random torch API functions into string names - Add push classmethod to the Mode classes, so that you can more easily push a mode onto the mode stack - Add some more skips for missing LAPACK - Added an API to let you query if there's already a registration for a function, added a test to check that we register_meta for all decompositions (except detach, that decomp is wrong lol), and then update all the necessary sites to make the test pass. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77477 Approved by: https://github.com/zou3519	2022-05-18 00:18:34 +00:00
Horace He	8626f76555	Add trace and log_sigmoid_forward decomps (#77329 ) Main question mark is that `log_sigmoid_forward` uses `acc_t` instead of `opmath_t` - not sure if we have a decorator today for that? Glad to add one if we don't. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77329 Approved by: https://github.com/ezyang	2022-05-13 04:55:52 +00:00
Edward Z. Yang	d5ed73badd	Make it possible to register decompositions to Meta key Decompositions can be used to fill in meta support where necessary, assuming the operations they decompose to support meta key. This PR adds register_meta kwarg to register_decomposition that optionally lets you register the meta to the C++ dispatch table for meta tensors. I use this to then get the meta function for where and huber_loss for free. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/77353 Approved by: https://github.com/mruberry	2022-05-12 23:20:16 +00:00
Horace He	c25bdeea26	Added logsumexp decomposition (#77219 ) Pretty simple. cc: @jansel who mentioned this. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77219 Approved by: https://github.com/jansel	2022-05-12 02:01:31 +00:00
samdow	d694cf60fe	add decomposition for nll_loss2d_backward (#77198 ) Adds a decomposition for `nll_loss2d_backward` This will let us actually run all the tests for jvpvjp ([see this functorch PR](https://github.com/pytorch/functorch/pull/792)). I confirmed locally that this made those tests pass too Pull Request resolved: https://github.com/pytorch/pytorch/pull/77198 Approved by: https://github.com/Chillee	2022-05-11 20:41:20 +00:00
Mike Ruberry	bb8baea932	[primTorch] flatten, squeeze, unsqueeze... (#77043 ) This PR ... Makes the following testing changes: - Updates stride testing in test_python_reference_consistency to only check strides of dimensions with length > 1 - Creates reference inputs for reshape - Creates reference inputs for chunk - Extends the sample inputs for unsqueeze - Extends the sample inputs for stack -- test_conj_view and test_neg_view are now xfailed - https://github.com/pytorch/pytorch/issues/77046 Makes the following architecture changes: - Adds the refs.special (sub)module - Adds the refs.nn.functional (sub)module Adds the following prims: - expand_dims - view_of - rev - clone Adds the following references: - flatten - squeeze - unsqueeze - special.i0e - special.i1e - logical_or - logical_and - isclose - flip - stack - nn.functional.elu - chunk - clone - narrow Identifies the following bugs in PyTorch today: - https://github.com/pytorch/pytorch/issues/77054 - https://github.com/pytorch/pytorch/issues/77055 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77043 Approved by: https://github.com/ngimel	2022-05-09 11:24:55 +00:00
Mike Ruberry	c031643e39	Adds decorators for Python References and extends Python Reference testing (#76945 ) This PR does the following... Tests: - fixes test_type_promotion in test_binary_ufuncs to correctly generate scalar cpu tensors - fixes test_python_reference_consistency to use the Python Reference's reference inputs - extends Python reference testing to test_conj_view, test_neg_view, and test_neg_conj_view - adds a NaN propagation sample input for elementwise unary and binary operations - fixes the UnaryUfuncInfo class to properly register its reference inputs - Updates the Python Reference OpInfos to skip error inputs when their behavior on scalar inputs is inconsistent with their reference operators Code organization: - moves elementwise type promotion functionality to prims.utils Prims & Refs: - fixes scalar cpu tensor handling by having them pass through broadcasting and device and shape checks - adds two decorators, `elementwise_type_promotion_wrapper` and `out_wrapper`, the former allows for elementwise type promotion to be automated and the latter automatically adds the out kwarg and handles it properly cc @ezyang who also had some thoughts on cpu scalar tensor handling cc @chillee -- might want to use this new decorator as we converge decompositions and references Pull Request resolved: https://github.com/pytorch/pytorch/pull/76945 Approved by: https://github.com/ngimel	2022-05-07 03:42:24 +00:00
Edward Z. Yang	f2eed9400d	Register PrimTorch refs as decompositions. For the most part, PrimTorch refs have the same signature as their ATen equivalents. I modify most PrimTorch refs to register themselves as decompositions, using the prim name they wrap to find the aten name (except for a few cases where the prim/aten names mismatch). There are some exclusions, falling into one of two categories: - The torch equivalent was already implemented as a CompositeImplicitAutograd decomposition in C++ - The ref doesn't support enough features (e.g., the real deal has more kwargs / overloads than are currently implemented) PrimTorch refs are written as a single function that supports all overloads, and this style is convenient for cases where we have a bundle of overloads for what morally is a single overload with a Union type on an argument (which we ought to have supported in native_functions.yaml but blah); to support registering a single decomp for all the overloads, we modify register_decomposition to register to ALL overloads if you pass it an overload packet. This is technically BC breaking but no tests started failing because of it. Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/76835 Approved by: https://github.com/Chillee, https://github.com/mruberry	2022-05-06 20:11:45 +00:00
Horace He	e9f34931ef	Add some shape decomps (t, transpose, rot90, stack) Also fixes xlogy (turns out the only thing it was missing was a type cast annotation! nice!) I also renamed `canonicalize_idx` => `canonicalize_dim` (to align with `canonicalize_dims`) and fixed a bug in it (cc: @mruberry) Pull Request resolved: https://github.com/pytorch/pytorch/pull/76873 Approved by: https://github.com/mruberry	2022-05-06 02:40:57 +00:00
Horace He	6917034afb	Added logit/reciprocal decomps, fixed var for complex, moved type promotion logic to standardize on primtorch's Pull Request resolved: https://github.com/pytorch/pytorch/pull/76633 Approved by: https://github.com/ezyang	2022-05-04 21:29:52 +00:00
PyTorch MergeBot	ce63c53c9b	Revert "Add binary_cross_entropy and trace decomp - fixed _log_softmax/_softmax dtype promotion semantics" This reverts commit `8a3e9255ea`. Reverted https://github.com/pytorch/pytorch/pull/76670 on behalf of https://github.com/mruberry	2022-05-04 10:42:39 +00:00
Horace He	ed18181d83	Added gelu decomposition ^ Pull Request resolved: https://github.com/pytorch/pytorch/pull/76763 Approved by: https://github.com/ezyang	2022-05-03 23:23:18 +00:00
Horace He	8a3e9255ea	Add binary_cross_entropy and trace decomp - fixed _log_softmax/_softmax dtype promotion semantics cc: @zou3519 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76670 Approved by: https://github.com/ezyang	2022-05-03 18:20:17 +00:00
Horace He	fb24614011	Port functorch decomps over and fix some tests Still some stuff to fix up, will finish later. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76621 Approved by: https://github.com/ezyang	2022-05-01 08:48:48 +00:00
Edward Z. Yang	a3f10ec281	Move functorch decompositions to PyTorch Signed-off-by: Edward Z. Yang <ezyangfb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/76311 Approved by: https://github.com/Chillee	2022-04-30 16:47:53 +00:00

... 2 3 4 5 6 ...

350 Commits