Commit Graph

439 Commits

Author SHA1 Message Date
lezcano
fd27246c16 Fix decomposition for std (#87181)
The previous implementation was lacking a few features and incurred on a
pretty large error

cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87181
Approved by: https://github.com/ngimel, https://github.com/peterbell10
2022-10-28 00:50:29 +00:00
Natalia Gimelshein
f1b78224ca Fix type promotion for 2 wrapped scalar args (#87845)
Fixes #76801

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87845
Approved by: https://github.com/SherlockNoMad, https://github.com/mruberry
2022-10-27 15:53:11 +00:00
Nikita Karetnikov
59b9d29260 [primTorch] Check error_regex in test_python_ref_errors (#86987)
cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86987
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-26 23:34:34 +00:00
Bin Bao
2c1efe7472 Enable some PyTorch core tests with inductor (#87490)
Summary:
1) Graph break on torch.random.set_rng_state since it blocks running
inductor core tests;
2) Add several inductor-specific skips;
3) Enable several core tests for inductor CI;

cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87490
Approved by: https://github.com/eellison
2022-10-26 18:58:33 +00:00
Sherlock Huang
eb99c1efce Prefer python meta function over c++ meta function (#87426)
This is a policy update for meta registration. **We now prefer python meta implementation over C++ meta function.**  This is a flip of the previous policy, where we prefer C++ meta function over python meta function if they both exist.

Here's the meta registration process:
1. register_meta and register_decomposition will place the python meta/decomp functions into the `global_decomp_table`.  However, they will NOT register them into dispatcher.
2. After global_decomp_table is populated, we will compile an `active_meta_table`. For a given op, we pick the most specific decomp function from `global_decomp_table` in the preference order of Meta > PostAutograd > PreAutograd.
3. We will unconditionally register all of them into python dispatcher. And register them into C++ dispatcher, unless it one of the following 3 cases
- 1. the op is a CompositeImplicitAutograd, and should rely on decomposed op's meta
- 2. the op is a view op, as the MetaTensor doesn't support aliased storage
- 3. the op is in the blocklist (due to UT failures, and we will burn down this list op by op)

Over the long run, we wish to implement all meta functions in python. With this PR, 321 op_overloads will have cpp meta overridden by python meta. There are still 400 op_overloads is using cpp meta. The exact list can be found here https://gist.github.com/SherlockNoMad/d20bb736178df8eebd3b054c8bb7cdc5

cc @ngimel @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87426
Approved by: https://github.com/ezyang, https://github.com/jansel
2022-10-25 16:49:02 +00:00
Nikita Karetnikov
1b8af28fe8 [primTorch] Add refs for softmax, softmin, log_softmax (#84956)
cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84956
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-20 12:29:04 +00:00
PyTorch MergeBot
cd21613526 Revert "[primTorch] Add refs for softmax, softmin, log_softmax (#84956)"
This reverts commit c09ca93e47.

Reverted https://github.com/pytorch/pytorch/pull/84956 on behalf of https://github.com/ZainRizvi due to This is causing the MPS test test_output_match_log_softmax_with_dtype_cpu_float32 (__main__.TestConsistencyCPU) to fail
2022-10-19 20:36:55 +00:00
Nikita Karetnikov
c09ca93e47 [primTorch] Add refs for softmax, softmin, log_softmax (#84956)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84956
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-19 18:45:40 +00:00
Nikita Karetnikov
b886cd15f5 [primTorch] Add a ref for NumPy-style T (#86850)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86850
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-18 10:19:47 +00:00
Nikita Karetnikov
841995d53b [primTorch] Add refs for data conversion ops (#86561)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86561
Approved by: https://github.com/lezcano, https://github.com/mruberry, https://github.com/zou3519
2022-10-18 08:38:51 +00:00
Sean Ross-Ross
1bb609ad47 Added new test test_compare_cpu that checks if cpu and gpu results are consistent (#85011)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85011
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-14 20:15:16 +00:00
Ivan Yashchuk
fd80684784 Add nvFuser support for torch.Tensor.view (#84634)
This is an alternative to https://github.com/pytorch/pytorch/pull/83739. While PrimTorch has `view` as a reference, we would like to use nvFuser's implementation for `view` for now. Later we might transition to PrimTorch's `torch._refs.view`.

See `test_nvprims_view` for examples of things that are now sent to nvFuser. Note that nvFuser's `view` is a copy-like operation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84634
Approved by: https://github.com/kevinstephano, https://github.com/mruberry
2022-10-14 12:08:02 +00:00
Brian Hirsh
0feccda7d7 fix aliasing bug in pixel shuffle/unshuffle (#86608)
Fixes https://github.com/pytorch/pytorch/issues/82235

cc @albanD - `at::pixel_shuffle` and `at::pixel_unshuffle` advertise as being non-aliasing, but they have a C++ decomposition that internally uses reshape(), which means that it might return an alias.

I happened to notice this because a bunch of tests in `test/test_ops.py` failed when I ran locally with a `DEBUG=1` build.

(P.S.: when are we finally gonna get a debug build test in CI? 😃)

I fixed by adding an extra clone, which... is going to be an unnecessary perf hit in the case where the `reshape()` already properly cloned the input. My hope is that this is fine, because this only impacts the composite kernel- we already have a "fast" CPU kernel that does the right thing. Is `pixel_shuffle/unshuffle` commonly used with cuda? Maybe we should just add a fast cuda kernel for it if that's the case.

Alternatively, it seems like it would be nice if `reshape()` accepted an optional argument to unconditionally return a copy. That seems like a rabbit hole that isn't worth going down for now though - I remember a discussion a while ago about making `reshape()` copy-on-write

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86608
Approved by: https://github.com/albanD
2022-10-13 14:14:26 +00:00
Peter Bell
73c43ce2e2 Display unexpected exceptions raised from test_dtypes (#86599)
Currently `test_dtypes` swallows all exceptions which can make debugging failures more tricky.
This changes the test to save the exceptions and print only the unexpected ones at the end e.g.
```
AssertionError: The supported dtypes for nn.functional._scaled_dot_product_attention on device type cuda are incorrect!
The following dtypes did not work in backward but are listed by the OpInfo: {torch.bfloat16}.
Unexpected failures raised the following errors:
torch.bfloat16 - CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling [...]
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86599
Approved by: https://github.com/mruberry
2022-10-12 19:51:58 +00:00
Nikita Karetnikov
d56017a14f [primTorch] Add ref for triplet_margin_loss, improve triplet_margin_with_distance_loss (#85614)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85614
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-12 18:37:58 +00:00
Khushi
2344135179 [primTorch] special: entr, expit (#86592)
Add _refs for `entr` & `expit`.

cc @mruberry @kshitij12345!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86592
Approved by: https://github.com/mruberry
2022-10-12 07:00:40 +00:00
Elias Ellison
b409d1f65b Turn on Data Dependent Throwing (#86480)
This was already enabled in TorchDynamo, but was staged to make sure things don't break. Also makes backward single threaded for tests to fix a memory leak.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86480
Approved by: https://github.com/bdhirsh
2022-10-10 21:58:29 +00:00
Elias Ellison
d3f7c34cb3 Enable aten-aten decomps (#85921)
Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374))

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921
Approved by: https://github.com/ezyang
2022-10-08 05:12:42 +00:00
PyTorch MergeBot
7ec12a559c Revert "Enable aten-aten decomps (#85921)"
This reverts commit 62e4f51efd.

Reverted https://github.com/pytorch/pytorch/pull/85921 on behalf of https://github.com/huydhn due to Sorry for reverting your PR. I think it breaks a dynamo test in trunk 62e4f51efd
2022-10-08 01:59:54 +00:00
Elias Ellison
62e4f51efd Enable aten-aten decomps (#85921)
Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374))

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921
Approved by: https://github.com/ezyang
2022-10-07 21:04:39 +00:00
Elias Ellison
9ceadcadb2 Fix unfold backward decomp aliasing for 0 dim input (#86428)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86428
Approved by: https://github.com/ngimel, https://github.com/ezyang
2022-10-07 03:55:31 +00:00
lezcano
c609768896 Add refs for torch.unfold and a decomposition for its backward. (#85629)
It's not clear to me what's the difference between `unfold` and `unfold_copy`, as this latter one is codegen'd

I also took this chance to clean the implementation of unfold and its reference
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85629
Approved by: https://github.com/mruberry
2022-10-05 12:15:49 +00:00
Elias Ellison
6a2b12dd65 Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85471
Approved by: https://github.com/ezyang
2022-09-28 23:06:59 +00:00
Elias Ellison
0b93afb112 add amp tests (#85434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85434
Approved by: https://github.com/ngimel
2022-09-28 19:34:46 +00:00
samdow
18d8c548f4 [Modes] remove enable and rewrite mode stack (squashed) (#84774)
Based on @ezyang's suggestion, mode stack now has "one true mode" which is the _only_ mode that can ever be active at the C++ level. That mode's torch dispatch is just to take the top mode in the stack, reenable itself (if we aren't at the end of the mode stack), and run the top mode's torch_{dispatch|function}

This maintains that in the middle of a mode's torch dispatch, the mode itself will not be active. It changes the function the user has to call to see what the current mode is (no longer queries the C++, it's python only) but allows the user to also see the entire mode stack easily

Removes `enable_torch_dispatch_mode` and `.restore()` since neither makes sense in this new setup

### Background
Why do we want this? Well, a pretty common pattern that was coming up was that users had to do something like

```python
## PRE-PR UX
def f(mode):
  with mode.restore():  # user needs to understand this restore thing?
    ...

with Mode() as m:
  pass
f(m)
```

Many users were getting error from forgetting to call `.restore` or from forgetting to add the (tbh weird) "mode instantiation"  step where they use the mode as a context manager with an empty body. Really, they wanted to treat modes like context managers and just write
```python
## FROM FEEDBACK, USER DESIRED CODE. POSSIBLE POST-PR
def f(mode):
  with mode:
    ...
f(Mode())
```

** Technical Details **
With the old mode stack, we basically had a linked list so the mode itself could only be used once and had a fixed parent. In this new design, the mode stack is just a python list that we're pushing to and popping from. There's only one mode that's ever active at the C++ level and it runs the next mode in the Python list. The modes don't have state on them anymore
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84774
Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-09-27 01:04:35 +00:00
Elias Ellison
bcc544e9d7 Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417
Approved by: https://github.com/ezyang
2022-09-26 17:08:14 +00:00
PyTorch MergeBot
d10de31cc8 Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)"
This reverts commit 78afa0cf0c.

Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk 78afa0cf0c
2022-09-23 17:21:43 +00:00
PyTorch MergeBot
eb570ab7d0 Revert "add amp tests (#85434)"
This reverts commit c2f4bbe669.

Reverted https://github.com/pytorch/pytorch/pull/85434 on behalf of https://github.com/clee2000 due to broke rocm and slow tests on trunk c2f4bbe669
2022-09-23 17:19:06 +00:00
PyTorch MergeBot
3b195fd33e Revert "Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471)"
This reverts commit 1e92eb8068.

Reverted https://github.com/pytorch/pytorch/pull/85471 on behalf of https://github.com/clee2000 due to stacked prs https://github.com/pytorch/pytorch/pull/85417 and https://github.com/pytorch/pytorch/pull/85434 broke trunk, reverting this so i can revert the others
2022-09-23 17:13:35 +00:00
Elias Ellison
1e92eb8068 Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85471
Approved by: https://github.com/ezyang
2022-09-23 16:02:15 +00:00
Elias Ellison
c2f4bbe669 add amp tests (#85434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85434
Approved by: https://github.com/ngimel
2022-09-23 15:57:37 +00:00
Elias Ellison
78afa0cf0c Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417
Approved by: https://github.com/ezyang
2022-09-23 15:50:03 +00:00
PyTorch MergeBot
5043457a8e Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)"
This reverts commit 9c77083965.

Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk (and pull somehow) 9c77083965
2022-09-22 15:44:38 +00:00
Elias Ellison
9c77083965 Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417
Approved by: https://github.com/ezyang
2022-09-22 13:03:57 +00:00
Thomas Viehmann
764cba6848 add Python ref for isreal (#85361)
Dipping my toes into prims waters

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85361
Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry
2022-09-21 18:53:34 +00:00
Ivan Yashchuk
35943f30cb Reference implementation for torch.Tensor.sum_to_size (#85338)
New ref: `torch._refs.sum_to_size`.

View consistency validation is disabled because the ref returns a view instead of returning the input.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85338
Approved by: https://github.com/mruberry
2022-09-21 18:12:52 +00:00
Horace He
2f4a517d67 Ported matmul compositeimplicitautograd impl into core (#85239)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85239
Approved by: https://github.com/ezyang, https://github.com/lezcano
2022-09-21 09:25:24 +00:00
Elias Ellison
a3afb2c2f6 Fake: fix conv_transpose2d striding (#82846)
The output striding channels-last preservation logic differs between cuda and cpu. For the meta kernel, we can peek at the fake tensor device and use that to determine whether to do cpu or cuda.

You could argue there's a leaking of abstraction here but this seems like a pretty minimal leak and I'm not sure there's a much cleaner way forward for device-specific striding tracing logic.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82846
Approved by: https://github.com/ezyang
2022-09-20 18:00:59 +00:00
lezcano
5dd9610e9d Refs and decompositions for index_{add,copy,select,fill} (#85002)
As per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85002
Approved by: https://github.com/ngimel
2022-09-17 19:57:34 +00:00
PyTorch MergeBot
e33b464ffc Revert "Refs and decompositions for index_{add,copy,select,fill} (#85002)"
This reverts commit 2f0b3de443.

Reverted https://github.com/pytorch/pytorch/pull/85002 on behalf of https://github.com/huydhn due to Broke trunk slow tests
2022-09-17 04:26:04 +00:00
lezcano
2f0b3de443 Refs and decompositions for index_{add,copy,select,fill} (#85002)
As per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85002
Approved by: https://github.com/ngimel
2022-09-16 23:59:35 +00:00
Horace He
4bdc0af53d Added support for symbolic is_contiguous (#84829)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84829
Approved by: https://github.com/ezyang
2022-09-16 04:54:01 +00:00
Sherlock Huang
17925122d0 Rewrite new_zeros, new_ones, new_full decomp with aten.full (#84946)
We should **NOT**  introducing non-functional op for decomps of functional op.

For example
```
make_fx(functionalize(lambda x: x.new_zeros(3)), decomposition_table=decomposition_table)(x)
```
is producing
```
def forward(self, x_1):
    empty = torch.ops.aten.empty.memory_format([3, 4], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False)
    zero_ = torch.ops.aten.zero_.default(empty);  empty = None
    return zero_
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84946
Approved by: https://github.com/ngimel
2022-09-15 05:45:40 +00:00
Ivan Yashchuk
6750946b82 Skip validate_view_consistency for nvFuser tests (#84858)
nvFuser's execute function always returns a copy for now.

Ref. https://github.com/pytorch/pytorch/pull/84629#discussion_r966375582
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84858
Approved by: https://github.com/mruberry, https://github.com/ngimel
2022-09-14 12:03:11 +00:00
Ryan Spring
d09e8b23bf [primTorch] Add repeat and unfold_copy references (#81374)
Add References:

- repeat
- unfold
- expand_as
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81374
Approved by: https://github.com/mruberry, https://github.com/ngimel
2022-09-12 22:19:06 +00:00
kshitij12345
4f6027b78a [opinfo] narrow: add new sample for Tensor overload (#84785)
`narrow` accepts `start` argument to be a Tensor. We add a sample to test this overload.

NOTE: This leads to a bunch of failed tests and hence the skips and xfails
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84785
Approved by: https://github.com/zou3519
2022-09-12 16:59:08 +00:00
Elias Ellison
15c5baf878 Throw on data dependent ops (#83567)
Previously, we would trace through the following with no error:
```
from torch.fx.experimental.proxy_tensor import make_fx
import torch

def f(x, y):
    return x[0, y:]
```

Even though the output shape is dependent on the data of `y`.  Now, throw on the conversion of `y` to an integer.

It would be nice to not break on constant tensors but I'll do that as the next PR (Edit: done with https://github.com/pytorch/pytorch/pull/84387). Sketching out how that would work (and keep in mind this is applicable Dynamo tracing and not just AOT Autograd)

I think to do that you would need to :
- hold strong refs to a set of constant tensors, and only allow them to be captured from `lift_fresh.copy`
- when you run a mutable op, either remove it from the set of constant tensors or run the operator for real
- limit to small constant tensors
Anything else ?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83567
Approved by: https://github.com/ezyang
2022-09-07 02:37:00 +00:00
Nikita Karetnikov
85b889fa5f [primTorch] Add ref for poisson_nll_loss (#83805)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83805
Approved by: https://github.com/Lezcano, https://github.com/ngimel
2022-08-31 17:39:34 +00:00
Nikita Karetnikov
305af90d0f [primTorch] Add docstring and promotion for l1_loss ref (#83803)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83803
Approved by: https://github.com/Lezcano, https://github.com/ngimel
2022-08-31 17:39:31 +00:00
Elias Ellison
9c452abcf1 Use reentrant mode when invoking prims, delete global prim_fake_mode (#84090)
Maybe I should be using the meta_impl instead of the prim_impl, but it's not terribly clear why, since the prim impl will be better tested and should work under the re-entrant FakeTensorMode.

Fixes https://github.com/pytorch/pytorch/issues/78613 in the process
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84090
Approved by: https://github.com/ezyang, https://github.com/samdow
2022-08-31 01:58:44 +00:00
samdow
7532d5b125 [Modes] remove inner constructor kwarg (#83925)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83925
Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-08-31 00:05:56 +00:00
jjsjann123
b078d242c4 Nvfuser to copy decomp to prim (#83782)
Conditional decomposing aten::_to_copy to nvprim::convert_element_type to allow fusion with type casting, which is introduced during type promotion phase at torch decomposition.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83782
Approved by: https://github.com/ngimel
2022-08-28 04:26:36 +00:00
Horace He
9a236c7ab4 Made some minor cleanups to decompositions (#83814)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83814
Approved by: https://github.com/ngimel
2022-08-26 10:55:31 +00:00
jjsjann123
1407e6728c Nvfuser python api patch take 2 (#83684)
landing #83645 again.

Previously we are breaking on codegen bf16 kernel for cuda TK 10.2. Added a short-cut to disable bf tests on pre cuda 11 build.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83684
Approved by: https://github.com/ngimel
2022-08-19 16:05:39 +00:00
Nikita Karetnikov
1a49eea301 [primTorch] Add ref for diag_embed (#82322)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82322
Approved by: https://github.com/Lezcano, https://github.com/ngimel
2022-08-17 20:32:56 +00:00
Fabio Rocha
2a096e940d [primTorch] support for a few magic methods (#83524)
Added support for mapping __rsub__, __rtruediv__,
__rfloordiv__, __floordiv__, __pow__,
and __rpow__ in TorchRefsMode.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83524
Approved by: https://github.com/ngimel
2022-08-17 09:48:15 +00:00
Nikita Karetnikov
b156f3329e [primTorch] Add ref for movedim (#83278)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83278
Approved by: https://github.com/ngimel
2022-08-16 18:38:28 +00:00
Ivan Yashchuk
2e8e386d6f Add refs for real and imag to __all__ (#83057)
`imag` and `real` were missing from the ref's `__all__` list.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83057
Approved by: https://github.com/ngimel
2022-08-16 13:40:43 +00:00
soulitzer
ba53efa6e7 Unskip CompositeCompliance tests for ARM (#83089)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83089
Approved by: https://github.com/albanD
2022-08-11 20:01:51 +00:00
Peter Bell
5e3d1ef49f Allow ufunc OpInfos to have no reference (#82348)
The `ref` property was moved down from `{Unary,Binary}UfuncInfo` into
`OpInfo` quite some time ago, but `OpInfo` uses `None` to signal no
reference is available while the others use `_NOTHING`. This makes
everything consistently use `None`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82348
Approved by: https://github.com/ngimel
2022-08-09 04:38:17 +00:00
PyTorch MergeBot
814c19b266 Revert "Allow ufunc OpInfos to have no reference (#82348)"
This reverts commit 566d734396.

Reverted https://github.com/pytorch/pytorch/pull/82348 on behalf of https://github.com/peterbell10 due to This stack broke macos tests on trunk
2022-08-06 21:09:09 +00:00
Peter Bell
566d734396 Allow ufunc OpInfos to have no reference (#82348)
The `ref` property was moved down from `{Unary,Binary}UfuncInfo` into
`OpInfo` quite some time ago, but `OpInfo` uses `None` to signal no
reference is available while the others use `_NOTHING`. This makes
everything consistently use `None`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82348
Approved by: https://github.com/ngimel
2022-08-06 20:01:39 +00:00
albanD
2255911f8a Make M1 tests green (#82213)
This is skipping all the failing tests and add a new master job to test on M1

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82213
Approved by: https://github.com/seemethere, https://github.com/soulitzer, https://github.com/malfet
2022-08-05 16:12:08 +00:00
Peter Bell
4d405517e4 Move OpInfo class into new opinfo folder (#82540)
Ref #82518

Starting small to minimize merge conflicts, this moves the top-level
class definitions and some helper functions into the `opinfos` folder.
It also brings `common_methods_invocations.py` to just below 1MB.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82540
Approved by: https://github.com/albanD
2022-08-05 15:10:17 +00:00
Fabio Rocha
ff753cbc12 [primTorch] Added unbind OpInfo and ref (#81776)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81776
Approved by: https://github.com/Lezcano, https://github.com/ngimel
2022-08-04 17:03:24 +00:00
Natalia Gimelshein
112ec24f09 Fix device behavior for masked_fill (#82737)
Fixes #81018, based on #81036.
It will create graph break for cpu 0d tensor value due to .item() call (we could maybe specialize on that instead of breaking?), but otherwise it would create graph break due to synchronizing `to` call, so there's no way around :-(, and for number `value` argument we already should be specializing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82737
Approved by: https://github.com/Chillee
2022-08-04 15:47:56 +00:00
Fabio Rocha
22fea8f654 [primTorch] Added reference for unflatten (#81231)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81231
Approved by: https://github.com/ngimel
2022-08-03 15:20:46 +00:00
Elias Ellison
9b46737fca Add tests for fake tensor striding (#82571)
Add tests for fake tensor striding in OpInfos. I know primtorch is not strictly committing to consistent stride propagation with ATen (see https://github.com/pytorch/pytorch/issues/78050), where as in fake tensor/meta the goal is be completely consistent. This is a little awkward because by default prim refs will register a meta implementation.

In any case, I think we can add the tests for fake with a disclaimer in the tests the failure is non-blocking for adding prims. At least as far as OpInfo tests get, the prims seem to do a pretty good job with stride propagation already.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82571
Approved by: https://github.com/ezyang
2022-08-01 22:01:23 +00:00
Elias Ellison
b2f6aa666e Add tests for aliasing in fake tensor (#82337)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82337
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2022-08-01 21:58:54 +00:00
Elias Ellison
642aed8b99 Add Autocast Support for FakeTensors / use fake device dispatch keys (#82449)
From PR:
```
Note: [Fake Tensor Dispatch Keys]
In order to model the behavior of device-specific autocast
and autograd logic, we update the dispatch keys of FakeTensors
to reflect their fake device. This includes the BackendComponent
(DispatchKey::Meta -> DispatchKey::CUDA), and also the BackendComponent
related Autocast and Autograd keys. __torch__dispatch__ sits below
Autocast and Autograd, and is only invoked when we are at the
kernel for the BackendComponent. Then, we add Meta to the
thread-local dispatch include set to hit the meta kernel
instead of the kernel of the BackendComponent for the fake device.
```

Also adds the `conv1/2/3d.padding` operators to the Autocast rule set. Without that fix, the FakeTensor dtype would diverge.

See: https://github.com/pytorch/pytorch/issues/81608

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82449
Approved by: https://github.com/ezyang
2022-08-01 21:40:36 +00:00
soulitzer
16093a1d81 Fix primtorch out_wrapper semantics for factory functions (#82375)
This PR:
- introduces new OpInfo attribute `is_factory_function`
- updates OpInfo test_out to handle case when `is_factory_function=True`:
- correct primtorch out_wrapper
- update sample inputs for arange, linspace, logspace to not explicitly pass in dtype or device (having this sample is necessary for the test to get triggered)

Fixes https://github.com/pytorch/pytorch/issues/82364

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82375
Approved by: https://github.com/ezyang, https://github.com/ngimel
2022-07-29 00:57:57 +00:00
Elias Ellison
688b971876 Extend fake tensor tests to cuda, add support for index put (#82281)
Testing CUDA exposes some failures, such as `index_put` with CUDA self tensor and cpu value tensors
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82281
Approved by: https://github.com/ezyang
2022-07-28 16:07:15 +00:00
Edward Z. Yang
3f740f6d7f Move test_dtypes so it runs later (#82169)
The error messages it gives are very unhelpful (because a failure
gets translated into "dtype was not supported" rather than the
actual backtrace), so I'd rather get error messages about this after
I've tested basic functionality.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82169
Approved by: https://github.com/zou3519, https://github.com/Chillee
2022-07-27 18:08:17 +00:00
soulitzer
80e2d5704b Add OpInfo and ref for linspace and logspace (#81826)
Implements linspace with arange, and logspace with linspace.
- Implements a more precise path in linspace's ref when dtype is integral to avoid off-by-one issues when output of computation is casted to int. The trade off is that there's an increased chance of overflow.
- Files several issues #82242, #82230, #81996, on preexisting issues with the linspace and logspace. These mainly concern when dtype is integral - the affect tests are xfailed in this PR.
- Fixes the check that the reference implementation is closer to precise implementation than torch implementation to also update the dtype kwarg to the precise dtype.

TODO:
- ~support negative bases~ (not in this PR)
- ~support complex. Since arange does not support complex, but linspace does, one solution is to just call linspace separately on the real and imag components and sum the results in the end~ (not in this PR)
- ~default dtypes need to be explicitly handled since computation is done in a different dtype than result~ (done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81826
Approved by: https://github.com/ngimel
2022-07-27 05:53:06 +00:00
Ryan Spring
801f0d24bb [primTorch] Add rsub reference (#80421)
Add Reference:
- rsub

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80421
Approved by: https://github.com/mruberry
2022-07-26 20:31:44 +00:00
lezcano
11fe277b62 [PrimTorch] Add reference for torch.norm (#81765)
This ref does more things than `torch.norm`, and it fixes a few bugs
that `torch.norm` has. This implementation and the `torch.norm`
implementation come to terms in the next PR of this stack

We put this PR before, as otherwise `test_decomp.py` was failing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81765
Approved by: https://github.com/ngimel
2022-07-25 19:57:21 +00:00
samdow
2ac24675cc get rid of push_torch_{dispatch, function}_mode (#78215)
Currently we have 2 ways of doing the same thing for torch dispatch and function modes:
`with push_torch_dispatch_mode(X)` or `with X.push(...)`
is now the equivalent of doing
`with X()`

This removes the first API (which is older and private so we don't need to go through a deprecation cycle)

There is some risk here that this might land race with a PR that uses the old API but in general it seems like most are using the `with X()` API or `enable_torch_dispatch_mode(X())` which isn't getting removed.

EDIT: left the `with X.push(...)` API since there were ~3 land races with that over the past day or so. But made it give a warning and ask users to use the other API
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78215
Approved by: https://github.com/ezyang
2022-07-22 18:56:37 +00:00
soulitzer
f595467e5c Reenable slow gradcheck and make it pass (#80514)
Context: For a while slow gradcheck CI was skipping nearly all tests and this hid the fact that it should've been failing and timing out (10+h runtime for TestGradients). The CI configuration has since been fixed to correct this, revealing the test failures. This PR reenables slow gradcheck CI and makes it pass again.

This PR:
- makes slow and failing tests run in fast gradcheck mode only
- reduce the input size for slow gradcheck only for unary/binary ufuncs (alternatively, skip the test entirely)
- skip entire test files on slow gradcheck runner if they don't use gradcheck (test_ops, test_meta, test_decomp, test_ops_jit)
- reduces the input size for some ops

Follow ups:
1. Investigate slow mode failures https://github.com/pytorch/pytorch/issues/80411
2. See if we can re-enable slow gradcheck tests for some of the slow tests by reducing the sizes of their inputs

The following are failing in slow mode, they are now running in fast mode only.
```
test_fn_fwgrad_bwgrad___rmod___cuda_float64
test_fn_fwgrad_bwgrad_linalg_householder_product_cuda_complex128
test_fn_fwgrad_bwgrad__masked_prod_cuda_complex128
test_fn_fwgrad_bwgrad__masked_prod_cuda_float64
test_fn_fwgrad_bwgrad_linalg_matrix_power_cuda_complex128
test_fn_fwgrad_bwgrad_cat_cuda_complex128
test_fn_fwgrad_bwgrad_linalg_lu_factor_ex_cuda_float64
test_fn_fwgrad_bwgrad_copysign_cuda_float64
test_fn_fwgrad_bwgrad_cholesky_inverse_cuda_complex128
test_fn_fwgrad_bwgrad_float_power_cuda_complex128
test_fn_fwgrad_bwgrad_fmod_cuda_float64
test_fn_fwgrad_bwgrad_float_power_cuda_float64
test_fn_fwgrad_bwgrad_linalg_lu_cuda_float64
test_fn_fwgrad_bwgrad_remainder_cuda_float64
test_fn_fwgrad_bwgrad_repeat_cuda_complex128
test_fn_fwgrad_bwgrad_prod_cuda_complex128
test_fn_fwgrad_bwgrad_slice_scatter_cuda_float64
test_fn_fwgrad_bwgrad_tile_cuda_complex128
test_fn_fwgrad_bwgrad_pow_cuda_float64
test_fn_fwgrad_bwgrad_pow_cuda_complex128
test_fn_fwgrad_bwgrad_fft_*
test_fn_fwgrad_bwgrad_zero__cuda_complex128
test_fn_gradgrad_linalg_lu_factor_cuda_float64
test_fn_grad_div_trunc_rounding_cuda_float64
test_fn_grad_div_floor_rounding_cuda_float64
```

Marks the OpInfos for the following ops that run slowly in slow gradcheck as `fast_gradcheck` only (the left column represents runtime in seconds):
```
0  918.722  test_fn_fwgrad_bwgrad_nn_functional_conv_transpose3d_cuda_float64
1  795.042  test_fn_fwgrad_bwgrad_nn_functional_unfold_cuda_complex128
2  583.63  test_fn_fwgrad_bwgrad_nn_functional_max_pool3d_cuda_float64
3  516.946  test_fn_fwgrad_bwgrad_svd_cuda_complex128
4  503.179  test_fn_fwgrad_bwgrad_linalg_svd_cuda_complex128
5  460.985  test_fn_fwgrad_bwgrad_linalg_lu_cuda_complex128
6  401.04  test_fn_fwgrad_bwgrad_linalg_lstsq_grad_oriented_cuda_complex128
7  353.671  test_fn_fwgrad_bwgrad_nn_functional_max_pool2d_cuda_float64
8  321.903  test_fn_fwgrad_bwgrad_nn_functional_gaussian_nll_loss_cuda_float64
9  307.951  test_fn_fwgrad_bwgrad_stft_cuda_complex128
10  266.104  test_fn_fwgrad_bwgrad_svd_lowrank_cuda_float64
11  221.032  test_fn_fwgrad_bwgrad_istft_cuda_complex128
12  183.741  test_fn_fwgrad_bwgrad_lu_unpack_cuda_complex128
13  132.019  test_fn_fwgrad_bwgrad_nn_functional_unfold_cuda_float64
14  125.343  test_fn_fwgrad_bwgrad_nn_functional_pad_constant_cuda_complex128
15  124.2  test_fn_fwgrad_bwgrad_kron_cuda_complex128
16  123.721  test_fn_fwgrad_bwgrad_pca_lowrank_cuda_float64
17  121.074  test_fn_fwgrad_bwgrad_nn_functional_max_unpool3d_cuda_float64
18  119.387  test_fn_fwgrad_bwgrad_rot90_cuda_complex128
19  112.889  test_fn_fwgrad_bwgrad__masked_normalize_cuda_complex128
20  107.541  test_fn_fwgrad_bwgrad_dist_cuda_complex128
21  106.727  test_fn_fwgrad_bwgrad_diff_cuda_complex128
22  104.588  test_fn_fwgrad_bwgrad__masked_cumprod_cuda_complex128
23  100.135  test_fn_fwgrad_bwgrad_nn_functional_feature_alpha_dropout_with_train_cuda_float64
24  88.359  test_fn_fwgrad_bwgrad_mH_cuda_complex128
25  86.214  test_fn_fwgrad_bwgrad_nn_functional_max_unpool2d_cuda_float64
26  83.037  test_fn_fwgrad_bwgrad_nn_functional_bilinear_cuda_float64
27  79.987  test_fn_fwgrad_bwgrad__masked_cumsum_cuda_complex128
28  77.822  test_fn_fwgrad_bwgrad_diag_embed_cuda_complex128
29  76.256  test_fn_fwgrad_bwgrad_mT_cuda_complex128
30  74.039  test_fn_fwgrad_bwgrad_linalg_lu_solve_cuda_complex128
```
```
0  334.142  test_fn_fwgrad_bwgrad_unfold_cuda_complex128
1  312.791  test_fn_fwgrad_bwgrad_linalg_lu_factor_cuda_complex128
2  121.963  test_fn_fwgrad_bwgrad_nn_functional_max_unpool3d_cuda_float64
3  108.085  test_fn_fwgrad_bwgrad_diff_cuda_complex128
4  89.418  test_fn_fwgrad_bwgrad_nn_functional_max_unpool2d_cuda_float64
5  72.231  test_fn_fwgrad_bwgrad___rdiv___cuda_complex128
6  69.433  test_fn_fwgrad_bwgrad___getitem___cuda_complex128
7  68.582  test_fn_fwgrad_bwgrad_ldexp_cuda_complex128
8  68.572  test_fn_fwgrad_bwgrad_linalg_pinv_cuda_complex128
9  67.585  test_fn_fwgrad_bwgrad_nn_functional_glu_cuda_float64
10  66.567  test_fn_fwgrad_bwgrad_lu_cuda_float64
```
```
0  630.13  test_fn_gradgrad_nn_functional_conv2d_cuda_complex128
1  81.086  test_fn_gradgrad_linalg_solve_triangular_cuda_complex128
2  71.332  test_fn_gradgrad_norm_cuda_complex128
3  64.308  test_fn_gradgrad__masked_std_cuda_complex128
4  59.519  test_fn_gradgrad_div_no_rounding_mode_cuda_complex128
5  58.836  test_fn_gradgrad_nn_functional_adaptive_avg_pool3
```

Reduces the sizes of the inputs for:
- diff
- diag_embed

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80514
Approved by: https://github.com/albanD
2022-07-22 02:05:37 +00:00
Horace He
a5fb41e3d3 Revert "Revert "Refactored prim utils into _prims_utils folder (#81746)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81746
Approved by: https://github.com/anijain2305, https://github.com/Krovatkin
2022-07-20 23:43:57 +00:00
Kshiteej K
8b5685da12 [composite compliance] test_operator correctness (#81600)
Time Before PR:
```
= 1111 passed, 45 skipped, 41020 deselected, 17 xfailed, 33 warnings in 52.55s =
```

Time After PR:
```
= 1105 passed, 51 skipped, 41020 deselected, 17 xfailed, 33 warnings in 70.03s (0:01:10) =
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81600
Approved by: https://github.com/zou3519
2022-07-20 21:18:56 +00:00
Ivan Yashchuk
a3d5d2ddf1 Add partitioned nvFuser executor with ATen fallbacks (#81043)
This PR introduces a new nvFuser executor for FX graphs containing different kinds of nodes, not just `torch.ops.prims` supported by nvFuser. The FX graph is partitioned based on whether nodes are supported or not by nvFuser and supported nodes are fused into subgraphs, that's all using Sherlock's work on the partitioner.

This new partitions-based executor with fallbacks to ATen is used by default with `executor="nvfuser"`. And the previous executor can be used with `executor="strictly_nvfuser"`, naming suggestions are welcome!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81043
Approved by: https://github.com/jjsjann123, https://github.com/SherlockNoMad
2022-07-20 19:51:20 +00:00
Kshiteej K
706b420a52 [composite compliance] check output of forward-ad with subclass args against regular tensor (#81464)
Time Before PR
```
= 880 passed, 274 skipped, 38170 deselected, 17 xfailed, 21 warnings in 808.96s (0:13:28) =
```

Time After PR
```
= 875 passed, 274 skipped, 38170 deselected, 22 xfailed, 21 warnings in 880.61s (0:14:40) =
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81464
Approved by: https://github.com/zou3519
2022-07-20 17:38:11 +00:00
PyTorch MergeBot
e43a02c314 Revert "Refactored prim utils into _prims_utils folder (#81088)"
This reverts commit 80231d0a72.

Reverted https://github.com/pytorch/pytorch/pull/81088 on behalf of https://github.com/jeanschmidt due to breaking internal tests
2022-07-19 19:56:41 +00:00
Catherine Lee
06a0cfc0ea pytest to run test_ops, test_ops_gradients, test_ops_jit in non linux cuda environments (#79898)
This PR uses pytest to run test_ops, test_ops_gradients, and test_ops_jit in parallel in non linux cuda environments to decrease TTS.  I am excluding linux cuda because running in parallel results in errors due to running out of memory

Notes:
* update hypothesis version for compatability with pytest
* use rerun-failures to rerun tests (similar to flaky tests, although these test files generally don't have flaky tests)
  * reruns are denoted by a rerun tag in the xml.  Failed reruns also have the failure tag.  Successes (meaning that the test is flaky) do not have the failure tag.
* see https://docs.google.com/spreadsheets/d/1aO0Rbg3y3ch7ghipt63PG2KNEUppl9a5b18Hmv2CZ4E/edit#gid=602543594 for info on speedup (or slowdown in the case of slow tests)
  * expecting windows tests to decrease by 60 minutes total
* slow test infra is expected to stay the same - verified by running pytest and unittest on the same job and check the number of skipped/run tests
* test reports to s3 changed - add entirely new table to keep track of invoking_file times
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79898
Approved by: https://github.com/malfet, https://github.com/janeyx99
2022-07-19 19:50:57 +00:00
Horace He
80231d0a72 Refactored prim utils into _prims_utils folder (#81088)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81088
Approved by: https://github.com/ngimel
2022-07-19 03:55:51 +00:00
Peter Bell
bf36d8b987 [primTorch] Implement one-dimensional fft transforms (#80570)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80570
Approved by: https://github.com/mruberry
2022-07-15 15:13:43 +00:00
Peter Bell
924b7951aa [primTorch] Implement conj and conj_physical (#80358)
This adds `prims.conj` and `prims.conj_physical` which only accept
complex tensors, as well as `refs.conj` and `refs.conj_physical` which
pass-through non-complex values and call the appropriate `prims` for
complex types.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80358
Approved by: https://github.com/mruberry
2022-07-14 15:29:41 +00:00
Nikita Karetnikov
1e3c6f2263 [primTorch] Add a ref for allclose (#81003)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81003
Approved by: https://github.com/mruberry
2022-07-12 15:08:01 +00:00
Richard Zou
9ee312023d [Composite compliance testing] Refactor check_forward_ad_formula to accept Callable (#81239)
Like https://github.com/pytorch/pytorch/pull/81059; this PR addresses
the review comments.

Test Plan:
- run tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81239
Approved by: https://github.com/ezyang
2022-07-11 20:48:18 +00:00
Richard Zou
d253cdd8ff [composite compliance testing] Refactor check_backward_formula to accept Callable (#81059)
Maybe niche, but for one-off debugging purposes, I want a variant of
check_backward_formula that accepts a callable rather than an OpInfo.
This is because when debugging, I try to create a repro that does not
involve OpInfos because OpInfos are difficult to deal with (they have
a lot of sample inputs, I may want to test my own sample inputs without
creating a new OpInfo, etc).

This PR refactors check_backward_formula so that it accepts a Callable
instead of an OpInfo. Example usage:

```
import torch
from torch.testing._internal.composite_compliance import check_backward_formula

x = torch.tensor([[1., 1.], [1., 0.]], requires_grad=True)
args = (x, 1)

check_backward_formula_callable(torch.prod, args, {})
```

Test Plan:
- run existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81059
Approved by: https://github.com/kshitij12345, https://github.com/ezyang
2022-07-11 18:37:50 +00:00
Mike Ruberry
8740c68c41 [primTorch] Adds contiguous and expand references (#79820)
I also filed  while creating this PR.

This PR...

**Filed issues**

- https://github.com/pytorch/pytorch/issues/79818
- https://github.com/pytorch/pytorch/issues/80154

**prims**

- Fixes prims.squeeze when called with an unsorted list of dimensions
- Removes the clone prim

**refs**
- adds contiguous
- adds expand
- updates clone to call empty_like and copy_to
- updates empty to accept a memory format
- updates empty_like to accept a memory_format

**utils**
- adds helper functions for working with memory formats and channels last tensors, in particular

**tests**

- removes unused clamp sample input functions (mooted by clamp's new reference inputs)
- extends the reference inputs for clone to include different memory formats
- creates reference inputs for contiguous
- xfails operators that depend on clone (including clone) on `test_python_ref` (see issues)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79820
Approved by: https://github.com/ngimel
2022-07-11 17:42:58 +00:00
Ryan Spring
d26516fd1b [primTorch] Implement loss function references (#80573)
Add Reference:

- mse_loss
- l1_loss
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80573
Approved by: https://github.com/mruberry
2022-07-09 03:31:20 +00:00
David Berard
4c57cf9a8b Register unregistered refs and add a test to check registration (#80497)
Added missing `register_decomposition`s which will register the refs so
they can be used for decompositions.

Also added a test for verifying that new refs are registered.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80497
Approved by: https://github.com/ezyang
2022-07-08 16:29:52 +00:00
Ivan Yashchuk
12dc410ff2 Fix nvFuser's where(tensor, python_scalar, tensor) type promotion (#80347)
This PR modifies the type promotion logic for nvFuser's `where` function when one of the arguments is a scalar. With the proposed change behavior now matches with ATen's type promotion.

The following script fails on master and passes with this PR:
```py
import torch
import torch._refs
from torch._prims.executor import make_traced
a = torch.ones(3, 3, dtype=torch.bool, device='cuda')
b = torch.randn(3, 3, device='cuda')
func = lambda a, b: torch._refs.where(a, 0.0, b)
assert make_traced(func)(a, b, executor="nvfuser").dtype == torch.float32
```

This PR allows to unskip nvFuser tests for `_refs.log_softmax`, it was failing with a dtype mismatch.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80347
Approved by: https://github.com/ngimel
2022-06-28 08:42:16 +00:00
Ryan Spring
1d0d506e97 Add Div reference (#77936)
Add Prims:
-  trunc
-  Replace _wrap_scalar with scalar_tensor

Add Reference:
-  copysign
- div
- floor_divide
- trunc_divide

Other:
* Add support for `variant_test_name` in _find_referenced_opinfo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77936
Approved by: https://github.com/mruberry
2022-06-27 14:46:17 +00:00
Ivan Yashchuk
072311bb28 Enable torch._prims.amax/amin for nvFuser executor (#80070)
This PR adds nvFuser implementations for `torch._prims.amax` and `torch._prims.amin` reduction functions. Currently, nvFuser refuses to reduce the 0d tensor, so these inputs are skipped in tests for now.

An accompanying fix replaces `collections.Sequence` -> `collections.abc.Sequence` in refs because `collections.Sequence` is deprecated and removed in Python 3.10

Many ops that were skipped for the nvFuser executor test are now enabled.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80070
Approved by: https://github.com/ngimel
2022-06-23 10:19:57 +00:00
Elias Ellison
268bbecf1c Add option for allowing non-fake inputs, add deepcopy impl
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79580

Approved by: https://github.com/samdow
2022-06-17 19:36:26 +00:00
Kshiteej K
04b98df87a [fix] composite compliance: eig, eigh, symeig (#79698)
Ref: https://github.com/pytorch/pytorch/issues/69991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79698
Approved by: https://github.com/Lezcano, https://github.com/albanD
2022-06-17 14:13:04 +00:00
kshitij12345
d05fb78685 [chalf] enable skipped tests (#79376)
Ref: https://github.com/pytorch/pytorch/pull/79217#pullrequestreview-1002849962

Had to add a few `expectedFailures`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79376
Approved by: https://github.com/ngimel, https://github.com/mruberry
2022-06-13 17:31:45 +00:00
Michael Suo
c978b609f7 [ci] remove IN_CI env var
The conventional env var to set is CI. Both circle and GHA set it, so
IN_CI is unnecessary

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79229

Approved by: https://github.com/janeyx99
2022-06-11 17:16:30 +00:00
Mikayla Gawarecki
e727539c29 Support multi-dimensional lengths in segment_reduce to support pytorch_scatter.segment_* functionalities (CUDA)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77061

Approved by: https://github.com/cpuhrsch
2022-06-11 01:45:22 +00:00
anjali411
38350acf8f Autogen Tags enum, and allow specifying tags while defining an op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79322

Approved by: https://github.com/albanD
2022-06-11 00:29:32 +00:00
PyTorch MergeBot
fefff54cad Revert "Revert "Revert "Added {logical_not, trace} refs, moved logical ops to use method overloads"""
This reverts commit a2d2981e8e.

Reverted https://github.com/pytorch/pytorch/pull/79224 on behalf of https://github.com/suo due to broke lots of things a2d2981e8e
2022-06-10 04:40:43 +00:00
Horace He
a2d2981e8e Revert "Revert "Added {logical_not, trace} refs, moved logical ops to use method overloads""
This reverts commit d67309aefb.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79224

Approved by: https://github.com/mruberry
2022-06-10 03:07:14 +00:00
Elias Ellison
13a8867c01 Add Dynamic Output Shape Tagdfor ata-dependent ops, handle in FakeTensor
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79170

Approved by: https://github.com/ezyang
2022-06-09 22:16:16 +00:00
PyTorch MergeBot
d67309aefb Revert "Added {logical_not, trace} refs, moved logical ops to use method overloads"
This reverts commit 64b6bd8c1e.

Reverted https://github.com/pytorch/pytorch/pull/79000 on behalf of https://github.com/malfet due to Introduces test failure, see https://hud.pytorch.org/pr/79000
2022-06-09 13:11:23 +00:00
Horace He
64b6bd8c1e Added {logical_not, trace} refs, moved logical ops to use method overloads
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79000

Approved by: https://github.com/ezyang
2022-06-09 07:16:36 +00:00
Elias Ellison
3c5a3ca9e8 Make FakeTensors return meta within kerenl invocation, add FakeTensor op tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78972

Approved by: https://github.com/ezyang
2022-06-09 01:39:27 +00:00
Elias Ellison
290d0979f1 Migrate FakeTensors to always call into FakeTensorMode and have them hold a reference
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78677

Approved by: https://github.com/ezyang
2022-06-08 22:30:34 +00:00
Ivan Yashchuk
ff39e3493a Test torch._refs with aten and nvfuser executors (#78926)
This PR adds testing of references with "aten" and "nvfuser" executors using `torch._prims.executor.make_traced`.

Many tests are skipped even for "aten" executor because of https://github.com/pytorch/pytorch/issues/78923.

I limited the dtypes for the nvfuser executor tests because it's slow due to compilation overhead (it took about 30 mins in total). With `float32` and `int32` types nvfuser tests take 5 minutes.
```
58 passed, 2507 skipped, 28162 deselected, 79 xfailed, 5 warnings in 297.58s (0:04:57)
```
58 tests passed means that 29 references work correctly with nvfuser executor now.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78926
Approved by: https://github.com/mruberry
2022-06-08 12:45:27 +00:00
PyTorch MergeBot
c8a5f28fde Revert "Test torch._refs with aten and nvfuser executors (#78926)"
This reverts commit d4eebca7bc.

Reverted https://github.com/pytorch/pytorch/pull/78926 on behalf of https://github.com/malfet due to breaks rocms, see d4eebca7bc
2022-06-07 22:39:05 +00:00
Ivan Yashchuk
d4eebca7bc Test torch._refs with aten and nvfuser executors (#78926)
This PR adds testing of references with "aten" and "nvfuser" executors using `torch._prims.executor.make_traced`.

Many tests are skipped even for "aten" executor because of https://github.com/pytorch/pytorch/issues/78923.

I limited the dtypes for the nvfuser executor tests because it's slow due to compilation overhead (it took about 30 mins in total). With `float32` and `int32` types nvfuser tests take 5 minutes.
```
58 passed, 2507 skipped, 28162 deselected, 79 xfailed, 5 warnings in 297.58s (0:04:57)
```
58 tests passed means that 29 references work correctly with nvfuser executor now.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78926
Approved by: https://github.com/mruberry
2022-06-07 20:34:07 +00:00
Peter Bell
c936396af2 Always convert truthy booleans to 1
Ref #54789

A `bool` has only two valid values, 1 or 0. Any in-memory value
outside of those leads to undefined behavior. So, instead of
`reinterpret_cast`-ing to `bool*` I introduce `c10::load<scalar_t>`
which will read as `unsigned char` and convert to a valid `bool`.

This gets >90% of operators working, but the remaining operators where
skips and xfails have been added will require individual attention.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77122

Approved by: https://github.com/mruberry
2022-06-07 16:00:30 +00:00
Horace He
e675dbadc4 Ported gelu decomp to ref (#78697)
Ugh... these are actually so painful to write without operator overloading lol.

Decided to just utilize operator overloading, and xfail the ref tests for now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78697
Approved by: https://github.com/mruberry
2022-06-06 22:30:20 +00:00
Edward Z. Yang
587efdb5fa Replace TensorMeta with FakeTensor
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78836

Approved by: https://github.com/albanD, https://github.com/mruberry
2022-06-05 11:51:27 +00:00
PyTorch MergeBot
954522a485 Revert "Autogen Tags enum, and allow specifying tags while defining an op"
This reverts commit 9476a78f37.

Reverted https://github.com/pytorch/pytorch/pull/77313 on behalf of https://github.com/malfet due to Broke OSS buck builds, see 9476a78f37
2022-06-03 01:53:53 +00:00
anjali411
9476a78f37 Autogen Tags enum, and allow specifying tags while defining an op
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77313

Approved by: https://github.com/ezyang, https://github.com/albanD
2022-06-03 01:13:44 +00:00
Kshiteej K
849b08f14b [reland][chalf] where(cpu and cuda), pow(cuda) (#78665)
Reland: https://github.com/pytorch/pytorch/pull/77640
Ref: #74537
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78665
Approved by: https://github.com/ngimel
2022-06-02 18:04:06 +00:00
PyTorch MergeBot
78824a7d54 Revert "Always convert truthy booleans to 1"
This reverts commit 3c3c6cd982.

Reverted https://github.com/pytorch/pytorch/pull/77122 on behalf of https://github.com/mruberry due to broke some jobs, like https://github.com/pytorch/pytorch/runs/6706333043?check_suite_focus=true
2022-06-02 13:45:54 +00:00
jjsjann123
fea909b43e [primTorch] Adds broadcast_shapes reference (#78612)
1. Added references `_refs.broadcast_shapes`
2. Added OpInfo test for `torch.broadcast_shapes`

A few minor changes:
- `test_python_ref_meta` and `_ref_test_helper` update to avoid non-tensor outputs
- type annotation update for `_resize_meta`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78612
Approved by: https://github.com/mruberry
2022-06-02 08:56:37 +00:00
Peter Bell
3c3c6cd982 Always convert truthy booleans to 1
Ref #54789

A `bool` has only two valid values, 1 or 0. Any in-memory value
outside of those leads to undefined behavior. So, instead of
`reinterpret_cast`-ing to `bool*` I introduce `c10::load<scalar_t>`
which will read as `unsigned char` and convert to a valid `bool`.

This gets >90% of operators working, but the remaining operators where
skips and xfails have been added will require individual attention.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77122

Approved by: https://github.com/mruberry
2022-06-02 04:18:34 +00:00
PyTorch MergeBot
4bb8db85e9 Revert "[chalf] where(cpu and cuda), pow(cuda) (#77640)"
This reverts commit 3697cf7f76.

Reverted https://github.com/pytorch/pytorch/pull/77640 on behalf of https://github.com/mruberry due to as it broke ROCM on trunk
2022-06-01 19:39:38 +00:00
kshitij12345
3697cf7f76 [chalf] where(cpu and cuda), pow(cuda) (#77640)
Ref: #74537
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77640
Approved by: https://github.com/anjali411, https://github.com/ngimel
2022-06-01 18:35:53 +00:00
Thomas J. Fan
a2ef1edb1f MNT Check that torch._refs are in python_ref_db (#78222)
Fixes #77688

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78222
Approved by: https://github.com/mruberry
2022-05-25 22:57:57 +00:00
Mike Ruberry
2738405a76 [primTorch] Adds any, all, equal, item references (#78072)
This PR adds the item, equal, any, and all references.

While doing this I found the following issues:
- https://github.com/pytorch/pytorch/issues/78070
- https://github.com/pytorch/pytorch/issues/78071

And I fixed a bug where the `convert_element_type` prim could not convert tensors requiring grad to datatypes that don't require grad.

Creating the item reference required adding item as a prim, but per @ngimel's suggestion I removed the prims for any and all and implemented them as references, so this is net negative one prim.

Reference OpInfos are added for any and all, but item and equal don't even have regular OpInfos.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78072
Approved by: https://github.com/ngimel
2022-05-23 12:49:04 +00:00
Mike Ruberry
d4345ed0a6 [primTorch] Adds random operations (#78026)
This PR...

**Issues Found**
- https://github.com/pytorch/pytorch/issues/78058
- https://github.com/pytorch/pytorch/issues/78054
- https://github.com/pytorch/pytorch/issues/78053
- https://github.com/pytorch/pytorch/issues/78050
- https://github.com/pytorch/pytorch/issues/77932

**Testing**
- disables stride consistency checks in test_ops and test_meta pending resolution of https://github.com/pytorch/pytorch/issues/78050
- skips chalf in reference tests (addressing https://github.com/pytorch/pytorch/issues/78054)
- splits test test_python_reference_consistency in one test for the ctx where torch.foo is torch.foo, and another for when torch.foo is refs.foo
- updates test names to be more natural and consistent:
  - test_python_reference_errors -> test_python_ref_errors
  - test_python_reference_consistency -> test_python_ref and test_python_ref_torch_fallback
  - test_python_reference_meta_functions -> test_python_ref_meta
  - test_reference_testing -> test_numpy_ref
- updates test_python_ref and test_python_ref_torch_fallback to check that the reference is more accurate than the torch op if the reference and torch op results are not close, a warning is raised when this occurs (addressing https://github.com/pytorch/pytorch/issues/77687)
- adds reference inputs for broadcast_tensors
- Updates the "fill_" OpInfo to "fill", adding a NumPy reference and making it an elementwise unary operator
- Adds 1D no element sample inputs to the cat OpInfo and updates the NumPy reference to handle them and type promotion correctly
- Adds reference inputs for elementwise ternary operations, like clamp
- Adds a NumPy reference for clamp
- Adds reference inputs to where's OpInfo
- Makes softplus an elementwise unary OpInfo
- Removes the great majority of Python reference OpInfo skips and xfails due to the above test changes
- Adds Python reference OpInfos for fill, dropout, clamp, broadcast_tensors, and where

**Prims**
- adds the fill, empty_strided, and uniform prims
- removes the empty, empty_like, full, and full_like prims -- these are now references that use empty_strided and fill
- renames the "concatenate" and "select" prims to "cat" and "where", respectively, to be consistent with PyTorch
- extends the `_elementwise_meta` operation to accepts tensors that don't participate in type promotion, like the `cond` tensor in `where`
- fixes a bug in the stride propagation of broadcast_in_dim
- moves some error checks from prims.cat to prims.where to refs.cat and refs.where, respectively, consistent with our new policy of doing as much error checking in the ref as possible

**Utils**
- adds the canoicalize_device, extract_shape, and extract_shape_from_varargs helpers
- adds the elementwise_unary_scalar_wrapper -- this allows elementwise unary operators to take and return scalar values (ex. refs.sin(1) will return .84...)

**Refs**
- adds the fill, broadcast_tensors, clamp, empty_strided, ones, zeros, and uniform references
- adds the nn.functional.dropout reference
- fixes refs.cat to handle 1D tensors with no inputs consistent with eager mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78026
Approved by: https://github.com/ngimel
2022-05-23 01:56:28 +00:00
PyTorch MergeBot
acfbc16b1c Revert "[primTorch] Adds random operations (#78026)"
This reverts commit 043cf1f9c7.

Reverted https://github.com/pytorch/pytorch/pull/78026 on behalf of https://github.com/suo due to This broke trunk: 043cf1f9c7
2022-05-22 18:11:14 +00:00
Mike Ruberry
043cf1f9c7 [primTorch] Adds random operations (#78026)
This PR...

**Issues Found**
- https://github.com/pytorch/pytorch/issues/78058
- https://github.com/pytorch/pytorch/issues/78054
- https://github.com/pytorch/pytorch/issues/78053
- https://github.com/pytorch/pytorch/issues/78050
- https://github.com/pytorch/pytorch/issues/77932

**Testing**
- disables stride consistency checks in test_ops and test_meta pending resolution of https://github.com/pytorch/pytorch/issues/78050
- skips chalf in reference tests (addressing https://github.com/pytorch/pytorch/issues/78054)
- splits test test_python_reference_consistency in one test for the ctx where torch.foo is torch.foo, and another for when torch.foo is refs.foo
- updates test names to be more natural and consistent:
  - test_python_reference_errors -> test_python_ref_errors
  - test_python_reference_consistency -> test_python_ref and test_python_ref_torch_fallback
  - test_python_reference_meta_functions -> test_python_ref_meta
  - test_reference_testing -> test_numpy_ref
- updates test_python_ref and test_python_ref_torch_fallback to check that the reference is more accurate than the torch op if the reference and torch op results are not close, a warning is raised when this occurs (addressing https://github.com/pytorch/pytorch/issues/77687)
- adds reference inputs for broadcast_tensors
- Updates the "fill_" OpInfo to "fill", adding a NumPy reference and making it an elementwise unary operator
- Adds 1D no element sample inputs to the cat OpInfo and updates the NumPy reference to handle them and type promotion correctly
- Adds reference inputs for elementwise ternary operations, like clamp
- Adds a NumPy reference for clamp
- Adds reference inputs to where's OpInfo
- Makes softplus an elementwise unary OpInfo
- Removes the great majority of Python reference OpInfo skips and xfails due to the above test changes
- Adds Python reference OpInfos for fill, dropout, clamp, broadcast_tensors, and where

**Prims**
- adds the fill, empty_strided, and uniform prims
- removes the empty, empty_like, full, and full_like prims -- these are now references that use empty_strided and fill
- renames the "concatenate" and "select" prims to "cat" and "where", respectively, to be consistent with PyTorch
- extends the `_elementwise_meta` operation to accepts tensors that don't participate in type promotion, like the `cond` tensor in `where`
- fixes a bug in the stride propagation of broadcast_in_dim
- moves some error checks from prims.cat to prims.where to refs.cat and refs.where, respectively, consistent with our new policy of doing as much error checking in the ref as possible

**Utils**
- adds the canoicalize_device, extract_shape, and extract_shape_from_varargs helpers
- adds the elementwise_unary_scalar_wrapper -- this allows elementwise unary operators to take and return scalar values (ex. refs.sin(1) will return .84...)

**Refs**
- adds the fill, broadcast_tensors, clamp, empty_strided, ones, zeros, and uniform references
- adds the nn.functional.dropout reference
- fixes refs.cat to handle 1D tensors with no inputs consistent with eager mode
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78026
Approved by: https://github.com/ngimel
2022-05-22 10:06:24 +00:00
Edward Z. Yang
6b273444c4 Add logit ref; allow non-refs to be called in refs.
Signed-off-by: Edward Z. Yang <ezyangfb.com>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77816

Approved by: https://github.com/mruberry
2022-05-21 02:35:14 +00:00
lezcano
c446f78ffd Use any_type in test_out
Previously, test_out used `OpDTypes.none` and then it pretty much
implemented `OpDtypes.any_type` inside. This PR changes it to use
`OpDTypes`. This has the advantage that the test now has a dtype, so it
can be used together with decorators that require a `dtype`, such as
`toleranceOverride`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77735

Approved by: https://github.com/mruberry
2022-05-18 16:24:32 +00:00
Mike Ruberry
580a053832 [primTorch] Enforces stride metadata (#77542)
This PR...

**Filed the Following Issues**
- https://github.com/pytorch/pytorch/issues/77553
- https://github.com/pytorch/pytorch/issues/77526
- https://github.com/pytorch/pytorch/issues/77600

**Testing**
- Updates test_dtypes to longer attempt to test the backward of sample inputs where no inputs require grad
- Adds a new test_python_reference_errors; it ensures the meta operations for references throw errors as expected
- Updates compare_tensor_meta to better handle CUDA devices, and (temporarily) restricts stride checking to the CUDA device type
- Elementwise unary and elementwise binary operators now have arbitrarily strided reference inputs
- Reference inputs for _like functions are added
- An OpInfo for torch.empty is added
- Reference inputs for torch.clone are added
- A NumPy reference for clone is added
- Adds OpInfos for refs.empty and refs.empty_like

**Prims**
- Renames the "max" and "min" prims have been renamed to "maximum" and "minimum," respectively, to better conform to their ATen names
- Adds the empty, empty_like, full, and full_like prims
- Fixes the elementwise meta function's stride propagation
- Fixes clone's meta function's stride propagation
- Fixes convert_element_type's meta's stride propagation
- Adds a (temporary) _to_dtype pprivate prim that casts a tensor while preserving its stride permutation
- Removes the _set prim comment
- Adds utils.compute_elementwise_output_strides, which computes the correct output strides for elementwise operations
- Corrects an issue where utils.make_contiguous_strides_for was creating the incorrect strides for tensors with no elements

**References**
- Adds the empty, empty_like, full, full_like, and ones_like refs
- Extends make_elementwise_unary_reference to accept an additional callable to perform extra input validation
- Adds an extra validation function to handle refs.neg(BoolTensor)
- Updates the isfinite ref to call ones_like when appropriate
- Models Python scalar handling for elementwise binary operations
- Added a 64 dim check for the amin and amax references
- opmath is now a flag that can be set separately for cpu and CUDA
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77542
Approved by: https://github.com/ezyang
2022-05-18 13:57:26 +00:00
Andrew M. James
e5a752a6ca Discover and check operator variants
Operator variants can now be explicitly specified in the OpInfo kwargs.
When the operator name is not the same as the method/function form this
will allow them to be discovered.

The OpInfo is extended to also accept/discover the inplace operator
variant.

Operator and inplace operator variants are exercised in consistency
tests when the sample does not contain any kwargs.

Operations which require explicit declarations of operator and inplace
operator variants have had them added to their OpInfos.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76901

Approved by: https://github.com/mruberry
2022-05-15 15:06:30 +00:00
Mike Ruberry
64c6a89bd6 [primTorch] reshape and view (#77220)
This PR makes the following changes...

Prims
- adds as_strided
- fixes errors in flatten meta

Testing
- enables view consistency checking (which can be opted out of, see issues below)
- adds reference inputs for view, reshape, and flatten
- adds error inputs for reshape

Refs
- adds as_strided, reshape, and view
- fixes an error in the flatten ref where it was not returning self on no-op
- fixes a bug in transpose where it was not retuning a view when the transposed tensor has 1 or fewer dims

Issues
- https://github.com/pytorch/pytorch/issues/77218
- https://github.com/pytorch/pytorch/issues/77216
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77220
Approved by: https://github.com/ngimel
2022-05-13 13:12:04 +00:00
Kshiteej K
39bd37f34f [complex32] sum, prod : cuda only (disable jiterator reduction on windows) (#77192)
Ref: https://github.com/pytorch/pytorch/issues/74537
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77192
Approved by: https://github.com/anjali411
2022-05-13 07:24:18 +00:00
kshitij12345
9e3eb329df [chalf] getitem (#77339)
Ref: https://github.com/pytorch/pytorch/issues/74537
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77339
Approved by: https://github.com/anjali411
2022-05-12 14:49:17 +00:00
Mike Ruberry
bb8baea932 [primTorch] flatten, squeeze, unsqueeze... (#77043)
This PR ...

Makes the following testing changes:

- Updates stride testing in test_python_reference_consistency to only check strides of dimensions with length > 1
- Creates reference inputs for reshape
- Creates reference inputs for chunk
- Extends the sample inputs for unsqueeze
- Extends the sample inputs for stack -- test_conj_view and test_neg_view are now xfailed
  - https://github.com/pytorch/pytorch/issues/77046

Makes the following architecture changes:
- Adds the refs.special (sub)module
- Adds the refs.nn.functional (sub)module

Adds the following prims:
- expand_dims
- view_of
- rev
- clone

Adds the following references:
  -  flatten
  - squeeze
  - unsqueeze
  - special.i0e
  - special.i1e
  - logical_or
  - logical_and
  - isclose
  - flip
  - stack
  - nn.functional.elu
  - chunk
  - clone
  - narrow

Identifies the following bugs in PyTorch today:
- https://github.com/pytorch/pytorch/issues/77054
- https://github.com/pytorch/pytorch/issues/77055

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77043
Approved by: https://github.com/ngimel
2022-05-09 11:24:55 +00:00
Natalia Gimelshein
362525724b type promote clamp (#77035)
Fixes #76630
When clamp(Tensor, Tensor) is structured, big parts of this PR won't be needed, but for now let's fix type promotion to make behavior more regular.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77035
Approved by: https://github.com/mruberry
2022-05-09 05:54:17 +00:00
Mike Ruberry
c031643e39 Adds decorators for Python References and extends Python Reference testing (#76945)
This PR does the following...

Tests:
- fixes test_type_promotion in test_binary_ufuncs to correctly generate scalar cpu tensors
- fixes test_python_reference_consistency to use the Python Reference's reference inputs
- extends Python reference testing to test_conj_view, test_neg_view, and test_neg_conj_view
- adds a NaN propagation sample input for elementwise unary and binary operations
- fixes the UnaryUfuncInfo class to properly register its reference inputs
- Updates the Python Reference OpInfos to skip error inputs when their behavior on scalar inputs is inconsistent with their reference operators

Code organization:
- moves elementwise type promotion functionality to prims.utils

Prims & Refs:
- fixes scalar cpu tensor handling by having them pass through broadcasting and device and shape checks
- adds two decorators, `elementwise_type_promotion_wrapper` and `out_wrapper`, the former allows for elementwise type promotion to be automated and the latter automatically adds the out kwarg and handles it properly

cc @ezyang who also had some thoughts on cpu scalar tensor handling
cc @chillee -- might want to use this new decorator as we converge decompositions and references
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76945
Approved by: https://github.com/ngimel
2022-05-07 03:42:24 +00:00
Natalia Gimelshein
1c776d209c Adds amax and amin references
Also extends reference testing to error inputs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76855
Approved by: https://github.com/mruberry
2022-05-05 15:53:09 +00:00
Natalia Gimelshein
c51b53d4ef [WIP] sum reference
Per title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76714
Approved by: https://github.com/mruberry
2022-05-04 02:50:00 +00:00
Mike Ruberry
c9bd73878a adds elementwise opinfos and unary references, extends to out testing
This PR makes the following changes:

Prims:
- igamma and igammac are now correctly listed as elementwise binary operations, not elementwise unary operations
- elementwise prims now must specify their type promotion kind (this is currently unused)

Refs:
- complexhalf is now handled by opmath-style type promotion
- adds references for: abs, acos, acosh, asin, atan, ceil, cos, cosh, digamma, erf, erfinv, erfc, exp, expm1, isfinite, isnan, lgamma, log, log1p, neg, reciprocal, sign, sin, sinh, sqrt, square, tan, igamma, igammac
- adds "complex to float" and "bool to long" type promotion kinds
- updates out behavior to warn when resizing a non-empty tensor, consistent with current ops
- updates the elementwise unary reference template with type promotion

Tests:
- fixes torch.pow's OpInfo to correctly specify it only supports one scalar input, not two
- fixes elementwise binary reference inputs to not attempt generating certain tensors in complex half (for now, cc @kshitij12345)
- adds OpInfos for the following Python references: abs, acos, acosh, asin, atan, ceil, cos, cosh, digamma, erf, erfinv, erfc, exp, expm1, isfinite, isnan, lgamma, log, log1p, neg, reciprocal, round, sign, sin, sinh, sqrt, square, tan, atan2, bitwise_and, bitwise_left_shift, bitwise_or, bitwise_xor, eq, float_power, ge, gt, igamma, igammac, le, lt, maximum, minimum, mul, ne, nextafter, pow, sub, true_divide

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76647
Approved by: https://github.com/ngimel
2022-05-02 14:23:05 +00:00
Mike Ruberry
f6bbecf8b5 Adds python ref consistency test, elementwise unary reference inputs, and formats test files
Per title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76626
Approved by: https://github.com/ngimel
2022-05-01 22:42:46 +00:00
Mike Ruberry
fe1968dea0 [primTorch] Prototype nvFuser integration and test_prims.py
This adds prototype nvFuser integration for the following prims:

- broadcast_in_dim
- convert_element_type
- add
- div
- ge
- gt
- le
- lt
- mul

Adding it for additional prims supported by nvFuser's prototype Python frontend should be easy.

This also adds a new sugar to run operations using the ATen or nvFuser trace executors. For example:

```
def foo(a, b):
  return torch.add(a, b)

traced_foo = make_traced(foo)

a = torch.randn((1, 2, 3, 4, 5), device='cuda')
b = torch.randn((1, 2, 3, 4, 5), device='cuda')
result = traced_foo(a, b, executor='nvfuser')
```

Currently only operations with tensor inputs and one tensor output are supported, and the operation must be composed exclusively of reference or prim operations.

Finally, this adds a new test, test_prims.py, that just tests the broadcast_in_dim prim for now. In the future we'll likely have OpInfos for each prim, but we'll need a reference implementation of broadcast_in_dim to make that interesting.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76560
Approved by: https://github.com/ngimel
2022-04-29 02:02:25 +00:00
Mike Ruberry
4048d4cdd2 [primTorch] Prototype tracer and elementwise unary reference opinfo class
Adds a prototype tracer with no caching support and the `ElementwiseUnaryPythonRefInfo` class. A reference for `floor` is added to test the latter, and the elementwise binary reference inputs are extended to also return noncontiguous inputs. The SampleInput transform operation has been updated to return an actual SampleInput instead of a tuple to facilitate uniform handling of (transformed) SampleInputs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76388
Approved by: https://github.com/ngimel
2022-04-27 14:40:21 +00:00
kshitij12345
c1ced8ff72 [composite compliance] add test for fwd AD
Fixes https://github.com/pytorch/pytorch/issues/74678

Test timings:
```
======================================= 756 passed, 99 skipped, 13864 deselected, 76 xfailed, 16 warnings in 278.35s (0:04:38) =======================================
```

Slowest ops
```
======================================================================== slowest 20 durations ========================================================================
32.16s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_instance_norm_cuda_float32
30.51s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_nn_functional_instance_norm_cpu_float32
9.89s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__masked_norm_cuda_float32
8.54s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad__masked_norm_cpu_float32
8.52s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diff_cuda_float32
8.33s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_solve_triangular_cuda_float32
8.08s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_linalg_solve_triangular_cpu_float32
8.03s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_diff_cpu_float32
6.52s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cov_cuda_float32
5.77s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_cov_cpu_float32
4.12s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_solve_cuda_float32
3.78s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__masked_std_cuda_float32
3.67s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gradient_cuda_float32
3.55s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__masked_var_cuda_float32
3.47s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_pool2d_cuda_float32
3.42s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_batch_norm_without_cudnn_cuda_float32
3.40s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_nn_functional_max_pool2d_cpu_float32
3.30s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad__masked_std_cpu_float32
3.30s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_gradient_cpu_float32
3.28s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_batch_norm_cuda_float32
====================================================================== short test summary info =======================================================================
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75178
Approved by: https://github.com/zou3519
2022-04-25 15:15:48 +00:00
Mike Ruberry
28c3e0f77c Initial prims, references, and test architecture for them (#75095)
Summary:
This PR adds an initial set of experimental primitive operations and Python references that reimplement existing PyTorch operations using them. See https://dev-discuss.pytorch.org/t/tracing-with-primitives-update-0/577 for additional context.

The following experimental primitives are added:

- Elementwise unary prims -- abs, acos, acosh, asin, atan, cos, cosh, bessel_i0e, bessel_i1e, cbrt, ceil, digamma, erf, erf_inv, erfc, exp, expm1, floor, igamma, igammac, is_finite, lgamma, log, log1p, neg, reciprocal, round, sign, sinh, sqrt, square, tan.
- Elementwise binary prims -- add, atan2, bitwise_and, bitwise_not, bitwise_or, bitwise_xor, div, eq, ge, gt, le, lt, max, min, mul, ne, nextafter, pow, rsqrt, shift_left, shift_right_arithmetic
- View prims -- brodcast_in_dim, collapse_view, split_dim, squeeze
- Shape prims -- collapse, concatenate, reshape
- Conditional prims -- select
- Data conversion & movement prims -- convert_element_type, device_put
- Inplace prims -- copy_to, resize

These primitives do not add any new functionality to PyTorch, but are intended to be the semantic building blocks for reference operators. We have tried to make them consistent with the operations in [jax.lax](https://jax.readthedocs.io/en/latest/jax.lax.html) where possible (because PyTorch prefers being consistent with other frameworks), although there are key differences between these prims and operations in jax.lax. Most notably is that these prims model view semantics and inplace operations.

In addition to these primitives the following elementwise binary Python references are added:

- Elementwise binary Python references -- add, atan2, bitwise_and, bitwise_left_shift, bitwise_or, bitwise_right_shift, bitwise_xor, eq, float_power, ge, gt, le, lt, maximum, minimum, mul, ne, nextafter, pow, sub, true_divide
- Conditional Python references - where
- Data conversion & movement references - copy_to

A Python reference implements the same behavior as its corresponding PyTorch operator (excepting slight numerical differences, bug fixes, and in some cases additional features).

The start of an OpInfo-based test architecture for these references is also included in this PR. A new list, `python_ref_db`, is added to `common_methods_invocations.py`. This list introduces the new `ElementwiseBinaryPythonRefInfo`, which inherits input arguments from the original operators' OpInfo, allows them to be overridden, and then constructs the OpInfo for the Python reference using the (potentially modified) arguments. OpInfo-based tests can opt-into testing references by including this new list in the Sequence passed to the `ops` decorator.

cc ngimel csarofeen kevinstephano Lezcano

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75095

Reviewed By: ngimel

Differential Revision: D35888004

Pulled By: mruberry

fbshipit-source-id: 21e77c4456c2a02113367d4bdae168a3a2f33f25
(cherry picked from commit 1d5bcfa99d4e8cf36f60642803a0bfca50e2ea4e)
2022-04-25 09:57:20 +00:00
kshitij12345
e23cbd633f [complex32] jiterator support
Reference #74537

Support for jiterating with `c10::complex<Half>`. Note that computation will take place in `complex<float>` by allowing implicit casting in JITerated code (similar to Half and BFloat16 which upcast to float for computation).

We add `complex32` support for `sigmoid` and `sigmoid_backward` in this PR. This is tested with `test_ops.py::test_dtypes and test_ops.py::test_complex_half_reference_testing`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75656
Approved by: https://github.com/ngimel
2022-04-19 16:33:18 +00:00
Jeff Daily
e587c8bc57 [ROCm] enable composite compliance backward tests
Follow up to #74646.  Do not skip the entire TestCompositeCompliance test_backward for ROCm, only skip the the two unexpected successes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75810
Approved by: https://github.com/zou3519
2022-04-19 13:47:06 +00:00
Mike Ruberry
de949a0e59 Various OpInfo architecture improvements
This PR makes the following improvements:

- moves the custom skip list for test_normalize_operator_exhaustive in test_fx_experimental to use the typical OpInfo skip architecture. The skips were updated to xfails, and that identified some operators which were no longer failing the test
- redundant tests with OpInfo-based testing in test_jit.py were removed
- test_dtypes was improved so its error messages are clear and it makes test_nondifferentiable redundant; the latter test has been removed
- OpInfo.supports_complex_autograd() is removed in favor of a more accurate and general test for whether the particular dtype is in the backward dtypes of the operator
- gradchecks have been improved to verify that an operator doesn't support grad if it claims not to
- gradchecks have been improved to test the gradient of all input tensors that require gradient
- the concept of "default test dtypes" has been removed
- excessive and mostly redundant out testing for elementwise unary operators has been removed
- metadata for whether an op supports nuanced "safe casting" to out behavior has been removed from OpInfos
- numerous skips have been converted to xfails
- numerous OpInfos have had their metadata fixed based on the new checks
- jit-specific utilities in common_methods_invocations.py have been moved to jit_programming_utils.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75951
Approved by: https://github.com/ngimel
2022-04-18 21:55:32 +00:00
Mike Ruberry
b09769992f Improves the OpInfo out= tests
Edit: OpInfos separated into their own PRs to debug an ASAN failure that doesn't identify the failing test properly. This PR now just updates the out tests.

Adds OpInfos for:

- nn.functional.smooth_l1_loss
- nn.functional.l1_loss
- nn.functional.pdist
- nn.functional.binary_cross_entropy
- nn.functional.triplet_margin_loss
- nn.functional.triplet_margin_with_distance_loss
- nn.functional.max_unpool{1, 2, 3}D
- nn.functional.alpha_dropout
- nn.functional.soft_margin_loss
- nn.functional.multilabel_soft_margin_loss
- nn.functional.multilabel_margin_loss
- nn.functional.multi_margin_loss
- nn.functional.margin_ranking_loss

These OpInfos were taken from https://github.com/pytorch/pytorch/pull/67560, https://github.com/pytorch/pytorch/pull/67823, https://github.com/pytorch/pytorch/pull/68625, and https://github.com/pytorch/pytorch/pull/67079. The sample input update from https://github.com/pytorch/pytorch/pull/67017 is also rolled into this PR.

cc @zou3519 @nikitaved @pmeier @vfdev-5 @dagitses
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75782
Approved by: https://github.com/ngimel
2022-04-15 06:16:01 +00:00
PyTorch MergeBot
9312ee8cd6 Revert "remove fp16 support from cpu linalg functions"
This reverts commit 29af58db51.

Reverted https://github.com/pytorch/pytorch/pull/75647 on behalf of https://github.com/ngimel
2022-04-14 21:06:48 +00:00
Natalia Gimelshein
29af58db51 remove fp16 support from cpu linalg functions
fp16 on cpu produces slow and inaccurate results, see #69969

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75647
Approved by: https://github.com/Lezcano, https://github.com/mruberry
2022-04-14 18:45:59 +00:00
PyTorch MergeBot
495c5aebb1 Revert "remove fp16 support from cpu linalg functions"
This reverts commit de18c28a4c.

Reverted https://github.com/pytorch/pytorch/pull/75647 on behalf of https://github.com/suo
2022-04-13 18:34:35 +00:00
Natalia Gimelshein
de18c28a4c remove fp16 support from cpu linalg functions
fp16 on cpu produces slow and inaccurate results, see #69969

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75647
Approved by: https://github.com/Lezcano, https://github.com/mruberry
2022-04-13 17:24:22 +00:00
kshitij12345
65b65af236 [complex32] cat, fill_(partial), item
Reference : #74537

`cat_backwards` (on CUDA) requires support for `fill`, have added support for `fill`. (Also `fill` requires `item` support)

Now `fill` backward requires `sum` (will add it in later PR).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75010
Approved by: https://github.com/anjali411
2022-04-01 15:19:05 +00:00
Nikita Shulga
bfac65dfe5
[testing] Update dispatch macros (#74977)
This PR is reland of #74289 
Co-authored-by: Khushi Agrawal <khushiagrawal411@gmail.com>
2022-03-30 14:13:21 -07:00
PyTorch MergeBot
2e4152b118 Revert "[testing] Update dispatch macros"
This reverts commit eed19a0f38.

Reverted https://github.com/pytorch/pytorch/pull/74289 on behalf of https://github.com/malfet
2022-03-30 19:52:37 +00:00
Khushi Agrawal
eed19a0f38 [testing] Update dispatch macros
Hi,
This PR is the follow-up PR of #71561. (the previous PR had a couple of merge conflicts and was reverted, this PR resolves that).
Please take a look. Thanks!

cc: @pmeier @mruberry @kshitij12345
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74289
Approved by: https://github.com/pmeier, https://github.com/mruberry
2022-03-30 16:10:16 +00:00
Richard Zou
e832eedd29 Composite Compliance testing for backward formulas (#74646)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74646

The OpInfo-based test, given an operator and sample inputs,
checks all permutations of {inputs, grad_output} being either
{CompositeCompliantTensor, regular Tensor}, running them through a
forward pass and a backward pass.

Test Plan: - wait for tests

Reviewed By: albanD

Differential Revision: D35186860

Pulled By: zou3519

fbshipit-source-id: 8b2577dd6106c05db2ab583bbefd10545fdd8adf
(cherry picked from commit 3f5c3793715af9a8d4db06690c5faa7256a82645)
2022-03-28 22:12:41 +00:00
Richard Zou
80d64b365a Test case where some inputs are Tensor Subclasses in CompositeCompiance (#74645)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74645

This PR adds tests for when only some inputs are Tensor Subclasses.

Why is this important to test?
==============================

Consider the following hypothetical out-of-place operation:
```
def my_add(x, y):
  result = x.clone()
  result.add_(y)
  return result
```

You may expect this to work the same as torch.add. If x is not a Tensor
Subclass, but y is a Tensor subclass, then this returns us a regular
Tensor, NOT a Tensor subclass!

This is exactly the type of in-place operations that causes `vmap` to
fail and will be problematic for certain Tensor Subclasses in the future
so we're adding tests to make sure Composite pytorch operations don't do
this.

What exactly does this PR do?
=============================
Composite compliance now takes a sample input and produces a test case
where some of the sample inputs are Tensor Subclasses. It then sends
this through the original operation, once with Python Mode and one
without.

(Why once with Python Mode? Because we want to use it to detect the
pattern of "create a Tensor and call resize_ on it")

Finally, it repeats this process for all possiblities where the inputs
are Tensor subclasses. For example, if the sample input is (x, y), then
we test all four of the following cases:
- Subclass(x), y
- x, Subclass(y)
- Subclass(x), Subclass(y)
- x, y

Test Plan
=========
- run tests

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D35186862

Pulled By: zou3519

fbshipit-source-id: 102477507b56583463668db7523a6586d92b357d
(cherry picked from commit bfcb087244b0598abb270f7c26d472482f00b5e2)
2022-03-28 22:12:41 +00:00
Richard Zou
c96f321804 Move CompositeCompliance tests to their own TestCase (#74644)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74644

This is in preparation for me adding additional tests for:
1. composite compliance of autograd formulas
2. composite compliance of forward-mode AD formulas

This PR also changes these tests to run on both CPU and CUDA. Previously
they were just run on CPU, but it turns out there's a lot of branching
on the device in composite operations in PyTorch today :/

Test Plan: - wait for tests

Reviewed By: albanD

Differential Revision: D35186861

Pulled By: zou3519

fbshipit-source-id: d974592a7547f71ef26ff0740bf453f7d335d55a
(cherry picked from commit 773b43394c2406502a6e386a30eb003a73861f13)
2022-03-28 22:12:40 +00:00
Jane Xu
a1e284d9c8 Remove high priority as an owner for tests (#74555)
Summary:
Following triage review discussion, it would be best for these tests to not be triaged high priority by automation, but by the triagers in the oncall.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74555

Reviewed By: albanD

Differential Revision: D35099202

Pulled By: janeyx99

fbshipit-source-id: 657a0317141de3a598476a6f601ec26cc26231b1
(cherry picked from commit 057519cb2494d0f9a0b169f359ac87ba9e89f088)
2022-03-24 14:29:52 +00:00
Mike Ruberry
0aa3c39e5f Extends OpInfo architecture with reference inputs, adds them for elementwise binary operators
This PR extends our OpInfo test architecture with "reference inputs," an optional expansion of typical sample inputs that allows for more thorough testing. Currently only the elementwise binary operations implement an extended set of reference inputs. This PR also cleans up some smaller OpInfo-related issues, including several bugs, and it identified https://github.com/pytorch/pytorch/issues/74279.

A reference inputs function can be specified for an OpInfo by filling in its "reference_inputs_func" metadata. If this is done it's recommended that the reference inputs function first call the sample inputs function, then produce additional sample inputs. See `reference_inputs_elementwise_binary` for an example of this pattern.

In addition to implementing reference inputs for the elementwise binary operations, this PR improves their consistency and simplifies how their metadata is represented. The great majority now use a generic sample input function, and those that want extensions start by calling the generic sample input function and then adding additional samples. This removes many older sample input functions. The BinaryUfuncInfo subclass also now allows specifying scalar support more precisely, and reference inputs and error inputs are generated based on this metadata to ensure it's correct.

cc @kshitij12345 @pmeier @zou3519 @Chillee

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74280
Approved by: https://github.com/ngimel
2022-03-21 03:24:16 +00:00
atalman
ebca80ed08 Move test ops gradients and test ops jit to separate files
Fixes #72368

As per reference issue, the test_ops in single file takes around 3:30-4:00Hrs to execute on asan jobs:

Reference : pytorch_test_times.json

```
{
    "commit": "39535fec6c3ff5bf7c2d322d096c59571c3295ed",
    "JOB_BASE_NAME": "linux-xenial-py3.7-clang7-asan",
    "job_times": {
        "test_ops": 14928.355000000636, <- This test group is over 4hrs alone
```
----

Hence separating  test_ops into following parts:
1. TestGradients
2. TestJit
3.  TestCommon and TestMathBits

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74297
Approved by: https://github.com/malfet
2022-03-17 02:07:50 +00:00
PyTorch MergeBot
232faeacf8 Revert "Move test ops gradients and test ops jit to separate files"
This reverts commit 7cf9b942da.

Reverted https://github.com/pytorch/pytorch/pull/74297 on behalf of https://github.com/atalman
2022-03-16 20:08:23 +00:00
atalman
7cf9b942da Move test ops gradients and test ops jit to separate files
Fixes #72368

As per reference issue, the test_ops in single file takes around 3:30-4:00Hrs to execute on asan jobs:

Reference : pytorch_test_times.json

```
{
    "commit": "39535fec6c3ff5bf7c2d322d096c59571c3295ed",
    "JOB_BASE_NAME": "linux-xenial-py3.7-clang7-asan",
    "job_times": {
        "test_ops": 14928.355000000636, <- This test group is over 4hrs alone
```
----

Hence separating  test_ops into following parts:
1. TestGradients
2. TestJit
3.  TestCommon and TestMathBits

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74297
Approved by: https://github.com/malfet
2022-03-16 19:30:22 +00:00
Nikita Shulga
ef066f0832 Revert D34856571: [pytorch][PR] Replace get_all_ type macros with the ATen dispatch macros.
Test Plan: revert-hammer

Differential Revision:
D34856571 (3ded7b1da3)

Original commit changeset: 0dca038bcad5

Original Phabricator Diff: D34856571 (3ded7b1da3)

fbshipit-source-id: 594553fa0b710d78beba59d5d2b646f1f1270386
(cherry picked from commit 8090eb9b12dcf452a9e7dc01792a66fb91b563b6)
2022-03-15 22:07:11 +00:00
Khushi Agrawal
3ded7b1da3 Replace get_all_ type macros with the ATen dispatch macros. (#71561)
Summary:
Hi, Team!
The PR is motivated from https://github.com/pytorch/pytorch/pull/71153#discussion_r782446738. It aims to replace `get_all` type macros with the ATen dispatch macros.

The files it iterates over are: (Thanks, Lezcano, for the idea!!)

<details>
<summary>

`test/test_autograd.py`</summary>

<p>

```python
43:from torch.testing._internal.common_dtype import get_all_dtypes
8506:        floating_dt = [dt for dt in get_all_dtypes() if dt.is_floating_point]
```

</p>
</details>

<details>
<summary>

`test/test_binary_ufuncs.py`</summary>

<p>

```python
26:    all_types_and_complex_and, integral_types_and, get_all_dtypes, get_all_int_dtypes, get_all_math_dtypes,
27:    get_all_complex_dtypes, get_all_fp_dtypes,
935:    dtypes(*get_all_dtypes(include_bool=False, include_complex=False))
1035:    dtypes(*get_all_dtypes(
1488:    dtypes(*(get_all_dtypes(include_bool=False, include_bfloat16=False)))
1879:    dtypes(*product(get_all_dtypes(include_complex=False), get_all_dtypes(include_complex=False)))
1887:    dtypes(*(get_all_int_dtypes() + [torch.bool]))
1913:    dtypes(*(get_all_fp_dtypes()))
1941:    dtypes(*(get_all_fp_dtypes()))
1977:    dtypes(*product(get_all_complex_dtypes(), get_all_dtypes()))
2019:    dtypes(*product(get_all_fp_dtypes(), get_all_fp_dtypes()))
2048:    dtypes(*get_all_dtypes())
2110:    dtypes(*product(get_all_dtypes(include_complex=False),
2111:                     get_all_dtypes(include_complex=False)))
2128:            types = [torch.bool, torch.bfloat16] + get_all_int_dtypes()
2173:        if dtypes[1] in get_all_fp_dtypes():
2178:    dtypes(*product(get_all_fp_dtypes(),
2179:                     get_all_fp_dtypes()))
2260:    dtypesIfCUDA(*set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128})
2261:    dtypes(*set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128})
2273:    dtypesIfCUDA(*set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128})
2274:    dtypes(*set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128})
2307:    dtypes(*get_all_math_dtypes('cpu'))
2319:    dtypes(*get_all_fp_dtypes(include_bfloat16=False))
2331:    dtypes(*get_all_int_dtypes())
2356:    dtypes(*get_all_dtypes(include_bfloat16=False, include_bool=False, include_complex=False))
2393:        if dtype in get_all_int_dtypes():
2614:    dtypes(*get_all_dtypes())
2624:    dtypes(*tuple(itertools.combinations_with_replacement(get_all_dtypes(), 2)))
2806:    dtypes(*list(product(get_all_dtypes(include_complex=False),
2807:                          get_all_dtypes(include_complex=False))))
2866:    dtypes(*list(product(get_all_complex_dtypes(),
2867:                          get_all_complex_dtypes())))
2902:    dtypes(*product(get_all_dtypes(), get_all_dtypes()))
2906:    dtypes(*product(get_all_dtypes(), get_all_dtypes()))
2910:    dtypes(*product(get_all_dtypes(), get_all_dtypes()))
3019:        dtypes = [torch.float, torch.double] + get_all_complex_dtypes()
3221:    dtypes(*get_all_dtypes(include_complex=False))
3407:    dtypes(*list(product(get_all_dtypes(include_bool=False),
3408:                          get_all_dtypes(include_bool=False))))
3504:    dtypes(*product(get_all_dtypes(include_complex=False, include_bfloat16=False),
3505:                     get_all_dtypes(include_complex=False, include_bfloat16=False)))
3516:            if x.dtype in get_all_int_dtypes() + [torch.bool]:
3643:    dtypes(*product(get_all_dtypes(include_complex=False,
3645:                     get_all_dtypes(include_complex=False,
```

</p>
</details>

<details>
<summary>

`test/test_complex.py`</summary>

<p>

```python
6:from torch.testing._internal.common_dtype import get_all_complex_dtypes
11:    dtypes(*get_all_complex_dtypes())
```

</p>
</details>

<details>
<summary>

`test/test_foreach.py`</summary>

<p>

```python
18:    get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes,
142:            if dtype in get_all_int_dtypes():
179:            disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool]
201:            disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool]
205:                disable_fastpath |= dtype in get_all_int_dtypes() + [torch.bool]
211:                disable_fastpath |= dtype not in get_all_complex_dtypes()
241:                bool_int_div = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool]
246:                    disable_fastpath |= dtype in get_all_int_dtypes() + [torch.bool]
248:                    disable_fastpath |= dtype not in get_all_complex_dtypes()
250:                    disable_fastpath |= True and dtype not in get_all_complex_dtypes()
307:        disable_fastpath = dtype in get_all_int_dtypes() + [torch.bool]
365:        if opinfo.name == "_foreach_abs" and dtype in get_all_complex_dtypes():
376:    ops(foreach_unary_op_db, dtypes=get_all_dtypes())
393:         dtypes=get_all_dtypes(include_half=True, include_bfloat16=True, include_complex=False))
401:    ops(foreach_minmax_op_db, dtypes=get_all_fp_dtypes(include_bfloat16=True, include_half=True))
426:            if ord in (1, 2) and dtype in torch.testing.get_all_fp_dtypes():
439:    dtypes(*get_all_dtypes())
449:    ops(foreach_binary_op_db, dtypes=get_all_dtypes())
481:    ops(foreach_binary_op_db, dtypes=get_all_dtypes())
536:            if dtype in get_all_int_dtypes() + [torch.bool] and foreach_op == torch._foreach_div:
545:    ops(foreach_binary_op_db, dtypes=get_all_dtypes())
637:    ops(foreach_pointwise_op_db, allowed_dtypes=get_all_fp_dtypes(include_half=False, include_bfloat16=False))
```

</p>
</details>

<details>
<summary>

`test/test_linalg.py`</summary>

<p>

```python
29:    all_types, floating_types, floating_and_complex_types, get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes,
30:    get_all_fp_dtypes,
111:    dtypes(*(get_all_dtypes()))
794:        float_and_complex_dtypes = get_all_fp_dtypes() + get_all_complex_dtypes()
807:    dtypes(*(get_all_int_dtypes()))
828:    dtypes(*(get_all_fp_dtypes() + get_all_complex_dtypes()))
841:        if dtype in get_all_complex_dtypes():
844:    dtypes(*itertools.product(get_all_dtypes(),
845:                               get_all_dtypes()))
855:        for dtypes0, dtypes1, dtypes2 in product(get_all_dtypes(), repeat=3):
5607:                  *get_all_fp_dtypes(include_half=not CUDA9, include_bfloat16=(CUDA11OrLater and SM53OrLater)))
5608:    dtypes(*(set(get_all_dtypes()) - {torch.half, torch.bool}))
5644:    dtypes(*(get_all_complex_dtypes() + get_all_fp_dtypes()))
6255:    dtypesIfCUDA(*get_all_complex_dtypes(),
6256:                  *get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)),
6292:    dtypesIfCUDA(*get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater))))
6323:    dtypesIfCUDA(*get_all_complex_dtypes(),
6324:                  *get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater))))
6325:    dtypes(*get_all_complex_dtypes(), *get_all_fp_dtypes())
6358:    dtypesIfCUDA(*([torch.float, torch.double] + get_all_complex_dtypes()))
6556:    dtypes(*get_all_fp_dtypes(), *get_all_complex_dtypes())
6668:    dtypes(*get_all_fp_dtypes(), *get_all_complex_dtypes())
6741:    dtypes(*get_all_fp_dtypes(), *get_all_complex_dtypes())
```

</p>
</details>

<details>
<summary>

`test/test_nn.py`</summary>

<p>

```python
37:from torch.testing._internal.common_dtype import integral_types, get_all_fp_dtypes, get_all_math_dtypes
50:    onlyNativeDeviceTypes, deviceCountAtLeast, largeTensorTest, expectedFailureMeta, skipMeta, get_all_device_types, \
8862:                for device in get_all_device_types():
9629:            for dt1 in get_all_math_dtypes(device):
9630:                for dt2 in get_all_math_dtypes(device):
9631:                    for dt3 in get_all_math_dtypes(device):
9648:            for input_dtype in get_all_math_dtypes(device):
9664:            for input_dtype in get_all_math_dtypes(device):
13015:    dtypes(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
13034:    dtypes(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
13159:    dtypes(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
17400:    dtypesIfCUDA(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
17768:    dtypesIfCUDA(*get_all_fp_dtypes())
17773:    dtypesIfCUDA(*get_all_fp_dtypes())
17778:    dtypesIfCUDA(*get_all_fp_dtypes())
17783:    dtypesIfCUDA(*get_all_fp_dtypes())
17788:    dtypesIfCUDA(*get_all_fp_dtypes())
17793:    dtypesIfCUDA(*get_all_fp_dtypes())
17798:    dtypesIfCUDA(*get_all_fp_dtypes())
17963:    dtypesIfCUDA(*get_all_fp_dtypes())
17977:    dtypesIfCUDA(*get_all_fp_dtypes())
18684:    def test_cross_entropy_loss_prob_target_all_reductions(self, device):
```

</p>
</details>

<details>
<summary>

`test/test_numpy_interop.py`</summary>

<p>

```python
12:from torch.testing._internal.common_dtype import get_all_dtypes
399:    dtypes(*get_all_dtypes())
```

</p>
</details>

<details>
<summary>

`test/test_ops.py`</summary>

<p>

```python
12:from torch.testing._internal.common_dtype import floating_and_complex_types_and, get_all_dtypes
86:        for dtype in get_all_dtypes():
```

</p>
</details>

<details>
<summary>

`test/test_reductions.py`</summary>

<p>

```python
16:    get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes,
360:         allowed_dtypes=get_all_dtypes(include_bfloat16=False))
366:         allowed_dtypes=get_all_dtypes(include_bfloat16=False))
394:         allowed_dtypes=get_all_dtypes(include_bfloat16=False))
750:        for dtype in [dtype for dtype in get_all_math_dtypes('cpu') if dtype != torch.float16]:
1404:    dtypes(*get_all_dtypes(include_bool=False, include_complex=False))
1457:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1458:              get_all_complex_dtypes()))
1465:            return dtype in get_all_int_dtypes()
1494:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)))
1501:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)))
1507:    dtypes(*(get_all_complex_dtypes()))
1514:        dtypes = list(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))
1523:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)))
1531:        if dtype in get_all_fp_dtypes():
1608:    dtypes(*(get_all_dtypes(include_half=True, include_bfloat16=False,
1837:    dtypes(*get_all_dtypes(include_bool=False, include_complex=False))
1855:    dtypes(*(set(get_all_dtypes(include_bool=False, include_complex=False)) - {torch.uint8}))
3219:        for dtype in get_all_dtypes(include_half=True, include_bfloat16=False,
```

</p>
</details>

<details>
<summary>

`test/test_serialization.py`</summary>

<p>

```python
26:from torch.testing._internal.common_dtype import get_all_dtypes
586:        for device, dtype in product(devices, get_all_dtypes()):
589:            for other_dtype in get_all_dtypes():
```

</p>
</details>

<details>
<summary>

`test/test_shape_ops.py`</summary>

<p>

```python
18:from torch.testing._internal.common_dtype import get_all_dtypes
230:    dtypes(*get_all_dtypes(include_complex=False, include_bool=False, include_half=False,
232:    dtypesIfCUDA(*get_all_dtypes(include_complex=False, include_bool=False, include_bfloat16=False))
344:    dtypes(*get_all_dtypes())
443:    dtypes(*get_all_dtypes())
461:    dtypes(*get_all_dtypes())
570:    dtypes(*get_all_dtypes(include_complex=False))
```

</p>
</details>

<details>
<summary>

`test/test_sort_and_select.py`</summary>

<p>

```python
12:    all_types, all_types_and, floating_types_and, get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes,
136:    dtypes(*set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128})
231:    dtypes(*set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128})
296:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
647:    dtypesIfCUDA(*get_all_fp_dtypes())
678:    dtypesIfCUDA(*(get_all_dtypes(include_complex=False,
682:    dtypes(*(get_all_dtypes(include_complex=False, include_bool=False, include_half=False, include_bfloat16=False)))
739:    dtypesIfCPU(*set(get_all_dtypes()) - {torch.complex64, torch.complex128})
740:    dtypes(*set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128})
799:    dtypesIfCPU(*set(get_all_dtypes()) - {torch.complex64, torch.complex128})
800:    dtypes(*set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128})
```

</p>
</details>

<details>
<summary>

`test/test_sparse.py`</summary>

<p>

```python
20:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes
29:    floating_and_complex_types, floating_and_complex_types_and, get_all_dtypes, get_all_int_dtypes,
1963:            return dtype in get_all_int_dtypes()
1994:    dtypes(*get_all_dtypes(include_bool=False, include_half=False,
2103:            return dtype in get_all_int_dtypes()
2138:    dtypes(*get_all_dtypes(include_bool=False, include_half=False,
2626:        all_sparse_dtypes = get_all_dtypes(include_complex=True)
2633:        all_sparse_dtypes = get_all_dtypes(include_complex=True)
3230:    dtypes(*get_all_complex_dtypes(),
3231:            *get_all_fp_dtypes(include_half=False, include_bfloat16=False))
3234:                  *get_all_fp_dtypes(
```

</p>
</details>

<details>
<summary>

`test/test_sparse_csr.py`</summary>

<p>

```python
7:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes, floating_and_complex_types, make_tensor
17:from torch.testing._internal.common_dtype import floating_types, get_all_dtypes
120:    dtypes(*get_all_dtypes())
133:    dtypes(*get_all_dtypes())
150:    dtypes(*get_all_dtypes())
180:    dtypes(*get_all_dtypes())
201:    dtypes(*get_all_dtypes())
210:    dtypes(*get_all_dtypes())
225:    dtypes(*get_all_dtypes())
244:    dtypes(*get_all_dtypes())
263:    dtypes(*get_all_dtypes())
285:    dtypes(*get_all_dtypes())
411:    dtypes(*get_all_dtypes())
482:    dtypes(*get_all_dtypes())
502:    dtypes(*get_all_dtypes())
562:    dtypes(*get_all_dtypes())
588:    dtypesIfCUDA(*get_all_complex_dtypes(),
589:                  *get_all_fp_dtypes(include_half=SM53OrLater, include_bfloat16=SM80OrLater))
745:    dtypesIfCUDA(*get_all_complex_dtypes(),
746:                  *get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC,
765:    dtypesIfCUDA(*get_all_complex_dtypes(),
766:                  *get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC,
801:                  *torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater,
841:                  *torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater,
1182:    dtypes(*get_all_dtypes())
1276:    dtypes(*get_all_dtypes(include_bool=False, include_half=False, include_bfloat16=False))
1286:    dtypes(*get_all_dtypes())
```

</p>
</details>

<details>
<summary>

`test/test_tensor_creation_ops.py`</summary>

<p>

```python
21:    onlyCUDA, skipCPUIf, dtypesIfCUDA, skipMeta, get_all_device_types)
23:    get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes
150:        for dt in get_all_dtypes():
160:        for dt in get_all_dtypes():
314:        dtypes = [dtype for dtype in get_all_dtypes() if dtype != torch.bfloat16]
1012:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1013:              get_all_complex_dtypes()))
1032:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1033:              get_all_complex_dtypes()))
1050:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1051:              get_all_complex_dtypes()))
1745:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1779:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1868:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1926:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1954:            do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device)
1956:            do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, None)
1957:            do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device)
2538:        for device in get_all_device_types():
2645:        for dtype in get_all_dtypes():
2678:    dtypes(*(get_all_fp_dtypes(include_half=False, include_bfloat16=False) +
2679:              get_all_complex_dtypes()))
2716:    dtypes(*get_all_fp_dtypes(include_half=False, include_bfloat16=False))
2827:            for dt in get_all_dtypes():
2913:    dtypes(*get_all_dtypes(include_bool=False, include_half=False))
2914:    dtypesIfCUDA(*get_all_dtypes(include_bool=False, include_half=True))
3028:    dtypes(*(get_all_fp_dtypes() + get_all_complex_dtypes()))
3033:    dtypes(*(get_all_fp_dtypes() + get_all_complex_dtypes()))
3074:    dtypes(*get_all_dtypes(include_bool=False, include_half=False, include_complex=False))
3075:    dtypesIfCUDA(*((get_all_int_dtypes() + [torch.float32, torch.float16, torch.bfloat16])
3077:                    else get_all_dtypes(include_bool=False, include_half=True, include_complex=False)))
3873:    dtypes(*get_all_dtypes())
3884:    dtypes(*get_all_dtypes(include_bool=False))
3916:            for other in get_all_dtypes():
3922:    dtypes(*get_all_dtypes())
3932:    dtypes(*get_all_dtypes(include_bool=False))
3955:    dtypes(*get_all_dtypes(include_bool=False))
3961:    dtypes(*get_all_dtypes(include_bool=False))
3965:    dtypes(*get_all_dtypes())
```

</p>
</details>

<details>
<summary>

`test/test_testing.py`</summary>

<p>

```python
25:from torch.testing._internal.common_dtype import get_all_dtypes
31:    dtypes(*(get_all_dtypes(include_half=True, include_bfloat16=False,
```

</p>
</details>

<details>
<summary>

`test/test_torch.py`</summary>

<p>

```python
51:    expectedAlertNondeterministic, get_all_device_types, skipXLA)
57:    get_all_fp_dtypes, get_all_int_dtypes, get_all_math_dtypes, get_all_dtypes, get_all_complex_dtypes
296:            for d in get_all_device_types():
323:            for device in get_all_device_types():
324:                for dt1 in get_all_dtypes():
325:                    for dt2 in get_all_dtypes():
343:            all_dtypes = get_all_dtypes()
350:            all_dtypes = get_all_dtypes()
781:            for dtype in get_all_dtypes():
986:            for device in get_all_device_types():
1017:            for device in get_all_device_types():
1018:                for dtype in get_all_math_dtypes(device):
2792:            for device in get_all_device_types():
3186:    dtypes(*get_all_dtypes())
3195:        for error_dtype in get_all_dtypes():
3203:    dtypes(*get_all_dtypes())
3212:        for error_dtype in get_all_dtypes():
4539:    dtypes(*get_all_fp_dtypes())
4545:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
4577:    dtypes(*get_all_fp_dtypes(include_half=False, include_bfloat16=False))
4578:    dtypesIfCPU(*(get_all_fp_dtypes(include_half=False, include_bfloat16=True)))
4579:    dtypesIfCUDA(*(get_all_fp_dtypes(include_bfloat16=False)))
4599:    dtypes(*(get_all_fp_dtypes(include_half=False, include_bfloat16=False)))
4600:    dtypesIfCPU(*(get_all_dtypes(include_half=False, include_bfloat16=False, include_complex=False)))
4601:    dtypesIfCUDA(*(get_all_dtypes(include_bfloat16=False, include_complex=False)))
4613:        for p_dtype in get_all_fp_dtypes(include_half=device.startswith('cuda'), include_bfloat16=False):
4628:    dtypes(*(get_all_fp_dtypes(include_half=False, include_bfloat16=False)))
4629:    dtypesIfCUDA(*(get_all_fp_dtypes(include_bfloat16=False)))
4640:    dtypes(*get_all_fp_dtypes())
4723:    dtypes(*get_all_fp_dtypes())
4735:    dtypes(*get_all_fp_dtypes(include_bfloat16=False))
4736:    dtypesIfCUDA(*get_all_fp_dtypes())
4747:    dtypes(*get_all_fp_dtypes())
4761:    dtypes(*get_all_fp_dtypes())
4771:    dtypes(*get_all_fp_dtypes())
4792:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
5302:    dtypes(*get_all_dtypes(include_bfloat16=False))
5322:    dtypes(*get_all_dtypes(include_half=False, include_bfloat16=False))
5323:    dtypesIfCPU(*get_all_dtypes(include_bfloat16=False))
5324:    dtypesIfCUDA(*get_all_dtypes(include_bfloat16=False))
5591:        for dt in get_all_dtypes():
5611:        for dt in get_all_dtypes():
5678:        for dt in get_all_dtypes():
5696:    dtypesIfCUDA(*set(get_all_math_dtypes('cuda')))
5697:    dtypes(*set(get_all_math_dtypes('cpu')))
5746:    dtypes(*get_all_dtypes())
5780:    dtypes(*get_all_dtypes())
5885:    dtypes(*get_all_dtypes())
5902:    dtypes(*get_all_dtypes())
5945:    dtypes(*get_all_dtypes())
5979:    dtypes(*get_all_dtypes(include_bool=False))
6049:    dtypes(*get_all_dtypes(include_bool=False))
6092:    dtypes(*(get_all_fp_dtypes(include_bfloat16=False, include_half=False) +
6093:              get_all_complex_dtypes()))
6094:    dtypesIfCPU(*get_all_dtypes())
6095:    dtypesIfCUDA(*get_all_dtypes())
6122:    dtypes(*(get_all_fp_dtypes(include_bfloat16=False, include_half=False) +
6123:              get_all_complex_dtypes()))
6124:    dtypesIfCPU(*get_all_dtypes())
6125:    dtypesIfCUDA(*get_all_dtypes())
6163:    dtypes(*(get_all_fp_dtypes(include_bfloat16=False, include_half=False) +
6164:              get_all_complex_dtypes()))
6165:    dtypesIfCPU(*get_all_dtypes())
6166:    dtypesIfCUDA(*get_all_dtypes())
6190:    dtypes(*(get_all_complex_dtypes() +
6191:              get_all_int_dtypes()))
6238:    dtypes(*get_all_dtypes())
6323:    dtypes(*get_all_dtypes())
6389:    dtypes(*product(get_all_dtypes(), (torch.uint8, torch.bool)))
6699:    dtypesIfCUDA(*set(get_all_math_dtypes('cuda')))
6700:    dtypes(*set(get_all_math_dtypes('cpu')))
7452:    dtypes(*get_all_dtypes(include_bool=False))
7461:    dtypes(*get_all_dtypes(include_bool=False))
7477:    dtypes(*get_all_dtypes(include_bool=False))
7496:    dtypes(*get_all_dtypes(include_bool=False))
7538:    dtypes(*get_all_dtypes(include_bool=False))
8162:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes() +
8163:              get_all_complex_dtypes()))
8175:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes() +
8176:              get_all_complex_dtypes()))
```

</p>
</details>

<details>
<summary>

`test/test_type_promotion.py`</summary>

<p>

```python
14:    get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes
187:        for dtype in get_all_dtypes():
262:        dtypes1 = get_all_math_dtypes('cuda')
263:        dtypes2 = get_all_math_dtypes(device)
339:    dtypes(*itertools.product(get_all_dtypes(), get_all_dtypes()))
468:            for dt1 in get_all_math_dtypes(device):
469:                for dt2 in get_all_math_dtypes(device):
519:            for dt1 in get_all_math_dtypes(device):
520:                for dt2 in get_all_math_dtypes(device):
528:        for dt in get_all_math_dtypes(device):
561:        for dtype in get_all_dtypes():
766:                                          dtypes=get_all_math_dtypes(device))
771:                                          dtypes=get_all_math_dtypes(device))
782:                                          dtypes=get_all_math_dtypes(device))
879:        dtypes = get_all_dtypes(include_bfloat16=False)
898:        dtypes = get_all_dtypes(include_bfloat16=False, include_bool=False)
965:    dtypesIfCUDA(*itertools.product(get_all_dtypes(include_bfloat16=False, include_complex=False),
966:                                     get_all_dtypes(include_bfloat16=False, include_complex=False)))
967:    dtypes(*itertools.product(get_all_dtypes(include_half=False, include_bfloat16=False,
969:                               get_all_dtypes(include_half=False, include_bfloat16=False,
976:            return dtype in get_all_int_dtypes() + [torch.bool]
979:            return dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False)
```

</p>
</details>

<details>
<summary>

`test/test_unary_ufuncs.py`</summary>

<p>

```python
24:    floating_types_and, all_types_and_complex_and, floating_and_complex_types_and, get_all_dtypes, get_all_math_dtypes,
25:    get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes
517:    dtypes(*(get_all_int_dtypes() + [torch.bool] +
518:              get_all_fp_dtypes(include_bfloat16=False)))
596:    dtypes(*get_all_fp_dtypes(include_half=True, include_bfloat16=False))
611:        invalid_input_dtypes = get_all_int_dtypes() + \
612:            get_all_complex_dtypes() + \
619:        for dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False):
1048:    dtypes(*get_all_math_dtypes('cpu'))
1182:    dtypesIfCUDA(*get_all_fp_dtypes())
1190:    dtypesIfCUDA(*get_all_fp_dtypes())
1205:    dtypesIfCUDA(*get_all_fp_dtypes())
1215:    dtypesIfCUDA(*get_all_fp_dtypes())
1307:    dtypes(*(get_all_dtypes(include_bool=False)))
1349:    dtypes(*(get_all_fp_dtypes(include_half=False) +
1350:              get_all_complex_dtypes()))
1351:    dtypesIfCUDA(*(get_all_fp_dtypes(include_half=True) +
1352:                    get_all_complex_dtypes()))
```

</p>
</details>

<details>
<summary>

`test/test_view_ops.py`</summary>

<p>

```python
19:    get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes
124:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
131:    dtypes(*get_all_dtypes(include_bfloat16=False))
213:            for view_dtype in [*get_all_fp_dtypes(), *get_all_complex_dtypes()]:
220:    dtypes(*get_all_dtypes())
224:        for view_dtype in get_all_dtypes():
305:    dtypes(*get_all_complex_dtypes(include_complex32=True))
343:    dtypes(*get_all_dtypes())
354:    dtypes(*get_all_dtypes())
364:    dtypes(*get_all_dtypes())
374:    dtypes(*get_all_dtypes())
384:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
395:    dtypes(*get_all_complex_dtypes())
426:    dtypes(*get_all_complex_dtypes())
451:    dtypes(*product(get_all_complex_dtypes(), get_all_dtypes()))
1263:    dtypes(*(torch.testing.get_all_dtypes()))
1279:    dtypes(*(torch.testing.get_all_dtypes()))
1405:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1406:              get_all_complex_dtypes()))
1471:    dtypes(*get_all_dtypes(include_bfloat16=False))
1574:    dtypes(*get_all_dtypes())
1601:    dtypes(*get_all_dtypes(include_bfloat16=False))
1632:    dtypes(*get_all_dtypes(include_bfloat16=False))
1711:        for dt in get_all_dtypes():
1717:        for dt in get_all_dtypes():
1724:        for dt in get_all_dtypes():
```

</p>
</details>

I'm looking forward to your viewpoints. Thanks :)

cc: mruberry kshitij12345 anjali411

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71561

Reviewed By: samdow

Differential Revision: D34856571

Pulled By: mruberry

fbshipit-source-id: 0dca038bcad5cf69906245c496d2e61ac3876335
(cherry picked from commit b058f67b4313143efa714ab105f36e74083131b9)
2022-03-15 20:31:41 +00:00
Douglas Lehr
28549b618a [ROCm] Enable skipped ROCm unit tests (#67706)
Summary:
A number of ROCm tests were skipped via the skipCUDAIfRocm flag.
A majority of the testcases are now supported on the ROCm platform. This fix enabled all of the test_ops tests for ROCm and enables most Operators in  common_methods_invocations.py minus the SpectralFuncInfo class which still has some fft issues.

Partially Fixes https://github.com/pytorch/pytorch/issues/51303

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67706

Reviewed By: seemethere, janeyx99

Differential Revision: D34153457

Pulled By: malfet

fbshipit-source-id: 95f4420f306ca7580cd438d3b5cc0b24efbfae99
(cherry picked from commit 0d178fffd3)
2022-02-11 22:14:54 +00:00
David Berard
bbd42c605a [JIT] Opinfo tests for nnc fusion - retry (#72486)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72486

Retry #70465.

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D34061628

Pulled By: davidberard98

fbshipit-source-id: e27ed315bc4ad57cdbfbc9cedffcbb7886004524
(cherry picked from commit 7937808d2e)
2022-02-09 19:01:22 +00:00
Nikita Shulga
bb101ec78d Revert D33595240: [JIT] Opinfo tests for nnc fusion
Test Plan: revert-hammer

Differential Revision:
D33595240 (0b57bd4c66)

Original commit changeset: e2e17a921bc3

Original Phabricator Diff: D33595240 (0b57bd4c66)

fbshipit-source-id: 172a3ffd19d180b1b3617956b1f881be62f37bc9
(cherry picked from commit 324cfaea86)
2022-02-08 01:28:42 +00:00
David Berard
0b57bd4c66 [JIT] Opinfo tests for nnc fusion (#70465)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70465

These tests check to ensure that
(a) the result after nnc fusion (of a single op) is the same as the
unfused op
(b) for certain ops where fusion is expected to occur, ensure that
fusion does actually occur

Test Plan: Imported from OSS

Reviewed By: wenleix

Differential Revision: D33595240

Pulled By: davidberard98

fbshipit-source-id: e2e17a921bc30c313e92e8e5bbc6c1b5fcd14bc1
(cherry picked from commit b1ba221acc)
2022-02-07 20:56:21 +00:00
lezcano
6cb128c8dd Generalize noncontiguous tests to several outputs (#67996)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67996

This is necessary for most matrix decompositions in `linalg`.

cc mruberry

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33774418

Pulled By: mruberry

fbshipit-source-id: 576f2dda9d484808b4acf0621514c0ffe26834e6
(cherry picked from commit fb07c50aa9)
2022-01-27 23:13:17 +00:00
lezcano
e2011b29aa Add OpInfo test to check that floating point inputs in OpInfos have requires_grad set to True (#69909)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69909

This test detected a number of sampling methods that were not generating
the samples as expected, e.g. `index_put`, `cosine_embedding`, `stft`, but
perhaps most notably the generator for `BinOps`.

It also detected that `reminder` and `fmod` did not have implemented the
backward formula for the second input. I added this in the previous PR.

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33774422

Pulled By: mruberry

fbshipit-source-id: 76cfc75b1fdfd72ee64aa524665f83a75fe52509
(cherry picked from commit 13ea7b436b)
2022-01-27 23:13:17 +00:00
lezcano
8ff1a8fdca Implement forward AD for linalg.svd and improve svd_backward (#70253)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70253

I included a derivation of the formula in the complex case, as it is
particularly tricky. As far as I know, this is the first time this formula
is derived in the literature.

I also implemented a more efficient and more accurate version of svd_backward.
More importantly, I also added a lax check in the complex case making sure the loss
function just depends on the subspaces spanned by the pairs of singular
vectors, and not their joint phase.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33751982

Pulled By: mruberry

fbshipit-source-id: c2a4a92a921a732357e99c01ccb563813b1af512
(cherry picked from commit 391319ed8f)
2022-01-27 18:38:30 +00:00
soulitzer
5ccf28d066 Do not use ZeroTensor for inplace ops (#69998)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69998

Fixes: https://github.com/pytorch/pytorch/issues/69855

The check for undefined grads for forward AD was not being run because `check_undefined_grads` was only passed as True by OpInfo for backward AD. This PR updates gradcheck to interpret `check_undefined_grads` as possibly for forward or backward AD.

This PR also updates codegen to 1) not use ZeroTensor for `self` when the op is inplace. 2) only create zeros (either through ZeroTensor or at::zeros) if the tensor itself is not undefined. Previously we would error in this case when we call `.options` on the undefined tensor.

~TODO: undo the skips that are due to the original issue~

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D33235973

Pulled By: soulitzer

fbshipit-source-id: 5769b6d6ca123b2bed31dc2bc6bc8e4701581891
2021-12-23 15:52:34 -08:00
Peter Bell
7cdfd86a72 TestMathBits: test with neg and conj bit set (#68948)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68948

The case where both the negative and conjugate bits are set
isn't tested currently despite being handled explicitly by `copy`.
In theory this shouldn't matter because neg_bit is only used for real
values, but it does mean the code in copy is untested. So, this just
runs it with a single sample as a sanity check.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33064371

Pulled By: anjali411

fbshipit-source-id: e90c65e311507c4fc618ff74fecc4929599c4fa3
2021-12-22 14:30:35 -08:00
soulitzer
47f11730ec Add testing for forward over reverse gradgrad (#69740)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69740

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33031727

Pulled By: soulitzer

fbshipit-source-id: 2bcba422b4bcea3bbc936d07ba45171a6531e578
2021-12-14 23:35:10 -08:00
Peter Bell
1188d89a1d TestMathBits: Call functions with original sample input values (#68947)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68947

`_test_math_view` currently calls the operator with different values
than those specified in the `SampleInput`. This is undesirable as it
could break mathematical properties required by the operator. Instead,
this calls `math_op_view(math_op_physical(sample.input))` to get a
view that represents the same value as the original input.

`test_neg_view` already did this by returning `torch._neg_view(-x)`
from `math_op_view` but this moves the handling into `_test_math_view`
to make it apply to all view op tests.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33064327

Pulled By: anjali411

fbshipit-source-id: 4d87e0c04fc39b95f8dc30dcabda0d554d16a1d8
2021-12-14 11:10:13 -08:00
soulitzer
db32daf4b2 Do not test batched forward grad for inplace ops (#69558)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69558

Currently we skip batched forward grad checks completely for certain views that also have inplace variants. This PR allow us to decouple the check.

Alternative: just skip the batched forward checks for inplace ops entirely. I'm okay with this because it was surprising to me these checks are being run in the first place.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33020599

Pulled By: soulitzer

fbshipit-source-id: f8012aadc0e775f80da0ab62b2c11f6645bb1f51
2021-12-12 00:09:45 -08:00
Peter Bell
6de9f0fc94 OpInfo: Allow sample_inputs_func to be any iterable (#69256)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69256

Closes #52486

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D32942008

Pulled By: mruberry

fbshipit-source-id: f5b01b0298c0160b0bec6e86e2b6db8cfe746206
2021-12-09 08:37:26 -08:00
Mike Ruberry
b6f41bb848 The Jiterator (#69439)
Summary:
This PR:

- creates the "jiterator" pattern, allowing elementwise unary and binary kernels that don't accept scalars to be jit compiled when called
- ports the gcd and i1 CUDA kernels to use the jiterator
- extends elementwise binary systemic testing to be comparable to elementwise unary systemic testing
- separates one test case from test_out in test_ops.py
- updates more OpInfos to use expected failures instead of skips

The jiterator currently does not support half, bfloat16 or complex dtypes. It also (as mentioned above) doesn't support scalar inputs. In the future we expect to add support for those datatypes and scalars.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69439

Reviewed By: ngimel

Differential Revision: D32874968

Pulled By: mruberry

fbshipit-source-id: d44bb9cde4f602703e75400ec5a0b209f085e9b3
2021-12-06 07:32:48 -08:00
Ivan Yashchuk
219db3b4e1 Add OpInfo for torch.linalg.tensorsolve (#68810)
Summary:
This PR adds an OpInfo entry for tensorsolve function.
The keyword argument is different from NumPy so a lambda function is needed to be passed to `ref=`.
I had to change the dtypes for `test_reference_testing` because NumPy does computation internally using double for all linear algebra functions and maybe for some other functions. Using `torch.float64` and `torch.complex128` is more reliable for NumPy comparisons.

cc mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/68810

Reviewed By: soulitzer

Differential Revision: D32696065

Pulled By: mruberry

fbshipit-source-id: a4305065d3e7d0097503dc05938b3c4784e14996
2021-11-30 20:31:12 -08:00
Richard Zou
6fea7499c2 CompositeImplicitAutograd compliance testing (#65819)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65819

Related to #61669.

Functions registered as CompositeImplicitAutograd MUST work for most, if
not all, backends. This includes Tensor subclasses.

To achieve this, we (PyTorch) impose a set of constraints on how a
CompositeImplicitAutograd function can be written.

Concretely, this PR adds tests for all OpInfos that checks for
compliance. The things that get tested in this PR apply to composite
ops and are that:
- the op does not change the metadata of a Tensor without performing
dispatches
- the op does not call set_ or resize_
- the op does not directly access the data ptr

The mechanism for the test is to create a new __torch_dispatch__
object, CompositeCompliantTensor. For each operator, we wrap all inputs
in CompositeCompliantTensor, turn on python mode for it,
and send it through the operator.

Non-CompositeImplicitAutograd operators will pass the test because they
perform a dispatch to backend code. Here's how CompositeCompliantTensor
catches problems:

- If it sees set_ or resize_ getting called, it will directly error
out
- After each operation, CompositeCompliantTensor checks to make sure
that its metadata is consistent with that of the thing it is wrapping.
If the CompositeImplicitAutograd op modifes the metadata directly
(through e.g. the TensorImpl API) then the metadata will go out of sync.
- If data_ptr gets called, that returns a nice error (because the
storage is meta).

CompositeCompliantTensor is written in an interesting way. First off,
if a view operation occurs (e.g. `B = A.view_op(...)`), then B.storage()
must alias A.storage() where B.storage() is CompositeCompliantTensor's
storage, NOT the storage of the tensor it is wrapping. This is an
invariant in autograd, see #62182 for details. To handle
this we replay the view on A's storage and set it as B's storage.

Secondly, there are cases where the metadata is allowed to go out of
sync. I believe this is only possible with in-place view functions, like
transpose_, t_, squeeze_, unsqueeze_. Those are special cased.

Finally, I added a new section to aten/src/ATen/native/README.md about
what it means to be CompositeImplicitAutograd Compliant

Test Plan: - run tests

Reviewed By: ezyang, bdhirsh

Differential Revision: D31268369

Pulled By: zou3519

fbshipit-source-id: 31634b1cbe1778ab30196013cfc376ef9bd2e8b1
2021-11-30 07:35:22 -08:00
soulitzer
e358c49a5b Add OpInfo test and fix a couple cases (#66294)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66294

In this PR:
- OpInfo for forward AD now checks batched forward grad when `op.check_batched_grad=True`
- Adds setting to disable the test for individual ops `check_batched_forward_grad` and disable for the ops here: https://github.com/pytorch/pytorch/issues/66357

Fixes some more failures:
- Make Forward AD metadata less strict by allowing stride to differ when size is 1
- Fix sum batching rule when logical tensor is a scalar and dim is unspecified
- Batching rule for `_reshape_alias`
- ~Batching rules now preserve storage offset for view operator that return non-zero storage offset~ (moved to previous PR)

Test Plan: Imported from OSS

Reviewed By: zou3519, albanD

Differential Revision: D31842020

Pulled By: soulitzer

fbshipit-source-id: 3517a8fb9d6291fccb53c0b1631eab5bbb24ebd1
2021-11-19 14:31:03 -08:00
Mike Ruberry
613c1aca6d Adds support for automated error and warning testing (#67354)
Summary:
Adds a new class `ErrorOrWarningInput` that is a `SampleInput` with some additional metadata for validating that `SampleInput` throws the desired warning or error. The architecture to support these new tests is modeled after the existing reference tests and sample input functions.

Existing invalid input tests for neg and kthvalue are ported to the new scheme to validate it.

There may be a simpler/clearer naming scheme we can use here.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67354

Reviewed By: jbschlosser

Differential Revision: D31989888

Pulled By: mruberry

fbshipit-source-id: 4fa816e1e8d0eef21b81c2f80813d42b2c26714e
2021-11-11 19:28:47 -08:00
kshitij12345
885a8e53ba replace onlyOnCPUAndCUDA with onlyNativeDeviceTypes (#65201)
Summary:
Reference https://github.com/pytorch/pytorch/issues/53849

Replace `onlyOnCPUandCUDA` with `onlyNativeDeviceTypes` which includes `cpu, cuda and meta`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65201

Reviewed By: mrshenli

Differential Revision: D31299718

Pulled By: mruberry

fbshipit-source-id: 2d8356450c035d6a314209ab51b2c237583920fd
2021-11-01 09:22:34 -07:00
Jane Xu
c19cda5782 [skip ci] Add test owners for a special hi-pri class of tests (#67553)
Summary:
Action following https://github.com/pytorch/pytorch/issues/66232

This change does require some context: there were several suggestions regarding what to do about this group of tests: tests that are core and crucial to all of PyTorch and are too broad to be owned by one team.
1. Let's add a "module: core" and put people behind it! This idea sounds appealing unless you are one of the people backing the label. From talking to albanD among others, this idea of putting all these core tests on the shoulder of a few people or one team isn't super fair and I have not yet found anyone willing to take on this job.
2. Taking advantage of the fact that we already have a triaging oncall that takes turns triaging issues, we can leave these tests essentially unlabeled and allow the oncall to triage these tests. Since these tests are crucial to PyTorch, we'll add the "high priority" label to mark them different from other unowned tests (see https://github.com/pytorch/pytorch/issues/67552).
3. I _could_ still create an unbacked label "module: core" and attribute these tests there, but I don't like the idea of creating a facade that the tests are "triaged" to a label when no one is actually taking a look.

Now we could potentially break these tests down into smaller files so that each piece _could_ be owned by a team, but 1. I don't know if this is currently feasible and 2. This approach does not prevent that from happening in the future.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67553

Reviewed By: albanD

Differential Revision: D32025004

Pulled By: janeyx99

fbshipit-source-id: 1fb1aa4c27e305695ab6e80ae3d02f90519939c0
2021-10-29 12:17:21 -07:00
Mike Ruberry
ddc9bd335b Adds reference vs. noncontiguous OpInfo test (#67434)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/63341.

This PR adds a new test, `test_noncontigous_samples`, that runs ops forward and backward and compares their outputs and grads between "normal" contiguous SampleInputs and noncontiguous SampleInputs. This test should preclude the need for noncontiguous SampleInputs going forward.

The test was added by generalizing the `.numpy()` transform on SampleInputs to support a new `.noncontiguous()` transform and copying forward/backward patterns from other tests in test_ops.py. It also discovered that many SampleInputs were incorrectly reusing tensors, so those have been revised. SampleInputs creating noncontiguous tensors for testing have also been altered to no longer do so.

In addition, this test discovered the following high priority silent correctness issues:

- https://github.com/pytorch/pytorch/issues/67432
- https://github.com/pytorch/pytorch/issues/67517
- https://github.com/pytorch/pytorch/issues/67513
- https://github.com/pytorch/pytorch/issues/67512
- https://github.com/pytorch/pytorch/issues/67470

It also identified the following issues:
- https://github.com/pytorch/pytorch/issues/67539

The pow OpInfo also incorrectly specified that pow supported the bool datatype, and this has been fixed. Its SampleInputs were written in a way that made requests for boolean SampleInputs return type promoting inputs that never actually tried to compute pow in bool.

This PR suggests we should add the following guidance for writing SampleInputs:

- ensure that all SampleInputs are independent of each other (don't reuse tensors)
- ensure that all SampleInput tensors have no grad or backward functions (no autograd history) -- they should be leaves
- prefer keeping sample inputs simple where possible, a good set of handwritten samples that test interesting cases may be better than an exhaustive but hard to read and maintain programmatic enumeration
- keep code readable by using functools.partial and writing simple inline helpers; break up large statements into a more readable series of smaller statements; especially don't write complicated generator expressions with a `for` at the end!

fyi kshitij12345 krshrimali pmeier anjali411 saketh-are zou3519 dagitses

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67434

Reviewed By: ngimel

Differential Revision: D32014557

Pulled By: mruberry

fbshipit-source-id: b17e19adc1d41e24441f0765af13d381fef5e3c1
2021-10-29 09:55:56 -07:00
Ivan Yashchuk
383c0a3858 Fix internal assert failure for torch.all and torch.any with requires_grad=True (#65714)
Summary:
This PR fixes https://github.com/pytorch/pytorch/issues/58547.
I added an OpInfo-based test that fails on master and passes with the
proposed changes.

cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 mruberry

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65714

Reviewed By: saketh-are, mruberry

Differential Revision: D31248307

Pulled By: albanD

fbshipit-source-id: 041eaa9b744c3043f78dd8ae5f457f67c311df4f
2021-10-01 07:32:44 -07:00
soulitzer
91611fe1d1 Decouple forward AD checks from backward AD in OpInfo tests and gradcheck (#65040)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/64999

- Adds a flag to gradcheck `check_backward_ad` that can be used to disable gradcheck for backward ad
  - This is a bit bc-breaking in terms of positional args, but I prefer this ordering
- In OpInfo tests for forward ad:
  - set `check_backward_ad` False
- In test_ops treat `supports_autograd` as if it is `supports_backward_ad` (it basically already is)
  - the only modification needed is to no longer skip forward ad tests if `supports_autograd` is false
  - test_dtype, test_variant_consistency, etc behave correctly as-is
  - In a follow-up PR, we can rename it to actually be `supports_backward_ad`
- Testing
  - https://github.com/pytorch/pytorch/pull/65060

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65040

Reviewed By: albanD

Differential Revision: D31238177

Pulled By: soulitzer

fbshipit-source-id: f068d4cbe7ffb094930b16cddb210583b9b7b2c4
2021-09-29 17:01:34 -07:00
Max Ren
0eaf081018 [JIT] canonicalize aten::rsub (#65014)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/65014

ghstack-source-id: 138656948

Test Plan:
```
(pytorch) [maxren@devvm3115.atn0 ~/pytorch] python3 test/test_jit.py TestPeephole
CUDA not available, skipping tests
monkeytype is not installed. Skipping tests for Profile-Directed Typing
........s......................
----------------------------------------------------------------------
Ran 31 tests in 0.393s

OK (skipped=1)
(pytorch) [maxren@devvm3115.atn0 ~/pytorch] python3 test/test_jit.py TestPeephole.test_normalized_rsub
CUDA not available, skipping tests
monkeytype is not installed. Skipping tests for Profile-Directed Typing
.
----------------------------------------------------------------------
Ran 1 test in 0.015s

OK
```

Reviewed By: eellison

Differential Revision: D30941389

fbshipit-source-id: 03f0416d99090845c9bfb1e5fcf771d5f1d7a050
2021-09-22 17:20:46 -07:00
Peter Bell
f90d9b48db test_neg_view: preseve sign of sample input (#63010)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63010

This changes `test_neg_view` to call the operator with the same numeric values as the original sample input.

Test Plan: Imported from OSS

Reviewed By: pbelevich

Differential Revision: D31082824

Pulled By: anjali411

fbshipit-source-id: 7d50f99dc0d1343247e366cbe9b0ca081bd0a9b1
2021-09-22 07:47:42 -07:00
Elias Ellison
29514bfcdb Max Pool with indices (#64121)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64121

Add support for aten operators which return multiple outputs

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D30738142

Pulled By: eellison

fbshipit-source-id: 0d7e51187bd5e3e9b43f0fdb5178366a97aec943
2021-09-15 13:45:46 -07:00
Ansley Ussery
c60075d4b5 Preserve types during empty container assignment (#58911)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58911

Stack from [ghstack](https://github.com/ezyang/ghstack):
* __->__ #58911

Test Plan: Imported from OSS

Reviewed By: gmagogsfm

Differential Revision: D30785623

Pulled By: ansley

fbshipit-source-id: 4e05d6369318974290fea02ad2bc148293c25090
2021-09-10 16:49:21 -07:00
Elias Ellison
cf2d15bf84 Add support for slice, selec twith int, index_select (#63365)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63365

Test Plan: Imported from OSS

Reviewed By: driazati

Differential Revision: D30738144

Pulled By: eellison

fbshipit-source-id: 7e0c572209bdc6e62ecb4fd1f06f80291de69803
2021-09-07 18:22:22 -07:00
Elias Ellison
c8a608b197 Add squeeze, unsqueeze, transpose shape functins (#63099)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63099

These are checked by OpInfos, which represent all of the inputs and semantics of the operators so it should be an easy stamp

Test Plan: Imported from OSS

Reviewed By: desertfire, astaff

Differential Revision: D30347514

Pulled By: eellison

fbshipit-source-id: 37b4c9ecd8c222cc12bf39166181464b43218830
2021-09-07 18:22:19 -07:00
Philip Meier
26b7ff5aea deprecate dtype getters from torch.testing namespace (#63554)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63554

Following https://github.com/pytorch/pytorch/pull/61840#issuecomment-884087809, this deprecates all the dtype getters publicly exposed in the `torch.testing` namespace. The reason for this twofold:

1. If someone is not familiar with the C++ dispatch macros PyTorch uses, the names are misleading. For example `torch.testing.floating_types()` will only give you `float32` and `float64` skipping `float16` and `bfloat16`.
2. The dtype getters provide very minimal functionality that can be easily emulated by downstream libraries.

We thought about [providing an replacement](https://gist.github.com/pmeier/3dfd2e105842ad0de4505068a1a0270a), but ultimately decided against it. The major problem is BC: by keeping it, either the namespace is getting messy again after a new dtype is added or we need to somehow version the return values of the getters.

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D30662206

Pulled By: mruberry

fbshipit-source-id: a2bdb10ab02ae665df1b5b76e8afa9af043bbf56
2021-09-07 08:58:51 -07:00
Ansley Ussery
6831d8e379 Support Union in TorchScript (#64234)
Summary:
This PR is created to replace https://github.com/pytorch/pytorch/pull/53180 PR stack, which has all the review discussions. Reason for needing a replacement is due to a messy Sandcastle issue.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64234

Reviewed By: gmagogsfm

Differential Revision: D30656444

Pulled By: ansley

fbshipit-source-id: 77536c8bcc88162e2c72636026ca3c16891d669a
2021-09-03 06:12:24 -07:00
Kushashwa Ravi Shrimali
d37636901e [Doc] make_tensor to torch.testing module (#63925)
Summary:
This PR aims to add `make_tensor` to the `torch.testing` module in PyTorch docs.

TODOs:

* [x] Add examples

cc: pmeier mruberry brianjo

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63925

Reviewed By: ngimel

Differential Revision: D30633487

Pulled By: mruberry

fbshipit-source-id: 8e5a1f880c6ece5925b4039fee8122bd739538af
2021-08-30 12:25:40 -07:00