Commit Graph

339 Commits

Author SHA1 Message Date
lezcano
fd27246c16 Fix decomposition for std (#87181)
The previous implementation was lacking a few features and incurred on a
pretty large error

cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87181
Approved by: https://github.com/ngimel, https://github.com/peterbell10
2022-10-28 00:50:29 +00:00
Natalia Gimelshein
f1b78224ca Fix type promotion for 2 wrapped scalar args (#87845)
Fixes #76801

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87845
Approved by: https://github.com/SherlockNoMad, https://github.com/mruberry
2022-10-27 15:53:11 +00:00
Nikita Karetnikov
59b9d29260 [primTorch] Check error_regex in test_python_ref_errors (#86987)
cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86987
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-26 23:34:34 +00:00
Bin Bao
2c1efe7472 Enable some PyTorch core tests with inductor (#87490)
Summary:
1) Graph break on torch.random.set_rng_state since it blocks running
inductor core tests;
2) Add several inductor-specific skips;
3) Enable several core tests for inductor CI;

cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87490
Approved by: https://github.com/eellison
2022-10-26 18:58:33 +00:00
Sherlock Huang
eb99c1efce Prefer python meta function over c++ meta function (#87426)
This is a policy update for meta registration. **We now prefer python meta implementation over C++ meta function.**  This is a flip of the previous policy, where we prefer C++ meta function over python meta function if they both exist.

Here's the meta registration process:
1. register_meta and register_decomposition will place the python meta/decomp functions into the `global_decomp_table`.  However, they will NOT register them into dispatcher.
2. After global_decomp_table is populated, we will compile an `active_meta_table`. For a given op, we pick the most specific decomp function from `global_decomp_table` in the preference order of Meta > PostAutograd > PreAutograd.
3. We will unconditionally register all of them into python dispatcher. And register them into C++ dispatcher, unless it one of the following 3 cases
- 1. the op is a CompositeImplicitAutograd, and should rely on decomposed op's meta
- 2. the op is a view op, as the MetaTensor doesn't support aliased storage
- 3. the op is in the blocklist (due to UT failures, and we will burn down this list op by op)

Over the long run, we wish to implement all meta functions in python. With this PR, 321 op_overloads will have cpp meta overridden by python meta. There are still 400 op_overloads is using cpp meta. The exact list can be found here https://gist.github.com/SherlockNoMad/d20bb736178df8eebd3b054c8bb7cdc5

cc @ngimel @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87426
Approved by: https://github.com/ezyang, https://github.com/jansel
2022-10-25 16:49:02 +00:00
Nikita Karetnikov
1b8af28fe8 [primTorch] Add refs for softmax, softmin, log_softmax (#84956)
cc @ezyang @mruberry @ngimel @Lezcano @fdrocha
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84956
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-20 12:29:04 +00:00
PyTorch MergeBot
cd21613526 Revert "[primTorch] Add refs for softmax, softmin, log_softmax (#84956)"
This reverts commit c09ca93e47.

Reverted https://github.com/pytorch/pytorch/pull/84956 on behalf of https://github.com/ZainRizvi due to This is causing the MPS test test_output_match_log_softmax_with_dtype_cpu_float32 (__main__.TestConsistencyCPU) to fail
2022-10-19 20:36:55 +00:00
Nikita Karetnikov
c09ca93e47 [primTorch] Add refs for softmax, softmin, log_softmax (#84956)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84956
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-19 18:45:40 +00:00
Nikita Karetnikov
b886cd15f5 [primTorch] Add a ref for NumPy-style T (#86850)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86850
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-18 10:19:47 +00:00
Nikita Karetnikov
841995d53b [primTorch] Add refs for data conversion ops (#86561)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86561
Approved by: https://github.com/lezcano, https://github.com/mruberry, https://github.com/zou3519
2022-10-18 08:38:51 +00:00
Sean Ross-Ross
1bb609ad47 Added new test test_compare_cpu that checks if cpu and gpu results are consistent (#85011)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85011
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-14 20:15:16 +00:00
Ivan Yashchuk
fd80684784 Add nvFuser support for torch.Tensor.view (#84634)
This is an alternative to https://github.com/pytorch/pytorch/pull/83739. While PrimTorch has `view` as a reference, we would like to use nvFuser's implementation for `view` for now. Later we might transition to PrimTorch's `torch._refs.view`.

See `test_nvprims_view` for examples of things that are now sent to nvFuser. Note that nvFuser's `view` is a copy-like operation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84634
Approved by: https://github.com/kevinstephano, https://github.com/mruberry
2022-10-14 12:08:02 +00:00
Brian Hirsh
0feccda7d7 fix aliasing bug in pixel shuffle/unshuffle (#86608)
Fixes https://github.com/pytorch/pytorch/issues/82235

cc @albanD - `at::pixel_shuffle` and `at::pixel_unshuffle` advertise as being non-aliasing, but they have a C++ decomposition that internally uses reshape(), which means that it might return an alias.

I happened to notice this because a bunch of tests in `test/test_ops.py` failed when I ran locally with a `DEBUG=1` build.

(P.S.: when are we finally gonna get a debug build test in CI? 😃)

I fixed by adding an extra clone, which... is going to be an unnecessary perf hit in the case where the `reshape()` already properly cloned the input. My hope is that this is fine, because this only impacts the composite kernel- we already have a "fast" CPU kernel that does the right thing. Is `pixel_shuffle/unshuffle` commonly used with cuda? Maybe we should just add a fast cuda kernel for it if that's the case.

Alternatively, it seems like it would be nice if `reshape()` accepted an optional argument to unconditionally return a copy. That seems like a rabbit hole that isn't worth going down for now though - I remember a discussion a while ago about making `reshape()` copy-on-write

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86608
Approved by: https://github.com/albanD
2022-10-13 14:14:26 +00:00
Peter Bell
73c43ce2e2 Display unexpected exceptions raised from test_dtypes (#86599)
Currently `test_dtypes` swallows all exceptions which can make debugging failures more tricky.
This changes the test to save the exceptions and print only the unexpected ones at the end e.g.
```
AssertionError: The supported dtypes for nn.functional._scaled_dot_product_attention on device type cuda are incorrect!
The following dtypes did not work in backward but are listed by the OpInfo: {torch.bfloat16}.
Unexpected failures raised the following errors:
torch.bfloat16 - CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling [...]
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86599
Approved by: https://github.com/mruberry
2022-10-12 19:51:58 +00:00
Nikita Karetnikov
d56017a14f [primTorch] Add ref for triplet_margin_loss, improve triplet_margin_with_distance_loss (#85614)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85614
Approved by: https://github.com/lezcano, https://github.com/mruberry
2022-10-12 18:37:58 +00:00
Khushi
2344135179 [primTorch] special: entr, expit (#86592)
Add _refs for `entr` & `expit`.

cc @mruberry @kshitij12345!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86592
Approved by: https://github.com/mruberry
2022-10-12 07:00:40 +00:00
Elias Ellison
b409d1f65b Turn on Data Dependent Throwing (#86480)
This was already enabled in TorchDynamo, but was staged to make sure things don't break. Also makes backward single threaded for tests to fix a memory leak.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86480
Approved by: https://github.com/bdhirsh
2022-10-10 21:58:29 +00:00
Elias Ellison
d3f7c34cb3 Enable aten-aten decomps (#85921)
Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374))

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921
Approved by: https://github.com/ezyang
2022-10-08 05:12:42 +00:00
PyTorch MergeBot
7ec12a559c Revert "Enable aten-aten decomps (#85921)"
This reverts commit 62e4f51efd.

Reverted https://github.com/pytorch/pytorch/pull/85921 on behalf of https://github.com/huydhn due to Sorry for reverting your PR. I think it breaks a dynamo test in trunk 62e4f51efd
2022-10-08 01:59:54 +00:00
Elias Ellison
62e4f51efd Enable aten-aten decomps (#85921)
Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374))

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921
Approved by: https://github.com/ezyang
2022-10-07 21:04:39 +00:00
Elias Ellison
9ceadcadb2 Fix unfold backward decomp aliasing for 0 dim input (#86428)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86428
Approved by: https://github.com/ngimel, https://github.com/ezyang
2022-10-07 03:55:31 +00:00
lezcano
c609768896 Add refs for torch.unfold and a decomposition for its backward. (#85629)
It's not clear to me what's the difference between `unfold` and `unfold_copy`, as this latter one is codegen'd

I also took this chance to clean the implementation of unfold and its reference
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85629
Approved by: https://github.com/mruberry
2022-10-05 12:15:49 +00:00
Elias Ellison
6a2b12dd65 Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85471
Approved by: https://github.com/ezyang
2022-09-28 23:06:59 +00:00
Elias Ellison
0b93afb112 add amp tests (#85434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85434
Approved by: https://github.com/ngimel
2022-09-28 19:34:46 +00:00
samdow
18d8c548f4 [Modes] remove enable and rewrite mode stack (squashed) (#84774)
Based on @ezyang's suggestion, mode stack now has "one true mode" which is the _only_ mode that can ever be active at the C++ level. That mode's torch dispatch is just to take the top mode in the stack, reenable itself (if we aren't at the end of the mode stack), and run the top mode's torch_{dispatch|function}

This maintains that in the middle of a mode's torch dispatch, the mode itself will not be active. It changes the function the user has to call to see what the current mode is (no longer queries the C++, it's python only) but allows the user to also see the entire mode stack easily

Removes `enable_torch_dispatch_mode` and `.restore()` since neither makes sense in this new setup

### Background
Why do we want this? Well, a pretty common pattern that was coming up was that users had to do something like

```python
## PRE-PR UX
def f(mode):
  with mode.restore():  # user needs to understand this restore thing?
    ...

with Mode() as m:
  pass
f(m)
```

Many users were getting error from forgetting to call `.restore` or from forgetting to add the (tbh weird) "mode instantiation"  step where they use the mode as a context manager with an empty body. Really, they wanted to treat modes like context managers and just write
```python
## FROM FEEDBACK, USER DESIRED CODE. POSSIBLE POST-PR
def f(mode):
  with mode:
    ...
f(Mode())
```

** Technical Details **
With the old mode stack, we basically had a linked list so the mode itself could only be used once and had a fixed parent. In this new design, the mode stack is just a python list that we're pushing to and popping from. There's only one mode that's ever active at the C++ level and it runs the next mode in the Python list. The modes don't have state on them anymore
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84774
Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-09-27 01:04:35 +00:00
Elias Ellison
bcc544e9d7 Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417
Approved by: https://github.com/ezyang
2022-09-26 17:08:14 +00:00
PyTorch MergeBot
d10de31cc8 Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)"
This reverts commit 78afa0cf0c.

Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk 78afa0cf0c
2022-09-23 17:21:43 +00:00
PyTorch MergeBot
eb570ab7d0 Revert "add amp tests (#85434)"
This reverts commit c2f4bbe669.

Reverted https://github.com/pytorch/pytorch/pull/85434 on behalf of https://github.com/clee2000 due to broke rocm and slow tests on trunk c2f4bbe669
2022-09-23 17:19:06 +00:00
PyTorch MergeBot
3b195fd33e Revert "Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471)"
This reverts commit 1e92eb8068.

Reverted https://github.com/pytorch/pytorch/pull/85471 on behalf of https://github.com/clee2000 due to stacked prs https://github.com/pytorch/pytorch/pull/85417 and https://github.com/pytorch/pytorch/pull/85434 broke trunk, reverting this so i can revert the others
2022-09-23 17:13:35 +00:00
Elias Ellison
1e92eb8068 Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85471
Approved by: https://github.com/ezyang
2022-09-23 16:02:15 +00:00
Elias Ellison
c2f4bbe669 add amp tests (#85434)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85434
Approved by: https://github.com/ngimel
2022-09-23 15:57:37 +00:00
Elias Ellison
78afa0cf0c Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417
Approved by: https://github.com/ezyang
2022-09-23 15:50:03 +00:00
PyTorch MergeBot
5043457a8e Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)"
This reverts commit 9c77083965.

Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk (and pull somehow) 9c77083965
2022-09-22 15:44:38 +00:00
Elias Ellison
9c77083965 Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417
Approved by: https://github.com/ezyang
2022-09-22 13:03:57 +00:00
Thomas Viehmann
764cba6848 add Python ref for isreal (#85361)
Dipping my toes into prims waters

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85361
Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry
2022-09-21 18:53:34 +00:00
Ivan Yashchuk
35943f30cb Reference implementation for torch.Tensor.sum_to_size (#85338)
New ref: `torch._refs.sum_to_size`.

View consistency validation is disabled because the ref returns a view instead of returning the input.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85338
Approved by: https://github.com/mruberry
2022-09-21 18:12:52 +00:00
Horace He
2f4a517d67 Ported matmul compositeimplicitautograd impl into core (#85239)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85239
Approved by: https://github.com/ezyang, https://github.com/lezcano
2022-09-21 09:25:24 +00:00
Elias Ellison
a3afb2c2f6 Fake: fix conv_transpose2d striding (#82846)
The output striding channels-last preservation logic differs between cuda and cpu. For the meta kernel, we can peek at the fake tensor device and use that to determine whether to do cpu or cuda.

You could argue there's a leaking of abstraction here but this seems like a pretty minimal leak and I'm not sure there's a much cleaner way forward for device-specific striding tracing logic.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82846
Approved by: https://github.com/ezyang
2022-09-20 18:00:59 +00:00
lezcano
5dd9610e9d Refs and decompositions for index_{add,copy,select,fill} (#85002)
As per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85002
Approved by: https://github.com/ngimel
2022-09-17 19:57:34 +00:00
PyTorch MergeBot
e33b464ffc Revert "Refs and decompositions for index_{add,copy,select,fill} (#85002)"
This reverts commit 2f0b3de443.

Reverted https://github.com/pytorch/pytorch/pull/85002 on behalf of https://github.com/huydhn due to Broke trunk slow tests
2022-09-17 04:26:04 +00:00
lezcano
2f0b3de443 Refs and decompositions for index_{add,copy,select,fill} (#85002)
As per title
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85002
Approved by: https://github.com/ngimel
2022-09-16 23:59:35 +00:00
Horace He
4bdc0af53d Added support for symbolic is_contiguous (#84829)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84829
Approved by: https://github.com/ezyang
2022-09-16 04:54:01 +00:00
Sherlock Huang
17925122d0 Rewrite new_zeros, new_ones, new_full decomp with aten.full (#84946)
We should **NOT**  introducing non-functional op for decomps of functional op.

For example
```
make_fx(functionalize(lambda x: x.new_zeros(3)), decomposition_table=decomposition_table)(x)
```
is producing
```
def forward(self, x_1):
    empty = torch.ops.aten.empty.memory_format([3, 4], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False)
    zero_ = torch.ops.aten.zero_.default(empty);  empty = None
    return zero_
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84946
Approved by: https://github.com/ngimel
2022-09-15 05:45:40 +00:00
Ivan Yashchuk
6750946b82 Skip validate_view_consistency for nvFuser tests (#84858)
nvFuser's execute function always returns a copy for now.

Ref. https://github.com/pytorch/pytorch/pull/84629#discussion_r966375582
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84858
Approved by: https://github.com/mruberry, https://github.com/ngimel
2022-09-14 12:03:11 +00:00
Ryan Spring
d09e8b23bf [primTorch] Add repeat and unfold_copy references (#81374)
Add References:

- repeat
- unfold
- expand_as
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81374
Approved by: https://github.com/mruberry, https://github.com/ngimel
2022-09-12 22:19:06 +00:00
kshitij12345
4f6027b78a [opinfo] narrow: add new sample for Tensor overload (#84785)
`narrow` accepts `start` argument to be a Tensor. We add a sample to test this overload.

NOTE: This leads to a bunch of failed tests and hence the skips and xfails
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84785
Approved by: https://github.com/zou3519
2022-09-12 16:59:08 +00:00
Elias Ellison
15c5baf878 Throw on data dependent ops (#83567)
Previously, we would trace through the following with no error:
```
from torch.fx.experimental.proxy_tensor import make_fx
import torch

def f(x, y):
    return x[0, y:]
```

Even though the output shape is dependent on the data of `y`.  Now, throw on the conversion of `y` to an integer.

It would be nice to not break on constant tensors but I'll do that as the next PR (Edit: done with https://github.com/pytorch/pytorch/pull/84387). Sketching out how that would work (and keep in mind this is applicable Dynamo tracing and not just AOT Autograd)

I think to do that you would need to :
- hold strong refs to a set of constant tensors, and only allow them to be captured from `lift_fresh.copy`
- when you run a mutable op, either remove it from the set of constant tensors or run the operator for real
- limit to small constant tensors
Anything else ?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83567
Approved by: https://github.com/ezyang
2022-09-07 02:37:00 +00:00
Nikita Karetnikov
85b889fa5f [primTorch] Add ref for poisson_nll_loss (#83805)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83805
Approved by: https://github.com/Lezcano, https://github.com/ngimel
2022-08-31 17:39:34 +00:00
Nikita Karetnikov
305af90d0f [primTorch] Add docstring and promotion for l1_loss ref (#83803)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83803
Approved by: https://github.com/Lezcano, https://github.com/ngimel
2022-08-31 17:39:31 +00:00
Elias Ellison
9c452abcf1 Use reentrant mode when invoking prims, delete global prim_fake_mode (#84090)
Maybe I should be using the meta_impl instead of the prim_impl, but it's not terribly clear why, since the prim impl will be better tested and should work under the re-entrant FakeTensorMode.

Fixes https://github.com/pytorch/pytorch/issues/78613 in the process
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84090
Approved by: https://github.com/ezyang, https://github.com/samdow
2022-08-31 01:58:44 +00:00
samdow
7532d5b125 [Modes] remove inner constructor kwarg (#83925)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83925
Approved by: https://github.com/ezyang, https://github.com/zou3519
2022-08-31 00:05:56 +00:00
jjsjann123
b078d242c4 Nvfuser to copy decomp to prim (#83782)
Conditional decomposing aten::_to_copy to nvprim::convert_element_type to allow fusion with type casting, which is introduced during type promotion phase at torch decomposition.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83782
Approved by: https://github.com/ngimel
2022-08-28 04:26:36 +00:00
Horace He
9a236c7ab4 Made some minor cleanups to decompositions (#83814)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83814
Approved by: https://github.com/ngimel
2022-08-26 10:55:31 +00:00
jjsjann123
1407e6728c Nvfuser python api patch take 2 (#83684)
landing #83645 again.

Previously we are breaking on codegen bf16 kernel for cuda TK 10.2. Added a short-cut to disable bf tests on pre cuda 11 build.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83684
Approved by: https://github.com/ngimel
2022-08-19 16:05:39 +00:00
Nikita Karetnikov
1a49eea301 [primTorch] Add ref for diag_embed (#82322)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82322
Approved by: https://github.com/Lezcano, https://github.com/ngimel
2022-08-17 20:32:56 +00:00
Fabio Rocha
2a096e940d [primTorch] support for a few magic methods (#83524)
Added support for mapping __rsub__, __rtruediv__,
__rfloordiv__, __floordiv__, __pow__,
and __rpow__ in TorchRefsMode.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83524
Approved by: https://github.com/ngimel
2022-08-17 09:48:15 +00:00
Nikita Karetnikov
b156f3329e [primTorch] Add ref for movedim (#83278)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83278
Approved by: https://github.com/ngimel
2022-08-16 18:38:28 +00:00
Ivan Yashchuk
2e8e386d6f Add refs for real and imag to __all__ (#83057)
`imag` and `real` were missing from the ref's `__all__` list.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83057
Approved by: https://github.com/ngimel
2022-08-16 13:40:43 +00:00
soulitzer
ba53efa6e7 Unskip CompositeCompliance tests for ARM (#83089)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83089
Approved by: https://github.com/albanD
2022-08-11 20:01:51 +00:00
Peter Bell
5e3d1ef49f Allow ufunc OpInfos to have no reference (#82348)
The `ref` property was moved down from `{Unary,Binary}UfuncInfo` into
`OpInfo` quite some time ago, but `OpInfo` uses `None` to signal no
reference is available while the others use `_NOTHING`. This makes
everything consistently use `None`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82348
Approved by: https://github.com/ngimel
2022-08-09 04:38:17 +00:00
PyTorch MergeBot
814c19b266 Revert "Allow ufunc OpInfos to have no reference (#82348)"
This reverts commit 566d734396.

Reverted https://github.com/pytorch/pytorch/pull/82348 on behalf of https://github.com/peterbell10 due to This stack broke macos tests on trunk
2022-08-06 21:09:09 +00:00
Peter Bell
566d734396 Allow ufunc OpInfos to have no reference (#82348)
The `ref` property was moved down from `{Unary,Binary}UfuncInfo` into
`OpInfo` quite some time ago, but `OpInfo` uses `None` to signal no
reference is available while the others use `_NOTHING`. This makes
everything consistently use `None`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82348
Approved by: https://github.com/ngimel
2022-08-06 20:01:39 +00:00
albanD
2255911f8a Make M1 tests green (#82213)
This is skipping all the failing tests and add a new master job to test on M1

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82213
Approved by: https://github.com/seemethere, https://github.com/soulitzer, https://github.com/malfet
2022-08-05 16:12:08 +00:00
Peter Bell
4d405517e4 Move OpInfo class into new opinfo folder (#82540)
Ref #82518

Starting small to minimize merge conflicts, this moves the top-level
class definitions and some helper functions into the `opinfos` folder.
It also brings `common_methods_invocations.py` to just below 1MB.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82540
Approved by: https://github.com/albanD
2022-08-05 15:10:17 +00:00
Fabio Rocha
ff753cbc12 [primTorch] Added unbind OpInfo and ref (#81776)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81776
Approved by: https://github.com/Lezcano, https://github.com/ngimel
2022-08-04 17:03:24 +00:00
Natalia Gimelshein
112ec24f09 Fix device behavior for masked_fill (#82737)
Fixes #81018, based on #81036.
It will create graph break for cpu 0d tensor value due to .item() call (we could maybe specialize on that instead of breaking?), but otherwise it would create graph break due to synchronizing `to` call, so there's no way around :-(, and for number `value` argument we already should be specializing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82737
Approved by: https://github.com/Chillee
2022-08-04 15:47:56 +00:00
Fabio Rocha
22fea8f654 [primTorch] Added reference for unflatten (#81231)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81231
Approved by: https://github.com/ngimel
2022-08-03 15:20:46 +00:00
Elias Ellison
9b46737fca Add tests for fake tensor striding (#82571)
Add tests for fake tensor striding in OpInfos. I know primtorch is not strictly committing to consistent stride propagation with ATen (see https://github.com/pytorch/pytorch/issues/78050), where as in fake tensor/meta the goal is be completely consistent. This is a little awkward because by default prim refs will register a meta implementation.

In any case, I think we can add the tests for fake with a disclaimer in the tests the failure is non-blocking for adding prims. At least as far as OpInfo tests get, the prims seem to do a pretty good job with stride propagation already.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82571
Approved by: https://github.com/ezyang
2022-08-01 22:01:23 +00:00
Elias Ellison
b2f6aa666e Add tests for aliasing in fake tensor (#82337)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82337
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2022-08-01 21:58:54 +00:00
Elias Ellison
642aed8b99 Add Autocast Support for FakeTensors / use fake device dispatch keys (#82449)
From PR:
```
Note: [Fake Tensor Dispatch Keys]
In order to model the behavior of device-specific autocast
and autograd logic, we update the dispatch keys of FakeTensors
to reflect their fake device. This includes the BackendComponent
(DispatchKey::Meta -> DispatchKey::CUDA), and also the BackendComponent
related Autocast and Autograd keys. __torch__dispatch__ sits below
Autocast and Autograd, and is only invoked when we are at the
kernel for the BackendComponent. Then, we add Meta to the
thread-local dispatch include set to hit the meta kernel
instead of the kernel of the BackendComponent for the fake device.
```

Also adds the `conv1/2/3d.padding` operators to the Autocast rule set. Without that fix, the FakeTensor dtype would diverge.

See: https://github.com/pytorch/pytorch/issues/81608

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82449
Approved by: https://github.com/ezyang
2022-08-01 21:40:36 +00:00
soulitzer
16093a1d81 Fix primtorch out_wrapper semantics for factory functions (#82375)
This PR:
- introduces new OpInfo attribute `is_factory_function`
- updates OpInfo test_out to handle case when `is_factory_function=True`:
- correct primtorch out_wrapper
- update sample inputs for arange, linspace, logspace to not explicitly pass in dtype or device (having this sample is necessary for the test to get triggered)

Fixes https://github.com/pytorch/pytorch/issues/82364

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82375
Approved by: https://github.com/ezyang, https://github.com/ngimel
2022-07-29 00:57:57 +00:00
Elias Ellison
688b971876 Extend fake tensor tests to cuda, add support for index put (#82281)
Testing CUDA exposes some failures, such as `index_put` with CUDA self tensor and cpu value tensors
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82281
Approved by: https://github.com/ezyang
2022-07-28 16:07:15 +00:00
Edward Z. Yang
3f740f6d7f Move test_dtypes so it runs later (#82169)
The error messages it gives are very unhelpful (because a failure
gets translated into "dtype was not supported" rather than the
actual backtrace), so I'd rather get error messages about this after
I've tested basic functionality.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82169
Approved by: https://github.com/zou3519, https://github.com/Chillee
2022-07-27 18:08:17 +00:00
soulitzer
80e2d5704b Add OpInfo and ref for linspace and logspace (#81826)
Implements linspace with arange, and logspace with linspace.
- Implements a more precise path in linspace's ref when dtype is integral to avoid off-by-one issues when output of computation is casted to int. The trade off is that there's an increased chance of overflow.
- Files several issues #82242, #82230, #81996, on preexisting issues with the linspace and logspace. These mainly concern when dtype is integral - the affect tests are xfailed in this PR.
- Fixes the check that the reference implementation is closer to precise implementation than torch implementation to also update the dtype kwarg to the precise dtype.

TODO:
- ~support negative bases~ (not in this PR)
- ~support complex. Since arange does not support complex, but linspace does, one solution is to just call linspace separately on the real and imag components and sum the results in the end~ (not in this PR)
- ~default dtypes need to be explicitly handled since computation is done in a different dtype than result~ (done)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81826
Approved by: https://github.com/ngimel
2022-07-27 05:53:06 +00:00
Ryan Spring
801f0d24bb [primTorch] Add rsub reference (#80421)
Add Reference:
- rsub

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80421
Approved by: https://github.com/mruberry
2022-07-26 20:31:44 +00:00
lezcano
11fe277b62 [PrimTorch] Add reference for torch.norm (#81765)
This ref does more things than `torch.norm`, and it fixes a few bugs
that `torch.norm` has. This implementation and the `torch.norm`
implementation come to terms in the next PR of this stack

We put this PR before, as otherwise `test_decomp.py` was failing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81765
Approved by: https://github.com/ngimel
2022-07-25 19:57:21 +00:00
samdow
2ac24675cc get rid of push_torch_{dispatch, function}_mode (#78215)
Currently we have 2 ways of doing the same thing for torch dispatch and function modes:
`with push_torch_dispatch_mode(X)` or `with X.push(...)`
is now the equivalent of doing
`with X()`

This removes the first API (which is older and private so we don't need to go through a deprecation cycle)

There is some risk here that this might land race with a PR that uses the old API but in general it seems like most are using the `with X()` API or `enable_torch_dispatch_mode(X())` which isn't getting removed.

EDIT: left the `with X.push(...)` API since there were ~3 land races with that over the past day or so. But made it give a warning and ask users to use the other API
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78215
Approved by: https://github.com/ezyang
2022-07-22 18:56:37 +00:00
soulitzer
f595467e5c Reenable slow gradcheck and make it pass (#80514)
Context: For a while slow gradcheck CI was skipping nearly all tests and this hid the fact that it should've been failing and timing out (10+h runtime for TestGradients). The CI configuration has since been fixed to correct this, revealing the test failures. This PR reenables slow gradcheck CI and makes it pass again.

This PR:
- makes slow and failing tests run in fast gradcheck mode only
- reduce the input size for slow gradcheck only for unary/binary ufuncs (alternatively, skip the test entirely)
- skip entire test files on slow gradcheck runner if they don't use gradcheck (test_ops, test_meta, test_decomp, test_ops_jit)
- reduces the input size for some ops

Follow ups:
1. Investigate slow mode failures https://github.com/pytorch/pytorch/issues/80411
2. See if we can re-enable slow gradcheck tests for some of the slow tests by reducing the sizes of their inputs

The following are failing in slow mode, they are now running in fast mode only.
```
test_fn_fwgrad_bwgrad___rmod___cuda_float64
test_fn_fwgrad_bwgrad_linalg_householder_product_cuda_complex128
test_fn_fwgrad_bwgrad__masked_prod_cuda_complex128
test_fn_fwgrad_bwgrad__masked_prod_cuda_float64
test_fn_fwgrad_bwgrad_linalg_matrix_power_cuda_complex128
test_fn_fwgrad_bwgrad_cat_cuda_complex128
test_fn_fwgrad_bwgrad_linalg_lu_factor_ex_cuda_float64
test_fn_fwgrad_bwgrad_copysign_cuda_float64
test_fn_fwgrad_bwgrad_cholesky_inverse_cuda_complex128
test_fn_fwgrad_bwgrad_float_power_cuda_complex128
test_fn_fwgrad_bwgrad_fmod_cuda_float64
test_fn_fwgrad_bwgrad_float_power_cuda_float64
test_fn_fwgrad_bwgrad_linalg_lu_cuda_float64
test_fn_fwgrad_bwgrad_remainder_cuda_float64
test_fn_fwgrad_bwgrad_repeat_cuda_complex128
test_fn_fwgrad_bwgrad_prod_cuda_complex128
test_fn_fwgrad_bwgrad_slice_scatter_cuda_float64
test_fn_fwgrad_bwgrad_tile_cuda_complex128
test_fn_fwgrad_bwgrad_pow_cuda_float64
test_fn_fwgrad_bwgrad_pow_cuda_complex128
test_fn_fwgrad_bwgrad_fft_*
test_fn_fwgrad_bwgrad_zero__cuda_complex128
test_fn_gradgrad_linalg_lu_factor_cuda_float64
test_fn_grad_div_trunc_rounding_cuda_float64
test_fn_grad_div_floor_rounding_cuda_float64
```

Marks the OpInfos for the following ops that run slowly in slow gradcheck as `fast_gradcheck` only (the left column represents runtime in seconds):
```
0  918.722  test_fn_fwgrad_bwgrad_nn_functional_conv_transpose3d_cuda_float64
1  795.042  test_fn_fwgrad_bwgrad_nn_functional_unfold_cuda_complex128
2  583.63  test_fn_fwgrad_bwgrad_nn_functional_max_pool3d_cuda_float64
3  516.946  test_fn_fwgrad_bwgrad_svd_cuda_complex128
4  503.179  test_fn_fwgrad_bwgrad_linalg_svd_cuda_complex128
5  460.985  test_fn_fwgrad_bwgrad_linalg_lu_cuda_complex128
6  401.04  test_fn_fwgrad_bwgrad_linalg_lstsq_grad_oriented_cuda_complex128
7  353.671  test_fn_fwgrad_bwgrad_nn_functional_max_pool2d_cuda_float64
8  321.903  test_fn_fwgrad_bwgrad_nn_functional_gaussian_nll_loss_cuda_float64
9  307.951  test_fn_fwgrad_bwgrad_stft_cuda_complex128
10  266.104  test_fn_fwgrad_bwgrad_svd_lowrank_cuda_float64
11  221.032  test_fn_fwgrad_bwgrad_istft_cuda_complex128
12  183.741  test_fn_fwgrad_bwgrad_lu_unpack_cuda_complex128
13  132.019  test_fn_fwgrad_bwgrad_nn_functional_unfold_cuda_float64
14  125.343  test_fn_fwgrad_bwgrad_nn_functional_pad_constant_cuda_complex128
15  124.2  test_fn_fwgrad_bwgrad_kron_cuda_complex128
16  123.721  test_fn_fwgrad_bwgrad_pca_lowrank_cuda_float64
17  121.074  test_fn_fwgrad_bwgrad_nn_functional_max_unpool3d_cuda_float64
18  119.387  test_fn_fwgrad_bwgrad_rot90_cuda_complex128
19  112.889  test_fn_fwgrad_bwgrad__masked_normalize_cuda_complex128
20  107.541  test_fn_fwgrad_bwgrad_dist_cuda_complex128
21  106.727  test_fn_fwgrad_bwgrad_diff_cuda_complex128
22  104.588  test_fn_fwgrad_bwgrad__masked_cumprod_cuda_complex128
23  100.135  test_fn_fwgrad_bwgrad_nn_functional_feature_alpha_dropout_with_train_cuda_float64
24  88.359  test_fn_fwgrad_bwgrad_mH_cuda_complex128
25  86.214  test_fn_fwgrad_bwgrad_nn_functional_max_unpool2d_cuda_float64
26  83.037  test_fn_fwgrad_bwgrad_nn_functional_bilinear_cuda_float64
27  79.987  test_fn_fwgrad_bwgrad__masked_cumsum_cuda_complex128
28  77.822  test_fn_fwgrad_bwgrad_diag_embed_cuda_complex128
29  76.256  test_fn_fwgrad_bwgrad_mT_cuda_complex128
30  74.039  test_fn_fwgrad_bwgrad_linalg_lu_solve_cuda_complex128
```
```
0  334.142  test_fn_fwgrad_bwgrad_unfold_cuda_complex128
1  312.791  test_fn_fwgrad_bwgrad_linalg_lu_factor_cuda_complex128
2  121.963  test_fn_fwgrad_bwgrad_nn_functional_max_unpool3d_cuda_float64
3  108.085  test_fn_fwgrad_bwgrad_diff_cuda_complex128
4  89.418  test_fn_fwgrad_bwgrad_nn_functional_max_unpool2d_cuda_float64
5  72.231  test_fn_fwgrad_bwgrad___rdiv___cuda_complex128
6  69.433  test_fn_fwgrad_bwgrad___getitem___cuda_complex128
7  68.582  test_fn_fwgrad_bwgrad_ldexp_cuda_complex128
8  68.572  test_fn_fwgrad_bwgrad_linalg_pinv_cuda_complex128
9  67.585  test_fn_fwgrad_bwgrad_nn_functional_glu_cuda_float64
10  66.567  test_fn_fwgrad_bwgrad_lu_cuda_float64
```
```
0  630.13  test_fn_gradgrad_nn_functional_conv2d_cuda_complex128
1  81.086  test_fn_gradgrad_linalg_solve_triangular_cuda_complex128
2  71.332  test_fn_gradgrad_norm_cuda_complex128
3  64.308  test_fn_gradgrad__masked_std_cuda_complex128
4  59.519  test_fn_gradgrad_div_no_rounding_mode_cuda_complex128
5  58.836  test_fn_gradgrad_nn_functional_adaptive_avg_pool3
```

Reduces the sizes of the inputs for:
- diff
- diag_embed

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80514
Approved by: https://github.com/albanD
2022-07-22 02:05:37 +00:00
Horace He
a5fb41e3d3 Revert "Revert "Refactored prim utils into _prims_utils folder (#81746)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81746
Approved by: https://github.com/anijain2305, https://github.com/Krovatkin
2022-07-20 23:43:57 +00:00
Kshiteej K
8b5685da12 [composite compliance] test_operator correctness (#81600)
Time Before PR:
```
= 1111 passed, 45 skipped, 41020 deselected, 17 xfailed, 33 warnings in 52.55s =
```

Time After PR:
```
= 1105 passed, 51 skipped, 41020 deselected, 17 xfailed, 33 warnings in 70.03s (0:01:10) =
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81600
Approved by: https://github.com/zou3519
2022-07-20 21:18:56 +00:00
Ivan Yashchuk
a3d5d2ddf1 Add partitioned nvFuser executor with ATen fallbacks (#81043)
This PR introduces a new nvFuser executor for FX graphs containing different kinds of nodes, not just `torch.ops.prims` supported by nvFuser. The FX graph is partitioned based on whether nodes are supported or not by nvFuser and supported nodes are fused into subgraphs, that's all using Sherlock's work on the partitioner.

This new partitions-based executor with fallbacks to ATen is used by default with `executor="nvfuser"`. And the previous executor can be used with `executor="strictly_nvfuser"`, naming suggestions are welcome!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81043
Approved by: https://github.com/jjsjann123, https://github.com/SherlockNoMad
2022-07-20 19:51:20 +00:00
Kshiteej K
706b420a52 [composite compliance] check output of forward-ad with subclass args against regular tensor (#81464)
Time Before PR
```
= 880 passed, 274 skipped, 38170 deselected, 17 xfailed, 21 warnings in 808.96s (0:13:28) =
```

Time After PR
```
= 875 passed, 274 skipped, 38170 deselected, 22 xfailed, 21 warnings in 880.61s (0:14:40) =
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81464
Approved by: https://github.com/zou3519
2022-07-20 17:38:11 +00:00
PyTorch MergeBot
e43a02c314 Revert "Refactored prim utils into _prims_utils folder (#81088)"
This reverts commit 80231d0a72.

Reverted https://github.com/pytorch/pytorch/pull/81088 on behalf of https://github.com/jeanschmidt due to breaking internal tests
2022-07-19 19:56:41 +00:00
Catherine Lee
06a0cfc0ea pytest to run test_ops, test_ops_gradients, test_ops_jit in non linux cuda environments (#79898)
This PR uses pytest to run test_ops, test_ops_gradients, and test_ops_jit in parallel in non linux cuda environments to decrease TTS.  I am excluding linux cuda because running in parallel results in errors due to running out of memory

Notes:
* update hypothesis version for compatability with pytest
* use rerun-failures to rerun tests (similar to flaky tests, although these test files generally don't have flaky tests)
  * reruns are denoted by a rerun tag in the xml.  Failed reruns also have the failure tag.  Successes (meaning that the test is flaky) do not have the failure tag.
* see https://docs.google.com/spreadsheets/d/1aO0Rbg3y3ch7ghipt63PG2KNEUppl9a5b18Hmv2CZ4E/edit#gid=602543594 for info on speedup (or slowdown in the case of slow tests)
  * expecting windows tests to decrease by 60 minutes total
* slow test infra is expected to stay the same - verified by running pytest and unittest on the same job and check the number of skipped/run tests
* test reports to s3 changed - add entirely new table to keep track of invoking_file times
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79898
Approved by: https://github.com/malfet, https://github.com/janeyx99
2022-07-19 19:50:57 +00:00
Horace He
80231d0a72 Refactored prim utils into _prims_utils folder (#81088)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81088
Approved by: https://github.com/ngimel
2022-07-19 03:55:51 +00:00
Peter Bell
bf36d8b987 [primTorch] Implement one-dimensional fft transforms (#80570)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80570
Approved by: https://github.com/mruberry
2022-07-15 15:13:43 +00:00
Peter Bell
924b7951aa [primTorch] Implement conj and conj_physical (#80358)
This adds `prims.conj` and `prims.conj_physical` which only accept
complex tensors, as well as `refs.conj` and `refs.conj_physical` which
pass-through non-complex values and call the appropriate `prims` for
complex types.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80358
Approved by: https://github.com/mruberry
2022-07-14 15:29:41 +00:00
Nikita Karetnikov
1e3c6f2263 [primTorch] Add a ref for allclose (#81003)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81003
Approved by: https://github.com/mruberry
2022-07-12 15:08:01 +00:00
Richard Zou
9ee312023d [Composite compliance testing] Refactor check_forward_ad_formula to accept Callable (#81239)
Like https://github.com/pytorch/pytorch/pull/81059; this PR addresses
the review comments.

Test Plan:
- run tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81239
Approved by: https://github.com/ezyang
2022-07-11 20:48:18 +00:00
Richard Zou
d253cdd8ff [composite compliance testing] Refactor check_backward_formula to accept Callable (#81059)
Maybe niche, but for one-off debugging purposes, I want a variant of
check_backward_formula that accepts a callable rather than an OpInfo.
This is because when debugging, I try to create a repro that does not
involve OpInfos because OpInfos are difficult to deal with (they have
a lot of sample inputs, I may want to test my own sample inputs without
creating a new OpInfo, etc).

This PR refactors check_backward_formula so that it accepts a Callable
instead of an OpInfo. Example usage:

```
import torch
from torch.testing._internal.composite_compliance import check_backward_formula

x = torch.tensor([[1., 1.], [1., 0.]], requires_grad=True)
args = (x, 1)

check_backward_formula_callable(torch.prod, args, {})
```

Test Plan:
- run existing tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81059
Approved by: https://github.com/kshitij12345, https://github.com/ezyang
2022-07-11 18:37:50 +00:00
Mike Ruberry
8740c68c41 [primTorch] Adds contiguous and expand references (#79820)
I also filed  while creating this PR.

This PR...

**Filed issues**

- https://github.com/pytorch/pytorch/issues/79818
- https://github.com/pytorch/pytorch/issues/80154

**prims**

- Fixes prims.squeeze when called with an unsorted list of dimensions
- Removes the clone prim

**refs**
- adds contiguous
- adds expand
- updates clone to call empty_like and copy_to
- updates empty to accept a memory format
- updates empty_like to accept a memory_format

**utils**
- adds helper functions for working with memory formats and channels last tensors, in particular

**tests**

- removes unused clamp sample input functions (mooted by clamp's new reference inputs)
- extends the reference inputs for clone to include different memory formats
- creates reference inputs for contiguous
- xfails operators that depend on clone (including clone) on `test_python_ref` (see issues)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79820
Approved by: https://github.com/ngimel
2022-07-11 17:42:58 +00:00
Ryan Spring
d26516fd1b [primTorch] Implement loss function references (#80573)
Add Reference:

- mse_loss
- l1_loss
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80573
Approved by: https://github.com/mruberry
2022-07-09 03:31:20 +00:00
David Berard
4c57cf9a8b Register unregistered refs and add a test to check registration (#80497)
Added missing `register_decomposition`s which will register the refs so
they can be used for decompositions.

Also added a test for verifying that new refs are registered.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80497
Approved by: https://github.com/ezyang
2022-07-08 16:29:52 +00:00
Ivan Yashchuk
12dc410ff2 Fix nvFuser's where(tensor, python_scalar, tensor) type promotion (#80347)
This PR modifies the type promotion logic for nvFuser's `where` function when one of the arguments is a scalar. With the proposed change behavior now matches with ATen's type promotion.

The following script fails on master and passes with this PR:
```py
import torch
import torch._refs
from torch._prims.executor import make_traced
a = torch.ones(3, 3, dtype=torch.bool, device='cuda')
b = torch.randn(3, 3, device='cuda')
func = lambda a, b: torch._refs.where(a, 0.0, b)
assert make_traced(func)(a, b, executor="nvfuser").dtype == torch.float32
```

This PR allows to unskip nvFuser tests for `_refs.log_softmax`, it was failing with a dtype mismatch.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80347
Approved by: https://github.com/ngimel
2022-06-28 08:42:16 +00:00
Ryan Spring
1d0d506e97 Add Div reference (#77936)
Add Prims:
-  trunc
-  Replace _wrap_scalar with scalar_tensor

Add Reference:
-  copysign
- div
- floor_divide
- trunc_divide

Other:
* Add support for `variant_test_name` in _find_referenced_opinfo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77936
Approved by: https://github.com/mruberry
2022-06-27 14:46:17 +00:00
Ivan Yashchuk
072311bb28 Enable torch._prims.amax/amin for nvFuser executor (#80070)
This PR adds nvFuser implementations for `torch._prims.amax` and `torch._prims.amin` reduction functions. Currently, nvFuser refuses to reduce the 0d tensor, so these inputs are skipped in tests for now.

An accompanying fix replaces `collections.Sequence` -> `collections.abc.Sequence` in refs because `collections.Sequence` is deprecated and removed in Python 3.10

Many ops that were skipped for the nvFuser executor test are now enabled.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80070
Approved by: https://github.com/ngimel
2022-06-23 10:19:57 +00:00
Elias Ellison
268bbecf1c Add option for allowing non-fake inputs, add deepcopy impl
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79580

Approved by: https://github.com/samdow
2022-06-17 19:36:26 +00:00
Kshiteej K
04b98df87a [fix] composite compliance: eig, eigh, symeig (#79698)
Ref: https://github.com/pytorch/pytorch/issues/69991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79698
Approved by: https://github.com/Lezcano, https://github.com/albanD
2022-06-17 14:13:04 +00:00
kshitij12345
d05fb78685 [chalf] enable skipped tests (#79376)
Ref: https://github.com/pytorch/pytorch/pull/79217#pullrequestreview-1002849962

Had to add a few `expectedFailures`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79376
Approved by: https://github.com/ngimel, https://github.com/mruberry
2022-06-13 17:31:45 +00:00
Michael Suo
c978b609f7 [ci] remove IN_CI env var
The conventional env var to set is CI. Both circle and GHA set it, so
IN_CI is unnecessary

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79229

Approved by: https://github.com/janeyx99
2022-06-11 17:16:30 +00:00