pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
lezcano	fd27246c16	Fix decomposition for std (#87181 ) The previous implementation was lacking a few features and incurred on a pretty large error cc @ezyang @mruberry @ngimel @Lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/87181 Approved by: https://github.com/ngimel, https://github.com/peterbell10	2022-10-28 00:50:29 +00:00
Natalia Gimelshein	f1b78224ca	Fix type promotion for 2 wrapped scalar args (#87845 ) Fixes #76801 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87845 Approved by: https://github.com/SherlockNoMad, https://github.com/mruberry	2022-10-27 15:53:11 +00:00
Nikita Karetnikov	59b9d29260	[primTorch] Check `error_regex` in `test_python_ref_errors` (#86987 ) cc @ezyang @mruberry @ngimel @Lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/86987 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-26 23:34:34 +00:00
Bin Bao	2c1efe7472	Enable some PyTorch core tests with inductor (#87490 ) Summary: 1) Graph break on torch.random.set_rng_state since it blocks running inductor core tests; 2) Add several inductor-specific skips; 3) Enable several core tests for inductor CI; cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87490 Approved by: https://github.com/eellison	2022-10-26 18:58:33 +00:00
Sherlock Huang	eb99c1efce	Prefer python meta function over c++ meta function (#87426 ) This is a policy update for meta registration. We now prefer python meta implementation over C++ meta function. This is a flip of the previous policy, where we prefer C++ meta function over python meta function if they both exist. Here's the meta registration process: 1. register_meta and register_decomposition will place the python meta/decomp functions into the `global_decomp_table`. However, they will NOT register them into dispatcher. 2. After global_decomp_table is populated, we will compile an `active_meta_table`. For a given op, we pick the most specific decomp function from `global_decomp_table` in the preference order of Meta > PostAutograd > PreAutograd. 3. We will unconditionally register all of them into python dispatcher. And register them into C++ dispatcher, unless it one of the following 3 cases - 1. the op is a CompositeImplicitAutograd, and should rely on decomposed op's meta - 2. the op is a view op, as the MetaTensor doesn't support aliased storage - 3. the op is in the blocklist (due to UT failures, and we will burn down this list op by op) Over the long run, we wish to implement all meta functions in python. With this PR, 321 op_overloads will have cpp meta overridden by python meta. There are still 400 op_overloads is using cpp meta. The exact list can be found here https://gist.github.com/SherlockNoMad/d20bb736178df8eebd3b054c8bb7cdc5 cc @ngimel @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang Pull Request resolved: https://github.com/pytorch/pytorch/pull/87426 Approved by: https://github.com/ezyang, https://github.com/jansel	2022-10-25 16:49:02 +00:00
Nikita Karetnikov	1b8af28fe8	[primTorch] Add refs for `softmax`, `softmin`, `log_softmax` (#84956 ) cc @ezyang @mruberry @ngimel @Lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/84956 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-20 12:29:04 +00:00
PyTorch MergeBot	cd21613526	Revert "[primTorch] Add refs for `softmax`, `softmin`, `log_softmax` (#84956 )" This reverts commit `c09ca93e47`. Reverted https://github.com/pytorch/pytorch/pull/84956 on behalf of https://github.com/ZainRizvi due to This is causing the MPS test test_output_match_log_softmax_with_dtype_cpu_float32 (__main__.TestConsistencyCPU) to fail	2022-10-19 20:36:55 +00:00
Nikita Karetnikov	c09ca93e47	[primTorch] Add refs for `softmax`, `softmin`, `log_softmax` (#84956 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84956 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-19 18:45:40 +00:00
Nikita Karetnikov	b886cd15f5	[primTorch] Add a ref for NumPy-style `T` (#86850 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86850 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-18 10:19:47 +00:00
Nikita Karetnikov	841995d53b	[primTorch] Add refs for data conversion ops (#86561 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86561 Approved by: https://github.com/lezcano, https://github.com/mruberry, https://github.com/zou3519	2022-10-18 08:38:51 +00:00
Sean Ross-Ross	1bb609ad47	Added new test test_compare_cpu that checks if cpu and gpu results are consistent (#85011 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85011 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-14 20:15:16 +00:00
Ivan Yashchuk	fd80684784	Add nvFuser support for torch.Tensor.view (#84634 ) This is an alternative to https://github.com/pytorch/pytorch/pull/83739. While PrimTorch has `view` as a reference, we would like to use nvFuser's implementation for `view` for now. Later we might transition to PrimTorch's `torch._refs.view`. See `test_nvprims_view` for examples of things that are now sent to nvFuser. Note that nvFuser's `view` is a copy-like operation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84634 Approved by: https://github.com/kevinstephano, https://github.com/mruberry	2022-10-14 12:08:02 +00:00
Brian Hirsh	0feccda7d7	fix aliasing bug in pixel shuffle/unshuffle (#86608 ) Fixes https://github.com/pytorch/pytorch/issues/82235 cc @albanD - `at::pixel_shuffle` and `at::pixel_unshuffle` advertise as being non-aliasing, but they have a C++ decomposition that internally uses reshape(), which means that it might return an alias. I happened to notice this because a bunch of tests in `test/test_ops.py` failed when I ran locally with a `DEBUG=1` build. (P.S.: when are we finally gonna get a debug build test in CI? 😃) I fixed by adding an extra clone, which... is going to be an unnecessary perf hit in the case where the `reshape()` already properly cloned the input. My hope is that this is fine, because this only impacts the composite kernel- we already have a "fast" CPU kernel that does the right thing. Is `pixel_shuffle/unshuffle` commonly used with cuda? Maybe we should just add a fast cuda kernel for it if that's the case. Alternatively, it seems like it would be nice if `reshape()` accepted an optional argument to unconditionally return a copy. That seems like a rabbit hole that isn't worth going down for now though - I remember a discussion a while ago about making `reshape()` copy-on-write Pull Request resolved: https://github.com/pytorch/pytorch/pull/86608 Approved by: https://github.com/albanD	2022-10-13 14:14:26 +00:00
Peter Bell	73c43ce2e2	Display unexpected exceptions raised from test_dtypes (#86599 ) Currently `test_dtypes` swallows all exceptions which can make debugging failures more tricky. This changes the test to save the exceptions and print only the unexpected ones at the end e.g. ``` AssertionError: The supported dtypes for nn.functional._scaled_dot_product_attention on device type cuda are incorrect! The following dtypes did not work in backward but are listed by the OpInfo: {torch.bfloat16}. Unexpected failures raised the following errors: torch.bfloat16 - CUDA error: CUBLAS_STATUS_NOT_SUPPORTED when calling [...] ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/86599 Approved by: https://github.com/mruberry	2022-10-12 19:51:58 +00:00
Nikita Karetnikov	d56017a14f	[primTorch] Add ref for `triplet_margin_loss`, improve `triplet_margin_with_distance_loss` (#85614 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85614 Approved by: https://github.com/lezcano, https://github.com/mruberry	2022-10-12 18:37:58 +00:00
Khushi	2344135179	[primTorch] special: entr, expit (#86592 ) Add _refs for `entr` & `expit`. cc @mruberry @kshitij12345! Pull Request resolved: https://github.com/pytorch/pytorch/pull/86592 Approved by: https://github.com/mruberry	2022-10-12 07:00:40 +00:00
Elias Ellison	b409d1f65b	Turn on Data Dependent Throwing (#86480 ) This was already enabled in TorchDynamo, but was staged to make sure things don't break. Also makes backward single threaded for tests to fix a memory leak. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86480 Approved by: https://github.com/bdhirsh	2022-10-10 21:58:29 +00:00
Elias Ellison	d3f7c34cb3	Enable aten-aten decomps (#85921 ) Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374)) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921 Approved by: https://github.com/ezyang	2022-10-08 05:12:42 +00:00
PyTorch MergeBot	7ec12a559c	Revert "Enable aten-aten decomps (#85921 )" This reverts commit `62e4f51efd`. Reverted https://github.com/pytorch/pytorch/pull/85921 on behalf of https://github.com/huydhn due to Sorry for reverting your PR. I think it breaks a dynamo test in trunk `62e4f51efd`	2022-10-08 01:59:54 +00:00
Elias Ellison	62e4f51efd	Enable aten-aten decomps (#85921 ) Invokes aten-aten decomps with re-entrant FakeMode. These decomps are being used in other places, so it's good to unify the path static fake tensor takes / get additional testing etc. There is also an instance where we return different devices with cpu/cuda which this fixes ([batch_norm](https://github.com/pytorch/pytorch/blob/master/torch/_decomp/decompositions.py#L1374)) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85921 Approved by: https://github.com/ezyang	2022-10-07 21:04:39 +00:00
Elias Ellison	9ceadcadb2	Fix unfold backward decomp aliasing for 0 dim input (#86428 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86428 Approved by: https://github.com/ngimel, https://github.com/ezyang	2022-10-07 03:55:31 +00:00
lezcano	c609768896	Add refs for torch.unfold and a decomposition for its backward. (#85629 ) It's not clear to me what's the difference between `unfold` and `unfold_copy`, as this latter one is codegen'd I also took this chance to clean the implementation of unfold and its reference Pull Request resolved: https://github.com/pytorch/pytorch/pull/85629 Approved by: https://github.com/mruberry	2022-10-05 12:15:49 +00:00
Elias Ellison	6a2b12dd65	Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85471 Approved by: https://github.com/ezyang	2022-09-28 23:06:59 +00:00
Elias Ellison	0b93afb112	add amp tests (#85434 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85434 Approved by: https://github.com/ngimel	2022-09-28 19:34:46 +00:00
samdow	18d8c548f4	[Modes] remove enable and rewrite mode stack (squashed) (#84774 ) Based on @ezyang's suggestion, mode stack now has "one true mode" which is the _only_ mode that can ever be active at the C++ level. That mode's torch dispatch is just to take the top mode in the stack, reenable itself (if we aren't at the end of the mode stack), and run the top mode's torch_{dispatch\|function} This maintains that in the middle of a mode's torch dispatch, the mode itself will not be active. It changes the function the user has to call to see what the current mode is (no longer queries the C++, it's python only) but allows the user to also see the entire mode stack easily Removes `enable_torch_dispatch_mode` and `.restore()` since neither makes sense in this new setup ### Background Why do we want this? Well, a pretty common pattern that was coming up was that users had to do something like ```python ## PRE-PR UX def f(mode): with mode.restore(): # user needs to understand this restore thing? ... with Mode() as m: pass f(m) ``` Many users were getting error from forgetting to call `.restore` or from forgetting to add the (tbh weird) "mode instantiation" step where they use the mode as a context manager with an empty body. Really, they wanted to treat modes like context managers and just write ```python ## FROM FEEDBACK, USER DESIRED CODE. POSSIBLE POST-PR def f(mode): with mode: ... f(Mode()) ``` Technical Details With the old mode stack, we basically had a linked list so the mode itself could only be used once and had a fixed parent. In this new design, the mode stack is just a python list that we're pushing to and popping from. There's only one mode that's ever active at the C++ level and it runs the next mode in the Python list. The modes don't have state on them anymore Pull Request resolved: https://github.com/pytorch/pytorch/pull/84774 Approved by: https://github.com/ezyang, https://github.com/zou3519	2022-09-27 01:04:35 +00:00
Elias Ellison	bcc544e9d7	Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417 Approved by: https://github.com/ezyang	2022-09-26 17:08:14 +00:00
PyTorch MergeBot	d10de31cc8	Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 )" This reverts commit `78afa0cf0c`. Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk `78afa0cf0c`	2022-09-23 17:21:43 +00:00
PyTorch MergeBot	eb570ab7d0	Revert "add amp tests (#85434 )" This reverts commit `c2f4bbe669`. Reverted https://github.com/pytorch/pytorch/pull/85434 on behalf of https://github.com/clee2000 due to broke rocm and slow tests on trunk `c2f4bbe669`	2022-09-23 17:19:06 +00:00
PyTorch MergeBot	3b195fd33e	Revert "Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471 )" This reverts commit `1e92eb8068`. Reverted https://github.com/pytorch/pytorch/pull/85471 on behalf of https://github.com/clee2000 due to stacked prs https://github.com/pytorch/pytorch/pull/85417 and https://github.com/pytorch/pytorch/pull/85434 broke trunk, reverting this so i can revert the others	2022-09-23 17:13:35 +00:00
Elias Ellison	1e92eb8068	Turn on aliasing tests for fake backwards, Fix Batch norm running mean/var decomp aliasing (#85471 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85471 Approved by: https://github.com/ezyang	2022-09-23 16:02:15 +00:00
Elias Ellison	c2f4bbe669	add amp tests (#85434 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85434 Approved by: https://github.com/ngimel	2022-09-23 15:57:37 +00:00
Elias Ellison	78afa0cf0c	Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417 Approved by: https://github.com/ezyang	2022-09-23 15:50:03 +00:00
PyTorch MergeBot	5043457a8e	Revert "Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 )" This reverts commit `9c77083965`. Reverted https://github.com/pytorch/pytorch/pull/85417 on behalf of https://github.com/clee2000 due to broke tests on trunk (and pull somehow) `9c77083965`	2022-09-22 15:44:38 +00:00
Elias Ellison	9c77083965	Add FakeCrossRef tests for backwards, Fix Layer Norm Backward Decomp (#85417 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85417 Approved by: https://github.com/ezyang	2022-09-22 13:03:57 +00:00
Thomas Viehmann	764cba6848	add Python ref for isreal (#85361 ) Dipping my toes into prims waters Pull Request resolved: https://github.com/pytorch/pytorch/pull/85361 Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry	2022-09-21 18:53:34 +00:00
Ivan Yashchuk	35943f30cb	Reference implementation for torch.Tensor.sum_to_size (#85338 ) New ref: `torch._refs.sum_to_size`. View consistency validation is disabled because the ref returns a view instead of returning the input. Pull Request resolved: https://github.com/pytorch/pytorch/pull/85338 Approved by: https://github.com/mruberry	2022-09-21 18:12:52 +00:00
Horace He	2f4a517d67	Ported matmul compositeimplicitautograd impl into core (#85239 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85239 Approved by: https://github.com/ezyang, https://github.com/lezcano	2022-09-21 09:25:24 +00:00
Elias Ellison	a3afb2c2f6	Fake: fix conv_transpose2d striding (#82846 ) The output striding channels-last preservation logic differs between cuda and cpu. For the meta kernel, we can peek at the fake tensor device and use that to determine whether to do cpu or cuda. You could argue there's a leaking of abstraction here but this seems like a pretty minimal leak and I'm not sure there's a much cleaner way forward for device-specific striding tracing logic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82846 Approved by: https://github.com/ezyang	2022-09-20 18:00:59 +00:00
lezcano	5dd9610e9d	Refs and decompositions for index_{add,copy,select,fill} (#85002 ) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/85002 Approved by: https://github.com/ngimel	2022-09-17 19:57:34 +00:00
PyTorch MergeBot	e33b464ffc	Revert "Refs and decompositions for index_{add,copy,select,fill} (#85002 )" This reverts commit `2f0b3de443`. Reverted https://github.com/pytorch/pytorch/pull/85002 on behalf of https://github.com/huydhn due to Broke trunk slow tests	2022-09-17 04:26:04 +00:00
lezcano	2f0b3de443	Refs and decompositions for index_{add,copy,select,fill} (#85002 ) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/85002 Approved by: https://github.com/ngimel	2022-09-16 23:59:35 +00:00
Horace He	4bdc0af53d	Added support for symbolic is_contiguous (#84829 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84829 Approved by: https://github.com/ezyang	2022-09-16 04:54:01 +00:00
Sherlock Huang	17925122d0	Rewrite new_zeros, new_ones, new_full decomp with aten.full (#84946 ) We should NOT introducing non-functional op for decomps of functional op. For example ``` make_fx(functionalize(lambda x: x.new_zeros(3)), decomposition_table=decomposition_table)(x) ``` is producing ``` def forward(self, x_1): empty = torch.ops.aten.empty.memory_format([3, 4], dtype = torch.float32, layout = torch.strided, device = device(type='cpu'), pin_memory = False) zero_ = torch.ops.aten.zero_.default(empty); empty = None return zero_ ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/84946 Approved by: https://github.com/ngimel	2022-09-15 05:45:40 +00:00
Ivan Yashchuk	6750946b82	Skip validate_view_consistency for nvFuser tests (#84858 ) nvFuser's execute function always returns a copy for now. Ref. https://github.com/pytorch/pytorch/pull/84629#discussion_r966375582 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84858 Approved by: https://github.com/mruberry, https://github.com/ngimel	2022-09-14 12:03:11 +00:00
Ryan Spring	d09e8b23bf	[primTorch] Add repeat and unfold_copy references (#81374 ) Add References: - repeat - unfold - expand_as Pull Request resolved: https://github.com/pytorch/pytorch/pull/81374 Approved by: https://github.com/mruberry, https://github.com/ngimel	2022-09-12 22:19:06 +00:00
kshitij12345	4f6027b78a	[opinfo] narrow: add new sample for Tensor overload (#84785 ) `narrow` accepts `start` argument to be a Tensor. We add a sample to test this overload. NOTE: This leads to a bunch of failed tests and hence the skips and xfails Pull Request resolved: https://github.com/pytorch/pytorch/pull/84785 Approved by: https://github.com/zou3519	2022-09-12 16:59:08 +00:00
Elias Ellison	15c5baf878	Throw on data dependent ops (#83567 ) Previously, we would trace through the following with no error: ``` from torch.fx.experimental.proxy_tensor import make_fx import torch def f(x, y): return x[0, y:] ``` Even though the output shape is dependent on the data of `y`. Now, throw on the conversion of `y` to an integer. It would be nice to not break on constant tensors but I'll do that as the next PR (Edit: done with https://github.com/pytorch/pytorch/pull/84387). Sketching out how that would work (and keep in mind this is applicable Dynamo tracing and not just AOT Autograd) I think to do that you would need to : - hold strong refs to a set of constant tensors, and only allow them to be captured from `lift_fresh.copy` - when you run a mutable op, either remove it from the set of constant tensors or run the operator for real - limit to small constant tensors Anything else ? Pull Request resolved: https://github.com/pytorch/pytorch/pull/83567 Approved by: https://github.com/ezyang	2022-09-07 02:37:00 +00:00
Nikita Karetnikov	85b889fa5f	[primTorch] Add ref for `poisson_nll_loss` (#83805 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83805 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-08-31 17:39:34 +00:00
Nikita Karetnikov	305af90d0f	[primTorch] Add docstring and promotion for `l1_loss` ref (#83803 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83803 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-08-31 17:39:31 +00:00
Elias Ellison	9c452abcf1	Use reentrant mode when invoking prims, delete global prim_fake_mode (#84090 ) Maybe I should be using the meta_impl instead of the prim_impl, but it's not terribly clear why, since the prim impl will be better tested and should work under the re-entrant FakeTensorMode. Fixes https://github.com/pytorch/pytorch/issues/78613 in the process Pull Request resolved: https://github.com/pytorch/pytorch/pull/84090 Approved by: https://github.com/ezyang, https://github.com/samdow	2022-08-31 01:58:44 +00:00
samdow	7532d5b125	[Modes] remove inner constructor kwarg (#83925 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83925 Approved by: https://github.com/ezyang, https://github.com/zou3519	2022-08-31 00:05:56 +00:00
jjsjann123	b078d242c4	Nvfuser to copy decomp to prim (#83782 ) Conditional decomposing aten::_to_copy to nvprim::convert_element_type to allow fusion with type casting, which is introduced during type promotion phase at torch decomposition. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83782 Approved by: https://github.com/ngimel	2022-08-28 04:26:36 +00:00
Horace He	9a236c7ab4	Made some minor cleanups to decompositions (#83814 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83814 Approved by: https://github.com/ngimel	2022-08-26 10:55:31 +00:00
jjsjann123	1407e6728c	Nvfuser python api patch take 2 (#83684 ) landing #83645 again. Previously we are breaking on codegen bf16 kernel for cuda TK 10.2. Added a short-cut to disable bf tests on pre cuda 11 build. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83684 Approved by: https://github.com/ngimel	2022-08-19 16:05:39 +00:00
Nikita Karetnikov	1a49eea301	[primTorch] Add ref for diag_embed (#82322 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82322 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-08-17 20:32:56 +00:00
Fabio Rocha	2a096e940d	[primTorch] support for a few magic methods (#83524 ) Added support for mapping __rsub__, __rtruediv__, __rfloordiv__, __floordiv__, __pow__, and __rpow__ in TorchRefsMode. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83524 Approved by: https://github.com/ngimel	2022-08-17 09:48:15 +00:00
Nikita Karetnikov	b156f3329e	[primTorch] Add ref for movedim (#83278 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83278 Approved by: https://github.com/ngimel	2022-08-16 18:38:28 +00:00
Ivan Yashchuk	2e8e386d6f	Add refs for real and imag to __all__ (#83057 ) `imag` and `real` were missing from the ref's `__all__` list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83057 Approved by: https://github.com/ngimel	2022-08-16 13:40:43 +00:00
soulitzer	ba53efa6e7	Unskip CompositeCompliance tests for ARM (#83089 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83089 Approved by: https://github.com/albanD	2022-08-11 20:01:51 +00:00
Peter Bell	5e3d1ef49f	Allow ufunc OpInfos to have no reference (#82348 ) The `ref` property was moved down from `{Unary,Binary}UfuncInfo` into `OpInfo` quite some time ago, but `OpInfo` uses `None` to signal no reference is available while the others use `_NOTHING`. This makes everything consistently use `None`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82348 Approved by: https://github.com/ngimel	2022-08-09 04:38:17 +00:00
PyTorch MergeBot	814c19b266	Revert "Allow ufunc OpInfos to have no reference (#82348 )" This reverts commit `566d734396`. Reverted https://github.com/pytorch/pytorch/pull/82348 on behalf of https://github.com/peterbell10 due to This stack broke macos tests on trunk	2022-08-06 21:09:09 +00:00
Peter Bell	566d734396	Allow ufunc OpInfos to have no reference (#82348 ) The `ref` property was moved down from `{Unary,Binary}UfuncInfo` into `OpInfo` quite some time ago, but `OpInfo` uses `None` to signal no reference is available while the others use `_NOTHING`. This makes everything consistently use `None`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82348 Approved by: https://github.com/ngimel	2022-08-06 20:01:39 +00:00
albanD	2255911f8a	Make M1 tests green (#82213 ) This is skipping all the failing tests and add a new master job to test on M1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82213 Approved by: https://github.com/seemethere, https://github.com/soulitzer, https://github.com/malfet	2022-08-05 16:12:08 +00:00
Peter Bell	4d405517e4	Move OpInfo class into new opinfo folder (#82540 ) Ref #82518 Starting small to minimize merge conflicts, this moves the top-level class definitions and some helper functions into the `opinfos` folder. It also brings `common_methods_invocations.py` to just below 1MB. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82540 Approved by: https://github.com/albanD	2022-08-05 15:10:17 +00:00
Fabio Rocha	ff753cbc12	[primTorch] Added unbind OpInfo and ref (#81776 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81776 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-08-04 17:03:24 +00:00
Natalia Gimelshein	112ec24f09	Fix device behavior for masked_fill (#82737 ) Fixes #81018, based on #81036. It will create graph break for cpu 0d tensor value due to .item() call (we could maybe specialize on that instead of breaking?), but otherwise it would create graph break due to synchronizing `to` call, so there's no way around :-(, and for number `value` argument we already should be specializing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82737 Approved by: https://github.com/Chillee	2022-08-04 15:47:56 +00:00
Fabio Rocha	22fea8f654	[primTorch] Added reference for unflatten (#81231 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81231 Approved by: https://github.com/ngimel	2022-08-03 15:20:46 +00:00
Elias Ellison	9b46737fca	Add tests for fake tensor striding (#82571 ) Add tests for fake tensor striding in OpInfos. I know primtorch is not strictly committing to consistent stride propagation with ATen (see https://github.com/pytorch/pytorch/issues/78050), where as in fake tensor/meta the goal is be completely consistent. This is a little awkward because by default prim refs will register a meta implementation. In any case, I think we can add the tests for fake with a disclaimer in the tests the failure is non-blocking for adding prims. At least as far as OpInfo tests get, the prims seem to do a pretty good job with stride propagation already. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82571 Approved by: https://github.com/ezyang	2022-08-01 22:01:23 +00:00
Elias Ellison	b2f6aa666e	Add tests for aliasing in fake tensor (#82337 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82337 Approved by: https://github.com/ezyang, https://github.com/bdhirsh	2022-08-01 21:58:54 +00:00
Elias Ellison	642aed8b99	Add Autocast Support for FakeTensors / use fake device dispatch keys (#82449 ) From PR: ``` Note: [Fake Tensor Dispatch Keys] In order to model the behavior of device-specific autocast and autograd logic, we update the dispatch keys of FakeTensors to reflect their fake device. This includes the BackendComponent (DispatchKey::Meta -> DispatchKey::CUDA), and also the BackendComponent related Autocast and Autograd keys. __torch__dispatch__ sits below Autocast and Autograd, and is only invoked when we are at the kernel for the BackendComponent. Then, we add Meta to the thread-local dispatch include set to hit the meta kernel instead of the kernel of the BackendComponent for the fake device. ``` Also adds the `conv1/2/3d.padding` operators to the Autocast rule set. Without that fix, the FakeTensor dtype would diverge. See: https://github.com/pytorch/pytorch/issues/81608 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82449 Approved by: https://github.com/ezyang	2022-08-01 21:40:36 +00:00
soulitzer	16093a1d81	Fix primtorch out_wrapper semantics for factory functions (#82375 ) This PR: - introduces new OpInfo attribute `is_factory_function` - updates OpInfo test_out to handle case when `is_factory_function=True`: - correct primtorch out_wrapper - update sample inputs for arange, linspace, logspace to not explicitly pass in dtype or device (having this sample is necessary for the test to get triggered) Fixes https://github.com/pytorch/pytorch/issues/82364 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82375 Approved by: https://github.com/ezyang, https://github.com/ngimel	2022-07-29 00:57:57 +00:00
Elias Ellison	688b971876	Extend fake tensor tests to cuda, add support for index put (#82281 ) Testing CUDA exposes some failures, such as `index_put` with CUDA self tensor and cpu value tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/82281 Approved by: https://github.com/ezyang	2022-07-28 16:07:15 +00:00
Edward Z. Yang	3f740f6d7f	Move test_dtypes so it runs later (#82169 ) The error messages it gives are very unhelpful (because a failure gets translated into "dtype was not supported" rather than the actual backtrace), so I'd rather get error messages about this after I've tested basic functionality. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82169 Approved by: https://github.com/zou3519, https://github.com/Chillee	2022-07-27 18:08:17 +00:00
soulitzer	80e2d5704b	Add OpInfo and ref for linspace and logspace (#81826 ) Implements linspace with arange, and logspace with linspace. - Implements a more precise path in linspace's ref when dtype is integral to avoid off-by-one issues when output of computation is casted to int. The trade off is that there's an increased chance of overflow. - Files several issues #82242, #82230, #81996, on preexisting issues with the linspace and logspace. These mainly concern when dtype is integral - the affect tests are xfailed in this PR. - Fixes the check that the reference implementation is closer to precise implementation than torch implementation to also update the dtype kwarg to the precise dtype. TODO: - ~support negative bases~ (not in this PR) - ~support complex. Since arange does not support complex, but linspace does, one solution is to just call linspace separately on the real and imag components and sum the results in the end~ (not in this PR) - ~default dtypes need to be explicitly handled since computation is done in a different dtype than result~ (done) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81826 Approved by: https://github.com/ngimel	2022-07-27 05:53:06 +00:00
Ryan Spring	801f0d24bb	[primTorch] Add rsub reference (#80421 ) Add Reference: - rsub Pull Request resolved: https://github.com/pytorch/pytorch/pull/80421 Approved by: https://github.com/mruberry	2022-07-26 20:31:44 +00:00
lezcano	11fe277b62	[PrimTorch] Add reference for torch.norm (#81765 ) This ref does more things than `torch.norm`, and it fixes a few bugs that `torch.norm` has. This implementation and the `torch.norm` implementation come to terms in the next PR of this stack We put this PR before, as otherwise `test_decomp.py` was failing. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81765 Approved by: https://github.com/ngimel	2022-07-25 19:57:21 +00:00
samdow	2ac24675cc	get rid of push_torch_{dispatch, function}_mode (#78215 ) Currently we have 2 ways of doing the same thing for torch dispatch and function modes: `with push_torch_dispatch_mode(X)` or `with X.push(...)` is now the equivalent of doing `with X()` This removes the first API (which is older and private so we don't need to go through a deprecation cycle) There is some risk here that this might land race with a PR that uses the old API but in general it seems like most are using the `with X()` API or `enable_torch_dispatch_mode(X())` which isn't getting removed. EDIT: left the `with X.push(...)` API since there were ~3 land races with that over the past day or so. But made it give a warning and ask users to use the other API Pull Request resolved: https://github.com/pytorch/pytorch/pull/78215 Approved by: https://github.com/ezyang	2022-07-22 18:56:37 +00:00
soulitzer	f595467e5c	Reenable slow gradcheck and make it pass (#80514 ) Context: For a while slow gradcheck CI was skipping nearly all tests and this hid the fact that it should've been failing and timing out (10+h runtime for TestGradients). The CI configuration has since been fixed to correct this, revealing the test failures. This PR reenables slow gradcheck CI and makes it pass again. This PR: - makes slow and failing tests run in fast gradcheck mode only - reduce the input size for slow gradcheck only for unary/binary ufuncs (alternatively, skip the test entirely) - skip entire test files on slow gradcheck runner if they don't use gradcheck (test_ops, test_meta, test_decomp, test_ops_jit) - reduces the input size for some ops Follow ups: 1. Investigate slow mode failures https://github.com/pytorch/pytorch/issues/80411 2. See if we can re-enable slow gradcheck tests for some of the slow tests by reducing the sizes of their inputs The following are failing in slow mode, they are now running in fast mode only. ``` test_fn_fwgrad_bwgrad___rmod___cuda_float64 test_fn_fwgrad_bwgrad_linalg_householder_product_cuda_complex128 test_fn_fwgrad_bwgrad__masked_prod_cuda_complex128 test_fn_fwgrad_bwgrad__masked_prod_cuda_float64 test_fn_fwgrad_bwgrad_linalg_matrix_power_cuda_complex128 test_fn_fwgrad_bwgrad_cat_cuda_complex128 test_fn_fwgrad_bwgrad_linalg_lu_factor_ex_cuda_float64 test_fn_fwgrad_bwgrad_copysign_cuda_float64 test_fn_fwgrad_bwgrad_cholesky_inverse_cuda_complex128 test_fn_fwgrad_bwgrad_float_power_cuda_complex128 test_fn_fwgrad_bwgrad_fmod_cuda_float64 test_fn_fwgrad_bwgrad_float_power_cuda_float64 test_fn_fwgrad_bwgrad_linalg_lu_cuda_float64 test_fn_fwgrad_bwgrad_remainder_cuda_float64 test_fn_fwgrad_bwgrad_repeat_cuda_complex128 test_fn_fwgrad_bwgrad_prod_cuda_complex128 test_fn_fwgrad_bwgrad_slice_scatter_cuda_float64 test_fn_fwgrad_bwgrad_tile_cuda_complex128 test_fn_fwgrad_bwgrad_pow_cuda_float64 test_fn_fwgrad_bwgrad_pow_cuda_complex128 test_fn_fwgrad_bwgrad_fft_* test_fn_fwgrad_bwgrad_zero__cuda_complex128 test_fn_gradgrad_linalg_lu_factor_cuda_float64 test_fn_grad_div_trunc_rounding_cuda_float64 test_fn_grad_div_floor_rounding_cuda_float64 ``` Marks the OpInfos for the following ops that run slowly in slow gradcheck as `fast_gradcheck` only (the left column represents runtime in seconds): ``` 0 918.722 test_fn_fwgrad_bwgrad_nn_functional_conv_transpose3d_cuda_float64 1 795.042 test_fn_fwgrad_bwgrad_nn_functional_unfold_cuda_complex128 2 583.63 test_fn_fwgrad_bwgrad_nn_functional_max_pool3d_cuda_float64 3 516.946 test_fn_fwgrad_bwgrad_svd_cuda_complex128 4 503.179 test_fn_fwgrad_bwgrad_linalg_svd_cuda_complex128 5 460.985 test_fn_fwgrad_bwgrad_linalg_lu_cuda_complex128 6 401.04 test_fn_fwgrad_bwgrad_linalg_lstsq_grad_oriented_cuda_complex128 7 353.671 test_fn_fwgrad_bwgrad_nn_functional_max_pool2d_cuda_float64 8 321.903 test_fn_fwgrad_bwgrad_nn_functional_gaussian_nll_loss_cuda_float64 9 307.951 test_fn_fwgrad_bwgrad_stft_cuda_complex128 10 266.104 test_fn_fwgrad_bwgrad_svd_lowrank_cuda_float64 11 221.032 test_fn_fwgrad_bwgrad_istft_cuda_complex128 12 183.741 test_fn_fwgrad_bwgrad_lu_unpack_cuda_complex128 13 132.019 test_fn_fwgrad_bwgrad_nn_functional_unfold_cuda_float64 14 125.343 test_fn_fwgrad_bwgrad_nn_functional_pad_constant_cuda_complex128 15 124.2 test_fn_fwgrad_bwgrad_kron_cuda_complex128 16 123.721 test_fn_fwgrad_bwgrad_pca_lowrank_cuda_float64 17 121.074 test_fn_fwgrad_bwgrad_nn_functional_max_unpool3d_cuda_float64 18 119.387 test_fn_fwgrad_bwgrad_rot90_cuda_complex128 19 112.889 test_fn_fwgrad_bwgrad__masked_normalize_cuda_complex128 20 107.541 test_fn_fwgrad_bwgrad_dist_cuda_complex128 21 106.727 test_fn_fwgrad_bwgrad_diff_cuda_complex128 22 104.588 test_fn_fwgrad_bwgrad__masked_cumprod_cuda_complex128 23 100.135 test_fn_fwgrad_bwgrad_nn_functional_feature_alpha_dropout_with_train_cuda_float64 24 88.359 test_fn_fwgrad_bwgrad_mH_cuda_complex128 25 86.214 test_fn_fwgrad_bwgrad_nn_functional_max_unpool2d_cuda_float64 26 83.037 test_fn_fwgrad_bwgrad_nn_functional_bilinear_cuda_float64 27 79.987 test_fn_fwgrad_bwgrad__masked_cumsum_cuda_complex128 28 77.822 test_fn_fwgrad_bwgrad_diag_embed_cuda_complex128 29 76.256 test_fn_fwgrad_bwgrad_mT_cuda_complex128 30 74.039 test_fn_fwgrad_bwgrad_linalg_lu_solve_cuda_complex128 ``` ``` 0 334.142 test_fn_fwgrad_bwgrad_unfold_cuda_complex128 1 312.791 test_fn_fwgrad_bwgrad_linalg_lu_factor_cuda_complex128 2 121.963 test_fn_fwgrad_bwgrad_nn_functional_max_unpool3d_cuda_float64 3 108.085 test_fn_fwgrad_bwgrad_diff_cuda_complex128 4 89.418 test_fn_fwgrad_bwgrad_nn_functional_max_unpool2d_cuda_float64 5 72.231 test_fn_fwgrad_bwgrad___rdiv___cuda_complex128 6 69.433 test_fn_fwgrad_bwgrad___getitem___cuda_complex128 7 68.582 test_fn_fwgrad_bwgrad_ldexp_cuda_complex128 8 68.572 test_fn_fwgrad_bwgrad_linalg_pinv_cuda_complex128 9 67.585 test_fn_fwgrad_bwgrad_nn_functional_glu_cuda_float64 10 66.567 test_fn_fwgrad_bwgrad_lu_cuda_float64 ``` ``` 0 630.13 test_fn_gradgrad_nn_functional_conv2d_cuda_complex128 1 81.086 test_fn_gradgrad_linalg_solve_triangular_cuda_complex128 2 71.332 test_fn_gradgrad_norm_cuda_complex128 3 64.308 test_fn_gradgrad__masked_std_cuda_complex128 4 59.519 test_fn_gradgrad_div_no_rounding_mode_cuda_complex128 5 58.836 test_fn_gradgrad_nn_functional_adaptive_avg_pool3 ``` Reduces the sizes of the inputs for: - diff - diag_embed Pull Request resolved: https://github.com/pytorch/pytorch/pull/80514 Approved by: https://github.com/albanD	2022-07-22 02:05:37 +00:00
Horace He	a5fb41e3d3	Revert "Revert "Refactored prim utils into _prims_utils folder (#81746 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81746 Approved by: https://github.com/anijain2305, https://github.com/Krovatkin	2022-07-20 23:43:57 +00:00
Kshiteej K	8b5685da12	[composite compliance] test_operator correctness (#81600 ) Time Before PR: ``` = 1111 passed, 45 skipped, 41020 deselected, 17 xfailed, 33 warnings in 52.55s = ``` Time After PR: ``` = 1105 passed, 51 skipped, 41020 deselected, 17 xfailed, 33 warnings in 70.03s (0:01:10) = ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/81600 Approved by: https://github.com/zou3519	2022-07-20 21:18:56 +00:00
Ivan Yashchuk	a3d5d2ddf1	Add partitioned nvFuser executor with ATen fallbacks (#81043 ) This PR introduces a new nvFuser executor for FX graphs containing different kinds of nodes, not just `torch.ops.prims` supported by nvFuser. The FX graph is partitioned based on whether nodes are supported or not by nvFuser and supported nodes are fused into subgraphs, that's all using Sherlock's work on the partitioner. This new partitions-based executor with fallbacks to ATen is used by default with `executor="nvfuser"`. And the previous executor can be used with `executor="strictly_nvfuser"`, naming suggestions are welcome! Pull Request resolved: https://github.com/pytorch/pytorch/pull/81043 Approved by: https://github.com/jjsjann123, https://github.com/SherlockNoMad	2022-07-20 19:51:20 +00:00
Kshiteej K	706b420a52	[composite compliance] check output of forward-ad with subclass args against regular tensor (#81464 ) Time Before PR ``` = 880 passed, 274 skipped, 38170 deselected, 17 xfailed, 21 warnings in 808.96s (0:13:28) = ``` Time After PR ``` = 875 passed, 274 skipped, 38170 deselected, 22 xfailed, 21 warnings in 880.61s (0:14:40) = ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/81464 Approved by: https://github.com/zou3519	2022-07-20 17:38:11 +00:00
PyTorch MergeBot	e43a02c314	Revert "Refactored prim utils into _prims_utils folder (#81088 )" This reverts commit `80231d0a72`. Reverted https://github.com/pytorch/pytorch/pull/81088 on behalf of https://github.com/jeanschmidt due to breaking internal tests	2022-07-19 19:56:41 +00:00
Catherine Lee	06a0cfc0ea	pytest to run test_ops, test_ops_gradients, test_ops_jit in non linux cuda environments (#79898 ) This PR uses pytest to run test_ops, test_ops_gradients, and test_ops_jit in parallel in non linux cuda environments to decrease TTS. I am excluding linux cuda because running in parallel results in errors due to running out of memory Notes: * update hypothesis version for compatability with pytest * use rerun-failures to rerun tests (similar to flaky tests, although these test files generally don't have flaky tests) * reruns are denoted by a rerun tag in the xml. Failed reruns also have the failure tag. Successes (meaning that the test is flaky) do not have the failure tag. * see https://docs.google.com/spreadsheets/d/1aO0Rbg3y3ch7ghipt63PG2KNEUppl9a5b18Hmv2CZ4E/edit#gid=602543594 for info on speedup (or slowdown in the case of slow tests) * expecting windows tests to decrease by 60 minutes total * slow test infra is expected to stay the same - verified by running pytest and unittest on the same job and check the number of skipped/run tests * test reports to s3 changed - add entirely new table to keep track of invoking_file times Pull Request resolved: https://github.com/pytorch/pytorch/pull/79898 Approved by: https://github.com/malfet, https://github.com/janeyx99	2022-07-19 19:50:57 +00:00
Horace He	80231d0a72	Refactored prim utils into _prims_utils folder (#81088 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81088 Approved by: https://github.com/ngimel	2022-07-19 03:55:51 +00:00
Peter Bell	bf36d8b987	[primTorch] Implement one-dimensional fft transforms (#80570 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80570 Approved by: https://github.com/mruberry	2022-07-15 15:13:43 +00:00
Peter Bell	924b7951aa	[primTorch] Implement conj and conj_physical (#80358 ) This adds `prims.conj` and `prims.conj_physical` which only accept complex tensors, as well as `refs.conj` and `refs.conj_physical` which pass-through non-complex values and call the appropriate `prims` for complex types. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80358 Approved by: https://github.com/mruberry	2022-07-14 15:29:41 +00:00
Nikita Karetnikov	1e3c6f2263	[primTorch] Add a ref for allclose (#81003 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/81003 Approved by: https://github.com/mruberry	2022-07-12 15:08:01 +00:00
Richard Zou	9ee312023d	[Composite compliance testing] Refactor check_forward_ad_formula to accept Callable (#81239 ) Like https://github.com/pytorch/pytorch/pull/81059; this PR addresses the review comments. Test Plan: - run tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/81239 Approved by: https://github.com/ezyang	2022-07-11 20:48:18 +00:00
Richard Zou	d253cdd8ff	[composite compliance testing] Refactor check_backward_formula to accept Callable (#81059 ) Maybe niche, but for one-off debugging purposes, I want a variant of check_backward_formula that accepts a callable rather than an OpInfo. This is because when debugging, I try to create a repro that does not involve OpInfos because OpInfos are difficult to deal with (they have a lot of sample inputs, I may want to test my own sample inputs without creating a new OpInfo, etc). This PR refactors check_backward_formula so that it accepts a Callable instead of an OpInfo. Example usage: ``` import torch from torch.testing._internal.composite_compliance import check_backward_formula x = torch.tensor([[1., 1.], [1., 0.]], requires_grad=True) args = (x, 1) check_backward_formula_callable(torch.prod, args, {}) ``` Test Plan: - run existing tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/81059 Approved by: https://github.com/kshitij12345, https://github.com/ezyang	2022-07-11 18:37:50 +00:00
Mike Ruberry	8740c68c41	[primTorch] Adds contiguous and expand references (#79820 ) I also filed while creating this PR. This PR... Filed issues - https://github.com/pytorch/pytorch/issues/79818 - https://github.com/pytorch/pytorch/issues/80154 prims - Fixes prims.squeeze when called with an unsorted list of dimensions - Removes the clone prim refs - adds contiguous - adds expand - updates clone to call empty_like and copy_to - updates empty to accept a memory format - updates empty_like to accept a memory_format utils - adds helper functions for working with memory formats and channels last tensors, in particular tests - removes unused clamp sample input functions (mooted by clamp's new reference inputs) - extends the reference inputs for clone to include different memory formats - creates reference inputs for contiguous - xfails operators that depend on clone (including clone) on `test_python_ref` (see issues) Pull Request resolved: https://github.com/pytorch/pytorch/pull/79820 Approved by: https://github.com/ngimel	2022-07-11 17:42:58 +00:00
Ryan Spring	d26516fd1b	[primTorch] Implement loss function references (#80573 ) Add Reference: - mse_loss - l1_loss Pull Request resolved: https://github.com/pytorch/pytorch/pull/80573 Approved by: https://github.com/mruberry	2022-07-09 03:31:20 +00:00
David Berard	4c57cf9a8b	Register unregistered refs and add a test to check registration (#80497 ) Added missing `register_decomposition`s which will register the refs so they can be used for decompositions. Also added a test for verifying that new refs are registered. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80497 Approved by: https://github.com/ezyang	2022-07-08 16:29:52 +00:00
Ivan Yashchuk	12dc410ff2	Fix nvFuser's where(tensor, python_scalar, tensor) type promotion (#80347 ) This PR modifies the type promotion logic for nvFuser's `where` function when one of the arguments is a scalar. With the proposed change behavior now matches with ATen's type promotion. The following script fails on master and passes with this PR: ```py import torch import torch._refs from torch._prims.executor import make_traced a = torch.ones(3, 3, dtype=torch.bool, device='cuda') b = torch.randn(3, 3, device='cuda') func = lambda a, b: torch._refs.where(a, 0.0, b) assert make_traced(func)(a, b, executor="nvfuser").dtype == torch.float32 ``` This PR allows to unskip nvFuser tests for `_refs.log_softmax`, it was failing with a dtype mismatch. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80347 Approved by: https://github.com/ngimel	2022-06-28 08:42:16 +00:00
Ryan Spring	1d0d506e97	Add Div reference (#77936 ) Add Prims: - trunc - Replace _wrap_scalar with scalar_tensor Add Reference: - copysign - div - floor_divide - trunc_divide Other: * Add support for `variant_test_name` in _find_referenced_opinfo Pull Request resolved: https://github.com/pytorch/pytorch/pull/77936 Approved by: https://github.com/mruberry	2022-06-27 14:46:17 +00:00
Ivan Yashchuk	072311bb28	Enable torch._prims.amax/amin for nvFuser executor (#80070 ) This PR adds nvFuser implementations for `torch._prims.amax` and `torch._prims.amin` reduction functions. Currently, nvFuser refuses to reduce the 0d tensor, so these inputs are skipped in tests for now. An accompanying fix replaces `collections.Sequence` -> `collections.abc.Sequence` in refs because `collections.Sequence` is deprecated and removed in Python 3.10 Many ops that were skipped for the nvFuser executor test are now enabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/80070 Approved by: https://github.com/ngimel	2022-06-23 10:19:57 +00:00
Elias Ellison	268bbecf1c	Add option for allowing non-fake inputs, add deepcopy impl Pull Request resolved: https://github.com/pytorch/pytorch/pull/79580 Approved by: https://github.com/samdow	2022-06-17 19:36:26 +00:00
Kshiteej K	04b98df87a	[fix] composite compliance: eig, eigh, symeig (#79698 ) Ref: https://github.com/pytorch/pytorch/issues/69991 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79698 Approved by: https://github.com/Lezcano, https://github.com/albanD	2022-06-17 14:13:04 +00:00
kshitij12345	d05fb78685	[chalf] enable skipped tests (#79376 ) Ref: https://github.com/pytorch/pytorch/pull/79217#pullrequestreview-1002849962 Had to add a few `expectedFailures` Pull Request resolved: https://github.com/pytorch/pytorch/pull/79376 Approved by: https://github.com/ngimel, https://github.com/mruberry	2022-06-13 17:31:45 +00:00
Michael Suo	c978b609f7	[ci] remove IN_CI env var The conventional env var to set is CI. Both circle and GHA set it, so IN_CI is unnecessary Pull Request resolved: https://github.com/pytorch/pytorch/pull/79229 Approved by: https://github.com/janeyx99	2022-06-11 17:16:30 +00:00

1 2 3 4 5 ...

339 Commits