Commit Graph

157 Commits

Author SHA1 Message Date
Mike Ruberry
64c6a89bd6 [primTorch] reshape and view (#77220)
This PR makes the following changes...

Prims
- adds as_strided
- fixes errors in flatten meta

Testing
- enables view consistency checking (which can be opted out of, see issues below)
- adds reference inputs for view, reshape, and flatten
- adds error inputs for reshape

Refs
- adds as_strided, reshape, and view
- fixes an error in the flatten ref where it was not returning self on no-op
- fixes a bug in transpose where it was not retuning a view when the transposed tensor has 1 or fewer dims

Issues
- https://github.com/pytorch/pytorch/issues/77218
- https://github.com/pytorch/pytorch/issues/77216
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77220
Approved by: https://github.com/ngimel
2022-05-13 13:12:04 +00:00
Kshiteej K
39bd37f34f [complex32] sum, prod : cuda only (disable jiterator reduction on windows) (#77192)
Ref: https://github.com/pytorch/pytorch/issues/74537
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77192
Approved by: https://github.com/anjali411
2022-05-13 07:24:18 +00:00
kshitij12345
9e3eb329df [chalf] getitem (#77339)
Ref: https://github.com/pytorch/pytorch/issues/74537
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77339
Approved by: https://github.com/anjali411
2022-05-12 14:49:17 +00:00
Mike Ruberry
bb8baea932 [primTorch] flatten, squeeze, unsqueeze... (#77043)
This PR ...

Makes the following testing changes:

- Updates stride testing in test_python_reference_consistency to only check strides of dimensions with length > 1
- Creates reference inputs for reshape
- Creates reference inputs for chunk
- Extends the sample inputs for unsqueeze
- Extends the sample inputs for stack -- test_conj_view and test_neg_view are now xfailed
  - https://github.com/pytorch/pytorch/issues/77046

Makes the following architecture changes:
- Adds the refs.special (sub)module
- Adds the refs.nn.functional (sub)module

Adds the following prims:
- expand_dims
- view_of
- rev
- clone

Adds the following references:
  -  flatten
  - squeeze
  - unsqueeze
  - special.i0e
  - special.i1e
  - logical_or
  - logical_and
  - isclose
  - flip
  - stack
  - nn.functional.elu
  - chunk
  - clone
  - narrow

Identifies the following bugs in PyTorch today:
- https://github.com/pytorch/pytorch/issues/77054
- https://github.com/pytorch/pytorch/issues/77055

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77043
Approved by: https://github.com/ngimel
2022-05-09 11:24:55 +00:00
Natalia Gimelshein
362525724b type promote clamp (#77035)
Fixes #76630
When clamp(Tensor, Tensor) is structured, big parts of this PR won't be needed, but for now let's fix type promotion to make behavior more regular.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77035
Approved by: https://github.com/mruberry
2022-05-09 05:54:17 +00:00
Mike Ruberry
c031643e39 Adds decorators for Python References and extends Python Reference testing (#76945)
This PR does the following...

Tests:
- fixes test_type_promotion in test_binary_ufuncs to correctly generate scalar cpu tensors
- fixes test_python_reference_consistency to use the Python Reference's reference inputs
- extends Python reference testing to test_conj_view, test_neg_view, and test_neg_conj_view
- adds a NaN propagation sample input for elementwise unary and binary operations
- fixes the UnaryUfuncInfo class to properly register its reference inputs
- Updates the Python Reference OpInfos to skip error inputs when their behavior on scalar inputs is inconsistent with their reference operators

Code organization:
- moves elementwise type promotion functionality to prims.utils

Prims & Refs:
- fixes scalar cpu tensor handling by having them pass through broadcasting and device and shape checks
- adds two decorators, `elementwise_type_promotion_wrapper` and `out_wrapper`, the former allows for elementwise type promotion to be automated and the latter automatically adds the out kwarg and handles it properly

cc @ezyang who also had some thoughts on cpu scalar tensor handling
cc @chillee -- might want to use this new decorator as we converge decompositions and references
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76945
Approved by: https://github.com/ngimel
2022-05-07 03:42:24 +00:00
Natalia Gimelshein
1c776d209c Adds amax and amin references
Also extends reference testing to error inputs.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76855
Approved by: https://github.com/mruberry
2022-05-05 15:53:09 +00:00
Natalia Gimelshein
c51b53d4ef [WIP] sum reference
Per title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76714
Approved by: https://github.com/mruberry
2022-05-04 02:50:00 +00:00
Mike Ruberry
c9bd73878a adds elementwise opinfos and unary references, extends to out testing
This PR makes the following changes:

Prims:
- igamma and igammac are now correctly listed as elementwise binary operations, not elementwise unary operations
- elementwise prims now must specify their type promotion kind (this is currently unused)

Refs:
- complexhalf is now handled by opmath-style type promotion
- adds references for: abs, acos, acosh, asin, atan, ceil, cos, cosh, digamma, erf, erfinv, erfc, exp, expm1, isfinite, isnan, lgamma, log, log1p, neg, reciprocal, sign, sin, sinh, sqrt, square, tan, igamma, igammac
- adds "complex to float" and "bool to long" type promotion kinds
- updates out behavior to warn when resizing a non-empty tensor, consistent with current ops
- updates the elementwise unary reference template with type promotion

Tests:
- fixes torch.pow's OpInfo to correctly specify it only supports one scalar input, not two
- fixes elementwise binary reference inputs to not attempt generating certain tensors in complex half (for now, cc @kshitij12345)
- adds OpInfos for the following Python references: abs, acos, acosh, asin, atan, ceil, cos, cosh, digamma, erf, erfinv, erfc, exp, expm1, isfinite, isnan, lgamma, log, log1p, neg, reciprocal, round, sign, sin, sinh, sqrt, square, tan, atan2, bitwise_and, bitwise_left_shift, bitwise_or, bitwise_xor, eq, float_power, ge, gt, igamma, igammac, le, lt, maximum, minimum, mul, ne, nextafter, pow, sub, true_divide

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76647
Approved by: https://github.com/ngimel
2022-05-02 14:23:05 +00:00
Mike Ruberry
f6bbecf8b5 Adds python ref consistency test, elementwise unary reference inputs, and formats test files
Per title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76626
Approved by: https://github.com/ngimel
2022-05-01 22:42:46 +00:00
Mike Ruberry
fe1968dea0 [primTorch] Prototype nvFuser integration and test_prims.py
This adds prototype nvFuser integration for the following prims:

- broadcast_in_dim
- convert_element_type
- add
- div
- ge
- gt
- le
- lt
- mul

Adding it for additional prims supported by nvFuser's prototype Python frontend should be easy.

This also adds a new sugar to run operations using the ATen or nvFuser trace executors. For example:

```
def foo(a, b):
  return torch.add(a, b)

traced_foo = make_traced(foo)

a = torch.randn((1, 2, 3, 4, 5), device='cuda')
b = torch.randn((1, 2, 3, 4, 5), device='cuda')
result = traced_foo(a, b, executor='nvfuser')
```

Currently only operations with tensor inputs and one tensor output are supported, and the operation must be composed exclusively of reference or prim operations.

Finally, this adds a new test, test_prims.py, that just tests the broadcast_in_dim prim for now. In the future we'll likely have OpInfos for each prim, but we'll need a reference implementation of broadcast_in_dim to make that interesting.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76560
Approved by: https://github.com/ngimel
2022-04-29 02:02:25 +00:00
Mike Ruberry
4048d4cdd2 [primTorch] Prototype tracer and elementwise unary reference opinfo class
Adds a prototype tracer with no caching support and the `ElementwiseUnaryPythonRefInfo` class. A reference for `floor` is added to test the latter, and the elementwise binary reference inputs are extended to also return noncontiguous inputs. The SampleInput transform operation has been updated to return an actual SampleInput instead of a tuple to facilitate uniform handling of (transformed) SampleInputs.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76388
Approved by: https://github.com/ngimel
2022-04-27 14:40:21 +00:00
kshitij12345
c1ced8ff72 [composite compliance] add test for fwd AD
Fixes https://github.com/pytorch/pytorch/issues/74678

Test timings:
```
======================================= 756 passed, 99 skipped, 13864 deselected, 76 xfailed, 16 warnings in 278.35s (0:04:38) =======================================
```

Slowest ops
```
======================================================================== slowest 20 durations ========================================================================
32.16s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_instance_norm_cuda_float32
30.51s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_nn_functional_instance_norm_cpu_float32
9.89s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__masked_norm_cuda_float32
8.54s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad__masked_norm_cpu_float32
8.52s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_diff_cuda_float32
8.33s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_linalg_solve_triangular_cuda_float32
8.08s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_linalg_solve_triangular_cpu_float32
8.03s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_diff_cpu_float32
6.52s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_cov_cuda_float32
5.77s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_cov_cpu_float32
4.12s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_lu_solve_cuda_float32
3.78s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__masked_std_cuda_float32
3.67s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_gradient_cuda_float32
3.55s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad__masked_var_cuda_float32
3.47s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_max_pool2d_cuda_float32
3.42s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_batch_norm_without_cudnn_cuda_float32
3.40s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_nn_functional_max_pool2d_cpu_float32
3.30s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad__masked_std_cpu_float32
3.30s call     test/test_ops.py::TestCompositeComplianceCPU::test_forward_ad_gradient_cpu_float32
3.28s call     test/test_ops.py::TestCompositeComplianceCUDA::test_forward_ad_nn_functional_batch_norm_cuda_float32
====================================================================== short test summary info =======================================================================
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75178
Approved by: https://github.com/zou3519
2022-04-25 15:15:48 +00:00
Mike Ruberry
28c3e0f77c Initial prims, references, and test architecture for them (#75095)
Summary:
This PR adds an initial set of experimental primitive operations and Python references that reimplement existing PyTorch operations using them. See https://dev-discuss.pytorch.org/t/tracing-with-primitives-update-0/577 for additional context.

The following experimental primitives are added:

- Elementwise unary prims -- abs, acos, acosh, asin, atan, cos, cosh, bessel_i0e, bessel_i1e, cbrt, ceil, digamma, erf, erf_inv, erfc, exp, expm1, floor, igamma, igammac, is_finite, lgamma, log, log1p, neg, reciprocal, round, sign, sinh, sqrt, square, tan.
- Elementwise binary prims -- add, atan2, bitwise_and, bitwise_not, bitwise_or, bitwise_xor, div, eq, ge, gt, le, lt, max, min, mul, ne, nextafter, pow, rsqrt, shift_left, shift_right_arithmetic
- View prims -- brodcast_in_dim, collapse_view, split_dim, squeeze
- Shape prims -- collapse, concatenate, reshape
- Conditional prims -- select
- Data conversion & movement prims -- convert_element_type, device_put
- Inplace prims -- copy_to, resize

These primitives do not add any new functionality to PyTorch, but are intended to be the semantic building blocks for reference operators. We have tried to make them consistent with the operations in [jax.lax](https://jax.readthedocs.io/en/latest/jax.lax.html) where possible (because PyTorch prefers being consistent with other frameworks), although there are key differences between these prims and operations in jax.lax. Most notably is that these prims model view semantics and inplace operations.

In addition to these primitives the following elementwise binary Python references are added:

- Elementwise binary Python references -- add, atan2, bitwise_and, bitwise_left_shift, bitwise_or, bitwise_right_shift, bitwise_xor, eq, float_power, ge, gt, le, lt, maximum, minimum, mul, ne, nextafter, pow, sub, true_divide
- Conditional Python references - where
- Data conversion & movement references - copy_to

A Python reference implements the same behavior as its corresponding PyTorch operator (excepting slight numerical differences, bug fixes, and in some cases additional features).

The start of an OpInfo-based test architecture for these references is also included in this PR. A new list, `python_ref_db`, is added to `common_methods_invocations.py`. This list introduces the new `ElementwiseBinaryPythonRefInfo`, which inherits input arguments from the original operators' OpInfo, allows them to be overridden, and then constructs the OpInfo for the Python reference using the (potentially modified) arguments. OpInfo-based tests can opt-into testing references by including this new list in the Sequence passed to the `ops` decorator.

cc ngimel csarofeen kevinstephano Lezcano

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75095

Reviewed By: ngimel

Differential Revision: D35888004

Pulled By: mruberry

fbshipit-source-id: 21e77c4456c2a02113367d4bdae168a3a2f33f25
(cherry picked from commit 1d5bcfa99d4e8cf36f60642803a0bfca50e2ea4e)
2022-04-25 09:57:20 +00:00
kshitij12345
e23cbd633f [complex32] jiterator support
Reference #74537

Support for jiterating with `c10::complex<Half>`. Note that computation will take place in `complex<float>` by allowing implicit casting in JITerated code (similar to Half and BFloat16 which upcast to float for computation).

We add `complex32` support for `sigmoid` and `sigmoid_backward` in this PR. This is tested with `test_ops.py::test_dtypes and test_ops.py::test_complex_half_reference_testing`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75656
Approved by: https://github.com/ngimel
2022-04-19 16:33:18 +00:00
Jeff Daily
e587c8bc57 [ROCm] enable composite compliance backward tests
Follow up to #74646.  Do not skip the entire TestCompositeCompliance test_backward for ROCm, only skip the the two unexpected successes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75810
Approved by: https://github.com/zou3519
2022-04-19 13:47:06 +00:00
Mike Ruberry
de949a0e59 Various OpInfo architecture improvements
This PR makes the following improvements:

- moves the custom skip list for test_normalize_operator_exhaustive in test_fx_experimental to use the typical OpInfo skip architecture. The skips were updated to xfails, and that identified some operators which were no longer failing the test
- redundant tests with OpInfo-based testing in test_jit.py were removed
- test_dtypes was improved so its error messages are clear and it makes test_nondifferentiable redundant; the latter test has been removed
- OpInfo.supports_complex_autograd() is removed in favor of a more accurate and general test for whether the particular dtype is in the backward dtypes of the operator
- gradchecks have been improved to verify that an operator doesn't support grad if it claims not to
- gradchecks have been improved to test the gradient of all input tensors that require gradient
- the concept of "default test dtypes" has been removed
- excessive and mostly redundant out testing for elementwise unary operators has been removed
- metadata for whether an op supports nuanced "safe casting" to out behavior has been removed from OpInfos
- numerous skips have been converted to xfails
- numerous OpInfos have had their metadata fixed based on the new checks
- jit-specific utilities in common_methods_invocations.py have been moved to jit_programming_utils.py
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75951
Approved by: https://github.com/ngimel
2022-04-18 21:55:32 +00:00
Mike Ruberry
b09769992f Improves the OpInfo out= tests
Edit: OpInfos separated into their own PRs to debug an ASAN failure that doesn't identify the failing test properly. This PR now just updates the out tests.

Adds OpInfos for:

- nn.functional.smooth_l1_loss
- nn.functional.l1_loss
- nn.functional.pdist
- nn.functional.binary_cross_entropy
- nn.functional.triplet_margin_loss
- nn.functional.triplet_margin_with_distance_loss
- nn.functional.max_unpool{1, 2, 3}D
- nn.functional.alpha_dropout
- nn.functional.soft_margin_loss
- nn.functional.multilabel_soft_margin_loss
- nn.functional.multilabel_margin_loss
- nn.functional.multi_margin_loss
- nn.functional.margin_ranking_loss

These OpInfos were taken from https://github.com/pytorch/pytorch/pull/67560, https://github.com/pytorch/pytorch/pull/67823, https://github.com/pytorch/pytorch/pull/68625, and https://github.com/pytorch/pytorch/pull/67079. The sample input update from https://github.com/pytorch/pytorch/pull/67017 is also rolled into this PR.

cc @zou3519 @nikitaved @pmeier @vfdev-5 @dagitses
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75782
Approved by: https://github.com/ngimel
2022-04-15 06:16:01 +00:00
PyTorch MergeBot
9312ee8cd6 Revert "remove fp16 support from cpu linalg functions"
This reverts commit 29af58db51.

Reverted https://github.com/pytorch/pytorch/pull/75647 on behalf of https://github.com/ngimel
2022-04-14 21:06:48 +00:00
Natalia Gimelshein
29af58db51 remove fp16 support from cpu linalg functions
fp16 on cpu produces slow and inaccurate results, see #69969

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75647
Approved by: https://github.com/Lezcano, https://github.com/mruberry
2022-04-14 18:45:59 +00:00
PyTorch MergeBot
495c5aebb1 Revert "remove fp16 support from cpu linalg functions"
This reverts commit de18c28a4c.

Reverted https://github.com/pytorch/pytorch/pull/75647 on behalf of https://github.com/suo
2022-04-13 18:34:35 +00:00
Natalia Gimelshein
de18c28a4c remove fp16 support from cpu linalg functions
fp16 on cpu produces slow and inaccurate results, see #69969

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75647
Approved by: https://github.com/Lezcano, https://github.com/mruberry
2022-04-13 17:24:22 +00:00
kshitij12345
65b65af236 [complex32] cat, fill_(partial), item
Reference : #74537

`cat_backwards` (on CUDA) requires support for `fill`, have added support for `fill`. (Also `fill` requires `item` support)

Now `fill` backward requires `sum` (will add it in later PR).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75010
Approved by: https://github.com/anjali411
2022-04-01 15:19:05 +00:00
Nikita Shulga
bfac65dfe5
[testing] Update dispatch macros (#74977)
This PR is reland of #74289 
Co-authored-by: Khushi Agrawal <khushiagrawal411@gmail.com>
2022-03-30 14:13:21 -07:00
PyTorch MergeBot
2e4152b118 Revert "[testing] Update dispatch macros"
This reverts commit eed19a0f38.

Reverted https://github.com/pytorch/pytorch/pull/74289 on behalf of https://github.com/malfet
2022-03-30 19:52:37 +00:00
Khushi Agrawal
eed19a0f38 [testing] Update dispatch macros
Hi,
This PR is the follow-up PR of #71561. (the previous PR had a couple of merge conflicts and was reverted, this PR resolves that).
Please take a look. Thanks!

cc: @pmeier @mruberry @kshitij12345
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74289
Approved by: https://github.com/pmeier, https://github.com/mruberry
2022-03-30 16:10:16 +00:00
Richard Zou
e832eedd29 Composite Compliance testing for backward formulas (#74646)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74646

The OpInfo-based test, given an operator and sample inputs,
checks all permutations of {inputs, grad_output} being either
{CompositeCompliantTensor, regular Tensor}, running them through a
forward pass and a backward pass.

Test Plan: - wait for tests

Reviewed By: albanD

Differential Revision: D35186860

Pulled By: zou3519

fbshipit-source-id: 8b2577dd6106c05db2ab583bbefd10545fdd8adf
(cherry picked from commit 3f5c3793715af9a8d4db06690c5faa7256a82645)
2022-03-28 22:12:41 +00:00
Richard Zou
80d64b365a Test case where some inputs are Tensor Subclasses in CompositeCompiance (#74645)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74645

This PR adds tests for when only some inputs are Tensor Subclasses.

Why is this important to test?
==============================

Consider the following hypothetical out-of-place operation:
```
def my_add(x, y):
  result = x.clone()
  result.add_(y)
  return result
```

You may expect this to work the same as torch.add. If x is not a Tensor
Subclass, but y is a Tensor subclass, then this returns us a regular
Tensor, NOT a Tensor subclass!

This is exactly the type of in-place operations that causes `vmap` to
fail and will be problematic for certain Tensor Subclasses in the future
so we're adding tests to make sure Composite pytorch operations don't do
this.

What exactly does this PR do?
=============================
Composite compliance now takes a sample input and produces a test case
where some of the sample inputs are Tensor Subclasses. It then sends
this through the original operation, once with Python Mode and one
without.

(Why once with Python Mode? Because we want to use it to detect the
pattern of "create a Tensor and call resize_ on it")

Finally, it repeats this process for all possiblities where the inputs
are Tensor subclasses. For example, if the sample input is (x, y), then
we test all four of the following cases:
- Subclass(x), y
- x, Subclass(y)
- Subclass(x), Subclass(y)
- x, y

Test Plan
=========
- run tests

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D35186862

Pulled By: zou3519

fbshipit-source-id: 102477507b56583463668db7523a6586d92b357d
(cherry picked from commit bfcb087244b0598abb270f7c26d472482f00b5e2)
2022-03-28 22:12:41 +00:00
Richard Zou
c96f321804 Move CompositeCompliance tests to their own TestCase (#74644)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74644

This is in preparation for me adding additional tests for:
1. composite compliance of autograd formulas
2. composite compliance of forward-mode AD formulas

This PR also changes these tests to run on both CPU and CUDA. Previously
they were just run on CPU, but it turns out there's a lot of branching
on the device in composite operations in PyTorch today :/

Test Plan: - wait for tests

Reviewed By: albanD

Differential Revision: D35186861

Pulled By: zou3519

fbshipit-source-id: d974592a7547f71ef26ff0740bf453f7d335d55a
(cherry picked from commit 773b43394c2406502a6e386a30eb003a73861f13)
2022-03-28 22:12:40 +00:00
Jane Xu
a1e284d9c8 Remove high priority as an owner for tests (#74555)
Summary:
Following triage review discussion, it would be best for these tests to not be triaged high priority by automation, but by the triagers in the oncall.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74555

Reviewed By: albanD

Differential Revision: D35099202

Pulled By: janeyx99

fbshipit-source-id: 657a0317141de3a598476a6f601ec26cc26231b1
(cherry picked from commit 057519cb2494d0f9a0b169f359ac87ba9e89f088)
2022-03-24 14:29:52 +00:00
Mike Ruberry
0aa3c39e5f Extends OpInfo architecture with reference inputs, adds them for elementwise binary operators
This PR extends our OpInfo test architecture with "reference inputs," an optional expansion of typical sample inputs that allows for more thorough testing. Currently only the elementwise binary operations implement an extended set of reference inputs. This PR also cleans up some smaller OpInfo-related issues, including several bugs, and it identified https://github.com/pytorch/pytorch/issues/74279.

A reference inputs function can be specified for an OpInfo by filling in its "reference_inputs_func" metadata. If this is done it's recommended that the reference inputs function first call the sample inputs function, then produce additional sample inputs. See `reference_inputs_elementwise_binary` for an example of this pattern.

In addition to implementing reference inputs for the elementwise binary operations, this PR improves their consistency and simplifies how their metadata is represented. The great majority now use a generic sample input function, and those that want extensions start by calling the generic sample input function and then adding additional samples. This removes many older sample input functions. The BinaryUfuncInfo subclass also now allows specifying scalar support more precisely, and reference inputs and error inputs are generated based on this metadata to ensure it's correct.

cc @kshitij12345 @pmeier @zou3519 @Chillee

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74280
Approved by: https://github.com/ngimel
2022-03-21 03:24:16 +00:00
atalman
ebca80ed08 Move test ops gradients and test ops jit to separate files
Fixes #72368

As per reference issue, the test_ops in single file takes around 3:30-4:00Hrs to execute on asan jobs:

Reference : pytorch_test_times.json

```
{
    "commit": "39535fec6c3ff5bf7c2d322d096c59571c3295ed",
    "JOB_BASE_NAME": "linux-xenial-py3.7-clang7-asan",
    "job_times": {
        "test_ops": 14928.355000000636, <- This test group is over 4hrs alone
```
----

Hence separating  test_ops into following parts:
1. TestGradients
2. TestJit
3.  TestCommon and TestMathBits

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74297
Approved by: https://github.com/malfet
2022-03-17 02:07:50 +00:00
PyTorch MergeBot
232faeacf8 Revert "Move test ops gradients and test ops jit to separate files"
This reverts commit 7cf9b942da.

Reverted https://github.com/pytorch/pytorch/pull/74297 on behalf of https://github.com/atalman
2022-03-16 20:08:23 +00:00
atalman
7cf9b942da Move test ops gradients and test ops jit to separate files
Fixes #72368

As per reference issue, the test_ops in single file takes around 3:30-4:00Hrs to execute on asan jobs:

Reference : pytorch_test_times.json

```
{
    "commit": "39535fec6c3ff5bf7c2d322d096c59571c3295ed",
    "JOB_BASE_NAME": "linux-xenial-py3.7-clang7-asan",
    "job_times": {
        "test_ops": 14928.355000000636, <- This test group is over 4hrs alone
```
----

Hence separating  test_ops into following parts:
1. TestGradients
2. TestJit
3.  TestCommon and TestMathBits

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74297
Approved by: https://github.com/malfet
2022-03-16 19:30:22 +00:00
Nikita Shulga
ef066f0832 Revert D34856571: [pytorch][PR] Replace get_all_ type macros with the ATen dispatch macros.
Test Plan: revert-hammer

Differential Revision:
D34856571 (3ded7b1da3)

Original commit changeset: 0dca038bcad5

Original Phabricator Diff: D34856571 (3ded7b1da3)

fbshipit-source-id: 594553fa0b710d78beba59d5d2b646f1f1270386
(cherry picked from commit 8090eb9b12dcf452a9e7dc01792a66fb91b563b6)
2022-03-15 22:07:11 +00:00
Khushi Agrawal
3ded7b1da3 Replace get_all_ type macros with the ATen dispatch macros. (#71561)
Summary:
Hi, Team!
The PR is motivated from https://github.com/pytorch/pytorch/pull/71153#discussion_r782446738. It aims to replace `get_all` type macros with the ATen dispatch macros.

The files it iterates over are: (Thanks, Lezcano, for the idea!!)

<details>
<summary>

`test/test_autograd.py`</summary>

<p>

```python
43:from torch.testing._internal.common_dtype import get_all_dtypes
8506:        floating_dt = [dt for dt in get_all_dtypes() if dt.is_floating_point]
```

</p>
</details>

<details>
<summary>

`test/test_binary_ufuncs.py`</summary>

<p>

```python
26:    all_types_and_complex_and, integral_types_and, get_all_dtypes, get_all_int_dtypes, get_all_math_dtypes,
27:    get_all_complex_dtypes, get_all_fp_dtypes,
935:    dtypes(*get_all_dtypes(include_bool=False, include_complex=False))
1035:    dtypes(*get_all_dtypes(
1488:    dtypes(*(get_all_dtypes(include_bool=False, include_bfloat16=False)))
1879:    dtypes(*product(get_all_dtypes(include_complex=False), get_all_dtypes(include_complex=False)))
1887:    dtypes(*(get_all_int_dtypes() + [torch.bool]))
1913:    dtypes(*(get_all_fp_dtypes()))
1941:    dtypes(*(get_all_fp_dtypes()))
1977:    dtypes(*product(get_all_complex_dtypes(), get_all_dtypes()))
2019:    dtypes(*product(get_all_fp_dtypes(), get_all_fp_dtypes()))
2048:    dtypes(*get_all_dtypes())
2110:    dtypes(*product(get_all_dtypes(include_complex=False),
2111:                     get_all_dtypes(include_complex=False)))
2128:            types = [torch.bool, torch.bfloat16] + get_all_int_dtypes()
2173:        if dtypes[1] in get_all_fp_dtypes():
2178:    dtypes(*product(get_all_fp_dtypes(),
2179:                     get_all_fp_dtypes()))
2260:    dtypesIfCUDA(*set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128})
2261:    dtypes(*set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128})
2273:    dtypesIfCUDA(*set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128})
2274:    dtypes(*set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128})
2307:    dtypes(*get_all_math_dtypes('cpu'))
2319:    dtypes(*get_all_fp_dtypes(include_bfloat16=False))
2331:    dtypes(*get_all_int_dtypes())
2356:    dtypes(*get_all_dtypes(include_bfloat16=False, include_bool=False, include_complex=False))
2393:        if dtype in get_all_int_dtypes():
2614:    dtypes(*get_all_dtypes())
2624:    dtypes(*tuple(itertools.combinations_with_replacement(get_all_dtypes(), 2)))
2806:    dtypes(*list(product(get_all_dtypes(include_complex=False),
2807:                          get_all_dtypes(include_complex=False))))
2866:    dtypes(*list(product(get_all_complex_dtypes(),
2867:                          get_all_complex_dtypes())))
2902:    dtypes(*product(get_all_dtypes(), get_all_dtypes()))
2906:    dtypes(*product(get_all_dtypes(), get_all_dtypes()))
2910:    dtypes(*product(get_all_dtypes(), get_all_dtypes()))
3019:        dtypes = [torch.float, torch.double] + get_all_complex_dtypes()
3221:    dtypes(*get_all_dtypes(include_complex=False))
3407:    dtypes(*list(product(get_all_dtypes(include_bool=False),
3408:                          get_all_dtypes(include_bool=False))))
3504:    dtypes(*product(get_all_dtypes(include_complex=False, include_bfloat16=False),
3505:                     get_all_dtypes(include_complex=False, include_bfloat16=False)))
3516:            if x.dtype in get_all_int_dtypes() + [torch.bool]:
3643:    dtypes(*product(get_all_dtypes(include_complex=False,
3645:                     get_all_dtypes(include_complex=False,
```

</p>
</details>

<details>
<summary>

`test/test_complex.py`</summary>

<p>

```python
6:from torch.testing._internal.common_dtype import get_all_complex_dtypes
11:    dtypes(*get_all_complex_dtypes())
```

</p>
</details>

<details>
<summary>

`test/test_foreach.py`</summary>

<p>

```python
18:    get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes,
142:            if dtype in get_all_int_dtypes():
179:            disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool]
201:            disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool]
205:                disable_fastpath |= dtype in get_all_int_dtypes() + [torch.bool]
211:                disable_fastpath |= dtype not in get_all_complex_dtypes()
241:                bool_int_div = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool]
246:                    disable_fastpath |= dtype in get_all_int_dtypes() + [torch.bool]
248:                    disable_fastpath |= dtype not in get_all_complex_dtypes()
250:                    disable_fastpath |= True and dtype not in get_all_complex_dtypes()
307:        disable_fastpath = dtype in get_all_int_dtypes() + [torch.bool]
365:        if opinfo.name == "_foreach_abs" and dtype in get_all_complex_dtypes():
376:    ops(foreach_unary_op_db, dtypes=get_all_dtypes())
393:         dtypes=get_all_dtypes(include_half=True, include_bfloat16=True, include_complex=False))
401:    ops(foreach_minmax_op_db, dtypes=get_all_fp_dtypes(include_bfloat16=True, include_half=True))
426:            if ord in (1, 2) and dtype in torch.testing.get_all_fp_dtypes():
439:    dtypes(*get_all_dtypes())
449:    ops(foreach_binary_op_db, dtypes=get_all_dtypes())
481:    ops(foreach_binary_op_db, dtypes=get_all_dtypes())
536:            if dtype in get_all_int_dtypes() + [torch.bool] and foreach_op == torch._foreach_div:
545:    ops(foreach_binary_op_db, dtypes=get_all_dtypes())
637:    ops(foreach_pointwise_op_db, allowed_dtypes=get_all_fp_dtypes(include_half=False, include_bfloat16=False))
```

</p>
</details>

<details>
<summary>

`test/test_linalg.py`</summary>

<p>

```python
29:    all_types, floating_types, floating_and_complex_types, get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes,
30:    get_all_fp_dtypes,
111:    dtypes(*(get_all_dtypes()))
794:        float_and_complex_dtypes = get_all_fp_dtypes() + get_all_complex_dtypes()
807:    dtypes(*(get_all_int_dtypes()))
828:    dtypes(*(get_all_fp_dtypes() + get_all_complex_dtypes()))
841:        if dtype in get_all_complex_dtypes():
844:    dtypes(*itertools.product(get_all_dtypes(),
845:                               get_all_dtypes()))
855:        for dtypes0, dtypes1, dtypes2 in product(get_all_dtypes(), repeat=3):
5607:                  *get_all_fp_dtypes(include_half=not CUDA9, include_bfloat16=(CUDA11OrLater and SM53OrLater)))
5608:    dtypes(*(set(get_all_dtypes()) - {torch.half, torch.bool}))
5644:    dtypes(*(get_all_complex_dtypes() + get_all_fp_dtypes()))
6255:    dtypesIfCUDA(*get_all_complex_dtypes(),
6256:                  *get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)),
6292:    dtypesIfCUDA(*get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater))))
6323:    dtypesIfCUDA(*get_all_complex_dtypes(),
6324:                  *get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater))))
6325:    dtypes(*get_all_complex_dtypes(), *get_all_fp_dtypes())
6358:    dtypesIfCUDA(*([torch.float, torch.double] + get_all_complex_dtypes()))
6556:    dtypes(*get_all_fp_dtypes(), *get_all_complex_dtypes())
6668:    dtypes(*get_all_fp_dtypes(), *get_all_complex_dtypes())
6741:    dtypes(*get_all_fp_dtypes(), *get_all_complex_dtypes())
```

</p>
</details>

<details>
<summary>

`test/test_nn.py`</summary>

<p>

```python
37:from torch.testing._internal.common_dtype import integral_types, get_all_fp_dtypes, get_all_math_dtypes
50:    onlyNativeDeviceTypes, deviceCountAtLeast, largeTensorTest, expectedFailureMeta, skipMeta, get_all_device_types, \
8862:                for device in get_all_device_types():
9629:            for dt1 in get_all_math_dtypes(device):
9630:                for dt2 in get_all_math_dtypes(device):
9631:                    for dt3 in get_all_math_dtypes(device):
9648:            for input_dtype in get_all_math_dtypes(device):
9664:            for input_dtype in get_all_math_dtypes(device):
13015:    dtypes(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
13034:    dtypes(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
13159:    dtypes(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
17400:    dtypesIfCUDA(*get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM))
17768:    dtypesIfCUDA(*get_all_fp_dtypes())
17773:    dtypesIfCUDA(*get_all_fp_dtypes())
17778:    dtypesIfCUDA(*get_all_fp_dtypes())
17783:    dtypesIfCUDA(*get_all_fp_dtypes())
17788:    dtypesIfCUDA(*get_all_fp_dtypes())
17793:    dtypesIfCUDA(*get_all_fp_dtypes())
17798:    dtypesIfCUDA(*get_all_fp_dtypes())
17963:    dtypesIfCUDA(*get_all_fp_dtypes())
17977:    dtypesIfCUDA(*get_all_fp_dtypes())
18684:    def test_cross_entropy_loss_prob_target_all_reductions(self, device):
```

</p>
</details>

<details>
<summary>

`test/test_numpy_interop.py`</summary>

<p>

```python
12:from torch.testing._internal.common_dtype import get_all_dtypes
399:    dtypes(*get_all_dtypes())
```

</p>
</details>

<details>
<summary>

`test/test_ops.py`</summary>

<p>

```python
12:from torch.testing._internal.common_dtype import floating_and_complex_types_and, get_all_dtypes
86:        for dtype in get_all_dtypes():
```

</p>
</details>

<details>
<summary>

`test/test_reductions.py`</summary>

<p>

```python
16:    get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes,
360:         allowed_dtypes=get_all_dtypes(include_bfloat16=False))
366:         allowed_dtypes=get_all_dtypes(include_bfloat16=False))
394:         allowed_dtypes=get_all_dtypes(include_bfloat16=False))
750:        for dtype in [dtype for dtype in get_all_math_dtypes('cpu') if dtype != torch.float16]:
1404:    dtypes(*get_all_dtypes(include_bool=False, include_complex=False))
1457:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1458:              get_all_complex_dtypes()))
1465:            return dtype in get_all_int_dtypes()
1494:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)))
1501:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)))
1507:    dtypes(*(get_all_complex_dtypes()))
1514:        dtypes = list(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))
1523:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)))
1531:        if dtype in get_all_fp_dtypes():
1608:    dtypes(*(get_all_dtypes(include_half=True, include_bfloat16=False,
1837:    dtypes(*get_all_dtypes(include_bool=False, include_complex=False))
1855:    dtypes(*(set(get_all_dtypes(include_bool=False, include_complex=False)) - {torch.uint8}))
3219:        for dtype in get_all_dtypes(include_half=True, include_bfloat16=False,
```

</p>
</details>

<details>
<summary>

`test/test_serialization.py`</summary>

<p>

```python
26:from torch.testing._internal.common_dtype import get_all_dtypes
586:        for device, dtype in product(devices, get_all_dtypes()):
589:            for other_dtype in get_all_dtypes():
```

</p>
</details>

<details>
<summary>

`test/test_shape_ops.py`</summary>

<p>

```python
18:from torch.testing._internal.common_dtype import get_all_dtypes
230:    dtypes(*get_all_dtypes(include_complex=False, include_bool=False, include_half=False,
232:    dtypesIfCUDA(*get_all_dtypes(include_complex=False, include_bool=False, include_bfloat16=False))
344:    dtypes(*get_all_dtypes())
443:    dtypes(*get_all_dtypes())
461:    dtypes(*get_all_dtypes())
570:    dtypes(*get_all_dtypes(include_complex=False))
```

</p>
</details>

<details>
<summary>

`test/test_sort_and_select.py`</summary>

<p>

```python
12:    all_types, all_types_and, floating_types_and, get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes,
136:    dtypes(*set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128})
231:    dtypes(*set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128})
296:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
647:    dtypesIfCUDA(*get_all_fp_dtypes())
678:    dtypesIfCUDA(*(get_all_dtypes(include_complex=False,
682:    dtypes(*(get_all_dtypes(include_complex=False, include_bool=False, include_half=False, include_bfloat16=False)))
739:    dtypesIfCPU(*set(get_all_dtypes()) - {torch.complex64, torch.complex128})
740:    dtypes(*set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128})
799:    dtypesIfCPU(*set(get_all_dtypes()) - {torch.complex64, torch.complex128})
800:    dtypes(*set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128})
```

</p>
</details>

<details>
<summary>

`test/test_sparse.py`</summary>

<p>

```python
20:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes
29:    floating_and_complex_types, floating_and_complex_types_and, get_all_dtypes, get_all_int_dtypes,
1963:            return dtype in get_all_int_dtypes()
1994:    dtypes(*get_all_dtypes(include_bool=False, include_half=False,
2103:            return dtype in get_all_int_dtypes()
2138:    dtypes(*get_all_dtypes(include_bool=False, include_half=False,
2626:        all_sparse_dtypes = get_all_dtypes(include_complex=True)
2633:        all_sparse_dtypes = get_all_dtypes(include_complex=True)
3230:    dtypes(*get_all_complex_dtypes(),
3231:            *get_all_fp_dtypes(include_half=False, include_bfloat16=False))
3234:                  *get_all_fp_dtypes(
```

</p>
</details>

<details>
<summary>

`test/test_sparse_csr.py`</summary>

<p>

```python
7:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes, floating_and_complex_types, make_tensor
17:from torch.testing._internal.common_dtype import floating_types, get_all_dtypes
120:    dtypes(*get_all_dtypes())
133:    dtypes(*get_all_dtypes())
150:    dtypes(*get_all_dtypes())
180:    dtypes(*get_all_dtypes())
201:    dtypes(*get_all_dtypes())
210:    dtypes(*get_all_dtypes())
225:    dtypes(*get_all_dtypes())
244:    dtypes(*get_all_dtypes())
263:    dtypes(*get_all_dtypes())
285:    dtypes(*get_all_dtypes())
411:    dtypes(*get_all_dtypes())
482:    dtypes(*get_all_dtypes())
502:    dtypes(*get_all_dtypes())
562:    dtypes(*get_all_dtypes())
588:    dtypesIfCUDA(*get_all_complex_dtypes(),
589:                  *get_all_fp_dtypes(include_half=SM53OrLater, include_bfloat16=SM80OrLater))
745:    dtypesIfCUDA(*get_all_complex_dtypes(),
746:                  *get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC,
765:    dtypesIfCUDA(*get_all_complex_dtypes(),
766:                  *get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC,
801:                  *torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater,
841:                  *torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater,
1182:    dtypes(*get_all_dtypes())
1276:    dtypes(*get_all_dtypes(include_bool=False, include_half=False, include_bfloat16=False))
1286:    dtypes(*get_all_dtypes())
```

</p>
</details>

<details>
<summary>

`test/test_tensor_creation_ops.py`</summary>

<p>

```python
21:    onlyCUDA, skipCPUIf, dtypesIfCUDA, skipMeta, get_all_device_types)
23:    get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes
150:        for dt in get_all_dtypes():
160:        for dt in get_all_dtypes():
314:        dtypes = [dtype for dtype in get_all_dtypes() if dtype != torch.bfloat16]
1012:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1013:              get_all_complex_dtypes()))
1032:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1033:              get_all_complex_dtypes()))
1050:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1051:              get_all_complex_dtypes()))
1745:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1779:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1868:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1926:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
1954:            do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device)
1956:            do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, None)
1957:            do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device)
2538:        for device in get_all_device_types():
2645:        for dtype in get_all_dtypes():
2678:    dtypes(*(get_all_fp_dtypes(include_half=False, include_bfloat16=False) +
2679:              get_all_complex_dtypes()))
2716:    dtypes(*get_all_fp_dtypes(include_half=False, include_bfloat16=False))
2827:            for dt in get_all_dtypes():
2913:    dtypes(*get_all_dtypes(include_bool=False, include_half=False))
2914:    dtypesIfCUDA(*get_all_dtypes(include_bool=False, include_half=True))
3028:    dtypes(*(get_all_fp_dtypes() + get_all_complex_dtypes()))
3033:    dtypes(*(get_all_fp_dtypes() + get_all_complex_dtypes()))
3074:    dtypes(*get_all_dtypes(include_bool=False, include_half=False, include_complex=False))
3075:    dtypesIfCUDA(*((get_all_int_dtypes() + [torch.float32, torch.float16, torch.bfloat16])
3077:                    else get_all_dtypes(include_bool=False, include_half=True, include_complex=False)))
3873:    dtypes(*get_all_dtypes())
3884:    dtypes(*get_all_dtypes(include_bool=False))
3916:            for other in get_all_dtypes():
3922:    dtypes(*get_all_dtypes())
3932:    dtypes(*get_all_dtypes(include_bool=False))
3955:    dtypes(*get_all_dtypes(include_bool=False))
3961:    dtypes(*get_all_dtypes(include_bool=False))
3965:    dtypes(*get_all_dtypes())
```

</p>
</details>

<details>
<summary>

`test/test_testing.py`</summary>

<p>

```python
25:from torch.testing._internal.common_dtype import get_all_dtypes
31:    dtypes(*(get_all_dtypes(include_half=True, include_bfloat16=False,
```

</p>
</details>

<details>
<summary>

`test/test_torch.py`</summary>

<p>

```python
51:    expectedAlertNondeterministic, get_all_device_types, skipXLA)
57:    get_all_fp_dtypes, get_all_int_dtypes, get_all_math_dtypes, get_all_dtypes, get_all_complex_dtypes
296:            for d in get_all_device_types():
323:            for device in get_all_device_types():
324:                for dt1 in get_all_dtypes():
325:                    for dt2 in get_all_dtypes():
343:            all_dtypes = get_all_dtypes()
350:            all_dtypes = get_all_dtypes()
781:            for dtype in get_all_dtypes():
986:            for device in get_all_device_types():
1017:            for device in get_all_device_types():
1018:                for dtype in get_all_math_dtypes(device):
2792:            for device in get_all_device_types():
3186:    dtypes(*get_all_dtypes())
3195:        for error_dtype in get_all_dtypes():
3203:    dtypes(*get_all_dtypes())
3212:        for error_dtype in get_all_dtypes():
4539:    dtypes(*get_all_fp_dtypes())
4545:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
4577:    dtypes(*get_all_fp_dtypes(include_half=False, include_bfloat16=False))
4578:    dtypesIfCPU(*(get_all_fp_dtypes(include_half=False, include_bfloat16=True)))
4579:    dtypesIfCUDA(*(get_all_fp_dtypes(include_bfloat16=False)))
4599:    dtypes(*(get_all_fp_dtypes(include_half=False, include_bfloat16=False)))
4600:    dtypesIfCPU(*(get_all_dtypes(include_half=False, include_bfloat16=False, include_complex=False)))
4601:    dtypesIfCUDA(*(get_all_dtypes(include_bfloat16=False, include_complex=False)))
4613:        for p_dtype in get_all_fp_dtypes(include_half=device.startswith('cuda'), include_bfloat16=False):
4628:    dtypes(*(get_all_fp_dtypes(include_half=False, include_bfloat16=False)))
4629:    dtypesIfCUDA(*(get_all_fp_dtypes(include_bfloat16=False)))
4640:    dtypes(*get_all_fp_dtypes())
4723:    dtypes(*get_all_fp_dtypes())
4735:    dtypes(*get_all_fp_dtypes(include_bfloat16=False))
4736:    dtypesIfCUDA(*get_all_fp_dtypes())
4747:    dtypes(*get_all_fp_dtypes())
4761:    dtypes(*get_all_fp_dtypes())
4771:    dtypes(*get_all_fp_dtypes())
4792:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
5302:    dtypes(*get_all_dtypes(include_bfloat16=False))
5322:    dtypes(*get_all_dtypes(include_half=False, include_bfloat16=False))
5323:    dtypesIfCPU(*get_all_dtypes(include_bfloat16=False))
5324:    dtypesIfCUDA(*get_all_dtypes(include_bfloat16=False))
5591:        for dt in get_all_dtypes():
5611:        for dt in get_all_dtypes():
5678:        for dt in get_all_dtypes():
5696:    dtypesIfCUDA(*set(get_all_math_dtypes('cuda')))
5697:    dtypes(*set(get_all_math_dtypes('cpu')))
5746:    dtypes(*get_all_dtypes())
5780:    dtypes(*get_all_dtypes())
5885:    dtypes(*get_all_dtypes())
5902:    dtypes(*get_all_dtypes())
5945:    dtypes(*get_all_dtypes())
5979:    dtypes(*get_all_dtypes(include_bool=False))
6049:    dtypes(*get_all_dtypes(include_bool=False))
6092:    dtypes(*(get_all_fp_dtypes(include_bfloat16=False, include_half=False) +
6093:              get_all_complex_dtypes()))
6094:    dtypesIfCPU(*get_all_dtypes())
6095:    dtypesIfCUDA(*get_all_dtypes())
6122:    dtypes(*(get_all_fp_dtypes(include_bfloat16=False, include_half=False) +
6123:              get_all_complex_dtypes()))
6124:    dtypesIfCPU(*get_all_dtypes())
6125:    dtypesIfCUDA(*get_all_dtypes())
6163:    dtypes(*(get_all_fp_dtypes(include_bfloat16=False, include_half=False) +
6164:              get_all_complex_dtypes()))
6165:    dtypesIfCPU(*get_all_dtypes())
6166:    dtypesIfCUDA(*get_all_dtypes())
6190:    dtypes(*(get_all_complex_dtypes() +
6191:              get_all_int_dtypes()))
6238:    dtypes(*get_all_dtypes())
6323:    dtypes(*get_all_dtypes())
6389:    dtypes(*product(get_all_dtypes(), (torch.uint8, torch.bool)))
6699:    dtypesIfCUDA(*set(get_all_math_dtypes('cuda')))
6700:    dtypes(*set(get_all_math_dtypes('cpu')))
7452:    dtypes(*get_all_dtypes(include_bool=False))
7461:    dtypes(*get_all_dtypes(include_bool=False))
7477:    dtypes(*get_all_dtypes(include_bool=False))
7496:    dtypes(*get_all_dtypes(include_bool=False))
7538:    dtypes(*get_all_dtypes(include_bool=False))
8162:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes() +
8163:              get_all_complex_dtypes()))
8175:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes() +
8176:              get_all_complex_dtypes()))
```

</p>
</details>

<details>
<summary>

`test/test_type_promotion.py`</summary>

<p>

```python
14:    get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes
187:        for dtype in get_all_dtypes():
262:        dtypes1 = get_all_math_dtypes('cuda')
263:        dtypes2 = get_all_math_dtypes(device)
339:    dtypes(*itertools.product(get_all_dtypes(), get_all_dtypes()))
468:            for dt1 in get_all_math_dtypes(device):
469:                for dt2 in get_all_math_dtypes(device):
519:            for dt1 in get_all_math_dtypes(device):
520:                for dt2 in get_all_math_dtypes(device):
528:        for dt in get_all_math_dtypes(device):
561:        for dtype in get_all_dtypes():
766:                                          dtypes=get_all_math_dtypes(device))
771:                                          dtypes=get_all_math_dtypes(device))
782:                                          dtypes=get_all_math_dtypes(device))
879:        dtypes = get_all_dtypes(include_bfloat16=False)
898:        dtypes = get_all_dtypes(include_bfloat16=False, include_bool=False)
965:    dtypesIfCUDA(*itertools.product(get_all_dtypes(include_bfloat16=False, include_complex=False),
966:                                     get_all_dtypes(include_bfloat16=False, include_complex=False)))
967:    dtypes(*itertools.product(get_all_dtypes(include_half=False, include_bfloat16=False,
969:                               get_all_dtypes(include_half=False, include_bfloat16=False,
976:            return dtype in get_all_int_dtypes() + [torch.bool]
979:            return dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False)
```

</p>
</details>

<details>
<summary>

`test/test_unary_ufuncs.py`</summary>

<p>

```python
24:    floating_types_and, all_types_and_complex_and, floating_and_complex_types_and, get_all_dtypes, get_all_math_dtypes,
25:    get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes
517:    dtypes(*(get_all_int_dtypes() + [torch.bool] +
518:              get_all_fp_dtypes(include_bfloat16=False)))
596:    dtypes(*get_all_fp_dtypes(include_half=True, include_bfloat16=False))
611:        invalid_input_dtypes = get_all_int_dtypes() + \
612:            get_all_complex_dtypes() + \
619:        for dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False):
1048:    dtypes(*get_all_math_dtypes('cpu'))
1182:    dtypesIfCUDA(*get_all_fp_dtypes())
1190:    dtypesIfCUDA(*get_all_fp_dtypes())
1205:    dtypesIfCUDA(*get_all_fp_dtypes())
1215:    dtypesIfCUDA(*get_all_fp_dtypes())
1307:    dtypes(*(get_all_dtypes(include_bool=False)))
1349:    dtypes(*(get_all_fp_dtypes(include_half=False) +
1350:              get_all_complex_dtypes()))
1351:    dtypesIfCUDA(*(get_all_fp_dtypes(include_half=True) +
1352:                    get_all_complex_dtypes()))
```

</p>
</details>

<details>
<summary>

`test/test_view_ops.py`</summary>

<p>

```python
19:    get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes
124:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
131:    dtypes(*get_all_dtypes(include_bfloat16=False))
213:            for view_dtype in [*get_all_fp_dtypes(), *get_all_complex_dtypes()]:
220:    dtypes(*get_all_dtypes())
224:        for view_dtype in get_all_dtypes():
305:    dtypes(*get_all_complex_dtypes(include_complex32=True))
343:    dtypes(*get_all_dtypes())
354:    dtypes(*get_all_dtypes())
364:    dtypes(*get_all_dtypes())
374:    dtypes(*get_all_dtypes())
384:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes()))
395:    dtypes(*get_all_complex_dtypes())
426:    dtypes(*get_all_complex_dtypes())
451:    dtypes(*product(get_all_complex_dtypes(), get_all_dtypes()))
1263:    dtypes(*(torch.testing.get_all_dtypes()))
1279:    dtypes(*(torch.testing.get_all_dtypes()))
1405:    dtypes(*(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) +
1406:              get_all_complex_dtypes()))
1471:    dtypes(*get_all_dtypes(include_bfloat16=False))
1574:    dtypes(*get_all_dtypes())
1601:    dtypes(*get_all_dtypes(include_bfloat16=False))
1632:    dtypes(*get_all_dtypes(include_bfloat16=False))
1711:        for dt in get_all_dtypes():
1717:        for dt in get_all_dtypes():
1724:        for dt in get_all_dtypes():
```

</p>
</details>

I'm looking forward to your viewpoints. Thanks :)

cc: mruberry kshitij12345 anjali411

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71561

Reviewed By: samdow

Differential Revision: D34856571

Pulled By: mruberry

fbshipit-source-id: 0dca038bcad5cf69906245c496d2e61ac3876335
(cherry picked from commit b058f67b4313143efa714ab105f36e74083131b9)
2022-03-15 20:31:41 +00:00
Douglas Lehr
28549b618a [ROCm] Enable skipped ROCm unit tests (#67706)
Summary:
A number of ROCm tests were skipped via the skipCUDAIfRocm flag.
A majority of the testcases are now supported on the ROCm platform. This fix enabled all of the test_ops tests for ROCm and enables most Operators in  common_methods_invocations.py minus the SpectralFuncInfo class which still has some fft issues.

Partially Fixes https://github.com/pytorch/pytorch/issues/51303

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67706

Reviewed By: seemethere, janeyx99

Differential Revision: D34153457

Pulled By: malfet

fbshipit-source-id: 95f4420f306ca7580cd438d3b5cc0b24efbfae99
(cherry picked from commit 0d178fffd3)
2022-02-11 22:14:54 +00:00
David Berard
bbd42c605a [JIT] Opinfo tests for nnc fusion - retry (#72486)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72486

Retry #70465.

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D34061628

Pulled By: davidberard98

fbshipit-source-id: e27ed315bc4ad57cdbfbc9cedffcbb7886004524
(cherry picked from commit 7937808d2e)
2022-02-09 19:01:22 +00:00
Nikita Shulga
bb101ec78d Revert D33595240: [JIT] Opinfo tests for nnc fusion
Test Plan: revert-hammer

Differential Revision:
D33595240 (0b57bd4c66)

Original commit changeset: e2e17a921bc3

Original Phabricator Diff: D33595240 (0b57bd4c66)

fbshipit-source-id: 172a3ffd19d180b1b3617956b1f881be62f37bc9
(cherry picked from commit 324cfaea86)
2022-02-08 01:28:42 +00:00
David Berard
0b57bd4c66 [JIT] Opinfo tests for nnc fusion (#70465)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70465

These tests check to ensure that
(a) the result after nnc fusion (of a single op) is the same as the
unfused op
(b) for certain ops where fusion is expected to occur, ensure that
fusion does actually occur

Test Plan: Imported from OSS

Reviewed By: wenleix

Differential Revision: D33595240

Pulled By: davidberard98

fbshipit-source-id: e2e17a921bc30c313e92e8e5bbc6c1b5fcd14bc1
(cherry picked from commit b1ba221acc)
2022-02-07 20:56:21 +00:00
lezcano
6cb128c8dd Generalize noncontiguous tests to several outputs (#67996)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67996

This is necessary for most matrix decompositions in `linalg`.

cc mruberry

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33774418

Pulled By: mruberry

fbshipit-source-id: 576f2dda9d484808b4acf0621514c0ffe26834e6
(cherry picked from commit fb07c50aa9)
2022-01-27 23:13:17 +00:00
lezcano
e2011b29aa Add OpInfo test to check that floating point inputs in OpInfos have requires_grad set to True (#69909)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69909

This test detected a number of sampling methods that were not generating
the samples as expected, e.g. `index_put`, `cosine_embedding`, `stft`, but
perhaps most notably the generator for `BinOps`.

It also detected that `reminder` and `fmod` did not have implemented the
backward formula for the second input. I added this in the previous PR.

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D33774422

Pulled By: mruberry

fbshipit-source-id: 76cfc75b1fdfd72ee64aa524665f83a75fe52509
(cherry picked from commit 13ea7b436b)
2022-01-27 23:13:17 +00:00
lezcano
8ff1a8fdca Implement forward AD for linalg.svd and improve svd_backward (#70253)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70253

I included a derivation of the formula in the complex case, as it is
particularly tricky. As far as I know, this is the first time this formula
is derived in the literature.

I also implemented a more efficient and more accurate version of svd_backward.
More importantly, I also added a lax check in the complex case making sure the loss
function just depends on the subspaces spanned by the pairs of singular
vectors, and not their joint phase.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33751982

Pulled By: mruberry

fbshipit-source-id: c2a4a92a921a732357e99c01ccb563813b1af512
(cherry picked from commit 391319ed8f)
2022-01-27 18:38:30 +00:00
soulitzer
5ccf28d066 Do not use ZeroTensor for inplace ops (#69998)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69998

Fixes: https://github.com/pytorch/pytorch/issues/69855

The check for undefined grads for forward AD was not being run because `check_undefined_grads` was only passed as True by OpInfo for backward AD. This PR updates gradcheck to interpret `check_undefined_grads` as possibly for forward or backward AD.

This PR also updates codegen to 1) not use ZeroTensor for `self` when the op is inplace. 2) only create zeros (either through ZeroTensor or at::zeros) if the tensor itself is not undefined. Previously we would error in this case when we call `.options` on the undefined tensor.

~TODO: undo the skips that are due to the original issue~

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D33235973

Pulled By: soulitzer

fbshipit-source-id: 5769b6d6ca123b2bed31dc2bc6bc8e4701581891
2021-12-23 15:52:34 -08:00
Peter Bell
7cdfd86a72 TestMathBits: test with neg and conj bit set (#68948)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68948

The case where both the negative and conjugate bits are set
isn't tested currently despite being handled explicitly by `copy`.
In theory this shouldn't matter because neg_bit is only used for real
values, but it does mean the code in copy is untested. So, this just
runs it with a single sample as a sanity check.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33064371

Pulled By: anjali411

fbshipit-source-id: e90c65e311507c4fc618ff74fecc4929599c4fa3
2021-12-22 14:30:35 -08:00
soulitzer
47f11730ec Add testing for forward over reverse gradgrad (#69740)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69740

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33031727

Pulled By: soulitzer

fbshipit-source-id: 2bcba422b4bcea3bbc936d07ba45171a6531e578
2021-12-14 23:35:10 -08:00
Peter Bell
1188d89a1d TestMathBits: Call functions with original sample input values (#68947)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68947

`_test_math_view` currently calls the operator with different values
than those specified in the `SampleInput`. This is undesirable as it
could break mathematical properties required by the operator. Instead,
this calls `math_op_view(math_op_physical(sample.input))` to get a
view that represents the same value as the original input.

`test_neg_view` already did this by returning `torch._neg_view(-x)`
from `math_op_view` but this moves the handling into `_test_math_view`
to make it apply to all view op tests.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D33064327

Pulled By: anjali411

fbshipit-source-id: 4d87e0c04fc39b95f8dc30dcabda0d554d16a1d8
2021-12-14 11:10:13 -08:00
soulitzer
db32daf4b2 Do not test batched forward grad for inplace ops (#69558)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69558

Currently we skip batched forward grad checks completely for certain views that also have inplace variants. This PR allow us to decouple the check.

Alternative: just skip the batched forward checks for inplace ops entirely. I'm okay with this because it was surprising to me these checks are being run in the first place.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33020599

Pulled By: soulitzer

fbshipit-source-id: f8012aadc0e775f80da0ab62b2c11f6645bb1f51
2021-12-12 00:09:45 -08:00
Peter Bell
6de9f0fc94 OpInfo: Allow sample_inputs_func to be any iterable (#69256)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69256

Closes #52486

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D32942008

Pulled By: mruberry

fbshipit-source-id: f5b01b0298c0160b0bec6e86e2b6db8cfe746206
2021-12-09 08:37:26 -08:00
Mike Ruberry
b6f41bb848 The Jiterator (#69439)
Summary:
This PR:

- creates the "jiterator" pattern, allowing elementwise unary and binary kernels that don't accept scalars to be jit compiled when called
- ports the gcd and i1 CUDA kernels to use the jiterator
- extends elementwise binary systemic testing to be comparable to elementwise unary systemic testing
- separates one test case from test_out in test_ops.py
- updates more OpInfos to use expected failures instead of skips

The jiterator currently does not support half, bfloat16 or complex dtypes. It also (as mentioned above) doesn't support scalar inputs. In the future we expect to add support for those datatypes and scalars.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/69439

Reviewed By: ngimel

Differential Revision: D32874968

Pulled By: mruberry

fbshipit-source-id: d44bb9cde4f602703e75400ec5a0b209f085e9b3
2021-12-06 07:32:48 -08:00