Commit Graph

1250 Commits

Author SHA1 Message Date
Nikita Shulga
8811e4d00d Add/fix typing annotations to some functions (#39075)
Summary:
Add missing typing imports to some jit tests
Add typing annotations to `torch.testing._compare_scalars_internal` and `torch.testing._internal.assertTrue`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39075

Differential Revision: D21882468

Pulled By: malfet

fbshipit-source-id: dd9858eb8e11a38411544cc64daf36fced807d76
2020-06-04 13:40:04 -07:00
Xiong Wei
fe684679b0 Fix overflow issues when unpacking large numbers (#39140)
Summary:
Resolve https://github.com/pytorch/pytorch/issues/33111

relax the overflow and precision lost checks when unpacking doubles.

Signed-off-by: Xiong Wei <xiongw.fnst@cn.fujitsu.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39140

Differential Revision: D21885217

Pulled By: ezyang

fbshipit-source-id: e2bbe90d719443ea2e1c6b7b2c637f9a943fa5c0
2020-06-04 12:24:24 -07:00
krshrimali
335e4a1e3b Add arcosh, arcsinh and arctanh to unary ops (#38388)
Summary:
This PR aims to add `arcosh`, `arcsinh` and `arctanh` support. Please see issue https://github.com/pytorch/pytorch/issues/38349 for more details.

**TODOs:**

* [x] Add test cases for `arcosh`, `arcsinh` and `arctanh`. (need help)
* [x] Overload ops if `std::op` does not work with `thrust::complex` types (like for `sinh`, `cosh`).

Note: `std::acosh, std::asinh, std::atanh` do not support `thrust::complex` types. Added support for complex types for these 3 ops (`arccosh, arcsinh, arctanh`)

cc: mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38388

Differential Revision: D21882055

Pulled By: mruberry

fbshipit-source-id: d334590b47c5a89e491a002c3e41e6ffa89000e3
2020-06-04 11:40:55 -07:00
Aayush Naik
0829cadca3 Implement rad2deg, deg2rad (#38852)
Summary:
Resolves https://github.com/pytorch/pytorch/issues/38372.

cc mruberry
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38852

Differential Revision: D21868935

Pulled By: mruberry

fbshipit-source-id: ae6ded11b743c9d1cdc032984b4abe0a115290d6
2020-06-03 22:21:54 -07:00
anjali411
3370c045ae Remove copy_imag and copy_real methods (#39065)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39065

Test Plan: Imported from OSS

Differential Revision: D21803939

Pulled By: anjali411

fbshipit-source-id: c7313c527eb6b54d49ef46aa0a839a3418fa8d7e
2020-06-03 18:22:50 -07:00
ShawnZhong
cb530fcd3c Enable some test cases in test_memory_format_operators (#38648)
Summary:
Re-enable some test cases in `test_memory_format_operators` since their corresponding issue has been fixed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38648

Differential Revision: D21689085

Pulled By: VitalyFedyunin

fbshipit-source-id: 0aa09e0bf31ba98c8ad0191ac3afd31dda0f1d42
2020-06-03 16:02:49 -07:00
Mike Ruberry
9ed5efda47 Adds TestCase.compare_with_numpy (#39179)
Summary:
Cut from https://github.com/pytorch/pytorch/pull/38994.

This is a helper function for comparing torch and NumPy behavior. It updates the existing and increasingly popular _np_compare function and moves it to be a method on TestCase.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39179

Differential Revision: D21855082

Pulled By: mruberry

fbshipit-source-id: edca3b78ae392d32243b02bf61960898b6ba590f
2020-06-03 15:27:32 -07:00
JackCaoG
46447045ea Replace torch.allClose with self.assertEqual (#39424)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/39424

Reviewed By: Krovatkin

Differential Revision: D21854870

Pulled By: ailzhang

fbshipit-source-id: eb68f1775596e4c963169033444d6d6f4f818d4f
2020-06-03 12:40:50 -07:00
kshitij12345
884e16b41a as_strided : add size and stride length check (#39301)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/39281
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39301

Differential Revision: D21849082

Pulled By: gchanan

fbshipit-source-id: 5d30ef10767c4d35c6cb59c5e6a9acbfe0270a40
2020-06-03 09:17:54 -07:00
Peter Bell
7417b4c66f Fix index overflow in ConvTranspose3d [attempt 2] (#39198)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/32866, resubmit of https://github.com/pytorch/pytorch/issues/38970

The memory error in the issue is caused by int overflowing in col2vol. This version using mixed 32-bit and 64-bit indexing calculation lifts the maximum indexing possible without compromising the performance of ConvTranspose3d. vs 20-30% regression with pure 64-bit indexing.

This requires that input.numel() <= UINT_MAX, and channels * kernel.numel() <= UINT_MAX otherwise it raises an error. Previously, the code would crash or give incorrect results unless input.numel() * kernel.numel() <= INT_MAX.

Note that the test is a minimised reproducer for the issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39198

Differential Revision: D21817836

Pulled By: ezyang

fbshipit-source-id: b9adfe9f9dd00f04435be132966b33ac6b9efbef
2020-06-03 07:06:54 -07:00
kshitij12345
09bea13981 support flip and rot90 for complex dtype (#37826)
Summary:
Closes https://github.com/pytorch/pytorch/issues/37698
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37826

Differential Revision: D21657697

Pulled By: mruberry

fbshipit-source-id: 16a3899d5de280da692a52bd0ce85d5ebe14cc31
2020-06-02 13:03:14 -07:00
Xiang Gao
48e66859c1 Check illegal output dtype for torch.{min, max} (#38850)
Summary:
The test is currently only enabled for CPU, and it will be enabled for CUDA after the migration of `min` and `max` from THC to ATen is done.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38850

Differential Revision: D21819388

Pulled By: ngimel

fbshipit-source-id: 406343e96bccbf9139eb1f8f2d49ed530dd83d62
2020-06-01 16:09:39 -07:00
guol-fnst
7773a45c0d Division by zero crashes for fmod operator(#32699) (#38919)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38919

Differential Revision: D21791648

Pulled By: anjali411

fbshipit-source-id: 447ded74fa52377b04c1b2271a0b3eb5b8e4eeed
2020-06-01 07:48:52 -07:00
anjali411
a50d781c03 Added real and imag views as tensor attributes (#39033)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39033

Added `real` and `imag` views as tensor attributes. Right now, tensor.imag is disabled for real tensors. This is because if we return a new tensor of zeros, the user would be able to update the tensor returned by tensor.imag which should not be allowed as numpy returns a read-only array, and pytorch doesn't support read-only tensors yet.

TODO in follow-up PRs:
1. add a setter for `real` and `imag`
2. add special case in codegen for `real` and `imag` backward functions.
3. remove `copy_real` and `copy_imag` methods.

Test Plan: Imported from OSS

Differential Revision: D21767542

Pulled By: anjali411

fbshipit-source-id: 539febf01f01ff055e3fbc7e9ff01fd3fe729056
2020-05-29 12:31:51 -07:00
kshitij12345
10e2126b10 support complex types for cumsum, cumprod (#39063)
Summary:
Adds complex support to `cumsum`, `cumprod` and relevant test update in `test_torch::tensor_op_tests`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39063

Differential Revision: D21771186

Pulled By: anjali411

fbshipit-source-id: 632916d4bdbd1c0941001898ab8146be2b7884fc
2020-05-29 09:36:26 -07:00
Natalia Gimelshein
4b5e87f94a Revert D21751663: [pytorch][PR] Fix argmin/max bug
Test Plan: revert-hammer

Differential Revision:
D21751663

Original commit changeset: 6d55e4bb7834

fbshipit-source-id: 5473af5650b8a14f1da32d660be43ccf027513e1
2020-05-29 09:08:46 -07:00
ShawnZhong
f7a8851e9e Fix argmin/max bug (#38946)
Summary:
Fix https://github.com/pytorch/pytorch/issues/38922

# Reproduction

-  This is correct
```py
>>> torch.zeros(1, 32767).argmax(dim=0)
tensor([0, 0, 0,  ..., 0, 0, 0])
```

- But this is not
```py
>>> torch.zeros(1, 32768).argmax(dim=0)
tensor([    0,     0,     0,  ..., 31141, 31141, 31141])
```

- Only occurs when the size of the reduced dimension is 1

```py
>>> torch.zeros(2, 327680).argmax(dim=0)
tensor([1, 1, 1,  ..., 1, 1, 1])
>>> torch.zeros(3, 327680).argmax(dim=0)
tensor([2, 2, 2,  ..., 2, 2, 2])
```

- Has something to do with the rest of the dims
```py
>>> torch.zeros(1, 327680).argmax(dim=0)
tensor([     0,      0,      0,  ..., 311296, 311296, 311296])
```
```py
>>> torch.zeros(1, 32768, 10).argmax(dim=0)
tensor([[     0,      0,      0,  ...,      0,      0,      0],
        [     0,      0,      0,  ...,      0,      0,      0],
        [     0,      0,      0,  ...,      0,      0,      0],
        ...,
        [311296, 311296, 311296,  ..., 311296, 311296, 311296],
        [311296, 311296, 311296,  ..., 311296, 311296, 311296],
        [311296, 311296, 311296,  ..., 311296, 311296, 311296]])
```

# Reason

- `resize_outputs_` is set to `false` in `reduce_op`, but the dimension is still coalesced during `TensorIterator::build()`

899a075b25/aten/src/ATen/native/TensorIterator.cpp (L703-L715)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38946

Differential Revision: D21751663

Pulled By: ngimel

fbshipit-source-id: 6d55e4bb783423b4c2df09cd3e8b87147efcbfdb
2020-05-28 19:42:07 -07:00
Mike Ruberry
ee3bd10445 Moves angle/abs test to test_torch (#39154)
Summary:
Moves test (per request).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39154

Differential Revision: D21769706

Pulled By: mruberry

fbshipit-source-id: a09d0d0a47fbcf8f0e798d57230f2fe6a9ebf6b9
2020-05-28 14:55:40 -07:00
Mike Ruberry
5e975cf8d6 Stops cross-device data movement in tensor iterator (#38998)
Summary:
**BC-breaking note:**

In previous versions of PyTorch zero dimensional CUDA tensors could be moved across devices implicitly. For example,

```
torch.tensor(5, device='cuda:0') + torch.tensor((1, 1), device='cuda:1')
```

would work, even though the tensors are on different CUDA devices. This is a frequent source of user confusion, however, and PyTorch generally does not move data across devices without it being explicit. This functionality is removed in PyTorch 1.6.

**PR Summary:**

Today in PyTorch we allow implicit data movement of zero dimensional CUDA tensors. For example, we allow:

```
torch.tensor(5, device='cuda:0') + torch.tensor((1, 1), device='cuda:1')
```

and

```
torch.tensor(2, device='cuda') + torch.tensor((3, 5))
```

In both of these cases TensorIterator would move the zero dim CUDA tensor to the device of the non-scalar tensor (cuda:1 in the first snippet, the CPU in the second snippet).

One of PyTorch's fundamental rules, however, is that it does not perform implicit data movement like this, and this change will causes these cases to throw an error. New tests for this behavior are added to test_torch.py, and tests of the old behavior are removed in test_torch.py and test_autograd.py. A cpp test in tensor_iterator_test.cpp is modified to account for the new behavior.

This addresses https://github.com/pytorch/pytorch/issues/36722.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38998

Differential Revision: D21757617

Pulled By: mruberry

fbshipit-source-id: 2498f07f4938d6de691fdbd5155ad2e881ff7fdb
2020-05-28 13:53:57 -07:00
Rohan Varma
5267b17a96 Revert D21748644: [pytorch][PR] Fix index overflow in ConvTranspose3d
Test Plan: revert-hammer

Differential Revision:
D21748644

Original commit changeset: 95060423219d

fbshipit-source-id: 73c53c8a27a29bc8edd5b9b8c80f0f938b04a845
2020-05-28 13:08:35 -07:00
Peter Bell
5702a28b26 Fix index overflow in ConvTranspose3d (#38970)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/32866

The memory error in the issue is caused by `int` overflowing in `col2vol`. This version using mixed 32-bit and 64-bit indexing calculation lifts the maximum indexing possible without compromising the performance of `ConvTranspose3d`. vs 20-30% regression with pure 64-bit indexing.

This requires that `input.numel() <= UINT_MAX`, and `channels * kernel.numel() <= UINT_MAX` otherwise it raises an error. Previously, the code would crash or give incorrect results unless `input.numel() * kernel.numel() <= INT_MAX`.

Note that the test is a minimised reproducer for the issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38970

Differential Revision: D21748644

Pulled By: ezyang

fbshipit-source-id: 95060423219dc647595e1a24b3dcac520d3aecba
2020-05-28 07:28:15 -07:00
Nikita Shulga
f5bc91f851 Get rid of multiple inheritence in test_torch (#39110)
Summary:
`_TestTorchMixin` is base class which is instantiated across multiple types.
It was inherited from `object` in order to hide it from unittest test discovery mechanism.
But this approach makes it almost impossible to use static code analyzer on the class.
This PR implements alternative approach by hiding base class into inner class, per https://stackoverflow.com/a/25695512

Change imported class access path in `test_cuda.py`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39110

Test Plan:
run `test_torch.py --discover-tests` and `test_cuda.py --discover-tests` before and after change:
```
$ python test_torch.py --discover-tests|md5sum
2ca437bb5d65700763ce04cdacf6de3e  -
$ python test_cuda.py --discover-tests|md5sum
b17df916fb0eeb6f0dd7222d7dae392c  -
```

Differential Revision: D21759265

Pulled By: malfet

fbshipit-source-id: b01b06111469e551f7b78387449975e5248f6b9e
2020-05-27 22:45:06 -07:00
Cloud Han
05f097b5bb Implement logaddexp (#38384)
Summary:
Resolve https://github.com/pytorch/pytorch/issues/38377
Related https://github.com/pytorch/pytorch/issues/38349

This op should be disambiguated with `logsumexp` which do a reduction on a tensor over a specific axis.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38384

Differential Revision: D21737336

Pulled By: mruberry

fbshipit-source-id: 7864d04ca304c0fb2937bb083583e3e3d6ef205d
2020-05-27 20:27:31 -07:00
Natalia Gimelshein
d92ef9268d Revert D21728402: Simplify precision-specification in tests.
Test Plan: revert-hammer

Differential Revision:
D21728402

Original commit changeset: 85f3daf63f1b

fbshipit-source-id: 4e2a36aca15cd8d842985173395b4e1cac7135d8
2020-05-27 17:34:28 -07:00
Ailing
20397285c6 Replace use of np.allclose in tests. (#34287)
Summary:
fixes https://github.com/pytorch/pytorch/issues/34096
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34287

Differential Revision: D21735525

Pulled By: ailzhang

fbshipit-source-id: 611da17cfc5a3fee77d482abccf8f9854f504263
2020-05-27 15:29:35 -07:00
Mike Ruberry
4239416c72 Throws runtime error on attempted addcdiv integer division (#38762)
Summary:
1.6 Deprecation Note:

In 1.6 attempting to perform integer division using addcdiv will throw a RuntimeError, and in 1.7 the behavior will change so that addcdiv always performs a true division of its tensor1 and tensor2 inputs. See the warning in torch.addcdiv's documentation for more information.

PR Summary:

This PR updates the warning that appears when addcdiv performs integer division to throw a RuntimeError. This is intended to prevent silent errors when torch.addcdiv's behavior is changed to always perform true division in 1.7. The documentation is updated (slightly) to reflect this, as our the addcdiv tests in test_torch and test_type_promotion.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38762

Differential Revision: D21657585

Pulled By: mruberry

fbshipit-source-id: c514b44409706f2bcfeca4473424b30cc48aafbc
2020-05-27 14:40:07 -07:00
chengjinfang
c835dedce9 Fix the issue that PyTorch doesn't construct bool tensors from non-bo… (#38392)
Summary:
…ol values correctly(https://github.com/pytorch/pytorch/issues/37398)

Signed-off-by: chengjinfang <chengjf@cn.fujitsu.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38392

Differential Revision: D21737009

Pulled By: mruberry

fbshipit-source-id: c77d8c940af95f5011fe008b48ea0d16c3f501d1
2020-05-27 13:59:28 -07:00
Brian
df4066bbb6 Simplify precision-specification in tests. (#37181)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37181

Now that assertEquals considers dtypes in determining tolerance, most
tests don't need explicitly set precision.

Those that do are a few half precision tests on cuda. In this PR, those
are broken out to be handled explicitly, though we may also want to
consider further loosening the tolerance on half-precision.

Test Plan: Imported from OSS

Differential Revision: D21728402

Pulled By: nairbv

fbshipit-source-id: 85f3daf63f1bdbb5101e8dea8c125f13448ca228
2020-05-27 12:05:33 -07:00
Mike Ruberry
13120bf677 Updates assertEqual to require atol and rtol, removes positional atol (#38872)
Summary:
This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument.

In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872

Differential Revision: D21740237

Pulled By: mruberry

fbshipit-source-id: acbc027aa1d7877a49664d94db9a5fff91a07042
2020-05-27 06:31:07 -07:00
Rohan Varma
63e545e0fe Revert D21717199: [pytorch][PR] Updates assertEqual to require atol and rtol, removes positional atol
Test Plan: revert-hammer

Differential Revision:
D21717199

Original commit changeset: 9feb856f94ee

fbshipit-source-id: bfde9c39a5ce99f0ca6183a7dde703c65b7c8259
2020-05-26 18:23:59 -07:00
ShawnZhong
12c219de54 Fix histc with empty tensor error (#38987)
Summary:
Fix https://github.com/pytorch/pytorch/issues/38979

The error in mentioned https://github.com/pytorch/pytorch/issues/38979 is a [`cudaErrorInvalidConfiguration` error](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html#group__CUDART__TYPES_1g3f51e3575c2178246db0a94a430e0038):
> This indicates that a kernel launch is requesting resources that can never be satisfied by the current device. Requesting more shared memory per block than the device supports will trigger this error, as will requesting too many threads or blocks. See cudaDeviceProp for more device limitations.

This is because we are trying to launch a kernel with block size 0.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38987

Differential Revision: D21722993

Pulled By: ezyang

fbshipit-source-id: 2c283e0a9f542b4acb96e895a43b991ccac808fe
2020-05-26 13:19:13 -07:00
Mike Ruberry
6ddca30b2d Updates assertEqual to require atol and rtol, removes positional atol (#38872)
Summary:
This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument.

In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872

Differential Revision: D21717199

Pulled By: mruberry

fbshipit-source-id: 9feb856f94eee911b44f6c7140a1d07c1b026d3a
2020-05-26 08:30:23 -07:00
Brian
389e16c33b torch.pow Add type promotion support and fix issue with __rpow__ (#37098)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37098

### **Cherry-picked from another stack:**
Some code review already occurred here: https://github.com/pytorch/pytorch/pull/32582

### Summary:

Fixes: https://github.com/pytorch/pytorch/issues/32436

The issue caused incorrect handling of dtypes for scalar ** tensor.
e.g. before this change:
```
>>> 5.5 ** torch.ones(5, dtype=torch.int32)
tensor([5, 5, 5, 5, 5], dtype=torch.int32)
```
should return a float tensor.

Also fixes a number of incorrect cases:
 * tensors to negative powers were giving incorrect results (1 instead
    of 0 or error)
 * Behavior wasn't consistent between cuda/cpu
 * large_value ** 1 in some cases gave a result not equal
    to large_value because of truncation in conversion to double and back.

BC-breaking:

Previously incorrect behavior (in 1.4):
```
>>> a
tensor([1, 1, 1, 1, 1], dtype=torch.int32)
>>> a.pow_(.5)
tensor([1, 1, 1, 1, 1], dtype=torch.int32)
```

After this change:
`RuntimeError: result type Float can't be cast to the desired output type Int`

Test Plan: Imported from OSS

Differential Revision: D21686207

Pulled By: nairbv

fbshipit-source-id: e797e7b195d224fa46404f668bb714e312ea78ac
2020-05-26 08:29:51 -07:00
Xiang Gao
7e6f6f522f [PATCH] Migrate min from THC to ATen and remove _min (#38440)
Summary:
Related issue: https://github.com/pytorch/pytorch/issues/36900

Since I feel this PR is already large enough, I didn't migrate max in this PR. Legacy code is not cleaned up either. All these remaining work will be done in later PRs after this is merged.

Benchmark on an extreme case
```python
import torch
print(torch.__version__)

t = torch.randn(100000, 2, device='cuda')

warmup = torch.arange(100000000)
torch.cuda.synchronize()

%timeit t.min(dim=0); torch.cuda.synchronize()
```
Before: 4ms; After: 24.5us.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38440

Differential Revision: D21560691

Pulled By: ngimel
2020-05-26 08:10:38 -07:00
kshitij12345
3487744821 Add torch.logcumsumexp (#36308)
Summary:
Creating new PR as I am unable to push to pandeykartikey 's branch as I don't have the permissions.

Closes https://github.com/pytorch/pytorch/issues/26411

Based on https://github.com/pytorch/pytorch/issues/32876 Thanks pandeykartikey for starting this out.

Have addressed the comments.

anjali411 agadetsky albanD
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36308

Differential Revision: D21648573

Pulled By: albanD

fbshipit-source-id: bc1a8fc4ab474a1148298117a1549b0e46f7c3ff
2020-05-21 09:12:31 -07:00
rohithkrn
1ea80b4234 [ROCm] Set correct tolerance values for bfloat16 div tests (#38823)
Summary:
This PR fixes the tolerance values for some of the bfloat16 div tests that were enabled on ROCm with incorrect tolerance values in the PR https://github.com/pytorch/pytorch/pull/38621

Also disabled(to unblock CI) `test_addcdiv*` for which the error is large when absolute values in the tensor are higher. This will have to be investigated further.

ezyang jeffdaily sunway513
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38823

Differential Revision: D21686290

Pulled By: ezyang

fbshipit-source-id: 85472680e1886bdc7c227ed2656e0b4fd5328e46
2020-05-21 07:29:49 -07:00
Nik Ved
f80df4ca79 port scatter_add to ATen (CUDA) (#38262)
Summary:
Fixes [https://github.com/pytorch/pytorch/issues/24622 ](https://github.com/pytorch/pytorch/issues/24622).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38262

Differential Revision: D21656729

Pulled By: ngimel

fbshipit-source-id: 63dcbf8eeaf59d8295bf4e5c8bb9d28ad165d4eb
2020-05-20 19:03:41 -07:00
kshitij12345
3b254acd99 support complex types for tanh_cuda and tanh_backward_cuda (#38786)
Summary:
Builds on https://github.com/pytorch/pytorch/issues/37791
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38786

Differential Revision: D21666138

Pulled By: anjali411

fbshipit-source-id: cbd313b8fd21109aadd614c60259b9dc505771a5
2020-05-20 12:57:40 -07:00
Mingfei Ma
fe66bdb498 port masked_select from TH to ATen and optimize perf on CPU (#33269)
Summary:
This PR ports `masked_select` from TH to ATen and optimize the performance on CPU with TensorIterator.

https://github.com/pytorch/pytorch/issues/33053

1. single socket run: up to **5.4x** speedup;
2. single core run: up to **1.16x** speedup.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33269

Differential Revision: D20922288

Pulled By: ngimel

fbshipit-source-id: 38e183a4e3599bba29bbbebe36264026abe1c50e
2020-05-20 11:36:29 -07:00
nuka137
c78691b4a6 [CPU] torch.gather for complex dtypes (#36430)
Summary:
This PR resolves https://github.com/pytorch/pytorch/issues/36340 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36430

Differential Revision: D21662139

Pulled By: anjali411

fbshipit-source-id: 361d064c1144b368afae3059c19f77abe26080a3
2020-05-20 09:15:14 -07:00
Mike Ruberry
7587188037 Skips test_float_to_int_conversion_finite on MacOS (#38753)
Summary:
See https://github.com/pytorch/pytorch/issues/38752.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38753

Differential Revision: D21656330

Pulled By: mruberry

fbshipit-source-id: f1f97228f31b8a0b0535b3168a7d209fefff2769
2020-05-19 21:56:48 -07:00
Mike Ruberry
64584573f9 Updates tests for integer division deprecation (#38621)
Summary:
Updates our tests in preparation of integer division using torch.div and torch.addcdiv throwing a runtime error by avoiding integer division using torch.div. This creates a brief period where integer division using torch.div is untested, but that should be OK (since it will soon throw a runtime error).

These callsites were identified using https://github.com/pytorch/pytorch/issues/36897.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38621

Differential Revision: D21612823

Pulled By: mruberry

fbshipit-source-id: 749c03a69feae02590b4395335163d9bf047e162
2020-05-19 19:28:00 -07:00
Mike Ruberry
819da00b3d Fixes floordiv dunder registrations (#38695)
Summary:
floordiv was missing a couple dunder registrations, which was causing __ifloordiv__ to not be called when it should. This adds the appropriate registrations and adds a test verifying that the inplace dunders are actually occuring inplace.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38695

Differential Revision: D21633980

Pulled By: mruberry

fbshipit-source-id: a423f5ec327cdc062fd6d9d56abd36fe44ac8198
2020-05-19 12:11:38 -07:00
Pavel Belevich
b14734d92e Add bfloat16 to CPU cauchy_kernel, log_normal_kernel, exponential_kernel (#38427)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38427

Test Plan: Imported from OSS

Differential Revision: D21640640

Pulled By: pbelevich

fbshipit-source-id: 9cff8f6b5c33b3b31753c76fc8033d329b218019
2020-05-19 10:21:36 -07:00
Pavel Belevich
35beff0b9f RNG infrastructure improvements (#37984)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37984

- `NumericUtils.h`
CUDA distribution kernels had two variants of transformation labdas(`uniform`/`normal` -> `lognormal`/`exponential`/`cauchy`/`geometric`...): for double-precision and optimized for CUDA single precision. It was done by using `::log`/`__logf`, `::exp`/`__expf` and `::tan/__tanf`. I moved them to `NumericUtils.h` and called them `at::exp`, `at::log` and `at::tan`. It allowed to unify CPU/CUDA transformation templates in `TransformationHelper.h`.

- `DistributionsHelper.h`
Made `normal_distribution`, `geometric_distribution`, `exponential_distribution`, `cauchy_distribution`, `lognormal_distribution` C10_HOST_DEVICE compatible to reuse them in CPU/CUDA distribution kernels.
Replaced explicit math with transformations from `TransformationHelper.h`

- `TransformationHelper.h`
Renamed `*_transformation` to `transformation::*`
Added clear unified host/device transformations templates `normal`, `cauchy`, `exponential`, `geometric`, `log_normal` which are used by both CPU and CUDA distribution kernels and custom PRNG distribution kernels.

- `cpu/DistributionTemplates.h`
Unified `normal_kernel`, `cauchy_kernel`, `log_normal_kernel`, `geometric_kernel`, `exponential_kernel`.

- `cuda/DistributionTemplates.h`
Extracted `UNIFORM_AND_TRANSFORM` and `NORMAL_AND_TRANSFORM` macros to reuse code between distribution kernel templates.
Unified transformation labdas(`uniform`/`normal` -> `lognormal`/`exponential`/`cauchy`/`geometric`...)

- `test_torch.py`
Added `scipy.stats.kstest` [Kolmogorov–Smirnov](https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test) tests for `uniform`/`normal`/`lognormal`/`exponential`/`cauchy` distributions and [Chi-squared](https://en.wikipedia.org/wiki/Chi-squared_test) test for `geometric` one. To make sure that our distributions are correct.

- `cpu_rng_test.cpp`, `rng_test.h`
Fixed random_()'s from and to bounds issue for floating-point types, fixed cast/overflow warnings

- `THTensorRandom.h`, `THVector.h`
Moved unnecessary includes to `THTensorRandom.cpp`

Test Plan: Imported from OSS

Differential Revision: D21477955

Pulled By: pbelevich

fbshipit-source-id: 7b793d1761a7a921c4b4a4a7d21d5d6c48f03e72
2020-05-19 10:20:39 -07:00
kshitij12345
fc19747d64 handle grad with stride=0 on GPU MvBackward (#38321)
Summary:
References : https://github.com/pytorch/pytorch/issues/38315 ,  https://github.com/pytorch/pytorch/issues/29984

cuBlas expects strides to be greater than 0.
Cloning the `grad` allocates a new vector with
non-zero strides.

For CPU, we don't clone and allocate a new vector
as CPU implementation works with stride=0.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38321

Differential Revision: D21628966

Pulled By: ngimel

fbshipit-source-id: 390caf835af6d1d77ed537b7fcc113a22c3ec301
2020-05-18 20:53:36 -07:00
anjali411
f3048609d3 [CUDA] torch.roll for complex dtypes (#38664)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38664

Test Plan: Imported from OSS

Differential Revision: D21630498

Pulled By: anjali411

fbshipit-source-id: bf43a812f3d8dd984785256bad41131410435965
2020-05-18 18:19:22 -07:00
Xiang Gao
83df3beaca Add complex support for torch.sum (#38382)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38382

Test Plan: Imported from OSS

Differential Revision: D21600127

Pulled By: anjali411

fbshipit-source-id: c5338ab10bdcebe4a281b03f78e6f2063186bc32
2020-05-15 19:49:38 -07:00
Mike Ruberry
9cfc10d52e Updates assertEqual to use torch.isclose-like logic (#37294)
Summary:
Edit: this has been updated to reflect the PR's current status, which has changed after review.

This PR updates the behavior of the assertEqual, assertNotEqual, and assert_allclose to be consistent with each other and torch.isclose. It corrects several additional bugs in the current implementations and adds extensive testing and comments, too.

These updates follow from changes to assertEqual like https://github.com/pytorch/pytorch/pull/34258 and https://github.com/pytorch/pytorch/pull/37069, and from our discussion of torch.isclose for complex tensors (see https://github.com/pytorch/pytorch/issues/36462), where we decided to implement a NumPy-compatible mathematical notion of "closeness" for complex tensors that is not a great fit for our testing framework.

The detailed changelist is:

- New test framework functions for comparing tensors and scalars
  - Tensors are compared using isclose; the real and imaginary parts of complex tensors are compared independently
  - Scalars are compared using the same algorithm
  - assertEqual and assert_allclose now use this common comparison function, instead of each implementing their own with divergent behavior
  - assertEqual-like debug messages are now available for all tensor and scalar comparisons, with additional context when comparing the components of sparse, quantized, and complex tensors
- Extensive testing of the comparison behavior and debug messages
- Small Updates
  - assertEqual now takes an "exact_device" argument, analogous to "exact_dtype", which should be useful in multidevice tests
  - assertEqual now takes an "equal_nan" argument for argument consistency with torch.isclose
  - assertEqual no longer takes the "allow_inf" keyword, which misleadingly only applied to scalar comparisons, was only ever set (rarely) to true, and is not supported by torch.isclose
- Bug fixes:
  - the exact_dtype attribute has been removed (no longer needed after https://github.com/pytorch/pytorch/pull/38103)
  - message arguments passed to assertEqual are now handled correctly
  - bool x other dtype comparisons are now supported
  - uint8 and int8 tensor comparisons now function properly
  - rtol for integer comparisons is now supported (default is zero)
  - rtol and atol for scalar comparisons are now supported
  - complex scalar comparisons are now supported, analogous to complex tensor comparisons
  - assertNotEqual is now equivalent to the logical negation of assertEqual
Pull Request resolved: https://github.com/pytorch/pytorch/pull/37294

Differential Revision: D21596830

Pulled By: mruberry

fbshipit-source-id: f2576669f7113a06f82581fc71883e6b772de19b
2020-05-15 16:24:03 -07:00
Gregory Chanan
70ef9f5124 Improve testing of logical_not. (#38505)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38505

This takes the testing of https://github.com/pytorch/pytorch/pull/38275, but doesn't include the kernel changes which are still being worked out.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D21580574

Pulled By: gchanan

fbshipit-source-id: f12317259cb7373989f6c9ad345b19aaac524851
2020-05-15 10:51:35 -07:00