Commit Graph

462 Commits

Author SHA1 Message Date
Natalia Gimelshein
ecc6358dbe Port nonzero cuda from THC to ATen (#44259)
Summary:
1) Ports nonzero from THC to ATen
2) replaces most thrust uses with cub, to avoid synchronization and to improve performance. There is still one necessary synchronization point, communicating number of nonzero elements from GPU to CPU
3) slightly changes algorithm, now we first compute the number of nonzeros, and then allocate correct-sized output, instead of allocating full-sized output as was done before, to account for possibly all elements being non-zero
4) unfortunately, since the last transforms are still done with thrust, 2) is slightly beside the point, however it is a step towards a future without thrust
4) hard limits the number of elements in the input tensor to MAX_INT. Previous implementation allocated a Long tensor with the size ndim*nelements, so that would be at least 16 GB for a tensor with MAX_INT elements. It is reasonable to say that larger tensors could not be used anyway.

Benchmarking is done for tensors with approximately half non-zeros
<details><summary>Benchmarking script</summary>
<p>

```
import torch
from torch.utils._benchmark import Timer
from torch.utils._benchmark import Compare
import sys

device = "cuda"
results = []
for numel in (1024 * 128,):#, 1024 * 1024, 1024 * 1024 * 128):
    inp = torch.randint(2, (numel,), device="cuda", dtype=torch.float)
    for ndim in range(2,3):#(1,4):
        if ndim == 1:
            shape = (numel,)
        elif ndim == 2:
            shape = (1024, numel // 1024)
        else:
            shape = (1024, 128, numel // 1024 // 128)
        inp = inp.reshape(shape)
        repeats = 3
        timer = Timer(stmt="torch.nonzero(inp, as_tuple=False)", label="Nonzero", sub_label=f"number of elts {numel}",
        description = f"ndim {ndim}", globals=globals())
        for i in range(repeats):
            results.append(timer.blocked_autorange())
        print(f"\rnumel {numel} ndim {ndim}", end="")
        sys.stdout.flush()

comparison = Compare(results)
comparison.print()
```
</p>
</details>

### Results
Before:
```
[--------------------------- Nonzero ---------------------------]
                                 |  ndim 1  |   ndim 2  |   ndim 3
 1 threads: ------------------------------------------------------
       number of elts 131072     |    55.2  |     71.7  |     90.5
       number of elts 1048576    |   113.2  |    250.7  |    497.0
       number of elts 134217728  |  8353.7  |  23809.2  |  54602.3

 Times are in microseconds (us).
```
After:
```
[-------------------------- Nonzero --------------------------]
                                |  ndim 1  |  ndim 2  |  ndim 3
1 threads: ----------------------------------------------------
      number of elts 131072     |    48.6  |    79.1  |    90.2
      number of elts 1048576    |    64.7  |   134.2  |   161.1
      number of elts 134217728  |  3748.8  |  7881.3  |  9953.7

Times are in microseconds (us).

```
There's a real regression for smallish 2D tensor due to added work of computing number of nonzero elements, however, for other sizes there are significant gains, and there are drastically lower memory requirements. Perf gains would be even larger for tensors with fewer nonzeros.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/44259

Reviewed By: izdeby

Differential Revision: D23581955

Pulled By: ngimel

fbshipit-source-id: 0b99a767fd60d674003d83f0848dc550d7a363dc
2020-09-08 20:52:51 -07:00
Mike Ruberry
bb861e1d69 Ports CUDA var and std reduce all (with no out argument) to ATen, fixes var docs (#43858)
Summary:
When var and std are called without args (other than unbiased) they currently call into TH or THC. This PR:

- Removes the THC var_all and std_all functions and updates CUDA var and std to use the ATen reduction
- Fixes var's docs, which listed its arguments in the incorrect order
- Adds new tests comparing var and std with their NumPy counterparts

Performance appears to have improved as a result of this change. I ran experiments on 1D tensors, 1D tensors with every other element viewed ([::2]), 2D tensors and 2D transposed tensors. Some notable datapoints:

- torch.randn((8000, 8000))
  - var measured 0.0022215843200683594s on CUDA before the change
  - var measured 0.0020322799682617188s on CUDA after the change
- torch.randn((8000, 8000)).T
  - var measured .015128850936889648 on CUDA before the change
  - var measured 0.001912832260131836 on CUDA after the change
- torch.randn(8000 ** 2)
  - std measured 0.11031460762023926 on CUDA before the change
  - std measured 0.0017833709716796875 on CUDA after the change

Timings for var and std are, as expected, similar.

On the CPU, however, the performance change from making the analogous update was more complicated, and ngimel and I decided not to remove CPU var_all and std_all. ngimel wrote the following script that showcases how single-threaded CPU inference would suffer from this change:

```
import torch
import numpy as np
from torch.utils._benchmark import Timer
from torch.utils._benchmark import Compare
import sys
base = 8
multiplier = 1

def stdfn(a):
    meanv = a.mean()
    ac = a-meanv
    return torch.sqrt(((ac*ac).sum())/a.numel())

results = []
num_threads=1
for _ in range(7):
    size = base*multiplier
    input = torch.randn(size)

    tasks = [("torch.var(input)", "torch_var"),
             ("torch.var(input, dim=0)", "torch_var0"),
             ("stdfn(input)", "stdfn"),
             ("torch.sum(input, dim=0)", "torch_sum0")
            ]
    timers = [Timer(stmt=stmt, num_threads=num_threads, label="Index", sub_label=f"{size}",
    description=label, globals=globals()) for stmt, label in tasks]
    repeats = 3

    for i, timer in enumerate(timers * repeats):
        results.append(
            timer.blocked_autorange()
        )
        print(f"\r{i + 1} / {len(timers) * repeats}", end="")
        sys.stdout.flush()
    multiplier *=10
print()

comparison = Compare(results)

comparison.print()
```

The TH timings using this script on my devfair are:

```
[------------------------------ Index ------------------------------]
        | torch_var | torch_var0 |  stdfn  | torch_sum0
1 threads: ----------------------------------------------------------
   8    |   16.0  |    5.6  |   40.9 |    5.0
   80    |   15.9  |    6.1  |   41.6 |    4.9
   800   |   16.7  |   12.0  |   42.3 |    5.0
   8000   |   27.2  |   72.7  |   51.5 |    6.2
   80000  |   129.0  |   715.0  |  133.0 |   18.0
   800000  |  1099.8  |  6961.2  |  842.0 |   112.6
   8000000 |  11879.8  |  68948.5  | 20138.4 |  1750.3
```

and the ATen timings are:

```
[------------------------------ Index ------------------------------]
               |  torch_var  |  torch_var0  |   stdfn   |  torch_sum0
1 threads: ----------------------------------------------------------
      8              |       4.3   |       5.4    |     41.4  |       5.4
      80            |       4.9   |       5.7    |     42.6  |       5.4
      800          |      10.7   |      11.7    |     43.3  |       5.5
      8000        |      69.3   |      72.2    |     52.8  |       6.6
      80000      |     679.1   |     676.3    |    129.5  |      18.1
      800000    |    6770.8   |    6728.8    |    819.8  |     109.7
      8000000  |   65928.2   |   65538.7    |  19408.7  |    1699.4
```

which demonstrates that performance is analogous to calling the existing var and std with `dim=0` on a 1D tensor. This would be a significant performance hit. Another simple script shows the performance is mixed when using multiple threads, too:

```
import torch
import time

# Benchmarking var and std, 1D with varying sizes
base = 8
multiplier = 1

op = torch.var
reps = 1000

for _ in range(7):
    size = base * multiplier
    t = torch.randn(size)
    elapsed = 0
    for _ in range(reps):
        start = time.time()
        op(t)
        end = time.time()
        elapsed += end - start
    multiplier *= 10

    print("Size: ", size)
    print("Avg. elapsed time: ", elapsed / reps)
```

```
var cpu TH vs ATen timings

Size:  8
Avg. elapsed time:  1.7853736877441406e-05 vs 4.9788951873779295e-06 (ATen wins)
Size:  80
Avg. elapsed time:  1.7803430557250977e-05 vs 6.156444549560547e-06 (ATen wins)
Size:  800
Avg. elapsed time:  1.8569469451904296e-05 vs 1.2302875518798827e-05 (ATen wins)
Size:  8000
Avg. elapsed time:  2.8756141662597655e-05 vs. 6.97789192199707e-05 (TH wins)
Size:  80000
Avg. elapsed time:  0.00026622867584228516 vs. 0.0002447957992553711 (ATen wins)
Size:  800000
Avg. elapsed time:  0.0010556647777557374 vs 0.00030616092681884767 (ATen wins)
Size:  8000000
Avg. elapsed time:  0.009990205764770508 vs 0.002938544034957886 (ATen wins)

std cpu TH vs ATen timings

Size:  8
Avg. elapsed time:  1.6681909561157225e-05 vs. 4.659652709960938e-06 (ATen wins)
Size:  80
Avg. elapsed time:  1.699185371398926e-05 vs. 5.431413650512695e-06 (ATen wins)
Size:  800
Avg. elapsed time:  1.768803596496582e-05 vs. 1.1279821395874023e-05 (ATen wins)
Size:  8000
Avg. elapsed time:  2.7791500091552735e-05  vs 7.031106948852539e-05 (TH wins)
Size:  80000
Avg. elapsed time:  0.00018650460243225096 vs 0.00024368906021118164 (TH wins)
Size:  800000
Avg. elapsed time:  0.0010522041320800782 vs 0.0003039860725402832 (ATen wins)
Size:  8000000
Avg. elapsed time:  0.009976618766784668 vs. 0.0029211788177490234 (ATen wins)
```

These results show the TH solution still performs better than the ATen solution with default threading for some sizes.

It seems like removing CPU var_all and std_all will require an improvement in ATen reductions. https://github.com/pytorch/pytorch/issues/40570 has been updated with this information.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43858

Reviewed By: zou3519

Differential Revision: D23498981

Pulled By: mruberry

fbshipit-source-id: 34bee046c4872d11c3f2ffa1b5beee8968b22050
2020-09-06 09:40:54 -07:00
Mike Ruberry
83a6e7d342 Adds inequality testing aliases for better NumPy compatibility (#43870)
Summary:
This PR adds the following aliaes:

- not_equal for torch.ne
- greater for torch.gt
- greater_equal for torch.ge
- less for torch.lt
- less_equal for torch.le

This aliases are consistent with NumPy's naming for these functions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43870

Reviewed By: zou3519

Differential Revision: D23498975

Pulled By: mruberry

fbshipit-source-id: 78560df98c9f7747e804a420c1e53fd1dd225002
2020-09-06 09:36:23 -07:00
Muthu Arivoli
719d29dab5 Implement torch.i0 and torch.kaiser_window (#43132)
Summary:
Related to https://github.com/pytorch/pytorch/issues/38349

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43132

Reviewed By: smessmer

Differential Revision: D23479072

Pulled By: mruberry

fbshipit-source-id: 4fb1de44830771c6a7222cf19f7728d9ac7c043b
2020-09-05 23:11:47 -07:00
Milind Yishu Ujjawal
ab7606702c Rectified a few grammatical errors in documentation (#43695)
Summary:
Rectified a few grammatical errors in documentation of pytorch.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43695

Reviewed By: anjali411

Differential Revision: D23451600

Pulled By: ezyang

fbshipit-source-id: bc7b34c240fde1b31cac811080befa2ff2989395
2020-09-02 23:59:45 -07:00
anjali411
129f406062 Make torch.conj() a composite function and return self for real tensors (#43270)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43270

`torch.conj` is a very commonly used operator for complex tensors, but it's mathematically a no op for real tensors. Switching to tensorflow gradients for complex tensors (as discussed in #41857) would involve adding `torch.conj()` to the backward definitions for a lot of operators. In order to preserve autograd performance for real tensors and maintain numpy compatibility for `torch.conj`, this PR updates `torch.conj()` which behaves the same for complex tensors but performs a view/returns `self` tensor for tensors of non-complex dtypes. The documentation states that the returned tensor for a real input shouldn't be mutated. We could perhaps return an immutable tensor for this case in future when that functionality is available (zdevito ezyang ).

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D23460493

Pulled By: anjali411

fbshipit-source-id: 3b3bf0af55423b77ff2d0e29f5d2c160291ae3d9
2020-09-02 17:06:04 -07:00
kshitij12345
b6b5ebc345 Add torch.vdot (#43004)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/42747

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43004

Reviewed By: mruberry

Differential Revision: D23318935

Pulled By: anjali411

fbshipit-source-id: 12d4824b7cb42bb9ca703172c54ec5c663d9e325
2020-09-02 09:00:30 -07:00
Hong Xu
6da26cf0d9 Update torch.range warning message regarding the removal version number (#43569)
Summary:
`torch.range` still hasn't been removed way after version 0.5. This PR fixes the warning message. Alternatively, we can remove `torch.range`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43569

Reviewed By: ngimel

Differential Revision: D23408233

Pulled By: mruberry

fbshipit-source-id: 86c4f9f018ea5eddaf80b78a3c54dfa41cfc6fa6
2020-08-31 22:23:32 -07:00
kiyosora
3682df77db Implementing NumPy-like function torch.heaviside() (#42523)
Summary:
- Related with https://github.com/pytorch/pytorch/issues/38349
- Implementing the NumPy-like function `torch.heaviside()` .

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42523

Reviewed By: ngimel

Differential Revision: D23416743

Pulled By: mruberry

fbshipit-source-id: 9975bd9c9fa73bd0958fe9879f79a692aeb722d5
2020-08-31 15:54:56 -07:00
Xiang Gao
a860be898e [resubmit] Add amax/amin (#43819)
Summary:
Resubmit for landing next week.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43819

Reviewed By: ngimel

Differential Revision: D23421906

Pulled By: mruberry

fbshipit-source-id: 23dd60d1e365bb1197d660c3bfad7ee07ba3e97f
2020-08-31 04:54:48 -07:00
Mike Ruberry
3aeb70db0b Documents sub properly, adds subtract alias (#43850)
Summary:
`torch.sub` was undocumented, so this PR adds its documentation, analogous to `torch.add`'s documentation, and adds the alias `torch.subtract` for `torch.sub`, too. This alias comes from NumPy (see https://numpy.org/doc/stable/reference/generated/numpy.subtract.html?highlight=subtract#numpy.subtract)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43850

Reviewed By: ngimel

Differential Revision: D23416908

Pulled By: mruberry

fbshipit-source-id: 6c4d2ebaf6ecae91f3a6efe484ce6c4dad96f016
2020-08-30 15:44:56 -07:00
Xiang Gao
5021ec826b Fix docs for kwargs, f-p (#43586)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43586

Reviewed By: glaringlee

Differential Revision: D23390667

Pulled By: mruberry

fbshipit-source-id: dd51a4a48ff4e2fc10675ec817a206041957982f
2020-08-30 10:13:36 -07:00
Gao, Xiang
7f967c08b8 Document the beta=0 behavior of BLAS functions (#43823)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43823

Reviewed By: mruberry

Differential Revision: D23413899

Pulled By: ngimel

fbshipit-source-id: d3c4e5631db729a3f3d5eb9290c76cb1aa529f74
2020-08-29 13:03:16 -07:00
Nikita Shulga
64906497cd Revert D23391941: [pytorch][PR] Implementing NumPy-like function torch.heaviside()
Test Plan: revert-hammer

Differential Revision:
D23391941 (a1eae6d158)

Original commit changeset: 7b942321a625

fbshipit-source-id: c2a7418a1fedaa9493300945c30e2392fc0d08ee
2020-08-28 19:16:58 -07:00
kiyosora
a1eae6d158 Implementing NumPy-like function torch.heaviside() (#42523)
Summary:
- Related with https://github.com/pytorch/pytorch/issues/38349
- Implementing the NumPy-like function `torch.heaviside()` .

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42523

Reviewed By: glaringlee

Differential Revision: D23391941

Pulled By: mruberry

fbshipit-source-id: 7b942321a62567a5fc0a3679a289f4c4c19e6134
2020-08-28 18:11:20 -07:00
Nikita Shulga
3f0120edb4 Revert D23360705: [pytorch][PR] Add amax/amin
Test Plan: revert-hammer

Differential Revision:
D23360705 (bcec8cc3f9)

Original commit changeset: 5bdeb08a2465

fbshipit-source-id: 76a9e199823c7585e55328bad0778bcd8cd49381
2020-08-28 18:01:25 -07:00
Mike Ruberry
20abfc21e4 Adds arctanh, arcsinh aliases, simplifies arc* alias dispatch (#43762)
Summary:
Adds two more "missing" NumPy aliases: arctanh and arcsinh, and simplifies the dispatch of other arc* aliases.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43762

Reviewed By: ngimel

Differential Revision: D23396370

Pulled By: mruberry

fbshipit-source-id: 43eb0c62536615fed221d460c1dec289526fb23c
2020-08-28 13:59:19 -07:00
Gao, Xiang
bcec8cc3f9 Add amax/amin (#43092)
Summary:
Add a max/min operator that only return values.

## Some important decision to discuss
| **Question**                          | **Current State** |
|---------------------------------------|-------------------|
| Expose torch.max_values to python?    | No                |
| Remove max_values and only keep amax? | Yes               |
| Should amax support named tensors?    | Not in this PR    |

## Numpy compatibility

Reference: https://numpy.org/doc/stable/reference/generated/numpy.amax.html

| Parameter                                                                                                                                                                                                                                              | PyTorch Behavior                                                                  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------|
| `axis`:  None or int or tuple of ints, optional. Axis or axes along which to operate. By default, flattened input is used. If this is a tuple of ints, the maximum is selected over multiple axes, instead of a single axis or all the axes as before. | Named `dim`, behavior same as `torch.sum` (https://github.com/pytorch/pytorch/issues/29137)                                |
| `out`: ndarray, optional. Alternative output array in which to place the result. Must be of the same shape and buffer length as the expected output.                                                                                                   | Same                                                                              |
| `keepdims`: bool, optional. If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.                                      | implemented as `keepdim`                                                          |
| `initial`: scalar, optional. The minimum value of an output element. Must be present to allow computation on empty slice.                                                                                                                              | Not implemented in this PR. Better to implement for all reductions in the future. |
| `where`: array_like of bool, optional. Elements to compare for the maximum.                                                                                                                                                                            | Not implemented in this PR. Better to implement for all reductions in the future. |

**Note from numpy:**
> NaN values are propagated, that is if at least one item is NaN, the corresponding max value will be NaN as well. To ignore NaN values (MATLAB behavior), please use nanmax.

PyTorch has the same behavior

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43092

Reviewed By: ngimel

Differential Revision: D23360705

Pulled By: mruberry

fbshipit-source-id: 5bdeb08a2465836764a5a6fc1a6cc370ae1ec09d
2020-08-28 12:51:03 -07:00
Aadesh
a76184fe1e grammatical error fix (#43697)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/43697

Reviewed By: malfet

Differential Revision: D23397655

Pulled By: mrshenli

fbshipit-source-id: fb447dcde4f83bc6650f0faa0728a1867cfa5213
2020-08-28 10:38:46 -07:00
kshitij12345
c7787f7fbf [numpy compatibility]Fix argmin/argmax when multiple max/min values (#42004)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/41998
Fixes https://github.com/pytorch/pytorch/issues/22853

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42004

Reviewed By: ngimel

Differential Revision: D23049003

Pulled By: mruberry

fbshipit-source-id: a6fddbadfec4b8696730550859395ce4f0cf50d6
2020-08-28 06:42:42 -07:00
Xiang Gao
f63d06a57b Fix docs for kwargs, a-e (#43583)
Summary:
To reduce the chance of conflicts, not all ops are fixed. Ops starting with letter `f` will be fixed in separate PR.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43583

Reviewed By: ZolotukhinM

Differential Revision: D23330347

Pulled By: mruberry

fbshipit-source-id: 3387cb1e495faebd16fb183039197c6d90972ad4
2020-08-27 00:14:05 -07:00
Xiong Wei
033b7ae3ef implement NumPy-like functionality maximum, minimum (#42579)
Summary:
Related to https://github.com/pytorch/pytorch/issues/38349

Implement NumPy-like functions `maximum` and `minimum`.
The `maximum` and `minimum` functions compute input tensors element-wise, returning a new array with the element-wise maxima/minima.

If one of the elements being compared is a NaN, then that element is returned, both `maximum` and `minimum` functions do not support complex inputs.

This PR also promotes the overloaded versions of torch.max and torch.min, by re-dispatching binary `torch.max` and `torch.min` to `torch.maximum` and `torch.minimum`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42579

Reviewed By: mrshenli

Differential Revision: D23153081

Pulled By: mruberry

fbshipit-source-id: 803506c912440326d06faa1b71964ec06775eac1
2020-08-26 16:56:12 -07:00
Nikita Vedeneev
3df398a3a8 Update the QR documentation to include a warning about when the QR.backward is well-defined. (#43547)
Summary:
As per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43547

Reviewed By: mruberry

Differential Revision: D23318829

Pulled By: albanD

fbshipit-source-id: 4764ebe1ad440e881b1c4c88b16fb569ef8eb0fa
2020-08-25 13:19:25 -07:00
Hameer Abbasi
c4e841654d Add alias torch.negative to torch.neg. (#43400)
Summary:
xref https://github.com/pytorch/pytorch/issues/42515

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43400

Reviewed By: albanD

Differential Revision: D23266011

Pulled By: mruberry

fbshipit-source-id: ca20b30d99206a255cf26438b09c3ca1f99445c6
2020-08-24 01:15:04 -07:00
Mike Ruberry
e57b89c8dc Adds arccos, arcsin, arctan aliases (#43319)
Summary:
These aliases are consistent with NumPy (see, for example, https://numpy.org/doc/stable/reference/generated/numpy.arccos.html?highlight=acos).

Note that PyTorch's existing names are consistent with Python (see https://docs.python.org/3.10/library/math.html?highlight=acos#math.acos) and C++ (see, for example, https://en.cppreference.com/w/cpp/numeric/math/acos).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43319

Reviewed By: pbelevich

Differential Revision: D23260426

Pulled By: mruberry

fbshipit-source-id: 98a6c97f69d1f718a396c2182e938a7a260c0889
2020-08-21 10:53:17 -07:00
Hameer Abbasi
e31cd46278 Add alias torch.fix for torch.trunc to be compatible with NumPy. (#43326)
Summary:
xref https://github.com/pytorch/pytorch/issues/42515

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43326

Reviewed By: pbelevich

Differential Revision: D23249089

Pulled By: mruberry

fbshipit-source-id: 6afa9eb20493983d084e0676022c6245e7463e05
2020-08-20 21:47:39 -07:00
Nikita Vedeneev
888ae1b3d8 Introducing Matrix exponential (#40161)
Summary:
Implements (batched) matrix exponential. Fixes [https://github.com/pytorch/pytorch/issues/9983](https://github.com/pytorch/pytorch/issues/9983).

The algorithm follows:
```
 Bader, P.; Blanes, S.; Casas, F.
 Computing the Matrix Exponential with an Optimized Taylor Polynomial Approximation.
 Mathematics 2019, 7, 1174.
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/40161

Reviewed By: zhangguanheng66

Differential Revision: D22951372

Pulled By: ezyang

fbshipit-source-id: aa068cb76d5cf71696b333d3e72cee287b3089e3
2020-08-18 14:15:10 -07:00
Mike Ruberry
6db0b8785d Adds movedim method, fixes movedim docs, fixes view doc links (#43122)
Summary:
This PR:

- Adds a method variant to movedim
- Fixes the movedim docs so it will actually appear in the documentation
- Fixes three view doc links which were broken

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43122

Reviewed By: ngimel

Differential Revision: D23166222

Pulled By: mruberry

fbshipit-source-id: 14971585072bbc04b5366d4cc146574839e79cdb
2020-08-17 14:24:52 -07:00
Mike Ruberry
e2eb0cb1a9 Adds arccosh alias for acosh and adds an alias consistency test (#43107)
Summary:
This adds the torch.arccosh alias and updates alias testing to validate the consistency of the aliased and original operations. The alias testing is also updated to run on CPU and CUDA, which revealed a memory leak when tracing (see https://github.com/pytorch/pytorch/issues/43119).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43107

Reviewed By: ngimel

Differential Revision: D23156472

Pulled By: mruberry

fbshipit-source-id: 6155fac7954fcc49b95e7c72ed917c85e0eabfcd
2020-08-16 22:12:25 -07:00
Mike Ruberry
d4c5f561ec Updates torch.clone documentation to be consistent with other functions (#43098)
Summary:
`torch.clone` exists but was undocumented, and the method incorrectly listed `memory_format` as a positional argument. This:

- documents `torch.clone`
- lists `memory_format` as a keyword-only argument
- wordsmiths the documentation

Pull Request resolved: https://github.com/pytorch/pytorch/pull/43098

Reviewed By: ngimel

Differential Revision: D23153397

Pulled By: mruberry

fbshipit-source-id: c2ea781cdcb8b5ad3f04987c2b3a2f1fe0eaf18b
2020-08-16 04:18:49 -07:00
Muthu Arivoli
5bcf9b017a Implement hstack, vstack, dstack (#42799)
Summary:
Related to https://github.com/pytorch/pytorch/issues/38349

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42799

Reviewed By: izdeby

Differential Revision: D23140704

Pulled By: mruberry

fbshipit-source-id: 6a36363562c50d0abce87021b84b194bb32825fb
2020-08-15 20:39:14 -07:00
ita
91b090ceaf Add polygamma where n >= 2 (#42499)
Summary:
https://github.com/pytorch/pytorch/issues/40980

I have a few questions during implementing Polygamma function...
so, I made PR prior to complete it.

1. some code blocks brought from cephes library(and I did too)
```
/*
 * The following function comes with the following copyright notice.
 * It has been released under the BSD license.
 *
 * Cephes Math Library Release 2.8:  June, 2000
 * Copyright 1984, 1987, 1992, 2000 by Stephen L. Moshier
 */
```
is it okay for me to use cephes code with this same copyright notice(already in the Pytorch codebases)

2. There is no linting in internal Aten library. (as far as I know, I read https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md)
How do I'm sure my code will follow appropriate guidelines of this library..?

3. Actually, there's a digamma, trigamma function already
digamma is needed, however, trigamma function becomes redundant if  polygamma function is added.
it is okay for trigamma to be there or should be removed?

btw, CPU version works fine with 3-rd order polygamma(it's what we need to play with variational inference with beta/gamma distribution) now and I'm going to finish GPU version soon.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42499

Reviewed By: gchanan

Differential Revision: D23110016

Pulled By: albanD

fbshipit-source-id: 246f4c2b755a99d9e18a15fcd1a24e3df5e0b53e
2020-08-14 17:00:24 -07:00
Muthu Arivoli
b8102b1550 Implement torch.nextafter (#42580)
Summary:
Related to https://github.com/pytorch/pytorch/issues/38349.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42580

Reviewed By: smessmer

Differential Revision: D23012260

Pulled By: mruberry

fbshipit-source-id: ce82a63c4ad407ec6ffea795f575ca7c58cd6137
2020-08-14 00:35:30 -07:00
Will Gan
e4373083a2 torch.complex and torch.polar (#39617)
Summary:
For https://github.com/pytorch/pytorch/issues/35312 and https://github.com/pytorch/pytorch/issues/38458#issuecomment-636066256.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/39617

Reviewed By: zhangguanheng66

Differential Revision: D23083926

Pulled By: anjali411

fbshipit-source-id: 1874378001efe2ff286096eaf1e92afe91c55b29
2020-08-14 00:30:11 -07:00
Muthu Arivoli
92885ebe16 Implement hypot (#42291)
Summary:
Related to https://github.com/pytorch/pytorch/issues/38349
Closes https://github.com/pytorch/pytorch/issues/22764

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42291

Reviewed By: malfet

Differential Revision: D22951859

Pulled By: mruberry

fbshipit-source-id: d0118f2b6437e5c3f775f699ec46e946a8da50f0
2020-08-12 13:18:26 -07:00
kshitij12345
ab0a04dc9c Add torch.nansum (#38628)
Summary:
Reference: https://github.com/pytorch/pytorch/issues/38349

Pull Request resolved: https://github.com/pytorch/pytorch/pull/38628

Reviewed By: VitalyFedyunin

Differential Revision: D22860549

Pulled By: mruberry

fbshipit-source-id: 87fcbfd096d83fc14b3b5622f2301073729ce710
2020-08-11 22:26:04 -07:00
Mike Ruberry
bee174dc3f Adds linalg.det alias, fixes outer alias, updates alias testing (#42802)
Summary:
This PR:

- updates test_op_normalization.py, which verifies that aliases are correctly translated in the JIT
- adds torch.linalg.det as an alias for torch.det
- moves the torch.linalg.outer alias to torch.outer (to be consistent with NumPy)

The torch.linalg.outer alias was put the linalg namespace erroneously as a placeholder since it's a "linear algebra op" according to NumPy but is actually still in the main NumPy namespace.

The updates to test_op_normalization are necessary. Previously it was using method_tests to generate tests, and method_tests assumes test suites using it also use the device generic framework, which test_op_normalization did not. For example, some ops require decorators like `skipCPUIfNoLapack`, which only works in device generic test classes. Moving test_op_normalization to the device generic framework also lets these tests run on CPU and CUDA.

Continued reliance on method_tests() is excessive since the test suite is only interested in testing aliasing, and a simpler and more readable `AliasInfo` class is used for the required information. An example impedance mismatch between method_tests and the new tests, for example, was how to handle ops in namespaces like torch.linalg.det. In the future this information will likely be folded into a common 'OpInfo' registry in the test suite.

The actual tests performed are similar to what they were previously: a scripted and traced version of the op is run and the test verifies that both graphs do not contain the alias name and do contain the aliased name.

The guidance for adding an alias has been updated accordingly.

cc mattip

Note:

ngimel suggests:
- deprecating and then removing the `torch.ger` name
- reviewing the implementation of `torch.outer`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42802

Reviewed By: zou3519

Differential Revision: D23059883

Pulled By: mruberry

fbshipit-source-id: 11321c2a7fb283a6e7c0d8899849ad7476be42d1
2020-08-11 21:48:31 -07:00
Kurt Mohler
5edd9aa95a Fix manual seed to unpack unsigned long (#42206)
Summary:
`torch.manual_seed` was unpacking its argument as an `int64_t`. This fix changes it to a `uint64_t`.

Fixes https://github.com/pytorch/pytorch/issues/33546

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42206

Reviewed By: ezyang

Differential Revision: D22822098

Pulled By: albanD

fbshipit-source-id: 97c978139c5cb2d5b62cc2c963550c758ee994f7
2020-08-11 18:05:34 -07:00
Heitor Schueroff de Souza
c660d2a9ae Initial quantile operator implementation (#42755)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/42755

Attempting to land quantile again after being landed here https://github.com/pytorch/pytorch/pull/39417 and reverted here https://github.com/pytorch/pytorch/pull/41616.

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D23030338

Pulled By: heitorschueroff

fbshipit-source-id: 124a86eea3aee1fdaa0aad718b04863935be26c7
2020-08-11 12:08:17 -07:00
Mike Ruberry
87970b70a7 Adds 'clip' alias for clamp (#42770)
Summary:
Per title. Also updates our guidance for adding aliases to clarify interned_string and method_test requirements. The alias is tested by extending test_clamp to also test clip.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/42770

Reviewed By: ngimel

Differential Revision: D23020655

Pulled By: mruberry

fbshipit-source-id: f1d8e751de9ac5f21a4f95d241b193730f07b5dc
2020-08-09 02:46:02 -07:00
Mike Ruberry
ccfce9d4a9 Adds fft namespace (#41911)
Summary:
This PR creates a new namespace, torch.fft (torch::fft) and puts a single function, fft, in it. This function is analogous to is a simplified version of NumPy's [numpy.fft.fft](https://numpy.org/doc/1.18/reference/generated/numpy.fft.fft.html?highlight=fft#numpy.fft.fft) that accepts no optional arguments. It is intended to demonstrate how to add and document functions in the namespace, and is not intended to deprecate the existing torch.fft function.

Adding this namespace was complicated by the existence of the torch.fft function in Python. Creating a torch.fft Python module makes this name ambiguous: does it refer to a function or module? If the JIT didn't exist, a solution to this problem would have been to make torch.fft refer to a callable class that mimicked both the function and module. The JIT, however, cannot understand this pattern. As a workaround it's required to explicitly `import torch.fft` to access the torch.fft.fft function in Python:

```
import torch.fft

t = torch.randn(128, dtype=torch.cdouble)
torch.fft.fft(t)
```

See https://github.com/pytorch/pytorch/issues/42175 for future work. Another possible future PR is to get the JIT to understand torch.fft as a callable class so it need not be imported explicitly to be used.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41911

Reviewed By: glaringlee

Differential Revision: D22941894

Pulled By: mruberry

fbshipit-source-id: c8e0b44cbe90d21e998ca3832cf3a533f28dbe8d
2020-08-06 00:20:50 -07:00
Mike Ruberry
e54f268a7a Enables torch.full bool and integer type inference (#41912)
Summary:
After being deprecated in 1.5 and throwing a runtime error in 1.6, we can now enable torch.full inferring its dtype when given bool and integer fill values. This PR enables that inference and updates the tests and docs to reflect this.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41912

Reviewed By: albanD

Differential Revision: D22836802

Pulled By: mruberry

fbshipit-source-id: 33dfbe4d4067800c418b314b1f60fab8adcab4e7
2020-07-30 22:39:13 -07:00
kshitij12345
31d41f987a torch.where : Scalar Support (#40336)
Summary:
Reference: https://github.com/pytorch/pytorch/issues/38349 #9190

TODO
* [x] Add Tests
* [x] Update Docs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/40336

Reviewed By: albanD

Differential Revision: D22813834

Pulled By: mruberry

fbshipit-source-id: 67c1693c059a301b249213afee3c25cea9f64fec
2020-07-30 22:36:53 -07:00
kiyosora
26d58503c2 Implementing NumPy-like function torch.signbit() (#41589)
Summary:
- Related with https://github.com/pytorch/pytorch/issues/38349
- Implementing the NumPy-like function `torch.signbit()` .

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41589

Reviewed By: albanD

Differential Revision: D22835249

Pulled By: mruberry

fbshipit-source-id: 7988f7fa8f591ce4b6a23ac884ee7b3aa718bcfd
2020-07-30 11:21:15 -07:00
Alban Desmaison
460970483d Revert D22790718: [pytorch][PR] Enables torch.full bool and integer type inference
Test Plan: revert-hammer

Differential Revision:
D22790718 (6b3f335641)

Original commit changeset: 8d1eb01574b1

fbshipit-source-id: c321177cce129a6c83f1a7b26bd5ed94a343ac0f
2020-07-29 07:52:04 -07:00
Xiong Wei
90074bbfa6 implement numpy-like functionality isposinf, isneginf (#41588)
Summary:
Related https://github.com/pytorch/pytorch/issues/38349

Numpy-like functionalities `isposinf` and `isneginf` are implemented.

Test-Plan:
- pytest test/test_torch.py -k "test_isposinf_isneginf"

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41588

Reviewed By: ngimel

Differential Revision: D22770732

Pulled By: mruberry

fbshipit-source-id: 7448653e8fb8df6b9cd4604a4739fe18a1135578
2020-07-29 03:29:31 -07:00
Mike Ruberry
6b3f335641 Enables torch.full bool and integer type inference (#41912)
Summary:
After being deprecated in 1.5 and throwing a runtime error in 1.6, we can now enable torch.full inferring its dtype when given bool and integer fill values. This PR enables that inference and updates the tests and docs to reflect this.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41912

Reviewed By: pbelevich

Differential Revision: D22790718

Pulled By: mruberry

fbshipit-source-id: 8d1eb01574b1977f00bc0696974ac38ffdd40d9e
2020-07-28 23:11:08 -07:00
kshitij12345
266657182a Add torch.movedim (#41480)
Summary:
https://github.com/pytorch/pytorch/issues/38349 #36048

TODO:
* [x] Tests
* [x] Docs

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41480

Reviewed By: zhangguanheng66

Differential Revision: D22649917

Pulled By: zou3519

fbshipit-source-id: a7f3920a24bae16ecf2ad731698ca65ca3e8c1ce
2020-07-23 09:41:01 -07:00
Jeremy Reizenstein
6ceb65f98c Document default dim for cross being None (#41850)
Summary:
The function torch.cross is a bit confusing, in particular the defaulting of the dim argument.

The default `dim` has been documented as -1 but it is actually `None`. This increases the confusion, in two possible ways depending on how carefully you read the rest. I also add a final warning to the final sentence.

This partially addresses https://github.com/pytorch/pytorch/issues/39310.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41850

Reviewed By: izdeby

Differential Revision: D22664625

Pulled By: albanD

fbshipit-source-id: b8669e026fd01de9e4ec16da1414b9edfaa76bdd
2020-07-22 13:31:47 -07:00
Wojciech Baranowski
48569cc330 Reland split (#41567)
Summary:
Take 3

Pull Request resolved: https://github.com/pytorch/pytorch/pull/41567

Reviewed By: zou3519

Differential Revision: D22586331

Pulled By: albanD

fbshipit-source-id: ca08199da716d64a335455610edbce752fee224b
2020-07-21 08:06:27 -07:00