Commit Graph

135 Commits

Author SHA1 Message Date
Edward Yang
4b929e5466 Revert D20193196: [pytorch][PR] PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem
Test Plan: revert-hammer

Differential Revision:
D20193196

Original commit changeset: 78a487991242

fbshipit-source-id: 8da4f8cb17c45af41e8c0ce80bc72581eb10dbb8
2020-03-11 09:24:34 -07:00
Pearu Peterson
2ec779d46c PCA and SVD for low-rank matrices, LOBPCG for positive-defined generalized eigenvalue problem (#29488)
Summary:
This PR implements the following linear algebra algorithms for low-rank matrices:
- [x] Approximate `A` as `Q Q^H A` - using Algorithm 4.4 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061).
  + exposed as `torch.lowrank.get_approximate_basis(A, q, niter=2, M=None) -> Q`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] SVD - using Algorithm 5.1 from [Halko et al, 2009](http://arxiv.org/abs/0909.4061).
  + uses `torch.lowrank.get_approximate_basis`
  + exposed as `torch.svd_lowrank(A, q=6, niter=2, M=None) -> (U, S, V)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] PCA - using `torch.svd_lowrank`
  + uses `torch.svd_lowrank`
  + exposed as `torch.pca_lowrank(A, center=True, q=None, niter=2) -> (U, S, V)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices, uses non-centered sparse matrix algorithm
  + [x] documentation
- [x] generalized eigenvalue solver using the original LOBPCG algorithm [Knyazev, 2001](https://epubs.siam.org/doi/abs/10.1137/S1064827500366124)
  + exposed as `torch.lobpcg(A, B=None, k=1, method="basic", ...)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] generalized eigenvalue solver using robust LOBPCG with orthogonal basis selection [Stathopoulos, 2002](https://epubs.siam.org/doi/10.1137/S1064827500370883)
  + exposed as `torch.lobpcg(A, B=None, k=1, method="ortho", ...)`
  + [x] dense matrices
  + [x] batches of dense matrices
  + [x] sparse matrices
  + [x] documentation
- [x] generalized eigenvalue solver using the robust and efficient LOBPCG Algorithm 8 from [Duersch et al, 2018](https://epubs.siam.org/doi/abs/10.1137/17M1129830) that switches to orthogonal basis selection automatically
  + the "ortho" method improves iterations so rapidly that in the current test cases it does not make sense to use the basic iterations at all. If users will have matrices for which basic iterations could improve convergence then the `tracker` argument allows breaking the iteration process at user choice so that the user can switch to the orthogonal basis selection if needed. In conclusion, there is no need to implement Algorithm 8 at this point.
- [x] benchmarks
  + [x] `torch.svd` vs `torch.svd_lowrank`, see notebook [Low-rank SVD](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/Low-rank%20SVD.ipynb). In conclusion, the low-rank SVD is going to be useful only for large sparse matrices where the full-rank SVD will fail due to memory limitations.
  + [x] `torch.lobpcg` vs `scipy.sparse.linalg.lobpcg`, see notebook [LOBPCG - pytorch vs scipy](https://github.com/Quansight/pearu-sandbox/blob/master/pytorch/LOBPCG%20-%20pytorch%20vs%20scipy.ipynb). In conculsion, both implementations give the same results (up to numerical errors from different methods), scipy lobpcg implementation is generally faster.
  + [x] On very small tolerance cases, `torch.lobpcg` is more robust than `scipy.sparse.linalg.lobpcg` (see `test_lobpcg_scipy` results)

Resolves https://github.com/pytorch/pytorch/issues/8049.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29488

Differential Revision: D20193196

Pulled By: vincentqb

fbshipit-source-id: 78a4879912424595e6ea95a95e483a37487a907e
2020-03-11 07:33:49 -07:00
vishwakftw
e025677e3c Remove **kwargs from torch.meshgrid (#34356)
Summary:
Changelog:
- Remove **kwargs from torch.meshgrid as they serve no purpose

Closes https://github.com/pytorch/pytorch/issues/34206
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34356

Differential Revision: D20310971

Pulled By: zou3519

fbshipit-source-id: 97250051504aa3ec1e2a9af9296e7cc71872e5bf
2020-03-09 12:07:43 -07:00
Shen Li
30680196e4 Revert D20121915: [JIT] Add support for list()
Test Plan: revert-hammer

Differential Revision:
D20121915

Original commit changeset: c6c4ef444dbf

fbshipit-source-id: 829adb58780f4d0f41acebb3e7640a9c68bdbc1b
2020-03-06 07:16:40 -08:00
Elias Ellison
f218842f2e [JIT] Add support for list() (#33818)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33818

Test Plan: Imported from OSS

Differential Revision: D20121915

Pulled By: eellison

fbshipit-source-id: c6c4ef444dbf1d4134dccb28c13315e225945b64
2020-03-05 14:48:20 -08:00
Elias Ellison
479c3b0aa5 [JIT] add support for torch.norm (#33783)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33783

Fix for https://github.com/pytorch/pytorch/issues/20113

Test Plan: Imported from OSS

Differential Revision: D20121917

Pulled By: eellison

fbshipit-source-id: ffedcc40678cd80f5529ff9323088eed544e5158
2020-03-05 14:46:24 -08:00
Elias Ellison
857eb4145e [JIT] add support for torch.cdist (#33737)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33737

Test Plan: Imported from OSS

Differential Revision: D20121916

Pulled By: eellison

fbshipit-source-id: b0427bbfd3ade1f3129c4a95a542fbc32c3abd76
2020-02-26 18:37:37 -08:00
Elias Ellison
f31b1d3453 [JIT] add support for lu_unpack (#33736)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/33736

Test Plan: Imported from OSS

Differential Revision: D20121914

Pulled By: eellison

fbshipit-source-id: 1136f4d7678a2233129aefe3e30234af385b8353
2020-02-26 18:37:33 -08:00
Elias Ellison
4543cf4eb1 [JIT] add support for torch.lu to torchscript (#33724)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33724

Fix for https://github.com/pytorch/pytorch/issues/33381, partial fix of https://github.com/pytorch/pytorch/issues/30786

Test Plan: Imported from OSS

Differential Revision: D20077321

Pulled By: eellison

fbshipit-source-id: a1e6a0370712b36c9f66979098ac2f9d500ca5f6
2020-02-26 18:37:28 -08:00
Elias Ellison
fddf73250d [JIT] fix resolving of functions in torch/functional. fix compilation of torch.stft (#33504)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/33504

Fix resolution fo functions that are bound onto torch in torch/functional.py. This does not fix compilation of all of those functions, those will be done in follow ups. Does torch.stft as a start.

Fixes #21478

Test Plan: Imported from OSS

Differential Revision: D20014591

Pulled By: eellison

fbshipit-source-id: bb362f1b5479adbb890e72a54111ef716679d127
2020-02-26 18:35:43 -08:00
Nathan Goldbaum
fa80299bdf __torch_function__ overrides for torch.functional and torch.nn.functional (#32799)
Summary:
This adds `__torch_function__` support for all functions in `torch.functional` and `torch.nn.functional`.

The changes to C++ code and codegen scripts are to facilitate adding `__torch_function__` support for the native functions in `torch._C._nn`. Note that I moved the `handle_torch_function` C++ function to a header that both `python_torch_functions.cpp` and `python_nn_functions.cpp` include. The changes to `python_nn_functions.cpp` mirror the changes I made to `python_torch_functions.cpp` when `__torch_function__` support was first added in https://github.com/pytorch/pytorch/issues/27064. Due to the somewhat different way the `torch._C` and `torch._C._nn` namespaces are initialized I needed to create a new static reference to the `torch._C._nn` namespace (`THPNNVariableFunctions`). I'm not sure if that is the best way to do this. In principle I could import these namespaces in each kernel and avoid the global variable but that would have a runtime cost.

I added `__torch_function__` support to the Python functions in `torch.nn.functional` following the approach in https://github.com/pytorch/pytorch/issues/32194.

I re-enabled the test that checks if all functions in the `torch` namespace are explicitly tested for `__torch_function__` support. I also generalized the check to work for `torch.functional` and `torch.nn.functional` as well. This test was explicitly disabled in https://github.com/pytorch/pytorch/issues/30730 and I'm happy to disable it again if you think that's appropriate. I figured now was as good a time as any to try to re-enable it.

Finally I adjusted the existing torch API tests to suppress deprecation warnings and add keyword arguments used by some of the code in `torch.nn.functional` that were missed when I originally added the tests in https://github.com/pytorch/pytorch/issues/27064.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32799

Differential Revision: D19956809

Pulled By: ezyang

fbshipit-source-id: 40d34e0109cc4b9f3ef62f409d2d35a1d84e3d22
2020-02-21 08:38:37 -08:00
Mike Ruberry
aa3c871739 Adds TestViewOps, updates documentation (#32512)
Summary:
Understanding which ops return views and which return tensors with new storage is a common user issue, and an issue for developers connecting accelerators to PyTorch, too. This generic test suite verifies that ops which should return views do (and a few ops that shouldn't don't).  The documentation has also been updated for .t(), permute(), unfold(), and select() to clarify they return views.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32512

Differential Revision: D19659454

Pulled By: mruberry

fbshipit-source-id: b4334be9b698253a979e1bb8746fdb3ca24aa4e3
2020-02-04 11:10:34 -08:00
Nathan Goldbaum
bab87e4b60 reimplement __torch_function__ overrides for torch.functional using inline logic (#32194)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/30831.

This improves the performance of operators in the `torch.functional` namespace that are overridable by `__torch_function__` implementations when supplied with `Tensor` operands.

Running the split benchmark in various configurations produces the following timings:

<details>
<summary>Expand for timings on <code>master</code> </summary>

```
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M8_N8_parts2_cpu
# Input: M: 8, N: 8, parts: 2, device: cpu
Forward Execution Time (us) : 3.340

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M8_N8_parts2_cuda
# Input: M: 8, N: 8, parts: 2, device: cuda
Forward Execution Time (us) : 3.333

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M256_N512_parts2_cpu
# Input: M: 256, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 3.366

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M256_N512_parts2_cuda
# Input: M: 256, N: 512, parts: 2, device: cuda
Forward Execution Time (us) : 3.385

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M512_N512_parts2_cpu
# Input: M: 512, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 3.468

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M512_N512_parts2_cuda
# Input: M: 512, N: 512, parts: 2, device: cuda
Forward Execution Time (us) : 3.416
```
</details>

<details>
<summary>Expand for timings with this pull request applied</summary>

```
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M8_N8_parts2_cpu
# Input: M: 8, N: 8, parts: 2, device: cpu
Forward Execution Time (us) : 2.261

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M8_N8_parts2_cuda
# Input: M: 8, N: 8, parts: 2, device: cuda
Forward Execution Time (us) : 2.223

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M256_N512_parts2_cpu
# Input: M: 256, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 2.237

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M256_N512_parts2_cuda
# Input: M: 256, N: 512, parts: 2, device: cuda
Forward Execution Time (us) : 2.218

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M512_N512_parts2_cpu
# Input: M: 512, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 2.259

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M512_N512_parts2_cuda
# Input: M: 512, N: 512, parts: 2, device: cuda
Forward Execution Time (us) : 2.234
```

</details>

<details>
<summary>Expand for timings on <code>master</code> with <code>__torch_function__</code> dispatch disabled </summary>

```
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M8_N8_parts2_cpu
# Input: M: 8, N: 8, parts: 2, device: cpu
Forward Execution Time (us) : 2.180

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M8_N8_parts2_cuda
# Input: M: 8, N: 8, parts: 2, device: cuda
Forward Execution Time (us) : 2.172

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M256_N512_parts2_cpu
# Input: M: 256, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 2.171

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M256_N512_parts2_cuda
# Input: M: 256, N: 512, parts: 2, device: cuda
Forward Execution Time (us) : 2.146

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M512_N512_parts2_cpu
# Input: M: 512, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 2.175

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M512_N512_parts2_cuda
# Input: M: 512, N: 512, parts: 2, device: cuda
Forward Execution Time (us) : 2.152
```

</details>

So at least on the machine I'm testing on, this brings the overhead down to less than 100 ns. For comparison, the overhead for `__array_function__` in NumPy is about 850 ns on the same machine.

<details>
<summary>Expand for timings for NumPy <code>__array_function__</code> dispatch </summary>

```
In [1]: import numpy as np

In [2]: %timeit np.mean([1])
8.89 µs ± 17.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [3]: %timeit np.mean._implementation([1])
8.04 µs ± 28.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
```

See [the implementation in NumPy](https://github.com/numpy/numpy/blob/master/numpy/core/overrides.py#L195) for why this measures `__array_function__` overhead.

</details>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/32194

Differential Revision: D19410396

Pulled By: ezyang

fbshipit-source-id: ada788a4399c81cd7eb2d548aa04a2459e96634a
2020-01-16 07:10:38 -08:00
Tongzhou Wang
b6f43afaca Fix tensordot allowing negative dims (#31954)
Summary:
fixes https://github.com/pytorch/pytorch/issues/31926
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31954

Differential Revision: D19331847

Pulled By: zou3519

fbshipit-source-id: e30dd9517917c056a52be7d16f23247fe28f4e28
2020-01-10 07:42:04 -08:00
TH3CHARLie
1296e2d55e C++ API parity: isinf (#31099)
Summary:
fixes https://github.com/pytorch/pytorch/issues/31021, port the legacy binding method of `isinf` to C++ therefore support JIT
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31099

Differential Revision: D19314733

Pulled By: yf225

fbshipit-source-id: 5725c51d19c33b4fddd0fc9e7034078580bd534e
2020-01-09 13:16:13 -08:00
Karl Ostmo
227d1a43a4 Revert D18838848: disable __torch_function__ overides for operators in torch.functional
Test Plan: revert-hammer

Differential Revision:
D18838848

Original commit changeset: 22b8015d7b2f

fbshipit-source-id: fdaeffcd112990ed379782cf7216d3f1beeb2cb1
2020-01-07 15:03:15 -08:00
Nathan Goldbaum
ca72df06ae disable __torch_function__ overides for operators in torch.functional (#30839)
Summary:
For now I'm just removing the decorators from all of the currently overridable functions in `torch.functional`. This means they are no longer overridable, however this should fix the benchmark regressions reported in https://github.com/pytorch/pytorch/issues/30831. Moving forward we'll be looking at reducing the overhead of the python-level override mechanism and failing that, re-implementing all of these operators in C++.

cc hl475
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30839

Differential Revision: D18838848

Pulled By: ezyang

fbshipit-source-id: 22b8015d7b2f7a947f1ebc9632c998e081b48ad8
2020-01-07 12:27:28 -08:00
Richard Zou
9047d4df45 Remove all remaining usages of BUILD_NAMEDTENSOR (#31116)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31116

Changelist:
- remove BUILD_NAMEDTENSOR macro
- remove torch._C._BUILD_NAMEDTENSOR
- remove all python behavior that relies on torch._C._BUILD_NAMEDTENSOR

Future:
- In the next diff, I will remove all usages of
ATen/core/EnableNamedTensor.h since that header doesn't do anything
anymore
- After that, we'll be done with the BUILD_NAMEDTENSOR removal.

Test Plan: - run CI

Differential Revision: D18934951

Pulled By: zou3519

fbshipit-source-id: 0a0df0f1f0470d0a01c495579333a2835aac9f5d
2019-12-12 09:53:03 -08:00
Michael Suo
62b10721fb Actually make flake8 do something (#30892)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30892

Fixes all outstanding lints and actually installs a properly configured
flake8

Test Plan: Imported from OSS

Differential Revision: D18862825

Pulled By: suo

fbshipit-source-id: 08e9083338a7309272e17bb803feaa42e348aa85
2019-12-06 17:50:50 -08:00
Nathan Goldbaum
9d3402e4cb Add the __torch_function__ API override mechanism (#30730)
Summary:
This is a re-do of https://github.com/pytorch/pytorch/issues/27064, which was reverted (b8792c0438). This was landed at the same time as other work that added new operators to the `torch` namespace so the check for whether the `torch` namespace is exhaustively checked for overridability was triggering test failures.

I've temporarily disabled that check and added an explanatory comment that the check will be re-enabled in a future PR that will be merged during a time when the commit velocity on PyTorch is lower.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30730

Differential Revision: D18813270

Pulled By: ezyang

fbshipit-source-id: 70477c4656dca8fea6e7bc59259555041fcfbf68
2019-12-04 13:19:07 -08:00
Edward Yang
b8792c0438 Revert D18645954: add __torch_function__ API override mechanism
Test Plan: revert-hammer

Differential Revision:
D18645954

Original commit changeset: 54b5e4344d7a

fbshipit-source-id: 4a7aebb483e6b001130d6f384ccc53c5a808ab13
2019-12-04 07:41:47 -08:00
Prasun Anand
d12786b24f add __torch_function__ API override mechanism (#27064)
Summary:
Closes https://github.com/pytorch/pytorch/issues/24015 (see description of that issue for more details).

For a toy example, see the `DiagonalTensor` and `SubDiagonalTensor` class in test/test_overrides.py.

This PR currently contains:

* tests for `__torch_function__` behavior
* modification to `gen_python_functions` and `parse` function signatures and dispatched to correct overloaded argument.

This feature is inspired by and analogous to NumPy's `__array_function__` protocol ([see NumPy Enhancement Proposal 18](https://numpy.org/neps/nep-0018-array-function-protocol.html#trying-array-function-methods-until-the-right-one-works)).

### Benchmarks:
See Nathan's comment below: https://github.com/pytorch/pytorch/pull/27064#issuecomment-554601189
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27064

Differential Revision: D18645954

Pulled By: ezyang

fbshipit-source-id: 54b5e4344d7afdbcf996bb57191b0bdadc7b1767
2019-12-04 05:56:46 -08:00
Pavel Belevich
cc81769e10 C++ API parity: isfinite
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/30083

Test Plan: Imported from OSS

Differential Revision: D18594723

Pulled By: pbelevich

fbshipit-source-id: 5970e0aa6ef8994e9c4a741784fd053383aaceb7
2019-11-19 20:00:05 -08:00
Will Feng
3bd0f476d4 Revert D18233037: C++ API parity: isfinite
Test Plan: revert-hammer

Differential Revision:
D18233037

Original commit changeset: c76b9467bbc1

fbshipit-source-id: 97d2cfa9de767a8c3a0ca919f9d768e959fa484e
2019-11-18 20:26:19 -08:00
Pavel Belevich
8df5e10ee9 C++ API parity: isfinite
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28918

Test Plan: Imported from OSS

Differential Revision: D18233037

Pulled By: pbelevich

fbshipit-source-id: c76b9467bbc1fbb2c9bf49855895c98438b36c12
2019-11-18 19:06:57 -08:00
Vitaly Fedyunin
bf61405ed6 explicitly provide memory format when calling to *_like operators
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/29387

Test Plan: Imported from OSS

Differential Revision: D18429729

Pulled By: VitalyFedyunin

fbshipit-source-id: c71264ed5d64ed7e5d8ea907413b6b8e7b67769a
2019-11-11 17:57:34 -08:00
Xiang Gao
5f03ad9698 Add note to docs of torch.unique (#29165)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/19151
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29165

Differential Revision: D18319890

Pulled By: soumith

fbshipit-source-id: 162afaecd5371446bec2a1769e0a8848ecffb002
2019-11-07 22:03:15 -08:00
Igor Fedan
75309b45f3 explicitly provide memory format when calling to clone() at Indexing.cpp
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28660

Test Plan: Imported from OSS

Differential Revision: D18333346

Pulled By: ifedan

fbshipit-source-id: 06590205d883a5096388a4ae318389244130972d
2019-11-07 05:38:32 -08:00
Pearu Peterson
fd4f22e4ea Generalized LU factorization (#28608)
Summary:
This PR implements support for generalized LU factorization that is required for various algorithms such as PCA (see issue https://github.com/pytorch/pytorch/issues/8049).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/28608

Differential Revision: D18326449

Pulled By: ezyang

fbshipit-source-id: d4011d75710e06e87ddbf5ad9afae42ba3330548
2019-11-05 12:27:40 -08:00
Igor Fedan
12dde7f58a cdist performance improvement for euclidean distance (#25799)
Summary:
jacobrgardner https://github.com/pytorch/pytorch/issues/15253#issuecomment-491467128 preposed a way to speedup euclidean distance calculation. This PR is implementation of this solution for normal and batch version.

Also simonepri provided performance metrics https://github.com/pytorch/pytorch/issues/15253#issuecomment-502363581
![image](https://user-images.githubusercontent.com/12058312/64460756-44a24580-d0c9-11e9-9f7f-a5942f4c832d.png)

Current implementation has speedup comparing to jacobrgardner approach
![image](https://user-images.githubusercontent.com/12058312/64461495-5553bb00-d0cb-11e9-87e6-302b8cc7e12b.png)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25799

Differential Revision: D17964982

Pulled By: ifedan

fbshipit-source-id: bf7bd0dbfca51fd39e667da55139347480f30a2f
2019-10-17 14:56:54 -07:00
Dominik1123
5797f5dd27 Update 'einsum' docstring to conform to PEP-484 (#27563)
Summary:
[PEP-484](https://www.python.org/dev/peps/pep-0484/#arbitrary-argument-lists-and-default-argument-values) specifies that arbitrary argument lists, here `*operands`, should be annotated with the type of the single arguments, i.e. not indicating that the whole thing is wrapped into a `list` (which is a Python internal anyway). The previous docstring caused problems with type checkers for IDEs such as PyCharm ([see here](https://youtrack.jetbrains.com/issue/PY-38035)).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27563

Differential Revision: D17904748

Pulled By: soumith

fbshipit-source-id: 0a7fcbbb12e388e6fc40d48bf533652a96024757
2019-10-15 14:35:24 -07:00
Iurii Zdebskyi
293e35a87c Fixed Error message for tensor.align_to (#27221)
Summary:
Fixing this [issue1](https://github.com/pytorch/pytorch/issues/27074) and [issue2](https://github.com/pytorch/pytorch/issues/27073)
Tested via unit tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/27221

Differential Revision: D17716235

Pulled By: izdeby

fbshipit-source-id: c7bafd16b469c91924ebc3dba77ca56424d4c93c
2019-10-02 14:19:40 -07:00
vishwakftw
15b506068b Remove deprecated torch.gels (#26480)
Summary:
Changelog:
- Remove `torch.gels` which was deprecated in v1.2.0
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26480

Test Plan: - No tests were changed and all callsites for `torch.gels` where modified to `torch.lstsq` when `torch.lstsq` was introduced

Differential Revision: D17527207

Pulled By: zou3519

fbshipit-source-id: 28e2fa3a3bf30eb6b9029bb5aab198c4d570a950
2019-09-23 07:15:39 -07:00
Richard Zou
7030f2c623 Implement tensor.align_to(names), torch.align_tensors(*tensors) (#23804)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23804

`output = tensor.align_to(names)` returns a view of `tensor` such that
`output.names = names`. Dimensions with the same names in `tensor` and
`output` have the same sizes; dimensions with new names have size 1.

The following must be true for this operation to succeed:
1) tensor.names must be a subsequence (not necessarily contiguous) of `names`
2) Aligning tensor.names to names must not change the absolute position from the
   right of any unnamed dimension.

In practice, these constraints mean that aligning cannot transpose
names.

Some examples:
- Tensor[C].align_to(C) -> Tensor[C]
- Tensor[N].align_to([N, C]) -> Tensor[N, C]
- Tensor[H, W].align_to([N, H, W, C]) -> Tensor[N, H, W, C]
- Tensor[None].align_to([N, None]) -> Tensor[N, None]
- Tensor[N].align_to([N, None None]) -> Tensor[N, None, None]

Examples of error cases:
- Tensor[W, H].align_to([N, H, W, C]) -> Error (not a subsequence)
- Tensor[None, H].align_to([None, H, W]) -> Error (would change the
absolute position from the right of a None dimension)

`torch.align_tensors(*tensors)` aligns the named dimensions of each
tensor according to the alignment rules so that they can be used in an
operation. More concretely, it aligns each tensor to the
longest names among the names of the tensors in `tensors`.

This allows users to emulate "broadcasting by names", which is one of
the things named tensors tries to enable. Here is an example:

```
imgs: Tensor[N, C, H, W]
scale: Tensor[N]

// Doesn't work because we do broadcasting by alignment by default
imgs * scale

// Does work
imgs, scale = torch.align_tensors(imgs, scale)
imas * scale
```

Future:
- Consider allowing broadcasting by names by default.

Test Plan:
- The diff looks pretty large but more than half of it is testing.
- new tests [namedtensor ci]

Differential Revision: D16657927

Pulled By: zou3519

fbshipit-source-id: e2f958bf5146c8ee3b694aba57d21b08e928a4e6
2019-08-14 09:40:27 -07:00
Iurii Zdebskyi
865c7eea48 Changed tensor comparison return type from uint8 to bool (#21113)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21113
ghimport-source-id: 9c4ba63457a72bfc41894387e0b01be3fd9a9baf

Test Plan: Imported from OSS

Differential Revision: D15552204

Pulled By: izdeby

fbshipit-source-id: a608213668649d058e22b510d7755cb99e7d0037
2019-08-01 07:54:53 -07:00
vishwakftw
b3a9a7a9b9 Rename gels to lstsq (#23460)
Summary:
Changelog:
- Rename `gels` to `lstsq`
- Fix all callsites
- Rename all tests
- Create a tentative alias for `lstsq` under the name `gels` and add a deprecation warning to not promote usage.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/23460

Test Plan: - All tests should pass to confirm that the patch is correct

Differential Revision: D16547834

Pulled By: colesbury

fbshipit-source-id: b3bdb8f4c5d14c7716c3d9528e40324cc544e496
2019-07-30 09:56:04 -07:00
vishwakftw
6dfecc7e01 Remove deprecated linear algebra functions (and methods) (#22841)
Summary:
Changelog:
- Removed the following linear algebra functions in PyTorch in favor of the renamed operations
  - `btrifact` (use `lu` instead)
  - `btrifact_with_info` (use `lu` with `get_infos=True` instead)
  - `btrisolve` (use `lu_solve` instead)
  - `btriunpack` (use `lu_unpack` instead)
  - `gesv` (use `solve` instead)
  - `pstrf` (use `cholesky` instead)
  - `potrf` (use `cholesky` instead)
  - `potri` (use `cholesky_inverse` instead)
  - `potrs` (use `cholesky_solve` instead)
  - `trtrs` (use `triangular_solve` instead)

- Removed dead code after the removal of `pstrf`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/22841

Test Plan:
- All existing tests should pass to verify that the removal is clean

Closes https://github.com/pytorch/pytorch/issues/22832

Differential Revision: D16346184

Pulled By: zou3519

fbshipit-source-id: f748d16ed7609c028de6adcbc28684d5a1af0678
2019-07-19 11:43:06 -07:00
vishwakftw
c9ba3f699d Bag of documentation fixes (#21846)
Summary:
Thanks henon for raising the issues.

Fixes https://github.com/pytorch/pytorch/issues/21830
Fixes https://github.com/pytorch/pytorch/issues/21831
Fixes https://github.com/pytorch/pytorch/issues/21832
Fixes https://github.com/pytorch/pytorch/issues/21827
Fixes https://github.com/pytorch/pytorch/issues/21822
Fixes https://github.com/pytorch/pytorch/issues/21820
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21846

Differential Revision: D15847389

Pulled By: soumith

fbshipit-source-id: 421cc48af646a2618af731697de7d4de83d3eabe
2019-06-16 19:35:27 -07:00
Stefan Krah
8b9b215dc5 Add a 'dim' argument to nuclear norm (#21022)
Summary:
Addresses #18275.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21022

Differential Revision: D15743515

Pulled By: ezyang

fbshipit-source-id: e4aaea0bd7f863a2abad45c4322d6a9fb02a88e3
2019-06-10 15:18:34 -07:00
Charles Lovering
8ae7b1c486 Update functional.py doc (#21510)
Summary:
- Fixes a typo.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21510

Differential Revision: D15731277

Pulled By: ezyang

fbshipit-source-id: c3f8e110f5c61e797b857477b495168ea8d63cd5
2019-06-09 15:28:09 -07:00
Jason Lian
6874c4058d Add type annotation to stft (#21302)
Summary:
We want to be able to call stft from a torchscript which requires that stft have a type annotation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/21302

Differential Revision: D15607973

Pulled By: cpuhrsch

fbshipit-source-id: c4a5c09cdaafe7e81cf487a3ad216d1b03464a21
2019-06-05 10:06:48 -07:00
Hong Xu
ef1fdc27a3 Raise TypeError when the argument to isinf and isfinite is not a tensor (#20817)
Summary:
Currently when the argument to isinf and isfinite is not tensor, a ValueError is raised. This, however, should be a TypeError, because the error is a type mismatch.

In the error message, "str(tensor)" is replaced by "repr(tensor)" because, when an error occurs, a printable representation of the object is likely more useful than the "informal" string version of the object.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20817

Differential Revision: D15495624

Pulled By: ezyang

fbshipit-source-id: 514198dcd723a7031818e50a87e187b22d51af73
2019-05-24 09:18:15 -07:00
sebftw
62957ab0a1 Tiny spelling mistake fix. (#20425)
Summary:
"then the output would also has k tensors" -> "then the output would also have k tensors"
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20425

Differential Revision: D15320152

Pulled By: zou3519

fbshipit-source-id: b04e2ccd29c6a3e33ad1040d0ea975a01a7bd9b5
2019-05-13 11:18:53 -07:00
vishwakftw
c30224ad21 Rename potri to cholesky_inverse (#19498)
Summary:
Changelog:
- Rename `potri` to `cholesky_inverse` to remain consistent with names of `cholesky` methods (`cholesky`, `cholesky_solve`)
- Fix all callsites
- Rename all tests
- Create a tentative alias for `cholesky_inverse` under the name `potri` and add a deprecation warning to not promote usage
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19498

Differential Revision: D15029901

Pulled By: ezyang

fbshipit-source-id: 2074286dc93d8744cdc9a45d54644fe57df3a57a
2019-04-22 08:18:39 -07:00
Xiang Gao
e1750754c8 Step 4: add support for unique with dim=None (#18651)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18651
ghimport-source-id: e11988130a3f9a73529de0b0d08b4ec25fbc639c

Differential Revision: D15000463

Pulled By: VitalyFedyunin

fbshipit-source-id: 9e258e473dea6a3fc2307da2119b887ba3f7934a
2019-04-18 18:28:07 -07:00
Xiang Gao
df67969e6b Step 3: Add support for return_counts to torch.unique for dim not None (#18650)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18650
ghimport-source-id: 75759c95e6c48e27c172b919097dbc40c6bfb5e6

Differential Revision: D14892319

Pulled By: VitalyFedyunin

fbshipit-source-id: ec5d1b80fc879d273ac5a534434fd648468dda1e
2019-04-16 14:06:45 -07:00
Xiang Gao
3f7ddd269c Step 2: Rename _unique_dim2_temporary_will_remove_soon to unique_dim (#18649)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18649
ghimport-source-id: 3411d240a6af5fe299a889667964730184e30645

Differential Revision: D14888292

Pulled By: VitalyFedyunin

fbshipit-source-id: 80da83c264598f74ab8decb165da4a1ce2b352bb
2019-04-12 12:41:20 -07:00
Xiang Gao
ea2405c7dc Add torch.unique_consecutive (#19060)
Summary:
Fixes: https://github.com/pytorch/pytorch/issues/19045

Please review: VitalyFedyunin ngimel

This is independent on the #18649 series. This will cause merge conflicts in #18649 series, but please merge this first, and I will resolve the merge conflicts there.

The new feature is exposed in `_unique2_temporary_will_remove_soon` and `_unique_dim2_temporary_will_remove_soon`. But not at `torch.unique` yet. I will take care of the API after #18649 series get merged completely.

Benchmark on a tensor of shape `torch.Size([15320, 2])`:

```python
print(torch.__version__)
print()
a = tensor.sort().values.to('cpu')
print('cpu, sorted_input=False:')
%timeit torch._unique2_temporary_will_remove_soon(a)
%timeit torch._unique2_temporary_will_remove_soon(a, return_inverse=True)
%timeit torch._unique2_temporary_will_remove_soon(a, return_counts=True)
%timeit torch._unique2_temporary_will_remove_soon(a, return_inverse=True, return_counts=True)
print()
print('cpu, sorted_input=True:')
%timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True)
%timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_inverse=True)
%timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_counts=True)
%timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_inverse=True, return_counts=True)
print()
a = a.to('cuda')
print('cuda, sorted_input=False:')
%timeit torch._unique2_temporary_will_remove_soon(a); torch.cuda.synchronize()
%timeit torch._unique2_temporary_will_remove_soon(a, return_inverse=True); torch.cuda.synchronize()
%timeit torch._unique2_temporary_will_remove_soon(a, return_counts=True); torch.cuda.synchronize()
%timeit torch._unique2_temporary_will_remove_soon(a, return_inverse=True, return_counts=True); torch.cuda.synchronize()
print()
print('cuda, sorted_input=True:')
%timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True); torch.cuda.synchronize()
%timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_inverse=True); torch.cuda.synchronize()
%timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_counts=True); torch.cuda.synchronize()
%timeit torch._unique2_temporary_will_remove_soon(a, sorted_input=True, return_inverse=True, return_counts=True); torch.cuda.synchronize()
```

```
1.1.0a0+2addccc

cpu, sorted_input=False:
340 µs ± 5.88 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
717 µs ± 14.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
52.3 ms ± 2.75 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
52.3 ms ± 1.79 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

cpu, sorted_input=True:
32.8 µs ± 285 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
49.9 µs ± 557 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
51.6 µs ± 1.08 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
78 µs ± 782 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

cuda, sorted_input=False:
213 µs ± 1.52 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
291 µs ± 3.81 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
250 µs ± 1.05 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
321 µs ± 1.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

cuda, sorted_input=True:
45.6 µs ± 2.13 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
110 µs ± 2.47 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
82 µs ± 857 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
143 µs ± 409 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```

```python
print(torch.__version__)
print()
a1, a2 = tensor.unbind(1)
indices = (a1 * tensor.max() + a2).sort().indices
a = tensor.index_select(0, indices).to('cpu')
print('cpu, sorted_input=False:')
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0)
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_inverse=True)
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_counts=True)
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_inverse=True, return_counts=True)
print()
print('cpu, sorted_input=True:')
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True)
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_inverse=True)
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_counts=True)
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_inverse=True, return_counts=True)
print()
a = a.to('cuda')
print('cuda, sorted_input=False:')
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0); torch.cuda.synchronize()
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_inverse=True); torch.cuda.synchronize()
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_counts=True); torch.cuda.synchronize()
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, return_inverse=True, return_counts=True); torch.cuda.synchronize()
print()
print('cuda, sorted_input=True:')
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True); torch.cuda.synchronize()
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_inverse=True); torch.cuda.synchronize()
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_counts=True); torch.cuda.synchronize()
%timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted_input=True, return_inverse=True, return_counts=True); torch.cuda.synchronize()
```

```
cpu, sorted_input=False:
55.4 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
55.8 ms ± 616 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
55.2 ms ± 402 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
55.1 ms ± 725 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

cpu, sorted_input=True:
54.7 ms ± 585 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
55.2 ms ± 1.23 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
54.5 ms ± 865 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
54.9 ms ± 577 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

cuda, sorted_input=False:
171 µs ± 783 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
220 µs ± 1.65 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
203 µs ± 2.95 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
251 µs ± 2.83 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

cuda, sorted_input=True:
59.6 µs ± 757 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
113 µs ± 431 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
93.2 µs ± 2.13 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
147 µs ± 2.81 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
The CPU implementation of `unique_dim` is super slow, see https://github.com/pytorch/pytorch/issues/18987, but this PR will not worry about this issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19060

Differential Revision: D14866909

Pulled By: ezyang

fbshipit-source-id: d20012cec68c37b05cf770a6f4d6524f910b950f
2019-04-10 07:36:08 -07:00
Vishwak Srinivasan
487388d8ad Rename btrisolve to lu_solve (#18726)
Summary:
Changelog:
- Rename `btrisolve` to `lu_solve` to remain consistent with names of solve methods (`cholesky_solve`, `triangular_solve`, `solve`)
- Fix all callsites
- Rename all tests
- Create a tentative alias for `lu_solve` under the name `btrisolve` and add a deprecation warning to not promote usage
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18726

Differential Revision: D14726237

Pulled By: zou3519

fbshipit-source-id: bf25f6c79062183a4153015e0ec7ebab2c8b986b
2019-04-09 15:21:24 -07:00
Vishwak Srinivasan
e73be58ff7 Rename btriunpack to lu_unpack (#18529)
Summary:
Changelog:
- Renames `btriunpack` to `lu_unpack` to remain consistent with the `lu` function interface.
- Rename all relevant tests, fix callsites
- Create a tentative alias for `lu_unpack` under the name `btriunpack` and add a deprecation warning to not promote usage.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18529

Differential Revision: D14683161

Pulled By: soumith

fbshipit-source-id: 994287eaa15c50fd74c2f1c7646edfc61e8099b1
2019-03-29 13:01:30 -07:00