Commit Graph

419 Commits

Author SHA1 Message Date
Erik Brinkman
91089a7e17 Add GPU implementation of pdist (#11102)
Summary:
Add the gpu kernel version.

The parallelism I went with performs poorly when there are a large number of vectors, but they're all short, as I don't allocate the thread pool to wrap in that case.

Test Plan
---------
```
python -m unittest test_torch.TestTorch.test_pdist_{empty,scipy} test_nn.TestNN.test_pdist{,_zeros,_empty_row,_empty_col,_cpu_gradgrad_unimplemented,_cuda_gradgrad_unimplemented} test_jit.TestJitGenerated.test_nn_pdist
```

Current performance specs are a little underwhelming, I'm in the process of debugging.

size | torch | torch cuda | scipy
-----|-------|------------|------
16 x 16 | 9.13 µs ± 3.55 µs | 9.86 µs ± 81.5 ns | 15.8 µs ± 1.2 µs
16 x 1024 | 15 µs ± 224 ns | 9.48 µs ± 88.7 ns | 88.7 µs ± 8.83 µs
1024 x 16 | 852 µs ± 6.03 µs | 7.84 ms ± 6.22 µs | 4.7 ms ± 166 µs
1024 x 1024 | 34.1 ms ± 803 µs | 11.5 ms ± 6.24 µs | 273 ms ± 6.7 ms
2048 x 2048 | 261 ms ± 3.5 ms | 77.5 ms ± 41.5 µs | 2.5 s ± 97.6 ms
4096 x 4096 | 2.37 s ± 154 ms | 636 ms ± 2.97 µs | 25.9 s ± 394 ms
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11102

Differential Revision: D9697305

Pulled By: erikbrinkman

fbshipit-source-id: 2b4f4b816c02b3715a85d8db3f4e77479d19bb99
2018-09-07 09:09:46 -07:00
Edward Yang
49231ab0a8 Reimplement storage slicing. (#11314)
Summary:
In #9466 I got rid of storage views and eliminated all places where
they were used... OR SO I THOUGHT.  In actuality, under certain
conditions (specifically, if you trained a CUDA multiprocessing model
shared over CUDA IPC and then serialized your parameters), you could
also serialize storage slices to the saved model format.  In #9466,
I "fixed" the case when you loaded the legacy model format (really,
just unshared the storages--not strictly kosher but if you aren't
updating the parameters, shouldn't matter), but NOT the modern model format, so
such models would fail.

So, I could have applied the legacy model format fix too, but
hyperfraise remarked that he had applied a fix that was effectively
the same as unsharing the storages, but it had caused his model to
behave differently.  So I looked into it again, and realized that
using a custom deleter, I could simulate the same behavior as old
storage slices.  So back they come.

In principle, I could also reimplement storage views entirely using
our allocators, but I'm not going to do that unless someone really
really wants it.

Fixes #10120.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11314

Reviewed By: ailzhang

Differential Revision: D9671966

Pulled By: ezyang

fbshipit-source-id: fd863783d03b6a6421d6b9ae21ce2f0e44a0dcce
2018-09-06 16:11:59 -07:00
Wei Yang
425ea6b31e fix doc for functional.dropout* (#10417)
Summary:
- fixes #4177
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10417

Differential Revision: D9542876

Pulled By: weiyangfb

fbshipit-source-id: 480ed973d1fe0364f4acb5cd596c2031895b82df
2018-09-05 17:26:00 -07:00
Thomas Viehmann
267e1ec112 Accept more numpy scalars as doubles (#9659)
Summary:
Allows mulitplication of e.g. numpy.float32 with tensors.

This came up with #9468

If you want this and after the other patch is done, I'll add tests (but that would be conflicting, so I prefer to wait).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9659

Differential Revision: D8948078

Pulled By: weiyangfb

fbshipit-source-id: c7dcc57b63e2f100df837f70e1299395692f1a1b
2018-09-05 10:25:55 -07:00
Thomas Viehmann
d4060d2d0e Implement torch.tensordot (#10025)
Summary:
Fixes: #8988
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10025

Reviewed By: ezyang

Differential Revision: D9540967

Pulled By: yf225

fbshipit-source-id: 6ba2a7777162983977db884b693e6f4543b31aeb
2018-09-04 21:10:07 -07:00
Christian Puhrsch
313e89d8db Fix dimension collapsing (#11226)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/11206
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11226

Differential Revision: D9646638

Pulled By: cpuhrsch

fbshipit-source-id: 104f367f75a4478bb7580324ea3661de71b2c8b0
2018-09-04 17:27:52 -07:00
Tongzhou Wang
7e2136c2b5 remove allclose from test_doc skipped list
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11187

Differential Revision: D9628349

Pulled By: SsnL

fbshipit-source-id: 0ff94666542ca049a6d82091bd9fc79ec1699ac6
2018-09-03 09:39:56 -07:00
iotamudelta
33c7cc13ca improve docker packages, fix bugs, enable tests, enable FFT (#10893)
Summary:
* improve docker packages (install OpenBLAS to have at-compile-time LAPACK functionality w/ optimizations for both Intel and AMD CPUs)
* integrate rocFFT (i.e., enable Fourier functionality)
* fix bugs in ROCm caused by wrong warp size
* enable more test sets, skip the tests that don't work on ROCm yet
* don't disable asserts any longer in hipification
* small improvements
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10893

Differential Revision: D9615053

Pulled By: ezyang

fbshipit-source-id: 864b4d27bf089421f7dfd8065e5017f9ea2f7b3b
2018-09-02 08:54:42 -07:00
Tongzhou Wang
1350f76b62 Fix max and min with inf on CUDA (#11091)
Summary:
Fixes #10237 #11084

cc vishwakftw
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11091

Differential Revision: D9582859

Pulled By: SsnL

fbshipit-source-id: 3991c0a2af65ba82fa815b82f9e6b2107912fd10
2018-09-01 23:09:23 -07:00
Erik Brinkman
611a608517 Add ATen pdist CPU kernel (#10782)
Summary:
Also add single grad whitelist to the jit test
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10782

Reviewed By: ezyang

Differential Revision: D9583378

Pulled By: erikbrinkman

fbshipit-source-id: 069e5ae68ea7f3524dec39cf1d5fe9cd53941944
2018-08-30 11:55:27 -07:00
pbialecki
2cc98d8df7 Adds dim argument to torch.unique (#10423)
Summary:
Initial version of `unique` supporting a `dim` argument.

As discussed in [this issue](https://github.com/pytorch/pytorch/issues/9997) I added the `dim` argument to `torch.unique` with the same behavior like [numpy](https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.unique.html).

Since the implementation is based on `std/thrust::unique`, the `tensor` always needs to be sorted. The `sorted` argument in `torch.unique` does not have any function, as in the CUDA version of the plain `torch.unique`.

To check the performance and equal behavior between `torch.unique` and `np.unique`, I've used [this gist](https://gist.github.com/ptrblck/ac0dc862f4e1766f0e1036c252cdb105).

Currently we achieve the following timings for an input of `x = torch.randint(2, (1000, 1000))`:
(The values are calculated by taking the average of the times for both dimension)

| Device | PyTorch (return_inverse=False) | Numpy (return_inverse=False) | PyTorch (return_inverse=True) | Numpy (return_inverse=True) |
| --- | --- | --- | --- | --- |
| CPU | ~0.007331s | ~0.022452s | ~0.011139s | ~0.044800s |
| GPU | ~0.006154s | - | ~0.105373s | - |

Many thanks to colesbury for the awesome mentoring and the valuable advices on the general implementation and performance issues!
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10423

Differential Revision: D9517289

Pulled By: soumith

fbshipit-source-id: a4754f805223589c2847c98b8e4e39d8c3ddb7b5
2018-08-29 16:26:09 -07:00
Tongzhou Wang
e9eed8edb4 Add doc for Tensor.digamma_? (#11008)
Summary:
follow up for #10967

zou3519 vishwakftw
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11008

Differential Revision: D9559889

Pulled By: SsnL

fbshipit-source-id: a05d8fbad92a54bcdb93de6e62a7f94180da1d99
2018-08-29 14:11:16 -07:00
Christian Puhrsch
ec519e8a4a Reduce number of elements within test_abs
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10997

Differential Revision: D9556861

Pulled By: cpuhrsch

fbshipit-source-id: 986ef275e94fcffcc04a5c1103b8b7bfb4ae3ba5
2018-08-29 12:55:54 -07:00
Ailing Zhang
a9469c9c8a Fill eigenvector with zeros if not required (#10645)
Summary:
Fix #10345, which only happens in CUDA case.

* Instead of returning some random buffer, we fill it with zeros.

* update torch.symeig doc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10645

Reviewed By: soumith

Differential Revision: D9395762

Pulled By: ailzhang

fbshipit-source-id: 0f3ed9bb6a919a9c1a4b8eb45188f65a68bfa9ba
2018-08-29 10:55:22 -07:00
Wei Yang
f1df85d799 bug-fix in normal_( ) (#10846)
Summary:
- fixes #10642
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10846

Differential Revision: D9495014

Pulled By: weiyangfb

fbshipit-source-id: 35a9fc349f9f0c21a24141f29c62853ab6a68dae
2018-08-24 11:26:18 -07:00
Will Feng
b14f2e899c Preserve sparse tensor shape and dim invariants, and add scalar tensor support (#9279)
Summary:
When 0-sized dimension support is added, we expect an empty sparse tensor to be a 1-dimensional tensor of size `[0]`, with `sparseDims == 1` and `denseDims == 0`. Also, we expect the following invariants to be preserved at all times:

```
_sparseDims + _denseDims = len(shape)
_indices.shape: dimensionality: 2,  shape: (_sparseDims, nnz)
_values.shape:  dimensionality: 1 + _denseDims.  shape: (nnz, shape[_sparseDims:])
```

This PR fixes various places where the invariants are not strictly enforced when 0-sized dimension support is enabled.

Tested and `test_sparse.py` passes locally on both CPU and CUDA with the `USE_TH_SIZE_ZERO_DIM` flag.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9279

Differential Revision: D8936683

Pulled By: yf225

fbshipit-source-id: 12f5cd7f52233d3b26af6edc20b4cdee045bcb5e
2018-08-23 10:10:24 -07:00
Vishwak Srinivasan
5fb9b31ed5 Add matrix_rank (#10338)
Summary:
- Similar functionality as NumPy
- Added doc string
- Added tests

Differential Revision: D9240850

Pulled By: SsnL

fbshipit-source-id: 1d04cfadb076e99e03bdf699bc41b8fac06831bf
2018-08-22 09:58:38 -07:00
Vishwak Srinivasan
8013dac43d Fix bincount for empty input (#9757)
Summary:
Added tests too. Fixes #9756 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9757

Reviewed By: Yangqing

Differential Revision: D9348485

Pulled By: soumith

fbshipit-source-id: e13afadf8dbea20ee6ee595383c522dcbaf8796a
2018-08-15 20:55:59 -07:00
Thomas Viehmann
151e7de893 varargs for einsum (#10067)
Summary:
Implemented via a wrapper, thank you Richard for the suggestion!

Fixes: #9929
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10067

Differential Revision: D9083388

Pulled By: soumith

fbshipit-source-id: 9ab21cd35278b01962e11d3e70781829bf4a36da
2018-08-15 15:13:25 -07:00
Tongzhou Wang
d043f83019 Add tests for Tensor.* nn.* F.* docs (#10311)
Summary:
Test only for existence for now. I had to skip a lot of them so there a FIXME in the test.

Also I'm not testing torch.* because of namespace issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10311

Differential Revision: D9196341

Pulled By: SsnL

fbshipit-source-id: 9c2ca1ffe660bc1cc664474993f8a21198525ccc
2018-08-14 11:39:46 -07:00
Vishwak Srinivasan
7d16e87f14 Fix byte ordering issue in from_numpy (#9508)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/3671 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9508

Differential Revision: D9307186

Pulled By: soumith

fbshipit-source-id: 39dcaa6fd2d330d7085802acd6f63c19270164fa
2018-08-13 21:39:16 -07:00
pbialecki
c6fc3ab557 fixes printing non-contiguous tensors
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10405

Differential Revision: D9302794

Pulled By: soumith

fbshipit-source-id: e4a7db8d33400a5a050d05fd1679de8bc3cbcf30
2018-08-13 16:26:20 -07:00
iotamudelta
75651d5b58 improve use of ROCm libraries, enable more tests, small fixes (#10406)
Summary:
* some small leftovers from the last PR review
* enable more unit test sets for CI
* replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND)
* use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2
* use strided_batched gemm interface also from the batched internal interface
* re-enable Dropout.cu as we now have philox w/ rocRAND
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406

Reviewed By: Jorghi12

Differential Revision: D9277093

Pulled By: ezyang

fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2
2018-08-13 11:39:43 -07:00
Christian Puhrsch
0b8a0125ab Fixes torch.log after torch.expand giving incorrect results (#10269)
Summary:
fixes #10241
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10269

Differential Revision: D9272472

Pulled By: cpuhrsch

fbshipit-source-id: cd1afbb4386a0d0956ee21b24f0d529755b986ca
2018-08-10 13:39:38 -07:00
Gregory Chanan
209af45614 Back out "[pytorch][PR] Fix bincount for empty input"
Summary: Original commit changeset: 6c4c66c23679

Reviewed By: SsnL

Differential Revision: D9253403

fbshipit-source-id: bf5ee669ed095c06ff58a2871f7350e879261076
2018-08-09 14:25:33 -07:00
Vishwak Srinivasan
b43beec070 Fix bincount for empty input (#9757)
Summary:
Added tests too. Fixes #9756 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9757

Differential Revision: D8966879

Pulled By: soumith

fbshipit-source-id: 9f08a9d5d5d037db16319141d7a227a5efa23869
2018-08-09 12:40:45 -07:00
Thomas Viehmann
6e49f933ad Check that result is on CPU for CPU unary ops kernels (#10358)
Summary:
Fixes: #10270
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10358

Differential Revision: D9233066

Pulled By: soumith

fbshipit-source-id: 39b7524fe55ddb899fb27e2c0ef504ce54dbad35
2018-08-08 21:11:53 -07:00
Roy Li
fe68879832 Fix dir(torch) for python 3.7 (#10271)
Summary:
fixes #10160.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10271

Differential Revision: D9188031

Pulled By: li-roy

fbshipit-source-id: a3620553a8ba2b7391acdf78dbe58afcdb6c5f7f
2018-08-07 09:57:51 -07:00
iotamudelta
a38b572de3 enable unit tests and other changes (#10266)
Summary:
This PR for the ROCm target does the following:
* enable some unit tests on ROCm
* fix a missing static_cast that breaks BatchNorm call on ROCm
* fix BatchNorm to work on ROCm w/ ROCm warp sizes etc
* improve the pyhipify script by introducing kernel scope to some transpilations and other improvements
* fix a linking issue on ROCm
* for more unit test sets: mark currently broken tests broken (to be fixed)
* enable THINLTO (phase one) to parallelize linking
* address the first failing of the elementwise kernel by removing non-working ROCm specialization
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10266

Differential Revision: D9184178

Pulled By: ezyang

fbshipit-source-id: 03bcd1fe4ca4dd3241f09634dbd42b6a4c350297
2018-08-06 14:54:01 -07:00
Owen Anderson
7a377b9a53 Add torch.argsort mirroring similar functionality in numpy. (#9600)
Summary:
Per issue #9542
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9600

Differential Revision: D8952338

Pulled By: resistor

fbshipit-source-id: c3f69d62858ad9458ec5ae563e3ff24b1c9283a7
2018-08-03 11:45:47 -07:00
Richard Zou
6b338c8026 Implement torch.broadcast_tensors (#10075)
Summary:
This exposes expand_outplace to python. Fixes #8076. Fixes #10041.

I didn't name it torch.broadcast because numpy.broadcast does something
slightly different (it returns an object with the correct shape
information).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10075

Differential Revision: D9125816

Pulled By: zou3519

fbshipit-source-id: ebe17c8bb54a73ec84b8f76ce14aff3e9c56f4d1
2018-08-01 19:18:34 -07:00
Wei Yang
6f6a1f2d63 fix test_load_error_msg failure (Network is unreachable) (#10021)
Summary:
- fixes [some failure]
- removed use of urlopen in test_load_error_msg]

cc soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10021

Differential Revision: D9068108

Pulled By: weiyangfb

fbshipit-source-id: a9484d4a913508d54731b6a1eef3cddff66604f2
2018-08-01 00:24:01 -07:00
Gregory Chanan
34c7c56c73 Re-enable empty n-dimensional empty tensor and fix parallel CPU on empty tensors (#10077)
Summary:
This is a combination of https://github.com/pytorch/pytorch/pull/9947 (this was reverted) and https://github.com/pytorch/pytorch/pull/10076.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10077

Differential Revision: D9087491

Pulled By: gchanan

fbshipit-source-id: 9fe9905628000f2ff3e47df32533cd7d1f25a354
2018-07-31 16:43:45 -07:00
Gregory Chanan
6fb9acfc16 Revert empty n-dim and ATen in C2 integration builds
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/10064

Differential Revision: D9082082

Pulled By: gchanan

fbshipit-source-id: ae49470f5b4c89b13beb55fd825de1ba05b6a4fa
2018-07-31 07:25:56 -07:00
Thomas Viehmann
6c7fb1582f Introduce __array_priority__ on torch.Tensor (#9651)
Summary:
This causes numpy to yield to the torch functions,
e.g. instead of numpy array/scalar __mul__ converting the tensor to
an array, it will now arrange for the Tensor __rmul__ to be called.

Fixes case 2 of #9468
I also makes case 3 and 4 equivalent but does not fix them.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9651

Differential Revision: D8948079

Pulled By: ezyang

fbshipit-source-id: bd42c04e96783da0bd340f37f4ac3559e9bbf8db
2018-07-30 14:39:43 -07:00
vishwakftw
ea3c36b822 NumPy Scalar to PyTorch Scalar (#9225)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/4985 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9225

Differential Revision: D8769317

Pulled By: ezyang

fbshipit-source-id: eeaeaf0749c9dc9e372634da68b4bd23e6e3ad28
2018-07-30 14:39:40 -07:00
Thomas Viehmann
faa96c1c47 Deal with spaces in einsum equation string (#9994)
Summary:
Fixes #9930
Thank you, vadimkantorov for the report.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9994

Differential Revision: D9042876

Pulled By: ezyang

fbshipit-source-id: 3bbd1aaaf1b432be40a7652b6a746d80934a216b
2018-07-30 12:57:56 -07:00
Gregory Chanan
ce5f0d40b6 Enable n-dimensional empty tensors. (#9947)
Summary:
These could use some autograd tests, which are coming in a later PR, but using them in autograd is probably pretty rare.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9947

Reviewed By: ezyang

Differential Revision: D9032778

Pulled By: gchanan

fbshipit-source-id: fa5a6509d3bac31ea4fae25143e82de62daabfbd
2018-07-30 12:33:17 -07:00
Sam Gross
829d763c69 Implement add, sub, mul, div using TensorIterator (#8919)
Summary:
```
This adds TensorIterator, a helper class for computing element-wise
operations that's intended to replace the CPU and CUDA apply utils
functions.

CPU kernels are implemented as functions that operate on strided 1-d
tensors compared to CPUApplyUtils which operated individual elements. This
allows the kernels to handle vectorization, while TensorIterator handles
parallelization and non-coalesced dimensions.

GPU kernels continue to operate on elements, but the number of
specializations is reduced. The contiguous case remains the same. The
non-contiguous case uses a single (reduced) shape for all operands and
the fast integer division from THCIntegerDivider. To avoid extra
specializations for indexing with 64-bits, large operations are split
into smaller operations that can be indexed with 32-bits.

Major semantic changes:

 - No more s_add, s_mul, s_div, or s_sub. Broadcasting is handled by
   TensorIterator. The autograd engine performs the reduction assuming
   standard broadcasting if the gradient shape does not match the
   expected shape. Functions that do not use standard broadcasting rules
   should either continue to trace the expand calls or handle the
   reduction in their derivative formula.

 - Use ONNX v7, which supports broadcasting ops.

Performance impact:

 - Small increased fixed overhead (~0.5 us)
 - Larger overhead for wrapped numbers (~2.5 us)
 - No significant change for ops on contiguous tensors
 - Much faster worst-case performance for non-contiguous GPU tensors
 - Faster CPU bias addition (~2x)
 - Faster GPU bias addition (~30% faster)

Future work:

 - Decrease overhead, especially for wrapping numbers in Tensors
 - Handle general inter-type operations
 - Extend to unary ops and reductions
 - Use buffering for compute-bound operations on non-contiguous tensors
   (pull in from CPUApplyUtils)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8919

Differential Revision: D8677600

Pulled By: colesbury

fbshipit-source-id: 61bc9cc2a36931dfd00eb7153501003fe0584afd
2018-07-27 14:43:24 -07:00
Gregory Chanan
c0bacc6284 Guard test_lapack_empty with has_magma. (#9936)
Summary:
CUDA lapack functions generally don't work unless has_magma is true.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9936

Differential Revision: D9028579

Pulled By: gchanan

fbshipit-source-id: 9b77e3b05253fd49bcabf604d0924ffa0e116055
2018-07-27 10:09:00 -07:00
Wei Yang
302adb7cc8 added torch.rot90() to ATen (#8628)
Summary:
1. fixes #6271
2. implemented torch.rot90() following [numpy.rot90()](6a58e25703/numpy/lib/function_base.py (L54-L138))
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8628

Reviewed By: ezyang

Differential Revision: D8987860

Pulled By: weiyangfb

fbshipit-source-id: 8dac3b2a1f6d3288672977aba8b547706ce97fe9
2018-07-25 15:11:44 -07:00
Gregory Chanan
be163f50a3 Avoid divide-by-zero when bartlett_window size is 0.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9788

Differential Revision: D8980951

Pulled By: gchanan

fbshipit-source-id: 429b341ac687afe4f1429bb141ef070bf315519c
2018-07-25 10:40:39 -07:00
bhushan
ea67a2bd11 Allows negative index to tensor.narrow (Fixes: #9546)
Summary:
Fixes #9546
Test cases added

Reviewed By: ezyang

Differential Revision: D8974842

Pulled By: zou3519

fbshipit-source-id: a7707406c2a21e8e14f9c2a8ad4d64c8b08156df
2018-07-25 09:25:45 -07:00
Edward Yang
0262fd0f91 Delete Tensor::typeString() (#9764)
Summary:
The primary use-site of typeString was checked_cast_tensor.
I did a little more than I needed in this patch, to set
the stage for actually deleting the tensor type.

Specifically, I modified checked_cast_tensor to explicitly
take Backend and ScalarType, the idea being that once we
remove the tensor subclasses, we will delete the T template
parameter.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9764

Differential Revision: D8969196

Pulled By: ezyang

fbshipit-source-id: 9de92b974b2c28f12ddad13429917515810f24c6
2018-07-24 22:26:15 -07:00
Thomas Viehmann
7050d83dd7 Make logsumexp_out inplace (#9755)
Summary:
Fixes: #9754

Maybe this could also make its way into 0.4.1, it  is a severe debugging headache if you hit this...
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9755

Reviewed By: ezyang

Differential Revision: D8967178

Pulled By: zou3519

fbshipit-source-id: 151ed24e3a15a0c67014e411ac808fb893929a42
2018-07-24 12:40:48 -07:00
Vishwak Srinivasan
360c1bbd5b Add multivariate log-gamma (mvlgamma) (#9451)
Summary:
1. Add tests in test_cuda, test_torch
2. Add doc strings

Closes https://github.com/pytorch/pytorch/issues/9378 .

Differential Revision: D8859746

Pulled By: ezyang

fbshipit-source-id: 939c309d90940a7aa08f53004c9e7b3b1c9cf54e
2018-07-24 12:10:10 -07:00
Gregory Chanan
6ab5e697b9 Small fixups for enabling zero size dims. (#9724)
Summary:
1) Properly test cpu for alpha/beta addmm cases.
2) Unsqueeze on empty no longer throws an exception.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9724

Reviewed By: ezyang

Differential Revision: D8958513

Pulled By: gchanan

fbshipit-source-id: 6ce2ec4a47201f9b225b8c52354144ace43e9e09
2018-07-24 11:11:39 -07:00
Gregory Chanan
9d6521c3a0 Support n-dimensional empty tensors in CUDA non-reduction dimension f… (#9658)
Summary:
…unctions.

This also unifies the error checkign between scatter/scatterAdd on CUDA.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9658

Differential Revision: D8941527

Pulled By: gchanan

fbshipit-source-id: 750bbac568f607985088211887c4167b67be11ea
2018-07-23 08:40:12 -07:00
Gregory Chanan
3efdece9da Support n-dimensional empty tensors in take/put.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9635

Differential Revision: D8935119

Pulled By: gchanan

fbshipit-source-id: 5035583e7322b1a1720d961945dd0eefb4cb28ef
2018-07-20 15:40:49 -07:00
Gregory Chanan
bae156a481 Support (some) CUDA Lapack on n-dimensional empty tensors.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9631

Reviewed By: ezyang

Differential Revision: D8933202

Pulled By: gchanan

fbshipit-source-id: 1ade4ca439bf26aa921df1da83a827d860f8f48f
2018-07-20 11:40:25 -07:00