Commit Graph

262 Commits

Author SHA1 Message Date
Thomas Viehmann
7cbe63da86 improve handling of precision issue in torch.multinomial (solves #4858) (#5774)
* improve handling of precision issue in torch.multinomial (solves #4858)

* add test

* review feedback - eliminate size check. Thanks!
2018-03-17 10:26:22 -04:00
Tongzhou Wang
940a0ab67b Add logdet and slogdet (#5393)
* 1. Add logdet and slogdet in ATen side
2. Previously, det can return result with incorrect sign upon seeing symmetric
   matrices. This is caused by the wrong assumption I had on SVD (when input is
   symmetric U=V^T). This fixes it.
3. Moreover, after fixing 2 now QR is always needed for det forward. So I moved
   SVD to backward call. Since this is a specific variant of SVD, it is named as
   _svd_with_positive_UV_det, with derivative.yaml entry being svd_backward.
4. Updated/added backward functions for det, logdet and slogdet, which uses
   _svd_with_positive_UV_det and svd_backward inside.
5. Optimized svd_backward:
   a. Avoid unnecessary kernels when only sigma has gradient (this is the usual
      case, and also true with *det backward functions).
   b. Fix SVD double backward by avoiding a nan.

* 1. Add/update grad checks for det, logdet, and slogdet.
2. Fix an incorrect check for dim_args_idx in test_autograd.py
3. Add option to only test a subset of output values, specified by
   test_output_indices, for cases like slogdet where only the
   second output is differentiable.
4. Add better doc for the test generating list.

* Add/improve output tests for det, logdet and slogdet
Add a scaling to random matrices so closeness checks are more robust

* Remove unnecessaery Variable wrappers in some test files

* Add logdet slogdet docs

* Improve an err msg in THTensorLapack.c

* add inverse-based backward for invertible matrices
use svd only for non-invertible case, so don't need the special variant anymore

* use LU rather than QR
2018-03-16 09:23:00 -04:00
Richard Zou
74043b69c2 Alias torch.diagonal, torch.diagflat (#5622)
* Alias torch.diagonal, torch.diagflat

* Address comments; Add sanity tests for torch.diagonal and torch.diagflat
2018-03-09 23:46:42 -05:00
Richard Zou
8ab101ccee Implement pow() for integer types (#5526)
* CPU int-types pow()

* CUDA int-type pow()

* Cleanup + fix deleted line

* Tests for integer-types pow

* Fix build

* Fix windows tests

* Make _test_int_pow static
2018-03-08 22:33:32 -05:00
Richard Zou
461e3e3ae0 Allow indexing tensors with both CPU and CUDA tensors (#5583)
* Allow indexing tensors with both CPU and CUDA tensors

* Remove stray import
2018-03-07 10:24:12 -05:00
Will Feng
9235277dba Re-enable some CUDA tests on Windows (#5446)
This PR enables the following tests on Windows again:

CUDA HalfTensor tests in test_torch.py and test_nn.py
test_Conv2d_deterministic_cudnn in test_nn.py
test_*Tensor_qr_big in test_cuda.py

The issues are no longer reproducible, possibly because of an upgrade to the display driver.

* Reenable CUDA HalfTensor tests on Windows

* Reenable test_Conv2d_deterministic_cudnn on Windows

* Reenable test_*Tensor_qr_big on Windows
2018-03-01 12:21:17 -05:00
Sam Gross
509aed6ca3
More Variable/Tensor clean-ups (#5464) 2018-02-28 16:46:47 -05:00
gchanan
94938be367
Support dtypes in legacy new constructors. (#5343)
* Support dtypes in legacy new constructors.

* Add comment about why we don't have dtype for sparse (indices, values).

* separate legacy tensor ctor vs new (new includes dtypes).

* Use TypeError.
2018-02-28 12:52:11 -05:00
Sam Gross
30ec06c140
Merge Variable and Tensor classes (#5225)
This replaces the torch.Tensor constructors with factories that produce
Variables. Similarly, functions on the torch module (e.g. torch.randn)
now return Variables.

To keep the PR to a reasonable size, I've left most of the unused tensor
code. Subsequent PRs will remove the dead code, clean-up calls to
torch.autograd.Variable, and rename Variable to Tensor everywhere.

There are some breaking changes because Variable and Tensors had
slightly different semantics. There's a list of those changes here:

 https://github.com/pytorch/pytorch/wiki/Breaking-Changes-from-Variable-and-Tensor-merge
2018-02-23 18:03:31 -05:00
Ailing
3ef2e484bf Add fp16 testcases in test_cuda (#5122) 2018-02-21 14:35:29 +01:00
Richard Zou
70e71391d2 Fix THCTensor_(max) and THCTensor_(min) inits (#5265)
Their cuda kernels should be initialized with (min_value, 0) and
(max_value, 0), respectively, where the second number is a default index
value. However, they were being initialized with (max, 1) and (min, 1)
instead, probably a remnant from the lua torch days.

This caused bugs in torch.max() and torch.min() when the input is at the
extreme values, and the max value (or min value) occurs at index 0. For example,

  import torch
  x = torch.ByteTensor([[0]])
  x.cuda().max(dim=0)  # returns (0, 1) but the expected result is (0, 0)
2018-02-15 14:41:19 -08:00
Sam Gross
85e22b5475
Reverts force_gpu_half changes from #3660 (#5000)
The test_cuda.py setup purports to test half tensors, but actually just
re-tests FloatTensors because the keys in type_map were str instead of
type. Testing HalfTensors is more complicated, requiring changes to
precision and requires excluding some unimplemented methods.

We should fully test half CUDA tensors. This change just deletes the
duplicate tests of FloatTensor.
2018-02-07 15:33:17 -05:00
Tongzhou Wang
47ee86776e Fix CPU torch.multinomial with noncontiguous prob tensor (#5093)
* fix CPU torch.multinomial not working on noncontiguous probability distn'

* address comments

* change some tabs to spaces in THStorage.c
2018-02-06 22:11:43 -05:00
Peter Goldsborough
86fd5fd524 Replace async with non_blocking for Python 3.7 (#4999)
* Replace async with non_blocking for Python 3.7 upgrade

* Remove trailing whitespace

* Give _cuda and _type kwargs and accept async for compatibility

* Rename async to non_blocking in all C++ code

* Add entries for async in python_variable_methods

* Friendlier backward compatibility for cuda and type
2018-02-02 09:23:51 -05:00
albanD
6c197c2f15 fix triu and tril for zero-strided inputs on gpu (#4962) 2018-01-31 14:38:49 -05:00
Will Feng
82fed06535 disable qr_big cuda test on Windows (#4747) 2018-01-23 21:29:32 -05:00
Richard Zou
c7a2e318ed Restore cuda variable.bernoulli() (#4787) 2018-01-23 21:12:47 -05:00
Adam Paszke
1061d7970d Move broadcast and broadcast_coalesced to C++ 2018-01-18 11:16:45 +01:00
Tongzhou Wang
5918243b0c Methods for checking CUDA memory usage (#4511)
* gpu mem allocated

* add test

* addressed some of @apaszke 's comments

* cache stats

* add more comments about test
2018-01-09 11:47:48 -05:00
Sam Gross
b8fd57a0cc
Fix handling of empty indices in CUDA Tensor.put_ (#4486)
Fixes #4386
2018-01-05 12:58:27 -05:00
Will Feng
c6adee0807 disable CUDA HalfTensor tests in test_cuda for Windows (#4482) 2018-01-04 22:58:13 +01:00
Fritz Obermeyer
35abc4efa2 Add low-precision digamma() and polygamma() functions (#4399) 2018-01-02 11:53:23 +01:00
Vishwak Srinivasan
e519ef5337 Adding torch.expm1() and its inplace function (#4350) 2017-12-28 18:56:03 +09:00
Sam Gross
1632ab2979
Fix default device for Variable.new() (#4307)
Variable.new() should default to the device of "self" if no device is
specified. Previously, we were using the current device. This now
matches Tensor.new().
2017-12-21 18:35:35 -05:00
Tongzhou Wang
d8b2e5d091 Add python only default init expression; Implement stft, hann/hamming/bartlett window. (#4095)
* implement stft

* addressed comments; implemented window functions; added support for python only default initialization
2017-12-18 12:28:23 -05:00
Tongzhou Wang
e0d5d1b7c9 view in certain noncontig case (#4062) 2017-12-18 02:08:17 -05:00
Richard Zou
9394e65b44 Add proper shape checking to torch.cat (#4087)
* Fix catArray in THTensor

Asserts that the inputs have the same size except in the
cat dimension or are empty (or a mix of both).

* Fix catArray for THCTensor

* Document torch.cat shape checks

* Fix types
2017-12-18 02:05:58 -05:00
Sam Gross
bec0349280 Implement Variable.cuda and Variable.type using ATen (#4139)
* Implement Variable.cuda using ATen

This adds an optional async flag to Tensor::copy_, which attempts to do
a non-blocking copy if the one of the tensors is in pinned memory and
the other is a CUDA tensor.

* Perform cross-device copy in CopyBackwards

Also call torch.cuda._lazy_init() from Variable.cuda()

* Implement Variable.type via ATen

* Changes from review:

 - remove copy_out
 - remove unnecessary include
 - fix default device for .cuda()

* Combine if statements in dispatch_type
2017-12-18 01:54:35 -05:00
Richard Zou
dac5e6568d Better error messages for blas ops with cuda.LongTensor (#4160)
* Better error messages for blas ops with cuda.LongTensor

Fixes #4157

Test plan

Try matrix multiplying with cuda.LongTensors

>>> import torch
>>> x = torch.randn(4, 4).long().cuda()
>>> y = torch.randn(4, 4).long().cuda()
>>> x.mm(y)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: addmm for CUDA tensors only supports floating-point types. Try converting the tensors with .flo
at() at /private/home/rzou/pytorch/pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:381
2017-12-14 11:28:59 -05:00
Sam Gross
aeb7a3668d
Implement Variable.new (#4080) 2017-12-11 15:45:43 -05:00
Tongzhou Wang
c681b03d37 Add determinant function on variable; Add backward on svd (#3816)
* determinant on variable

* svd bwd
2017-12-01 13:22:46 -05:00
Adam Paszke
6ae0d477ea Fix cuBLAS arguments for fp16 dot (#3660)
* Fix cuBLAS arguments for fp16 dot

* Enable FloatTensor <-> CUDA HalfTensor checks in test_cuda.py
2017-11-29 07:16:34 -08:00
Richard Zou
ec389f5128 Fix cuda symeig (#3566)
* Fix cuda symeig

* Add symeig test

* Better check for magma
2017-11-08 20:20:14 -05:00
Richard Zou
00d2befba1 THTensor_varOuterDim numeric stability (#3533) 2017-11-07 13:47:20 -05:00
Richard Zou
3d06a1e075 Make THCTensor_varInnermostDim numerically stable using Welford's algorithm (#3425)
* Use Welford's algorithm when reducing along inner dimension for THCTensor's variance fn

* Use accreals in THCTensor's varInnermostDim

* Skip cuda tests if no cuda

* Variance testing
2017-11-06 16:00:29 -05:00
SsnL
8fd171a6fd add test_index to test_cuda 2017-11-06 14:21:31 -05:00
Sam Gross
7c0b16c140 Add torch.take and Tensor.put_ (#3263)
* Add torch.take and Tensor.put_

These are similar to numpy.take and numpy.put. The take function allows
you to linearly index into a tensor without viewing it as a 1D tensor
first. The output has the same shape as the indices. The put function
copies value into a tensor also using linear indices.
2017-11-01 06:04:44 -04:00
SsnL
91a8d3325e test sparse dp, broadcast_coalesced, reduce_add_coalesced 2017-10-28 18:52:35 -04:00
Ozan Çağlayan
e43a63a968 tensor: Ensure that the tensor is contiguous before pinning (#3266) (#3273)
* tensor: Ensure that the tensor is contiguous before pinning (#3266)

pin_memory() was producing out-of-order tensor when the given
tensor was transposed, i.e. in column-major order.
This commit fixes this by calling contiguous() before pinning.

* test: add contiguous test for pin_memory (#3266)
2017-10-25 13:17:54 +02:00
SsnL
634c8315a4 isContiguous problems (#3148)
* with the size=1 case, impossible to do single point check, replace with isContiguousRange

* fix stride in desc; fix undef scope

* add test for this case for cudnn

* assertTrue
2017-10-20 10:20:33 -04:00
Edward Z. Yang
2dcaa40425 Add get_rng_state_all and set_rng_state_all.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-09-30 16:21:04 -04:00
IraKorshunova
2b9765ad02 Erf and erfinv (#2799) 2017-09-20 21:23:45 -04:00
Francisco Massa
1da87118cc Optimize pow for different exponents and add tests 2017-09-10 13:51:05 -04:00
Anton Osokin
0d34a6451a fixing the bug with squeezing a singleton dimension in torch.min and torch.max 2017-08-16 17:51:48 -04:00
Francisco Massa
b797ee04fc Add CUDA version of eye 2017-08-16 17:25:52 -04:00
Gregory Chanan
b3db52fe36 Support __neg__, .neg(), and neg_() for Long, Int, Short tensor types. 2017-08-15 02:51:25 -04:00
Christian Sarofeen
ac76ab5fca Increase tol. for float tensor qr big test.
test_FloatTensor_qr_big test is still a bit flaky on K80. Increasing tolerance to improve reliability as tests are moved around and results change for this test.
2017-07-27 14:23:06 -04:00
ngimel
3c275fe7a0 Increase flaky test tolerance (#2185) 2017-07-22 11:37:34 -04:00
Sam Gross
71ce3448d9 Fix torch.inverse when magma is not available
Fixes #2156
2017-07-21 15:57:43 -04:00
Francisco Massa
82143487b3 Add CUDA support for arange
Also enables CUDA for range
2017-07-19 15:48:20 -04:00
Trevor Killeen
a45ad7cfba Advanced Indexing Part 1 -- Purely Integer Array Indexing 2017-06-22 17:21:50 -04:00
Gregory Chanan
5b81746767 Simplify python warning settings and cleanup tests. 2017-06-11 05:37:59 -04:00
Gregory Chanan
69287250d1 Add a broadcast parameter to copy_, use it in the library in cases where there is non-broadcasting calls exposed by the tests. 2017-06-11 05:37:59 -04:00
Gregory Chanan
5af46cb352 Add broadcasting support for matmul. 2017-06-11 05:37:59 -04:00
Gregory Chanan
a36f95fe26 Add broadcast support for fused-matmul broadcasting. Functions are: addmm, addbmm, addr, addmv, baddbmm. 2017-06-11 05:37:59 -04:00
Gregory Chanan
85d838a028 Testing over the following: 1) CPU tensor out-of-place functions 2) CPU tensor in-place functions 3) GPU tensor out-of-place functions 4) GPU tensor in-place functions 5) torch. functions 6) Fallback semantics (use pointwise nElem matching rather than broadcasting) 2017-06-11 05:37:59 -04:00
Edward Z. Yang
ba690d5607 Add support for NVTX functions. (#1748) 2017-06-10 18:26:58 +02:00
Alykhan Tejani
5f1a16a018 Torch manual seed to seed cuda devices (#1762) 2017-06-10 12:37:21 +02:00
Adam Paszke
7b578dd68e Add scatterAdd 2017-05-25 16:49:48 -04:00
Alexander Matyasko
33b3968660 add larger tests for qr 2017-05-08 16:58:54 -07:00
Trevor Killeen
f273377d19 add device asserts in scatter/gather kernels 2017-05-03 11:12:26 -04:00
Soumith Chintala
77035d151e make topk test unique 2017-04-28 07:30:25 -04:00
Adam Paszke
01a35dcace Fix coalesced CUDA collectives for nonhomogeneous lists 2017-04-11 14:48:54 -07:00
Rudy Bunel
b16a352a3b Fix remainder and cremainder for integer types 2017-04-07 17:17:44 -07:00
albanD
f0c7124420 Allow support for negative dimension argument for all functions 2017-04-06 16:37:00 -07:00
Adam Paszke
91c4ba7980 Add torch.arange and deprecate torch.range 2017-04-03 10:38:58 -04:00
Brandon Amos
bb353ccc17 Add batch triangular factorization and solves, add IntegerTensor to cwrap (#903) 2017-03-23 15:06:00 -04:00
Sam Gross
e50a1f19b3 Use streams in scatter to overlap copy with compute 2017-03-14 22:46:07 +01:00
soumith
7ad948ffa9 fix tests to not sys.exit(), also fix fatal error on THC initialization 2017-03-01 17:37:04 -05:00
Sam Gross
b190f1b5bc Add another pinned memory test.
Checks that pinned memory freed on a different GPU from which it was
allocated isn't re-used too soon.
2017-03-01 12:22:31 +01:00
Luke Yeager
61bd5a0643 [Lint] Address F811 2017-02-27 19:33:00 -05:00
Adam Paszke
4c474a9939 Improve prodall CUDA test 2017-02-20 23:28:31 -08:00
Adam Paszke
a1534cc37d Fix auto-gpu in cat 2017-02-14 21:28:50 +01:00
Sam Gross
712686ce91 Add cat, contiguous, squeeze, and unsqueeze to THPP
Use unsqueeze and view from TH/THC
2017-02-11 17:49:31 +01:00
Luke Yeager
e7c1e6a8e3 [pep8] Fix most lint automatically with autopep8
Here's the command I used to invoke autopep8 (in parallel!):

    git ls-files | grep '\.py$' | xargs -n1 -P`nproc` autopep8 -i

Several rules are ignored in setup.cfg. The goal is to let autopep8
handle everything which it can handle safely, and to disable any rules
which are tricky or controversial to address. We may want to come back
and re-enable some of these rules later, but I'm trying to make this
patch as safe as possible.

Also configures flake8 to match pep8's behavior.

Also configures TravisCI to check the whole project for lint.
2017-01-28 01:15:51 +01:00
Adam Paszke
a1fa995044 Fixes and improvements (#593)
* Fix error in ELU backward

* Add --seed flag for testst st

* Add test for BatchNorm eval

* Fix autograd.backward docs

* Support cc flags in cuDNN search

* Fix IndexSelect backward formula
2017-01-25 22:21:49 -05:00
Sam Gross
d951d5b1cd Fix tensor.cuda(0) when on non-zero device. (#472) 2017-01-18 01:08:37 -05:00
Adam Paszke
f91bb96071 Remove cmin, cmax and cinv 2017-01-16 19:07:37 -05:00
soumith
b07358b329 renaming test to avoid dot in test name 2016-12-27 13:34:09 -08:00
soumith
2aea8077f9 renaming test to avoid dot in test name 2016-12-27 13:17:04 -08:00
Soumith Chintala
f45d75ed22 make the CUDA-aware tests backoff if CUDA no available 2016-12-24 15:36:00 -05:00
soumith
93ed476e7d adding LAPACK double bindings, adding fmod and remainder 2016-12-22 17:36:47 -08:00
Adam Paszke
59b9eeff49 Expose gather and equals for CUDA tensors 2016-12-19 20:35:08 -05:00
Sam Gross
20fffc8bb7 Fix torch.is_tensor for half tensors (#322)
Fixes #311
2016-12-19 15:27:47 +01:00
Sam Gross
0d7d29fa57 Enable caching allocator for CUDA pinned memory (#275)
Also add binding for CUDA "sleep" kernel
2016-12-02 01:33:56 -05:00
Adam Paszke
88d9fdec2e Add torch.cuda.set_device 2016-12-01 23:14:41 +01:00
Sam Gross
6322cf3234 Allow device=None in Tensor constructor"
Setting device=None is the same as not specifying the device (use the
current active device).
2016-12-01 20:09:19 +01:00
Soumith Chintala
103e70ccc5 adding cuda types for tensor methods (#194) 2016-11-02 10:25:58 -04:00
Sam Gross
f2d7e94948 Use torch.Size for Tensor sizes and tuple for strides
See issue #20

The torch.Size class is a tuple subclass which distinguishes sizes from
other tuples so that torch.Tensor(size) is interpreted as size instead
of data.
2016-10-28 19:37:09 +02:00
Adam Paszke
19f2f1a9d3 Buffer values when constructing a CUDA tensor from a sequence 2016-10-24 22:30:11 +02:00
Sam Gross
79ead42ade Add CUDA Stream and Event API (#133) 2016-10-18 12:15:57 -04:00
Sam Gross
ee14cf9438 Add support for pinned memory: (#127)
torch.Storage/Tensor.pin_memory()
 torch.Storage/Tensor.is_pinned()
2016-10-15 18:38:26 -04:00
Soumith Chintala
3d6ebde756 qr and ormqr tests and bugfix 2016-10-14 03:10:16 -04:00
Adam Paszke
0c9670ddf0 Allow remapping storages at load time and serialize data in little endian order 2016-10-04 12:54:55 -07:00
Adam Paszke
3f7ab95890 Finish implementation of prng related functions 2016-09-29 11:33:25 -07:00
Adam Paszke
3eac7164f4 Add data parallel functions to nn 2016-09-27 15:45:45 -07:00
Adam Paszke
1ed488da4f Make custom precision of CUDA tests work in inplace mode as well 2016-09-25 12:26:00 -07:00
Adam Paszke
5030d76acf Reduce precision of CUDA blas tests 2016-09-23 21:10:28 -07:00
Adam Paszke
a489884da4 Reduce precision of addmm CUDA test 2016-09-23 17:52:08 -07:00
Adam Paszke
06ab3f962f Refactor _C extension to export some utilities 2016-09-21 08:36:54 -07:00
Adam Paszke
8fdec15a55 Codemod to remove camel case method naming 2016-09-20 08:40:28 -07:00
Adam Paszke
da5bb373e6 Type conversions now use auto gpu 2016-09-15 18:48:27 -07:00
soumith
19ec206bad reducing tolerance in cumprod unit test 2016-09-14 15:53:14 -07:00
Adam Paszke
a0fb1ab86e Reduce precision for addmm and rsqrt CUDA tests 2016-09-14 11:08:53 -04:00
Adam Paszke
75579fcabd Fix Log autograd test 2016-08-23 10:42:36 -07:00
Adam Paszke
686e8d32e2 Add torch.save and torch.load 2016-08-23 07:51:55 -07:00
Adam Paszke
9fff8e7392 Fixes for changes in libs 2016-08-12 22:02:57 -07:00
Adam Paszke
1e905eb4d5 copy -> copy_ 2016-08-12 09:26:33 -07:00
Adam Paszke
12bed8dc0d Add CUDA device selection 2016-08-12 07:46:46 -07:00
Adam Paszke
fa6e5c5bff Update tests and fix CosineEmbeddingCriterion 2016-08-11 13:10:54 -07:00
Adam Paszke
ff00cdd728 Add cunn tests 2016-08-11 08:56:30 -07:00
Adam Paszke
1a57979f41 Add cutorch tests 2016-08-11 06:43:41 -07:00