pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Du Phan	c345212c86	Support gpu triangle solve (#6648 ) * add cuda trtrs * remove queue * add test trtrs	2018-04-17 14:33:39 +02:00
Richard Zou	6c0f74089f	More precise digamma (#6517 ) * More precise digamma Fixes #6190. This is a rebase of #3955 with some tweaks for better performance around poles. The code is ported over from cephes with permission. By itself, the cephes code returns inf for the poles. For better performance around the poles with float32, one intermediate step is always computed with double precision, regardless of dtype. This step does `PI / tan(PI * input)`. This is necessary because small (1e-6) rounding errors for the inputs to tan have strong effects on the output (ie, the derivative of tan is very large at some points). * Replace usages of finite-differences digamma with newly implemented digamma * Better behavior near and at poles * ScalarConvert -> scalar_cast for readability	2018-04-13 11:49:09 -04:00
gchanan	749d51414a	Separate cuda-ness from dtype. (#6470 ) * Separate cuda-ness from dtype. There are no longer torch.cuda.int64, etc; only torch.int64 that correspond to at::ScalarType. At the python arg parser level, the corresponding ATen type is selected from the combination of (ScalarType, Layout, Device). There is also currently unused code in here for support ScalarType in native_functions; this will be used for specifying aggregate types on reduction functions. * Fix test_autograd. * Add defaults to randint_like. * Track is_cuda in py tensor types. * Fix test_sparse. * Fix multiprocessing. * Fix rnn. * Fix test_nn. * Fix flake8.	2018-04-12 14:05:44 -04:00
Tongzhou Wang	37d5c58f4b	Skip all TestTorch tests in test_cuda.py (#6489 )	2018-04-10 20:31:05 -04:00
albanD	bb097e2a50	[pytorch] Fix signed random_ (#6463 ) * Fix cpu signed random * fix gpu signed tensor * add test for signed random_ * cleaner tests * fix lint	2018-04-10 13:07:04 -04:00
Vishwak Srinivasan	0aa35780bf	[ready] Implement log2 and log10 in PyTorch (#6272 ) * Implemented log2 and log10 * Re-add incorrectly removed files * Fix minor bugs * Fix log1p docs * Add a try-except for python2 math module in log2 test * Revert changes made to aten/doc/* * Fix docstring errors * Fix windows build	2018-04-05 14:28:37 -04:00
Tongzhou Wang	ecd5de0f36	[fft][2 of 3] Forward for fft methods (#5856 ) * implement fft ifft rfft irfft * add tests for fft ifft rfft irfft	2018-03-28 18:44:29 -04:00
gchanan	6ae0576e1c	Remove dtypes from legacy tensor.new(...) (#6081 ) This is in preparation for splitting out sparsity (layout) from dtypes; it's complex to maintain these and tensor.new(...) is a legacy API in any case.	2018-03-28 18:37:21 -04:00
Richard Zou	9923701a0d	Fix crash when cat-ing empty cuda tensors (#5971 ) Fixes #5739. The CUDA path for `torch.cat` was missing a check for the case where all input tensors are empty.	2018-03-23 22:22:39 -04:00
Vedanuj Goswami	08b1324ec2	Fix integer overflow in remainder operator (#5906 ) * Fix integer overflow in remainder * Fix remainder operator in CUDA * Add tests for remainder integer overflow * Add has_different_sign static function	2018-03-20 22:05:34 -04:00
Thomas Viehmann	7cbe63da86	improve handling of precision issue in torch.multinomial (solves #4858 ) (#5774 ) * improve handling of precision issue in torch.multinomial (solves #4858) * add test * review feedback - eliminate size check. Thanks!	2018-03-17 10:26:22 -04:00
Tongzhou Wang	940a0ab67b	Add logdet and slogdet (#5393 ) * 1. Add logdet and slogdet in ATen side 2. Previously, det can return result with incorrect sign upon seeing symmetric matrices. This is caused by the wrong assumption I had on SVD (when input is symmetric U=V^T). This fixes it. 3. Moreover, after fixing 2 now QR is always needed for det forward. So I moved SVD to backward call. Since this is a specific variant of SVD, it is named as _svd_with_positive_UV_det, with derivative.yaml entry being svd_backward. 4. Updated/added backward functions for det, logdet and slogdet, which uses _svd_with_positive_UV_det and svd_backward inside. 5. Optimized svd_backward: a. Avoid unnecessary kernels when only sigma has gradient (this is the usual case, and also true with det backward functions). b. Fix SVD double backward by avoiding a nan. 1. Add/update grad checks for det, logdet, and slogdet. 2. Fix an incorrect check for dim_args_idx in test_autograd.py 3. Add option to only test a subset of output values, specified by test_output_indices, for cases like slogdet where only the second output is differentiable. 4. Add better doc for the test generating list. * Add/improve output tests for det, logdet and slogdet Add a scaling to random matrices so closeness checks are more robust * Remove unnecessaery Variable wrappers in some test files * Add logdet slogdet docs * Improve an err msg in THTensorLapack.c * add inverse-based backward for invertible matrices use svd only for non-invertible case, so don't need the special variant anymore * use LU rather than QR	2018-03-16 09:23:00 -04:00
Richard Zou	74043b69c2	Alias torch.diagonal, torch.diagflat (#5622 ) * Alias torch.diagonal, torch.diagflat * Address comments; Add sanity tests for torch.diagonal and torch.diagflat	2018-03-09 23:46:42 -05:00
Richard Zou	8ab101ccee	Implement pow() for integer types (#5526 ) * CPU int-types pow() * CUDA int-type pow() * Cleanup + fix deleted line * Tests for integer-types pow * Fix build * Fix windows tests * Make _test_int_pow static	2018-03-08 22:33:32 -05:00
Richard Zou	461e3e3ae0	Allow indexing tensors with both CPU and CUDA tensors (#5583 ) * Allow indexing tensors with both CPU and CUDA tensors * Remove stray import	2018-03-07 10:24:12 -05:00
Will Feng	9235277dba	Re-enable some CUDA tests on Windows (#5446 ) This PR enables the following tests on Windows again: CUDA HalfTensor tests in test_torch.py and test_nn.py test_Conv2d_deterministic_cudnn in test_nn.py test_Tensor_qr_big in test_cuda.py The issues are no longer reproducible, possibly because of an upgrade to the display driver. Reenable CUDA HalfTensor tests on Windows * Reenable test_Conv2d_deterministic_cudnn on Windows * Reenable test_*Tensor_qr_big on Windows	2018-03-01 12:21:17 -05:00
Sam Gross	509aed6ca3	More Variable/Tensor clean-ups (#5464 )	2018-02-28 16:46:47 -05:00
gchanan	94938be367	Support dtypes in legacy new constructors. (#5343 ) * Support dtypes in legacy new constructors. * Add comment about why we don't have dtype for sparse (indices, values). * separate legacy tensor ctor vs new (new includes dtypes). * Use TypeError.	2018-02-28 12:52:11 -05:00
Sam Gross	30ec06c140	Merge Variable and Tensor classes (#5225 ) This replaces the torch.Tensor constructors with factories that produce Variables. Similarly, functions on the torch module (e.g. torch.randn) now return Variables. To keep the PR to a reasonable size, I've left most of the unused tensor code. Subsequent PRs will remove the dead code, clean-up calls to torch.autograd.Variable, and rename Variable to Tensor everywhere. There are some breaking changes because Variable and Tensors had slightly different semantics. There's a list of those changes here: https://github.com/pytorch/pytorch/wiki/Breaking-Changes-from-Variable-and-Tensor-merge	2018-02-23 18:03:31 -05:00
Ailing	3ef2e484bf	Add fp16 testcases in test_cuda (#5122 )	2018-02-21 14:35:29 +01:00
Richard Zou	70e71391d2	Fix THCTensor_(max) and THCTensor_(min) inits (#5265 ) Their cuda kernels should be initialized with (min_value, 0) and (max_value, 0), respectively, where the second number is a default index value. However, they were being initialized with (max, 1) and (min, 1) instead, probably a remnant from the lua torch days. This caused bugs in torch.max() and torch.min() when the input is at the extreme values, and the max value (or min value) occurs at index 0. For example, import torch x = torch.ByteTensor([[0]]) x.cuda().max(dim=0) # returns (0, 1) but the expected result is (0, 0)	2018-02-15 14:41:19 -08:00
Sam Gross	85e22b5475	Reverts force_gpu_half changes from #3660 (#5000 ) The test_cuda.py setup purports to test half tensors, but actually just re-tests FloatTensors because the keys in type_map were str instead of type. Testing HalfTensors is more complicated, requiring changes to precision and requires excluding some unimplemented methods. We should fully test half CUDA tensors. This change just deletes the duplicate tests of FloatTensor.	2018-02-07 15:33:17 -05:00
Tongzhou Wang	47ee86776e	Fix CPU torch.multinomial with noncontiguous prob tensor (#5093 ) * fix CPU torch.multinomial not working on noncontiguous probability distn' * address comments * change some tabs to spaces in THStorage.c	2018-02-06 22:11:43 -05:00
Peter Goldsborough	86fd5fd524	Replace async with non_blocking for Python 3.7 (#4999 ) * Replace async with non_blocking for Python 3.7 upgrade * Remove trailing whitespace * Give _cuda and _type kwargs and accept async for compatibility * Rename async to non_blocking in all C++ code * Add entries for async in python_variable_methods * Friendlier backward compatibility for cuda and type	2018-02-02 09:23:51 -05:00
albanD	6c197c2f15	fix triu and tril for zero-strided inputs on gpu (#4962 )	2018-01-31 14:38:49 -05:00
Will Feng	82fed06535	disable qr_big cuda test on Windows (#4747 )	2018-01-23 21:29:32 -05:00
Richard Zou	c7a2e318ed	Restore cuda variable.bernoulli() (#4787 )	2018-01-23 21:12:47 -05:00
Adam Paszke	1061d7970d	Move broadcast and broadcast_coalesced to C++	2018-01-18 11:16:45 +01:00
Tongzhou Wang	5918243b0c	Methods for checking CUDA memory usage (#4511 ) * gpu mem allocated * add test * addressed some of @apaszke 's comments * cache stats * add more comments about test	2018-01-09 11:47:48 -05:00
Sam Gross	b8fd57a0cc	Fix handling of empty indices in CUDA Tensor.put_ (#4486 ) Fixes #4386	2018-01-05 12:58:27 -05:00
Will Feng	c6adee0807	disable CUDA HalfTensor tests in test_cuda for Windows (#4482 )	2018-01-04 22:58:13 +01:00
Fritz Obermeyer	35abc4efa2	Add low-precision digamma() and polygamma() functions (#4399 )	2018-01-02 11:53:23 +01:00
Vishwak Srinivasan	e519ef5337	Adding torch.expm1() and its inplace function (#4350 )	2017-12-28 18:56:03 +09:00
Sam Gross	1632ab2979	Fix default device for Variable.new() (#4307 ) Variable.new() should default to the device of "self" if no device is specified. Previously, we were using the current device. This now matches Tensor.new().	2017-12-21 18:35:35 -05:00
Tongzhou Wang	d8b2e5d091	Add python only default init expression; Implement stft, hann/hamming/bartlett window. (#4095 ) * implement stft * addressed comments; implemented window functions; added support for python only default initialization	2017-12-18 12:28:23 -05:00
Tongzhou Wang	e0d5d1b7c9	view in certain noncontig case (#4062 )	2017-12-18 02:08:17 -05:00
Richard Zou	9394e65b44	Add proper shape checking to torch.cat (#4087 ) * Fix catArray in THTensor Asserts that the inputs have the same size except in the cat dimension or are empty (or a mix of both). * Fix catArray for THCTensor * Document torch.cat shape checks * Fix types	2017-12-18 02:05:58 -05:00
Sam Gross	bec0349280	Implement Variable.cuda and Variable.type using ATen (#4139 ) * Implement Variable.cuda using ATen This adds an optional async flag to Tensor::copy_, which attempts to do a non-blocking copy if the one of the tensors is in pinned memory and the other is a CUDA tensor. * Perform cross-device copy in CopyBackwards Also call torch.cuda._lazy_init() from Variable.cuda() * Implement Variable.type via ATen * Changes from review: - remove copy_out - remove unnecessary include - fix default device for .cuda() * Combine if statements in dispatch_type	2017-12-18 01:54:35 -05:00
Richard Zou	dac5e6568d	Better error messages for blas ops with cuda.LongTensor (#4160 ) * Better error messages for blas ops with cuda.LongTensor Fixes #4157 Test plan Try matrix multiplying with cuda.LongTensors >>> import torch >>> x = torch.randn(4, 4).long().cuda() >>> y = torch.randn(4, 4).long().cuda() >>> x.mm(y) Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: addmm for CUDA tensors only supports floating-point types. Try converting the tensors with .flo at() at /private/home/rzou/pytorch/pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:381	2017-12-14 11:28:59 -05:00
Sam Gross	aeb7a3668d	Implement Variable.new (#4080 )	2017-12-11 15:45:43 -05:00
Tongzhou Wang	c681b03d37	Add determinant function on variable; Add backward on svd (#3816 ) * determinant on variable * svd bwd	2017-12-01 13:22:46 -05:00
Adam Paszke	6ae0d477ea	Fix cuBLAS arguments for fp16 dot (#3660 ) * Fix cuBLAS arguments for fp16 dot * Enable FloatTensor <-> CUDA HalfTensor checks in test_cuda.py	2017-11-29 07:16:34 -08:00
Richard Zou	ec389f5128	Fix cuda symeig (#3566 ) * Fix cuda symeig * Add symeig test * Better check for magma	2017-11-08 20:20:14 -05:00
Richard Zou	00d2befba1	THTensor_varOuterDim numeric stability (#3533 )	2017-11-07 13:47:20 -05:00
Richard Zou	3d06a1e075	Make THCTensor_varInnermostDim numerically stable using Welford's algorithm (#3425 ) * Use Welford's algorithm when reducing along inner dimension for THCTensor's variance fn * Use accreals in THCTensor's varInnermostDim * Skip cuda tests if no cuda * Variance testing	2017-11-06 16:00:29 -05:00
SsnL	8fd171a6fd	add test_index to test_cuda	2017-11-06 14:21:31 -05:00
Sam Gross	7c0b16c140	Add torch.take and Tensor.put_ (#3263 ) * Add torch.take and Tensor.put_ These are similar to numpy.take and numpy.put. The take function allows you to linearly index into a tensor without viewing it as a 1D tensor first. The output has the same shape as the indices. The put function copies value into a tensor also using linear indices.	2017-11-01 06:04:44 -04:00
SsnL	91a8d3325e	test sparse dp, broadcast_coalesced, reduce_add_coalesced	2017-10-28 18:52:35 -04:00
Ozan Çağlayan	e43a63a968	tensor: Ensure that the tensor is contiguous before pinning (#3266 ) (#3273 ) * tensor: Ensure that the tensor is contiguous before pinning (#3266) pin_memory() was producing out-of-order tensor when the given tensor was transposed, i.e. in column-major order. This commit fixes this by calling contiguous() before pinning. * test: add contiguous test for pin_memory (#3266)	2017-10-25 13:17:54 +02:00
SsnL	634c8315a4	isContiguous problems (#3148 ) * with the size=1 case, impossible to do single point check, replace with isContiguousRange * fix stride in desc; fix undef scope * add test for this case for cudnn * assertTrue	2017-10-20 10:20:33 -04:00

1 2 3

122 Commits