pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Edward Yang	173f224570	Turn on F401: Unused import warning. (#18598 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598 ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a Stack from [ghstack](https://github.com/ezyang/ghstack): * #18598 Turn on F401: Unused import warning. This was requested by someone at Facebook; this lint is turned on for Facebook by default. "Sure, why not." I had to noqa a number of imports in __init__. Hypothetically we're supposed to use __all__ in this case, but I was too lazy to fix it. Left for future work. Be careful! flake8-2 and flake8-3 behave differently with respect to import resolution for # type: comments. flake8-3 will report an import unused; flake8-2 will not. For now, I just noqa'd all these sites. All the changes were done by hand. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Differential Revision: D14687478 fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3	2019-03-30 09:01:17 -07:00
vishwakftw	291746f110	Rename trtrs to triangular_solve (#18213 ) Summary: Changelog: - Renames `trtrs` to `triangular_solve` to remain consistent with `cholesky_solve` and `solve`. - Rename all tests, fix callsites - Create a tentative alias for `triangular_solve` under the name `trtrs`, and add a deprecation warning to not promote usage. - Move `isnan` to _torch_docs.py - Remove unnecessary imports Pull Request resolved: https://github.com/pytorch/pytorch/pull/18213 Differential Revision: D14566902 Pulled By: ezyang fbshipit-source-id: 544f57c29477df391bacd5de700bed1add456d3f	2019-03-21 14:27:21 -07:00
Igor Fedan	3eff333bff	lint changes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/18276 Differential Revision: D14563385 Pulled By: ifedan fbshipit-source-id: 12a51dbdb7b9e96be9fefa21fe298796b1ae6b58	2019-03-21 11:11:35 -07:00
Igor Fedan	e5cdd94324	Backward function for torch.cdist Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17173 Differential Revision: D14111482 Pulled By: ifedan fbshipit-source-id: d72cfd53c29d0f8cf5f8ad1148d14f3d5abd938e	2019-03-21 00:39:29 -07:00
Vishwak Srinivasan	a519217ee7	Add batched version of trtrs (#18025 ) Summary: - Remove single batch TH/THC implementations - Remove `_batch_trtrs_lower` from `multivariate_normal` - Add tests for batched behavior - Modify trtrs_backward to accommodate for batched case - Modify docs In a future PR, this will be renamed to `triangular_solve`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18025 Differential Revision: D14523004 Pulled By: ifedan fbshipit-source-id: 11c6a967d107f969b60e5a5c73ce6bb8099ebbe1	2019-03-20 11:11:32 -07:00
Edward Yang	4073e3c2f2	Fix lint errors in test_autograd Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17812 Reviewed By: eellison Differential Revision: D14388897 fbshipit-source-id: 6e2671805dc8d57af68eb0a0cd6ccb24d9db45e2	2019-03-12 08:55:12 -07:00
vishwakftw	9d70e199f4	Move lerp to ATen, add functionality for tensor weights (#17348 ) Summary: Changelog: - Remove TH/THC bindings - Add tensor weights for `lerp` - Modify derivatives appropriately Pull Request resolved: https://github.com/pytorch/pytorch/pull/17348 Differential Revision: D14355845 Pulled By: soumith fbshipit-source-id: eaede4c09ee589d77ba6cf52583510ea8e3a2fcf	2019-03-07 14:04:58 -08:00
Natalia Gimelshein	b4572668b4	Add sparse gradient option to `gather` operation (#17182 ) Summary: This PR allows `gather` to optionally return sparse gradients, as requested in #16329. It also allows to autograd engine to accumulate sparse gradients in place when it is safe to do so. I've commented out size.size() check in `SparseTensor.cpp` that also caused #17152, it does not seem to me that check serves a useful purpose, but please correct me if I'm wrong and a better fix is required. Motivating example: For this commonly used label smoothing loss function ``` def label_smoothing_opt(x, target): padding_idx = 0 smoothing = 0.1 logprobs = torch.nn.functional.log_softmax(x, dim=-1, dtype=torch.float32) pad_mask = (target == padding_idx) ll_loss = logprobs.gather(dim=-1, index=target.unsqueeze(1), sparse = True).squeeze(1) smooth_loss = logprobs.mean(dim=-1) loss = (smoothing - 1.0) * ll_loss - smoothing * smooth_loss loss.masked_fill_(pad_mask, 0) return loss.sum() ``` backward goes from 12.6 ms with dense gather gradients to 7.3 ms with sparse gradients, for 9K tokens x 30K vocab, which is some single percent end-to-end improvement, and also improvement in peak memory required. Shout-out to core devs: adding python-exposed functions with keyword arguments through native_functions.yaml is very easy now! cc gchanan apaszke Pull Request resolved: https://github.com/pytorch/pytorch/pull/17182 Differential Revision: D14158431 Pulled By: gchanan fbshipit-source-id: c8b654611534198025daaf7a634482b3151fbade	2019-02-27 11:42:48 -08:00
bhushan	7e5442f900	Reset grad attribute when called using del (#16525 ) Summary: del Tensor.grad set PyObject to nullptr and Tensor.grad = None set PyObject to Py_None Handling both the cases now fixes ##16471 Pull Request resolved: https://github.com/pytorch/pytorch/pull/16525 Differential Revision: D14130800 Pulled By: soumith fbshipit-source-id: ed85c38305bba94d5047311cb58e4e4cedd09832	2019-02-19 04:33:57 -08:00
Adam Paszke	7743ed8502	Don't keep unnecessary saved_inputs alive (#16583 ) Summary: Fixes #16577. This greatly improves memory efficiency of certain ops like Dropout2d. Previously, they were implemented as `input * mask` where mask never requires_grad, but we didn't use that knowledge in forward, and (in case of a in-place dropout) kept input.clone() for the backward, when it would simply get ignored. This patch tries to address this situation by emitting some guards for stores like this, but only if they are as simple, as checking if a single value requires_grad. Interestingly, the same optimizations apply to methods like bmm, baddmm, etc., but _not to mm nor addmm_, because of how their derivatives are defined. Apparently they unnecessarily use `mat1` to compute the derivative of `mat1` just to improve the error message in case `mat1` was sparse. I'd like to apply this optimization to that case, but I don't want to loose the nicer error message, so if anyone has any ideas for solutions, please let me know... Full list of operators affected by this patch: * _nnpack_spatial_convolution * addbmm * addcdiv * addcmul * addmv * addr * baddbmm * bmm * cross * div * dot * fmod * ger * index_add_ * mul * mv * scatter_add_ Pull Request resolved: https://github.com/pytorch/pytorch/pull/16583 Differential Revision: D13900881 Pulled By: gchanan fbshipit-source-id: dd0aeb2ab58c4b6aa95b37b46d3255b3e014291c	2019-02-11 13:42:09 -08:00
Johannes M Dieterich	23e1c55cc0	enable unit tests working on ROCm 2.1 (#16871 ) Summary: This is the first round of enabling unit tests that work on ROCm 2.1 in my tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16871 Differential Revision: D13997662 Pulled By: bddppq fbshipit-source-id: d909a3f7dd5fc8f85f126bf0613751c8e4ef949f	2019-02-09 00:30:50 -08:00
SsnL	c863a759a0	Fix slogdet sign requiring grad when input requires grad (#16337 ) Summary: The real fix for https://github.com/pytorch/pytorch/issues/15605. This is sort of BC breaking because now ```py In [1]: import torch In [2]: a = torch.randn(3, 3, requires_grad=True) In [3]: a.slogdet() Out[3]: (tensor(1.), tensor(0.1356, grad_fn=<SlogdetBackward>)) In [4]: a.slogdet()[0].requires_grad Out[4]: False ``` while before this patch ` a.slogdet()[0]` requires grad with `grad_fn=<SlogdetBackward>`. But any use of backproping through this value will meet the error in #15605 so I don't think this is a problem. Pull Request resolved: https://github.com/pytorch/pytorch/pull/16337 Differential Revision: D13832644 Pulled By: soumith fbshipit-source-id: f96c477e99edcbdbd966888e5c5ea7fd058429a8	2019-01-27 12:11:14 -08:00
Xiang Gao	c5e1b469be	Return namedtuples from torch.* function with multiple return arguments for C++ operators (#15429 ) Summary: Partially fixes: https://github.com/pytorch/pytorch/issues/394 Implementation detail: Codegen is modified to generate codes that looks like below: ```C++ static PyObject * THPVariable_svd(PyObject* self_, PyObject* args, PyObject* kwargs) { HANDLE_TH_ERRORS static PythonArgParser parser({ "svd(Tensor input, bool some=True, bool compute_uv=True, , TensorList[3] out=None)", }, /traceable=*/true); ParsedArgs<6> parsed_args; auto r = parser.parse(args, kwargs, parsed_args); static PyStructSequence_Field fields0[] = { {"U", ""}, {"S", ""}, {"V", ""}, {nullptr} }; static PyStructSequence_Desc desc0 = { "torch.return_types.svd_out", nullptr, fields0, 3 }; static PyTypeObject type0; static bool namedtuple_type_initialized0 = false; if (!namedtuple_type_initialized0) { PyStructSequence_InitType(&type0, &desc0); namedtuple_type_initialized0 = true; } static PyStructSequence_Field fields1[] = { {"U", ""}, {"S", ""}, {"V", ""}, {nullptr} }; static PyStructSequence_Desc desc1 = { "torch.return_types.svd", nullptr, fields1, 3 }; static PyTypeObject type1; static bool namedtuple_type_initialized1 = false; if (!namedtuple_type_initialized1) { PyStructSequence_InitType(&type1, &desc1); namedtuple_type_initialized1 = true; } if (r.idx == 0) { if (r.isNone(3)) { return wrap(&type1, dispatch_svd(r.tensor(0), r.toBool(1), r.toBool(2))); } else { auto results = r.tensorlist_n<3>(3); return wrap(&type0, dispatch_svd(r.tensor(0), r.toBool(1), r.toBool(2), results[0], results[1], results[2])); } } Py_RETURN_NONE; END_HANDLE_TH_ERRORS } ``` Types are defined as static member of `THPVariable_${op_name}` functions, and initialized at the first time the function is called. When parsing function prototypes in `native_functions.yaml`, the parser will set the specified name as `field_name` when see things like `-> (Tensor t1, ...)`. These field names will be the field names of namedtuple. The class of namedtuples will be named `torch.return_types.${op_name}`. In some python 2, `PyStructSequence` is not a subtype of tuple, so we have to create some functions to check if an object is a tuple or namedtuple for compatibility issue. Operators in `native_functions.yaml` are changed such that only `max` and `svd` are generated as namedtuple. Tests are added for these two operators to see if the return value works as expected. Docs for these two ops are also updated to explicitly mention the return value is a namedtuple. More ops will be added in later PRs. There is some issue with Windows build of linker unable to resolve `PyStructSequence_UnnamedField`, and some workaround is added to deal with this case. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15429 Differential Revision: D13709678 Pulled By: ezyang fbshipit-source-id: 23a511c9436977098afc49374e9a748b6e30bccf	2019-01-22 11:12:18 -08:00
Gregory Chanan	595f767880	Revert batched pdist, improve existing kernel, add test (#15901 ) Summary: 1) Reverts https://github.com/pytorch/pytorch/pull/12302 which added support for batched pdist. Except I kept the (non-batched) test improvements that came with that PR, because they are nice to have. Motivation: https://github.com/pytorch/pytorch/issues/15511 2) For the non-batched pdist, improved the existing kernel by forcing fp64 math and properly checking cuda launch errors 3) Added a 'large tensor' test that at least on my machine, fails on the batch pdist implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15901 Reviewed By: ezyang Differential Revision: D13616730 Pulled By: gchanan fbshipit-source-id: 620d3f9b9acd492dc131bad9d2ff618d69fc2954	2019-01-17 10:44:43 -08:00
Thomas Viehmann	c93cf89de2	Fix cuda native loss_ctc for varying input length (#15798 ) Summary: Thank you, freesouls, for the reproducing example! This is strictly fixing the bug in gradients for varying length inputs discussed in the middle-to-bottom of the bug report. I'll have a feature patch regarding inf losses -> NaN grads separately. Fixes: #14401 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15798 Differential Revision: D13605739 Pulled By: soumith fbshipit-source-id: 167ff42399c7e4cdfbd88d59bac5d25b57c0363f	2019-01-08 22:28:39 -08:00
albanD	828cb18fa3	Allow ReadyQueue to handle empty tasks (#15791 ) Summary: Allow the comparison function used in ReadyQueue to handle the empty FunctionTasks created by the reentrant autograd. Fix #11732 Pull Request resolved: https://github.com/pytorch/pytorch/pull/15791 Differential Revision: D13598006 Pulled By: soumith fbshipit-source-id: 0bfdf28a735fbfe44f0fdbaf8b74a6198e6a1984	2019-01-08 20:06:04 -08:00
Richard Zou	196eee6ccd	Fix sum_to behavior with zero dimensions (#15796 ) Summary: Fixes #15223. This fixes an autograd bug where backprop either fails or produces gradients of incorrect sizes when tensors with zero-sized dimensions are involved. Previously, we were reducing along dimensions that had size greater than 1 when summing to a size in autograd. This is incorrect because we should also reduce along dimensions with size 0 to produce a tensor of size 1 in that dimension that then gets viewed to the correct shape. Pull Request resolved: https://github.com/pytorch/pytorch/pull/15796 Differential Revision: D13593199 Pulled By: zou3519 fbshipit-source-id: 2e2acac34943a9b7fabadc10c9efd4f66db298fd	2019-01-08 13:19:54 -08:00
Gao, Xiang	d3e5540276	Fix typo: szie -> size Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15466 Differential Revision: D13536343 Pulled By: soumith fbshipit-source-id: cb3df30bf346ef6bc0bc1b6430107b3e0e086f8d	2018-12-28 22:40:52 -08:00
bddppq	2db742fc95	Do not use fork to invoke test scripts in pytorch rocm CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/14600 Differential Revision: D13523937 Pulled By: bddppq fbshipit-source-id: 1493fdd051283650081d7944bb2bd7f0c4c44990	2018-12-19 21:35:16 -08:00
Wei Yang	1a247f872f	gradcheck (#14596 ) Summary: - allow gradcheck to take sparse tensor as input - sparse output is not allowed yet at gradcheck - add backward for `to_dense()` to get around sparse output - calling gradcheck at test_sparse, so that we can use `_gen_sparse()` and also easily cover coalesced / uncoalesced test cases Pull Request resolved: https://github.com/pytorch/pytorch/pull/14596 Differential Revision: D13271904 Pulled By: weiyangfb fbshipit-source-id: 5317484104404fd38058884c86e987546011dd86	2018-12-06 18:03:38 -08:00
vishwakftw	a30ade1139	Batched cholesky decomposition (#14017 ) Summary: Implements batching for the Cholesky decomposition. Performance could be improved with a dedicated batched `tril` and `triu` op, which is also impeding autograd operations. Changes made: - batching code - tests in `test_torch.py`, `test_cuda.py` and `test_autograd.py`. - doc string modification - autograd modification - removal of `_batch_potrf` in `MultivariateNormal`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14017 Differential Revision: D13087945 Pulled By: ezyang fbshipit-source-id: 2386db887140295475ffc247742d5e9562a42f6e	2018-11-17 10:49:15 -08:00
Gregory Chanan	b7a7ab364b	Improve mm / addmm error message with sparse tensors (#13796 ) Summary: and write derivatives in terms of native functions. This is the same as https://github.com/pytorch/pytorch/pull/13648 but has a fix for the canonicalize op jit pass to propagate shape information. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13796 Reviewed By: ezyang Differential Revision: D13012281 Pulled By: gchanan fbshipit-source-id: 88d0d91e72b5967c51ff865350fcbdd7ffed92ef	2018-11-12 07:16:47 -08:00
Soumith Chintala	a7ee632dff	Various Test and build fixes (#13556 ) Summary: - fixes weights-contiguous requirement for THCUNN Convolutions - Add tests that conv backward pass works for non-contiguous weights - fix RNN tests / error messages to be consistent and pass - relax weight grad precision for fp16 for a particular test - fix regression of CMAKE_PREFIX_PATH not passing through - add missing skipIfNoLapack annotations where needed Differential Revision: D12918456 Pulled By: soumith fbshipit-source-id: 8642d36bffcc6f2957800d6afa1e10bef2a91d05	2018-11-06 07:13:47 -08:00
Gregory Chanan	3c1d593a27	cumsum/cumprod derivatives not depending on TH. (#13579 ) Summary: This is identical to https://github.com/pytorch/pytorch/pull/13467 but doesn't include the tests in common_invocations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/13579 Differential Revision: D12925404 Pulled By: gchanan fbshipit-source-id: 0a52fd26b15c7e0bbdfec03948f3e6c849e65091	2018-11-06 06:42:01 -08:00
albanD	246d5282b3	fix handling of single input in gradcheck (#13543 ) Summary: Now gradcheck properly accept a single Tensor as input. It was almost supported already but not completely. Should fix the confusion from #13540 Pull Request resolved: https://github.com/pytorch/pytorch/pull/13543 Differential Revision: D12918526 Pulled By: soumith fbshipit-source-id: a5bad69af0aea48c146f58df2482cabf91e24a01	2018-11-04 20:28:34 -08:00
vishwakftw	d714ecf879	Rename potrf to cholesky (#12699 ) Summary: This PR performs a renaming of the function `potrf` responsible for the Cholesky decomposition on positive definite matrices to `cholesky` as NumPy and TF do. Billing of changes - make potrf cname for cholesky in Declarations.cwrap - modify the function names in ATen/core - modify the function names in Python frontend - issue warnings when potrf is called to notify users of the change Reviewed By: soumith Differential Revision: D10528361 Pulled By: zou3519 fbshipit-source-id: 19d9bcf8ffb38def698ae5acf30743884dda0d88	2018-11-01 15:10:55 -07:00
vishwakftw	1fe8278559	Batched Inverse (#9949 ) Summary: Complete billing of changes: Related to Batch Inverse: - [x] Add batched inverse (CPU) - [x] Add batched inverse (CUDA) - [x] Modify autograd entry - [x] Add tests - [x] test_autograd - [x] test_cuda - [x] test_torch - [x] Modify docs - [x] Remove `_batch_inverse` in `MultivariateNormal`. - [x] Allow batch matrices as inputs for negative powers in `matrix_power` Miscellaneous modifications: - [x] Move all batch operations to BatchLinearAlgebra.cpp/.cu and provide general framework for adding more batch ops. - [x] Add a RAII structure for MAGMA queue management. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9949 Differential Revision: D10559089 Pulled By: zou3519 fbshipit-source-id: 7da24977f8a79d97dd42883302e13e708c1726e4	2018-10-27 23:42:46 -07:00
Zachary DeVito	dae7616078	Shard all of tests based on how many tests exist. (#13160 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13160 Reduces pytorch_core build from 2 hours to 30 minutes Reviewed By: soumith, dzhulgakov Differential Revision: D10524261 fbshipit-source-id: 97270ac73404b5ea4c264cd0e9d8d4b1be79b0e9	2018-10-26 18:20:34 -07:00
Sam Gross	9fefab5ac6	Add support for reductions to TensorIterator (#11908 ) Summary: This adds support for reductions like sum() and mul() to TensorIterator. Performance is similar to existing optimized code for CPU, and generally better than existing code for CUDA kernels. The templatized CUDA kernel requires fewer instantiations than the existing THCReduce/THCReduceAll code. For example, sum() previously generated 43 CUDA kernels, while it now requires only one (larger) CUDA kernel. I suspect this should reduce code-size and compilation time, but I haven't measured it. Below are timings for sum() on [CPU](https://ark.intel.com/products/81908/Intel-Xeon-Processor-E5-2680-v3-30M-Cache-2_50-GHz) (12 threads and 1 thread) and CUDA with various tensor sizes. CPU \| Reduction (dim) \| Master \| PR \| Master (1 thread) \| PR (1 thread) \| \|----------------------\|---------\|---------\|-------------------\|---------------\| \| 1024x1024 (all) \| 22 us \| 34 us \| 136 us \| 147 us \| \| 1024x1024 (0) \| 30 us \| 28 us \| 160 us \| 160 us \| \| 1024x1024 (1) \| 25 us \| 25 us \| 171 us \| 146 us \| \| 1024x10x1024 (all) \| 542 us \| 550 us \| 4.14 ms \| 3.11 ms \| \| 1024x10x1024 (0) \| 658 us \| 690 us \| 6.80 ms \| 5.93 ms \| \| 1024x10x1024 (1) \| 761 us \| 757 us \| 3.34 ms \| 3.52 ms \| \| 1024x10x1024 (2) \| 538 us \| 545 us \| 3.73 ms \| 3.04 ms \| \| 1024x1024x1024 (all) \| 72 ms \| 71 ms \| 364 ms \| 357 ms \| \| 1024x1024x1024 (0) \| 94 ms \| 90 ms \| 935 ms \| 927 ms \| \| 1024x1024x1024 (1) \| 80 ms \| 86 ms \| 881 ms \| 688 ms \| \| 1024x1024x1024 (2) \| 71 ms \| 71 ms \| 456 ms \| 354 ms \| CUDA \| Reduction (dim) \| M40 base \| M40 PR \| P100 base \| P100 PR \| \|----------------------\|----------\|---------\|-----------\|-----------\| \| 1024x10x1024 (all) \| 238 us \| 182 us \| 136 us \| 97 us \| \| 1024x10x1024 (0) \| 166 us \| 179 us \| 105 us \| 84 us \| \| 1024x10x1024 (1) \| 181 us \| 182 us \| 89 us \| 91 us \| \| 1024x10x1024 (2) \| 180 us \| 168 us \| 88 us \| 79 us \| \| 1024x1024x1024 (all) \| 17.5 ms \| 16.4 ms \| 8.23 ms \| 7.48 ms \| \| 1024x1024x1024 (0) \| 27.2 ms \| 28.6 ms \| 7.63 ms \| 7.38 ms \| \| 1024x1024x1024 (1) \| 16.5 ms \| 16.3 ms \| 7.66 ms \| 7.40 ms \| \| 1024x1024x1024 (2) \| 17.8 ms \| 16.4 ms \| 8.37 ms \| 7.31 ms \| Timings were generated with this script: https://gist.github.com/colesbury/d3238b266d8a9872fe6f68f77619b379 Pull Request resolved: https://github.com/pytorch/pytorch/pull/11908 Differential Revision: D10071760 Pulled By: colesbury fbshipit-source-id: 40e37a0e6803f1628b94cc5a52a10dfbb601f3d6	2018-10-25 09:42:55 -07:00
Tongzhou Wang	46162ccdb9	Autograd indices/values and sparse_coo ctor (#13001 ) Summary: Reopen of #11253 after fixing bug in index_select Pull Request resolved: https://github.com/pytorch/pytorch/pull/13001 Differential Revision: D10514987 Pulled By: SsnL fbshipit-source-id: 399a83a1d3246877a3523baf99aaf1ce8066f33f	2018-10-24 10:00:22 -07:00
Zachary DeVito	87d3d209a6	Enable JIT tests in fbcode (#12777 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12777 Enables JIT tests in FBCode. Changes pybind11 code to avoid mixing py::args with positinally matched arguments because old versions of PyBind11 leak memory in this case. Reviewed By: jamesr66a Differential Revision: D10419708 fbshipit-source-id: 74bc466001b5d363132d1af32e96841b38601827	2018-10-18 18:18:37 -07:00
James Sun	f4944f0f8a	Rename test/common.py to test/common_utils.py (#12794 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12794 common.py is used in base_module for almost all tests in test/. The name of this file is so common that can easily conflict with other dependencies if they happen to have another common.py in the base module. Rename the file to avoid conflict. Reviewed By: orionr Differential Revision: D10438204 fbshipit-source-id: 6a996c14980722330be0a9fd3a54c20af4b3d380	2018-10-17 23:04:29 -07:00
Sepehr Sameni	cffeb03a2d	fix forward and backward for norm with negative infinity norm (#12722 ) Summary: I found a bug in norm() and fixed it (and added tests to make sure it's fixed) here is how to reproduce it: ```python import torch x = torch.FloatTensor([[10, 12, 13], [4, 0, 12]]) print(torch.norm(x, -40, dim=0, keepdim=True)) #output is tensor([[ 4.0000, 0.0000, 11.9853]]) print(torch.norm(x, float('-inf'), dim=0, keepdim=True)) #output is tensor([[1., 1., 1.]]) which is wrong! from numpy.linalg import norm as np_norm x = x.numpy() print(np_norm(x, ord=-40, axis=0)) #output is array([[4., 0., 11.985261]]) print(np_norm(x, ord=float('-inf'), axis=0)) #output is array([[4., 0., 12.0]]) ``` it's related to [#6817](https://github.com/pytorch/pytorch/issues/6817) and [#6969](https://github.com/pytorch/pytorch/pull/6969) Pull Request resolved: https://github.com/pytorch/pytorch/pull/12722 Differential Revision: D10427687 Pulled By: soumith fbshipit-source-id: 936a7491d1e2625410513ee9c39f8c910e8e6803	2018-10-17 21:07:43 -07:00
vishwakftw	0740a5d521	compute_uv for SVD (#12517 ) Summary: Adds a `compute_uv` argument that defaults to `True` for optionally computing the singular vectors during SVD. Closes https://github.com/pytorch/pytorch/issues/12420 . Pull Request resolved: https://github.com/pytorch/pytorch/pull/12517 Differential Revision: D10384554 Pulled By: SsnL fbshipit-source-id: 704998a257afa815eda901b8ae830e8a661695be	2018-10-15 12:35:56 -07:00
vishwakftw	48bc57fa8d	Introduce chain_matmul (#12380 ) Summary: - This was one of the few functions left out from the list of functions in NumPy's `linalg` module - `multi_mm` is particularly useful for DL research, for quick analysis of deep linear networks - Added tests and doc string Pull Request resolved: https://github.com/pytorch/pytorch/pull/12380 Differential Revision: D10357136 Pulled By: SsnL fbshipit-source-id: 52b44fa18d6409bdeb76cbbb164fe4e88224458e	2018-10-12 03:58:12 -07:00
iotamudelta	64f707cd26	Enable more unit tests (ROCm 255) (#12486 ) Summary: * Enable more tests that relied on CPU LAPACK at compile time. * enabled min/max tests in test_cuda (ROCm 236) bddppq ezyang Tests ran as part of the ROCm CI here: https://github.com/ROCmSoftwarePlatform/pytorch/pull/255 Pull Request resolved: https://github.com/pytorch/pytorch/pull/12486 Differential Revision: D10262534 Pulled By: ezyang fbshipit-source-id: 167a06fc8232af006f4b33dcc625815fd4b06d6b	2018-10-09 15:38:19 -07:00
Thomas Viehmann	ea79f7c032	Add derivative to pow with scalar base (#12450 ) Summary: Fixes: #12426 Thank you, DriesSmit, for the report! Pull Request resolved: https://github.com/pytorch/pytorch/pull/12450 Differential Revision: D10238556 Pulled By: soumith fbshipit-source-id: 8bf71467c6734ecc5ff30f15500304d731f7e155	2018-10-09 11:38:48 -07:00
iotamudelta	a2ebbccc9f	fix unit tests on CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12187 Differential Revision: D10118483 Pulled By: bddppq fbshipit-source-id: 986c8fb48d61e00103c713548a50e74489a0e442	2018-09-28 23:11:55 -07:00
Wei Yang	807de9a1e3	fix segfault when grad to a hook fn is None (#12028 ) Summary: - fixes https://github.com/pytorch/pytorch/issues/11751 by checking if a grad is a Python None object before getting cdata from it - behaviors: pre-fix ``` >>> a = torch.randn(5, requires_grad=True) >>> a_list = a.unbind() >>> a0 = a_list[0] >>> a0.register_hook ...: def hook(grad): ...: print(grad) >>> a_list[0].backward() tensor(1.) >>> print('a_list[0]', a_list[0].grad, a.grad) ('a_list[0]', None, tensor([1., 0., 0., 0., 0.])) >>> a_list[1].backward() # segfault ``` post-fix ``` >>> a = torch.randn(5, requires_grad=True) >>> a_list = a.unbind() >>> a0 = a_list[0] >>> a0.register_hook ... : def hook(grad): ... : print(grad) >>> a_list[0].backward() tensor(1.) >>> print(a_list[0].grad, a.grad) (None, tensor([1., 0., 0., 0., 0.])) >>> a_list[1].backward() None >>> print(a_list[1].grad, a.grad) (None, tensor([1., 1., 0., 0., 0.])) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12028 Differential Revision: D10034094 Pulled By: weiyangfb fbshipit-source-id: 3f2135325fa7d338b920f57752057e4f6a6c0b1d	2018-09-25 19:10:25 -07:00
Thomas Viehmann	4fb7e72fe5	Fix _thnn_fused_lstm_cell backward (#11872 ) Summary: There are two parts: - Optional tensors cannot be dispatch tensors because dispatch tensors cannot be optional. - While the kernel dealt with undefined grad_outs, the logistics around it did not fully accomodate grad_hy being undefined. Fixes: #11800 Thank you, mttk for the reproduction! Pull Request resolved: https://github.com/pytorch/pytorch/pull/11872 Differential Revision: D9978527 Pulled By: apaszke fbshipit-source-id: e622c288d2eac93bd8388e141fb773f2588e2b8f	2018-09-21 08:25:00 -07:00
yya007	b91b15d86e	Implementing Matrix Norm for torch.norm (#11261 ) Summary: Currently, norm function only supports vector norm. This PR extends vector norm to matrix norm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11261 Reviewed By: li-roy Differential Revision: D9652379 Pulled By: yya007 fbshipit-source-id: 519b3fb80b563c17c56a24675c7b0e46bf5a3a1c	2018-09-20 14:43:13 -07:00
Natalia Gimelshein	8601b33c07	fix half grad assignment (#11781 ) Summary: currently grad assignment for half type fails with a misleading RuntimeError ``` RuntimeError: torch.cuda.sparse.HalfTensor is not enabled. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/11781 Differential Revision: D9931884 Pulled By: soumith fbshipit-source-id: 03e946c3833d1339a99585c9aa2dbb670f8bf459	2018-09-18 23:00:49 -07:00
vishwakftw	47d65ed34f	Fix issue 10492 (#11634 ) Summary: - pass infos vector by reference - checkErrors takes infos vector by reference - modified gesv tests to not cause infs or nans sporadically - also clean up error messages Reviewed By: ezyang Differential Revision: D9818550 Pulled By: soumith fbshipit-source-id: 00215205ff88767d6a5e921322394c5fd915d6d8	2018-09-17 12:13:45 -07:00
Thomas Viehmann	a7e3cd09e0	Fix ctc gradient handling (#11753 ) Summary: Fixes: #11750 Also fix cuda ctc with double to enable gradient check. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11753 Differential Revision: D9861318 Pulled By: ezyang fbshipit-source-id: 2e7afea2b60dbbd891bb5d0bda61ee75fe01d933	2018-09-17 09:54:47 -07:00
Wei Yang	cda74ac476	fix nested no_grad decorator and with-statement (#11479 ) Summary: - fixes https://github.com/pytorch/pytorch/issues/10858 - allow `no_grad` decorator to apply `with torch.no_grad()` at the correct context - current behavior: ``` import torch torch.no_grad() def nothing(x): return x testin = torch.Tensor([0]) with torch.no_grad(): print(torch.is_grad_enabled()) # False testout = nothing(testin) print(torch.is_grad_enabled()) # False ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/11479 Differential Revision: D9758691 Pulled By: weiyangfb fbshipit-source-id: 87de2219c6c45f65a2c0406ae152c3ad760be8f2	2018-09-11 17:56:40 -07:00
Tongzhou Wang	d3f98b5ffc	Add matrix power (#11421 ) Summary: vishwakftw Your patch needed some updates because the default native function dispatches changed from `[function, method]` to `[function]`. The CI was run before that change happened so it still shows green, but the internal test caught it. I did some changes when rebasing and updating so I didn't just force push to your branch. Let's see if this passes CI and internal test. If it does, let me know if you want me to force push to your branch or use this PR instead. Note to reviewers: patch was already approved at #10068 . cc yf225 Pull Request resolved: https://github.com/pytorch/pytorch/pull/11421 Differential Revision: D9733407 Pulled By: SsnL fbshipit-source-id: cf2ed293bb9942dcc5158934ff4def2f63252599	2018-09-08 15:25:56 -07:00
Deyu Fu	b7a2c91eed	remove unnecessary clone() when .grad is None (#11165 ) Summary: Currently gradient is copied into .grad if it is None. This PR aim to remove the copy when it is not absolutely needed. It is generally an improvement of speed and memory usage. And here is a case it may help a lot: Normally, people do optimizer.zero_grad() every minibatch before backward. It will translate into a memset, and later a point-wise add. When there is some large weight in the network, one optimization people can always do is set parameter.grad to None instead of zero_grad. This will remove memset and change point-wise add to a memcpy. Here is result running following script on V100 GPU. It is 100 iterations of forward/backward/zero_grad on single 1-billion word benchmark size embedding. `Zero grad: 2.123847723007202` `None grad: 1.3342866897583008` With the backend change of this PR, the unnecessary memcpy is removed, thus further speed up is achieved. `Zero grad: 2.124978542327881` `None grad: 0.4396955966949463` [benchmark.txt](https://github.com/pytorch/pytorch/files/2341800/benchmark.txt) Some details on the code change: .detach() is used because we need to get rid of new_grad being a view without copy data. This should be safe in first-order only mode. data need to be contiguous, otherwise `grad_variable.data() += new_grad.data();` below will fail. Only the last variable that has reference to the temp gradient will grab its buffer. ngimel, mcarilli and mruberry helped on finalizing this PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/11165 Differential Revision: D9728874 Pulled By: soumith fbshipit-source-id: b8fb822a2dff6e812bbddd215d8e384534b2fd78	2018-09-07 19:41:37 -07:00
Edward Yang	0d5584d8d7	Revert D9492561: [pytorch][PR] Moving the operator argument to the front for kernelPointwiseApply. Differential Revision: D9492561 Original commit changeset: d0f0e2ab7180 fbshipit-source-id: fc822e63b11866195ff7883f360338a41e25d9e2	2018-08-24 16:02:04 -07:00
Jorg Doku	313139d14e	Moving the operator argument to the front for kernelPointwiseApply. (#10829 ) Summary: Currently on PyTorch AMD, memory accesses on the TensorInfo struct contained in the Operators passed into the kernelPointwiseApply kernel leads to hangs on the HCC runtime. Permuting the argument order such that the operator is first alleviates this issue and the kernel hangs disappear. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10829 Reviewed By: ezyang Differential Revision: D9492561 Pulled By: Jorghi12 fbshipit-source-id: d0f0e2ab7180e55846db909f2744b8c8b110205e	2018-08-24 11:10:43 -07:00
iotamudelta	75651d5b58	improve use of ROCm libraries, enable more tests, small fixes (#10406 ) Summary: * some small leftovers from the last PR review * enable more unit test sets for CI * replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND) * use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2 * use strided_batched gemm interface also from the batched internal interface * re-enable Dropout.cu as we now have philox w/ rocRAND Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406 Reviewed By: Jorghi12 Differential Revision: D9277093 Pulled By: ezyang fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2	2018-08-13 11:39:43 -07:00

1 2 3 4 5 ...

398 Commits