pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Nikita Shulga	d80fe49de0	[Reland] Add py-3.10 config (#82329 ) This is a re-land of #81372 and #81233 with the exception that it does not force the range-checks on older Python runtime versions and as such should not affect the internal workloads, which were the reason for revert, see https://github.com/pytorch/pytorch/pull/81372#issuecomment-1187516464 - [Py3.10] Allow floats to be imported as Long (#81372) - [CI] Move CUDA-11.6 to Python-3.10 configuration (#81233) - Don't do anything about range checks for pre-py3.10 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82329 Approved by: https://github.com/kit1980	2022-07-27 20:22:47 +00:00
Edward Z. Yang	7f7c81c5f9	Add empty_like support for sparse_csc/bsr/bsc (#82310 ) Signed-off-by: Edward Z. Yang <ezyang@fb.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/82310 Approved by: https://github.com/amjames, https://github.com/nikitaved	2022-07-27 18:59:07 +00:00
PyTorch MergeBot	ec1b3a45ad	Revert "[Py3.10] Allow floats to be imported as Long (#81372 )" This reverts commit `69d73345a2`. Reverted https://github.com/pytorch/pytorch/pull/81372 on behalf of https://github.com/DanilBaibak due to Break internal build	2022-07-18 14:55:13 +00:00
Nikita Shulga	69d73345a2	[Py3.10] Allow floats to be imported as Long (#81372 ) Thus avoiding `TypeError: 'float' object cannot be interpreted as an integer` when trying to create integer tensor from floating point values Use `c10::checked_convert` to detect overflows during tensor construction from scalars. Modify sparse_csr test that violated this rule Fixes #69319 Tested in #81233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/81372 Approved by: https://github.com/ezyang, https://github.com/ngimel	2022-07-15 22:57:58 +00:00
Nikita Vedeneev	880b972841	More efficient indices validations for compressed sparse formats. (#81108 ) As per title. Some of the features: - native kernels both for the CPU and CUDA without device syncs. - If needed, invariant checks 5.1 - 5.5 could be improved to utilize vectorization. This will require implementing a conversion `Vectorized -> bool`. That's a follow-up. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81108 Approved by: https://github.com/amjames, https://github.com/pearu, https://github.com/cpuhrsch	2022-07-14 20:36:18 +00:00
Pearu Peterson	d50f4a3c24	Support sparse/dense_dim for Compressed Sparse tensors (#80901 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80901 Approved by: https://github.com/cpuhrsch, https://github.com/nikitaved	2022-07-08 15:49:35 +00:00
Pearu Peterson	d266256621	Support compressed sparse tensors with dense dimensions (#80565 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80565 Approved by: https://github.com/cpuhrsch	2022-07-07 16:21:12 +00:00
PyTorch MergeBot	682c0d2615	Use segment/scatter_reduce to support masked reductions on sparse CSR tensors (mean, amax, amin) (fp only) (#78918 ) Follows design [here](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/sparse/SparseCsrTensorMath.cpp#L804-L837) and [here](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/sparse/SparseCsrTensorMath.cpp#L885-L928) from SparseCsrTensorMath.cpp (which has already been used to implement sum/prod) but use `segment_reduce`/`scatter_reduce` for reduction step Pull Request resolved: https://github.com/pytorch/pytorch/pull/78918 Approved by: https://github.com/cpuhrsch	2022-06-30 14:11:53 +00:00
Andrew M. James	9e3677f85d	Add support for BSR <-> Strided Conversion (#80354 ) Supersedes #78303 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80354 Approved by: https://github.com/cpuhrsch	2022-06-27 21:09:09 +00:00
Pearu Peterson	cde365a7cd	Validate Sparse Compressed tensor inputs (#79385 ) The validation includes regular tensor inputs, batched tensor inputs, as well as hybrid tensor inputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79385 Approved by: https://github.com/nikitaved, https://github.com/cpuhrsch	2022-06-27 17:19:54 +00:00
Nikita Vedeneev	9ad91cc6e0	optimize `to_dense` for CSC (#79635 ) As per title. Previously it was done via converting to COO. A better approach could be using `dense.out_`, but `sparse_csc` is yet forbidden. And are we fine with implementing very critical operations like `add` via transpositions? Pull Request resolved: https://github.com/pytorch/pytorch/pull/79635 Approved by: https://github.com/cpuhrsch	2022-06-21 16:52:16 +00:00
jpvillam	aff7eef476	[ROCm] Enable some sparse tests on ROCm (#77877 ) Enabling: test_sampled_addmm_errors_cuda_complex128 test_sampled_addmm_errors_cuda_complex64 test_sampled_addmm_errors_cuda_float32 test_sampled_addmm_errors_cuda_float64 test_sparse_add_cuda_complex128 test_sparse_add_cuda_complex64 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77877 Approved by: https://github.com/pruthvistony, https://github.com/malfet	2022-06-14 21:11:35 +00:00
Pearu Peterson	fb6749d977	Support CSC/BSR/BSC inputs to unary zero-preserving functions. In addition, enable testing masked reductions in sparse compressed consistency check. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78173 Approved by: https://github.com/cpuhrsch	2022-06-09 09:46:34 +00:00
Pearu Peterson	8c88a55d44	Fix sparse BSR tensor validation. Also adds bits to support dense dimensions for Sparse Compressed tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78359 Approved by: https://github.com/cpuhrsch	2022-05-27 13:26:35 +00:00
Christian Puhrsch	b9fb940dec	Conversion between SparseBsr and Strided (#78025 ) Adds conversion between the strided and SparseBsr layout [Based on code by @bhosmer!](https://colab.research.google.com/drive/1NHWti04TU269dzbRjLfxGxVlzZWo1XLo?usp=sharing) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78025 Approved by: https://github.com/pearu, https://github.com/jbschlosser	2022-05-25 15:03:35 +00:00
Christian Puhrsch	a8467de6fa	Guard test_sparse_csr.test_mm on CUDA11+ (#77965 ) Fixes #77944 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77965 Approved by: https://github.com/albanD, https://github.com/malfet	2022-05-20 16:16:28 +00:00
Christian Puhrsch	ec290949aa	Change transpose to return CSC when given CSR, adjust addmm, addmv, mm (#77615 ) Changes transpose to return CSC when given CSR and adds CSC support via to_sparse_csr to addmm and addmv. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77615 Approved by: https://github.com/pearu, https://github.com/albanD	2022-05-19 14:17:55 +00:00
Pearu Peterson	8b5f11c61e	Support copy_ for Sparse Compressed tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77605 Approved by: https://github.com/cpuhrsch	2022-05-18 21:22:19 +00:00
Christian Puhrsch	e10a002e52	2D Strided to/from CSC, COO to CSC, CSC to CSC conversion. (#77521 ) Adds - to_sparse_csc for strided input - to_sparse_csc for COO input - CSC to strided - CSC to CSR - CSC to CSC Uses SciPy as a reference Follow up work is changing transpose to return CSC when passed CSR and the resulting ripples through our matmul operations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77521 Approved by: https://github.com/pearu, https://github.com/anjali411	2022-05-18 14:49:11 +00:00
Pearu Peterson	ccc991ba29	Support str for Sparse Compressed tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/77530 Approved by: https://github.com/cpuhrsch	2022-05-18 12:58:54 +00:00
Pearu Peterson	dc882ed33d	Add Sparse Compressed tensor support to torch.clone Pull Request resolved: https://github.com/pytorch/pytorch/pull/77512 Approved by: https://github.com/cpuhrsch	2022-05-17 16:29:41 +00:00
PyTorch MergeBot	0d1329c4ea	Revert "Add Sparse Compressed tensor support to torch.clone" This reverts commit `942f04172a`. Reverted https://github.com/pytorch/pytorch/pull/77512 on behalf of https://github.com/atalman	2022-05-17 14:26:52 +00:00
Pearu Peterson	942f04172a	Add Sparse Compressed tensor support to torch.clone Pull Request resolved: https://github.com/pytorch/pytorch/pull/77512 Approved by: https://github.com/cpuhrsch	2022-05-17 07:32:46 +00:00
PyTorch MergeBot	f1c8e8fa4e	Revert "Add Sparse Compressed tensor support to torch.clone" This reverts commit `20ba6e6935`. Reverted https://github.com/pytorch/pytorch/pull/77512 on behalf of https://github.com/malfet	2022-05-17 00:31:49 +00:00
Christian Puhrsch	89e32f52c7	Change test_sparse_csr test signatures (#77595 ) Some consuming tools aren't equipped to split on the "(" and ")" induced by passing tuples to parametrize. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77595 Approved by: https://github.com/malfet	2022-05-17 00:24:08 +00:00
Pearu Peterson	20ba6e6935	Add Sparse Compressed tensor support to torch.clone Pull Request resolved: https://github.com/pytorch/pytorch/pull/77512 Approved by: https://github.com/cpuhrsch	2022-05-16 22:21:49 +00:00
Pearu Peterson	d76efed578	Add Sparse CSC support to torch.empty Pull Request resolved: https://github.com/pytorch/pytorch/pull/77508 Approved by: https://github.com/cpuhrsch	2022-05-16 18:53:56 +00:00
Christian Puhrsch	8c608a79b4	Compressed sparse layout conversion stubs (#77489 ) This PR unifies sparse layout conversions into a single location and adds stubs to raise a Runtime error for unsupported conversions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77489 Approved by: https://github.com/pearu, https://github.com/mruberry	2022-05-16 18:37:42 +00:00
Pearu Peterson	88205886d7	Add ccol_indices and row_indices methods. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77503 Approved by: https://github.com/cpuhrsch	2022-05-16 00:23:54 +00:00
Christian Puhrsch	289192199a	Add to_sparse_bsr (#77366 ) Conversion function of CSR to BSR. Follow up work includes - Conversion from strided, COO, CSC, BSC - autograd Pull Request resolved: https://github.com/pytorch/pytorch/pull/77366 Approved by: https://github.com/IvanYashchuk, https://github.com/mikaylagawarecki	2022-05-13 20:16:03 +00:00
Christian Puhrsch	b250759242	mul(dense, csr), mul(csr, dense) via sparse_mask_csr (#77177 ) This adds basic coverage, but can be easily made more efficient by providing a native implementation. Follow up work includes supporting CSR gradients for strided Tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77177 Approved by: https://github.com/nikitaved, https://github.com/mikaylagawarecki	2022-05-12 23:56:10 +00:00
Ivan Yashchuk	09be44de7b	Sparse BSR: Enable addmm, addmv, triangular_solve for BSR layout (#77255 ) This PR enables `addmm`, `addmv`, `triangular_solve` functions for tensors with `torch.sparse_bsr` layout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77255 Approved by: https://github.com/cpuhrsch	2022-05-12 08:31:44 +00:00
Ivan Yashchuk	d1beda53e8	Sparse CSR CUDA: add batched support for torch.sparse.sampled_addmm This PR adds a forloop around cuSPARSE calls to support batched inputs. cuSPARSE function itself doesn't support batched inputs yet. `mat1` and `mat2` must have the same batch shape. It's allowed to pass `self` as a single matrix when `mat1` and `mat2` are batched. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77243 Approved by: https://github.com/cpuhrsch	2022-05-12 08:23:38 +00:00
Ivan Yashchuk	545d90f032	Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm This PR updates the derivative rule for `torch.sparse.addmm` to be working with CSR sparse matrix. Notably `torch.sparse.sampled_addmm` is used in the backward function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76591 Approved by: https://github.com/cpuhrsch	2022-05-11 18:57:40 +00:00
PyTorch MergeBot	f94abd59f7	Revert "Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm" This reverts commit `721a8ca697`. Reverted https://github.com/pytorch/pytorch/pull/76591 on behalf of https://github.com/janeyx99	2022-05-10 13:21:46 +00:00
Ivan Yashchuk	721a8ca697	Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm This PR updates the derivative rule for `torch.sparse.addmm` to be working with CSR sparse matrix. Notably `torch.sparse.sampled_addmm` is used in the backward function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76591 Approved by: https://github.com/cpuhrsch	2022-05-10 08:44:55 +00:00
Ivan Yashchuk	3df0140cbd	Sparse CSR: Fix sampled_addmm for noncontiguous inputs and fix block sparse triangular solve `torch.sparse.sampled_addmm` was incorrect for noncontiguous inputs on CUDA. Unfortnately, it was overlooked in the tests that noncontiguous inputs are not tested properly because 1x5, 5x1 shapes were used. Block sparse triangular solver on CUDA could return incorrect results if there's a zero on the diagonal in the sparse matrix. Now it returns nan. Tests also revealed that unitriangular=True flag is not working correctly on CPU in some cases. That part needs more investigation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76590 Approved by: https://github.com/cpuhrsch	2022-05-05 09:00:48 +00:00
Ivan Yashchuk	1335512056	Sparse CSR: Add CPU fallback for sampled_addmm `torch.sparse.sampled_addmm` function is used in backward for `torch.sparse.addmm` and `torch.sparse.mm` therefore we need a CPU implementation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76589 Approved by: https://github.com/cpuhrsch	2022-05-04 21:30:43 +00:00
Pearu Peterson	436a7be059	Factory functions for sparse CSC, BSR, and BSC tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/76634 Tests for Sparse Compressed factory functions Pull Request resolved: https://github.com/pytorch/pytorch/pull/76746 Approved by: https://github.com/cpuhrsch	2022-05-04 03:30:41 +00:00
Ivan Yashchuk	d7db6a7b02	Sparse CSR: Add backward for torch.sparse.sampled_addmm Pull Request resolved: https://github.com/pytorch/pytorch/pull/68084 Approved by: https://github.com/cpuhrsch	2022-05-02 17:58:20 +00:00
Ivan Yashchuk	407e8eba8c	Enable simple indexing into CSR tensor, add torch.select for CSR This PR implements `torch.select` for CSR tensors. Currently, it's not possible to select rows or columns for batched CSR. The non-batched case works fine by converting to COO and calling select. Initially, I implemented raw manipulations of indices but converting to COO is only slightly slower and more readable. This PR also enables indexing into batched CSR tensor with `[x, y, z]`. Assigning is disabled. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76228 Approved by: https://github.com/cpuhrsch	2022-04-23 02:36:03 +00:00
arindamroy-eng	7478ce187a	ROCM:Unskip more tests for ROCM5.0 Re-enabling more tests which are working on ROCM5.0 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/75353 Approved by: https://github.com/ezyang	2022-04-19 19:45:55 +00:00
Ivan Yashchuk	bba4780232	Enable autograd wrt sparse CSR tensors This pull request enables accumulating gradients for the CSR tensor. Functions that work and are tested: - tensor.abs() - tensor.neg() - tensor.conj_physical() - torch.addmm `torch.mm` also works, but tests will be added later. In addition, this PR adds throwing an error when trying to access strides, storage, and contiguity info on a CSR tensor. `tensor.to_sparse_csr().to_sparse_csr()` was failing and now fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75435 Approved by: https://github.com/cpuhrsch	2022-04-19 18:42:45 +00:00
Pearu Peterson	e9791cd8c9	Validate Sparse Compressed tensor arguments Pull Request resolved: https://github.com/pytorch/pytorch/pull/75946 Approved by: https://github.com/cpuhrsch	2022-04-18 02:21:22 +00:00
Yukio Siraichi	22a10ce513	Port `cat` kernel to structured kernels. Tracking issue: #55070 Pull Request resolved: https://github.com/pytorch/pytorch/pull/68640 Approved by: https://github.com/ezyang	2022-04-14 17:49:43 +00:00
Ivan Yashchuk	3f1351d1cf	Disable strides and contiguity for CSR tensors This pull request adds throwing an error when trying to access the strides, storage, and contiguity info of a CSR tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75499 Approved by: https://github.com/cpuhrsch	2022-04-08 23:15:19 +00:00
Pearu Peterson	e61b2e12e1	Support masked sum on CSR tensors [CPU, CUDA] Pull Request resolved: https://github.com/pytorch/pytorch/pull/72633 Approved by: https://github.com/cpuhrsch	2022-04-08 20:07:18 +00:00
PyTorch MergeBot	31ed77b769	Revert "Support masked sum on CSR tensors [CPU, CUDA]" This reverts commit `5c28216aea`. Reverted https://github.com/pytorch/pytorch/pull/72633 on behalf of https://github.com/b0noI	2022-04-07 23:34:58 +00:00
Ivan Yashchuk	c7ae23b50e	Extend CSR constructor to support batched indices and values This is the first portion of changes required to enable Batched CSR format described in https://github.com/pytorch/pytorch/issues/60854#batched-CSR-computation. Currently, only the same batch shape for indices and values is allowed. In the future, we could enable "broadcasting" of indices and batched values, as done in xFormers (`dd96b8d8be/xformers/components/attention/_sputnik_sparse.py (L441)`). This PR adds possibility to construct a batched CSR matrix with `torch.sparse_csr_tensor` and this batched CSR can be converted to a dense tensor with a `.to_dense()` call. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74542 Approved by: https://github.com/cpuhrsch	2022-04-07 17:10:52 +00:00
Pearu Peterson	5c28216aea	Support masked sum on CSR tensors [CPU, CUDA] Pull Request resolved: https://github.com/pytorch/pytorch/pull/72633 Approved by: https://github.com/cpuhrsch	2022-04-07 17:08:35 +00:00
PyTorch MergeBot	6d832a7a20	Revert "Extend CSR constructor to support batched indices and values" This reverts commit `eead599039`. Reverted https://github.com/pytorch/pytorch/pull/74542 on behalf of https://github.com/b0noI	2022-04-05 21:39:34 +00:00
Christian Puhrsch	f2a4d49174	torch.mm(dense, sparse_csr) Fixes #68621 Pull Request resolved: https://github.com/pytorch/pytorch/pull/73686 Approved by: https://github.com/IvanYashchuk, https://github.com/malfet	2022-04-05 17:05:37 +00:00
Ivan Yashchuk	eead599039	Extend CSR constructor to support batched indices and values This is the first portion of changes required to enable Batched CSR format described in https://github.com/pytorch/pytorch/issues/60854#batched-CSR-computation. Currently, only the same batch shape for indices and values is allowed. In the future, we could enable "broadcasting" of indices and batched values, as done in xFormers (`dd96b8d8be/xformers/components/attention/_sputnik_sparse.py (L441)`). This PR adds possibility to construct a batched CSR matrix with `torch.sparse_csr_tensor` and this batched CSR can be converted to a dense tensor with a `.to_dense()` call. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74542 Approved by: https://github.com/cpuhrsch	2022-04-04 22:09:44 +00:00
PyTorch MergeBot	f6b9a1d4fb	Revert "Support masked sum on CSR tensors [CPU, CUDA]" This reverts commit `cda3f586d0`. Reverted https://github.com/pytorch/pytorch/pull/72633 on behalf of https://github.com/janeyx99	2022-04-04 22:06:19 +00:00
Pearu Peterson	cda3f586d0	Support masked sum on CSR tensors [CPU, CUDA] Pull Request resolved: https://github.com/pytorch/pytorch/pull/72633 Approved by: https://github.com/cpuhrsch	2022-04-04 19:23:45 +00:00
Nikita Shulga	bfac65dfe5	[testing] Update dispatch macros (#74977 ) This PR is reland of #74289 Co-authored-by: Khushi Agrawal <khushiagrawal411@gmail.com>	2022-03-30 14:13:21 -07:00
PyTorch MergeBot	cc23725e89	Revert "Extend CSR constructor to support batched indices and values" This reverts commit `c074a53002`. Reverted https://github.com/pytorch/pytorch/pull/74542 on behalf of https://github.com/malfet	2022-03-30 19:54:26 +00:00
PyTorch MergeBot	2e4152b118	Revert "[testing] Update dispatch macros" This reverts commit `eed19a0f38`. Reverted https://github.com/pytorch/pytorch/pull/74289 on behalf of https://github.com/malfet	2022-03-30 19:52:37 +00:00
Khushi Agrawal	eed19a0f38	[testing] Update dispatch macros Hi, This PR is the follow-up PR of #71561. (the previous PR had a couple of merge conflicts and was reverted, this PR resolves that). Please take a look. Thanks! cc: @pmeier @mruberry @kshitij12345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74289 Approved by: https://github.com/pmeier, https://github.com/mruberry	2022-03-30 16:10:16 +00:00
Ivan Yashchuk	c074a53002	Extend CSR constructor to support batched indices and values This is the first portion of changes required to enable Batched CSR format described in https://github.com/pytorch/pytorch/issues/60854#batched-CSR-computation. Currently, only the same batch shape for indices and values is allowed. In the future, we could enable "broadcasting" of indices and batched values, as done in xFormers (`dd96b8d8be/xformers/components/attention/_sputnik_sparse.py (L441)`). This PR adds possibility to construct a batched CSR matrix with `torch.sparse_csr_tensor` and this batched CSR can be converted to a dense tensor with a `.to_dense()` call. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74542 Approved by: https://github.com/cpuhrsch	2022-03-29 21:20:25 +00:00
Christian Puhrsch	568e02dcd7	Support sum(sparse_csr) Basic support for summation of CSR. ~~Generalizes structured torch.sum to also support CSR.~~ Follow up work: - Autograd support - OpInfo integration Pull Request resolved: https://github.com/pytorch/pytorch/pull/74766 Approved by: https://github.com/ezyang	2022-03-29 18:44:02 +00:00
Christian Puhrsch	edf2deb81e	Add private conversion function from CSR to block CSR This PR adds a private function that converts a CSR Tensor into a [scipy-style block CSR Tensor](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.bsr_matrix.html#scipy.sparse.bsr_matrix). It uses the scipy CSR to BSR conversion routines (and credits them accordingly). The main purpose of this function is to easily create a block CSR Tensor for matrix multiplication. Follow up work includes - Blocksize support for sparse_csr_tensor - Parallel CPU kernel - CUDA kernels - Faster arg sanitization - Benchmarking of cuSPARSE backend - Dense to/from block CSR - Autograd support - Column-major blocks - Block CSR to CSR conversion Pull Request resolved: https://github.com/pytorch/pytorch/pull/71582 Approved by: https://github.com/IvanYashchuk, https://github.com/albanD	2022-03-25 21:22:15 +00:00
Christian Puhrsch	7fe0b6a5cd	mul(sparse_csr, sparse_csr) using mul(sparse, sparse) Basic fallback implementation. Let's make this faster once used. NOTE: This is stacked on top of https://github.com/pytorch/pytorch/pull/74294 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74266 Approved by: https://github.com/pearu, https://github.com/malfet	2022-03-25 17:10:33 +00:00
Christian Puhrsch	807b2e190b	Move to_sparse_csr to C++ Allows use of to_sparse_csr from C++ Pull Request resolved: https://github.com/pytorch/pytorch/pull/74294 Approved by: https://github.com/ngimel, https://github.com/malfet	2022-03-23 17:17:45 +00:00
Christian Puhrsch	a346a18150	Use assertEqual consistently in test_sparse_csr.py Let's use the provided comparison infrastructure Pull Request resolved: https://github.com/pytorch/pytorch/pull/74264 Approved by: https://github.com/IvanYashchuk, https://github.com/albanD	2022-03-16 15:19:41 +00:00
Nikita Shulga	ef066f0832	Revert D34856571: [pytorch][PR] Replace `get_all_` type macros with the ATen dispatch macros. Test Plan: revert-hammer Differential Revision: D34856571 (`3ded7b1da3`) Original commit changeset: 0dca038bcad5 Original Phabricator Diff: D34856571 (`3ded7b1da3`) fbshipit-source-id: 594553fa0b710d78beba59d5d2b646f1f1270386 (cherry picked from commit 8090eb9b12dcf452a9e7dc01792a66fb91b563b6)	2022-03-15 22:07:11 +00:00
Khushi Agrawal	3ded7b1da3	Replace `get_all_` type macros with the ATen dispatch macros. (#71561 ) Summary: Hi, Team! The PR is motivated from https://github.com/pytorch/pytorch/pull/71153#discussion_r782446738. It aims to replace `get_all` type macros with the ATen dispatch macros. The files it iterates over are: (Thanks, Lezcano, for the idea!!) <details> <summary> `test/test_autograd.py`</summary> <p> ```python 43:from torch.testing._internal.common_dtype import get_all_dtypes 8506: floating_dt = [dt for dt in get_all_dtypes() if dt.is_floating_point] ``` </p> </details> <details> <summary> `test/test_binary_ufuncs.py`</summary> <p> ```python 26: all_types_and_complex_and, integral_types_and, get_all_dtypes, get_all_int_dtypes, get_all_math_dtypes, 27: get_all_complex_dtypes, get_all_fp_dtypes, 935: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1035: dtypes(get_all_dtypes( 1488: dtypes((get_all_dtypes(include_bool=False, include_bfloat16=False))) 1879: dtypes(product(get_all_dtypes(include_complex=False), get_all_dtypes(include_complex=False))) 1887: dtypes((get_all_int_dtypes() + [torch.bool])) 1913: dtypes((get_all_fp_dtypes())) 1941: dtypes((get_all_fp_dtypes())) 1977: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 2019: dtypes(product(get_all_fp_dtypes(), get_all_fp_dtypes())) 2048: dtypes(get_all_dtypes()) 2110: dtypes(product(get_all_dtypes(include_complex=False), 2111: get_all_dtypes(include_complex=False))) 2128: types = [torch.bool, torch.bfloat16] + get_all_int_dtypes() 2173: if dtypes[1] in get_all_fp_dtypes(): 2178: dtypes(product(get_all_fp_dtypes(), 2179: get_all_fp_dtypes())) 2260: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2261: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2273: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2274: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2307: dtypes(get_all_math_dtypes('cpu')) 2319: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 2331: dtypes(get_all_int_dtypes()) 2356: dtypes(get_all_dtypes(include_bfloat16=False, include_bool=False, include_complex=False)) 2393: if dtype in get_all_int_dtypes(): 2614: dtypes(get_all_dtypes()) 2624: dtypes(tuple(itertools.combinations_with_replacement(get_all_dtypes(), 2))) 2806: dtypes(list(product(get_all_dtypes(include_complex=False), 2807: get_all_dtypes(include_complex=False)))) 2866: dtypes(list(product(get_all_complex_dtypes(), 2867: get_all_complex_dtypes()))) 2902: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2906: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2910: dtypes(product(get_all_dtypes(), get_all_dtypes())) 3019: dtypes = [torch.float, torch.double] + get_all_complex_dtypes() 3221: dtypes(get_all_dtypes(include_complex=False)) 3407: dtypes(list(product(get_all_dtypes(include_bool=False), 3408: get_all_dtypes(include_bool=False)))) 3504: dtypes(product(get_all_dtypes(include_complex=False, include_bfloat16=False), 3505: get_all_dtypes(include_complex=False, include_bfloat16=False))) 3516: if x.dtype in get_all_int_dtypes() + [torch.bool]: 3643: dtypes(product(get_all_dtypes(include_complex=False, 3645: get_all_dtypes(include_complex=False, ``` </p> </details> <details> <summary> `test/test_complex.py`</summary> <p> ```python 6:from torch.testing._internal.common_dtype import get_all_complex_dtypes 11: dtypes(get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_foreach.py`</summary> <p> ```python 18: get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 142: if dtype in get_all_int_dtypes(): 179: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 201: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 205: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 211: disable_fastpath \|= dtype not in get_all_complex_dtypes() 241: bool_int_div = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 246: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 248: disable_fastpath \|= dtype not in get_all_complex_dtypes() 250: disable_fastpath \|= True and dtype not in get_all_complex_dtypes() 307: disable_fastpath = dtype in get_all_int_dtypes() + [torch.bool] 365: if opinfo.name == "_foreach_abs" and dtype in get_all_complex_dtypes(): 376: ops(foreach_unary_op_db, dtypes=get_all_dtypes()) 393: dtypes=get_all_dtypes(include_half=True, include_bfloat16=True, include_complex=False)) 401: ops(foreach_minmax_op_db, dtypes=get_all_fp_dtypes(include_bfloat16=True, include_half=True)) 426: if ord in (1, 2) and dtype in torch.testing.get_all_fp_dtypes(): 439: dtypes(get_all_dtypes()) 449: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 481: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 536: if dtype in get_all_int_dtypes() + [torch.bool] and foreach_op == torch._foreach_div: 545: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 637: ops(foreach_pointwise_op_db, allowed_dtypes=get_all_fp_dtypes(include_half=False, include_bfloat16=False)) ``` </p> </details> <details> <summary> `test/test_linalg.py`</summary> <p> ```python 29: all_types, floating_types, floating_and_complex_types, get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, 30: get_all_fp_dtypes, 111: dtypes((get_all_dtypes())) 794: float_and_complex_dtypes = get_all_fp_dtypes() + get_all_complex_dtypes() 807: dtypes((get_all_int_dtypes())) 828: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 841: if dtype in get_all_complex_dtypes(): 844: dtypes(itertools.product(get_all_dtypes(), 845: get_all_dtypes())) 855: for dtypes0, dtypes1, dtypes2 in product(get_all_dtypes(), repeat=3): 5607: get_all_fp_dtypes(include_half=not CUDA9, include_bfloat16=(CUDA11OrLater and SM53OrLater))) 5608: dtypes((set(get_all_dtypes()) - {torch.half, torch.bool})) 5644: dtypes((get_all_complex_dtypes() + get_all_fp_dtypes())) 6255: dtypesIfCUDA(get_all_complex_dtypes(), 6256: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)), 6292: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6323: dtypesIfCUDA(get_all_complex_dtypes(), 6324: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6325: dtypes(get_all_complex_dtypes(), get_all_fp_dtypes()) 6358: dtypesIfCUDA(([torch.float, torch.double] + get_all_complex_dtypes())) 6556: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6668: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6741: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_nn.py`</summary> <p> ```python 37:from torch.testing._internal.common_dtype import integral_types, get_all_fp_dtypes, get_all_math_dtypes 50: onlyNativeDeviceTypes, deviceCountAtLeast, largeTensorTest, expectedFailureMeta, skipMeta, get_all_device_types, \ 8862: for device in get_all_device_types(): 9629: for dt1 in get_all_math_dtypes(device): 9630: for dt2 in get_all_math_dtypes(device): 9631: for dt3 in get_all_math_dtypes(device): 9648: for input_dtype in get_all_math_dtypes(device): 9664: for input_dtype in get_all_math_dtypes(device): 13015: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13034: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13159: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17400: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17768: dtypesIfCUDA(get_all_fp_dtypes()) 17773: dtypesIfCUDA(get_all_fp_dtypes()) 17778: dtypesIfCUDA(get_all_fp_dtypes()) 17783: dtypesIfCUDA(get_all_fp_dtypes()) 17788: dtypesIfCUDA(get_all_fp_dtypes()) 17793: dtypesIfCUDA(get_all_fp_dtypes()) 17798: dtypesIfCUDA(get_all_fp_dtypes()) 17963: dtypesIfCUDA(get_all_fp_dtypes()) 17977: dtypesIfCUDA(get_all_fp_dtypes()) 18684: def test_cross_entropy_loss_prob_target_all_reductions(self, device): ``` </p> </details> <details> <summary> `test/test_numpy_interop.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import get_all_dtypes 399: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_ops.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import floating_and_complex_types_and, get_all_dtypes 86: for dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_reductions.py`</summary> <p> ```python 16: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 360: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 366: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 394: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 750: for dtype in [dtype for dtype in get_all_math_dtypes('cpu') if dtype != torch.float16]: 1404: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1457: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1458: get_all_complex_dtypes())) 1465: return dtype in get_all_int_dtypes() 1494: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1501: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1507: dtypes((get_all_complex_dtypes())) 1514: dtypes = list(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)) 1523: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1531: if dtype in get_all_fp_dtypes(): 1608: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, 1837: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1855: dtypes((set(get_all_dtypes(include_bool=False, include_complex=False)) - {torch.uint8})) 3219: for dtype in get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_serialization.py`</summary> <p> ```python 26:from torch.testing._internal.common_dtype import get_all_dtypes 586: for device, dtype in product(devices, get_all_dtypes()): 589: for other_dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_shape_ops.py`</summary> <p> ```python 18:from torch.testing._internal.common_dtype import get_all_dtypes 230: dtypes(get_all_dtypes(include_complex=False, include_bool=False, include_half=False, 232: dtypesIfCUDA(get_all_dtypes(include_complex=False, include_bool=False, include_bfloat16=False)) 344: dtypes(get_all_dtypes()) 443: dtypes(get_all_dtypes()) 461: dtypes(get_all_dtypes()) 570: dtypes(get_all_dtypes(include_complex=False)) ``` </p> </details> <details> <summary> `test/test_sort_and_select.py`</summary> <p> ```python 12: all_types, all_types_and, floating_types_and, get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, 136: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 231: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 296: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 647: dtypesIfCUDA(get_all_fp_dtypes()) 678: dtypesIfCUDA((get_all_dtypes(include_complex=False, 682: dtypes((get_all_dtypes(include_complex=False, include_bool=False, include_half=False, include_bfloat16=False))) 739: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 740: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) 799: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 800: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) ``` </p> </details> <details> <summary> `test/test_sparse.py`</summary> <p> ```python 20:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes 29: floating_and_complex_types, floating_and_complex_types_and, get_all_dtypes, get_all_int_dtypes, 1963: return dtype in get_all_int_dtypes() 1994: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2103: return dtype in get_all_int_dtypes() 2138: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2626: all_sparse_dtypes = get_all_dtypes(include_complex=True) 2633: all_sparse_dtypes = get_all_dtypes(include_complex=True) 3230: dtypes(get_all_complex_dtypes(), 3231: get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 3234: get_all_fp_dtypes( ``` </p> </details> <details> <summary> `test/test_sparse_csr.py`</summary> <p> ```python 7:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes, floating_and_complex_types, make_tensor 17:from torch.testing._internal.common_dtype import floating_types, get_all_dtypes 120: dtypes(get_all_dtypes()) 133: dtypes(get_all_dtypes()) 150: dtypes(get_all_dtypes()) 180: dtypes(get_all_dtypes()) 201: dtypes(get_all_dtypes()) 210: dtypes(get_all_dtypes()) 225: dtypes(get_all_dtypes()) 244: dtypes(get_all_dtypes()) 263: dtypes(get_all_dtypes()) 285: dtypes(get_all_dtypes()) 411: dtypes(get_all_dtypes()) 482: dtypes(get_all_dtypes()) 502: dtypes(get_all_dtypes()) 562: dtypes(get_all_dtypes()) 588: dtypesIfCUDA(get_all_complex_dtypes(), 589: get_all_fp_dtypes(include_half=SM53OrLater, include_bfloat16=SM80OrLater)) 745: dtypesIfCUDA(get_all_complex_dtypes(), 746: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 765: dtypesIfCUDA(get_all_complex_dtypes(), 766: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 801: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 841: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 1182: dtypes(get_all_dtypes()) 1276: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_bfloat16=False)) 1286: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_tensor_creation_ops.py`</summary> <p> ```python 21: onlyCUDA, skipCPUIf, dtypesIfCUDA, skipMeta, get_all_device_types) 23: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 150: for dt in get_all_dtypes(): 160: for dt in get_all_dtypes(): 314: dtypes = [dtype for dtype in get_all_dtypes() if dtype != torch.bfloat16] 1012: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1013: get_all_complex_dtypes())) 1032: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1033: get_all_complex_dtypes())) 1050: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1051: get_all_complex_dtypes())) 1745: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1779: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1868: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1926: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1954: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 1956: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, None) 1957: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 2538: for device in get_all_device_types(): 2645: for dtype in get_all_dtypes(): 2678: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False) + 2679: get_all_complex_dtypes())) 2716: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 2827: for dt in get_all_dtypes(): 2913: dtypes(get_all_dtypes(include_bool=False, include_half=False)) 2914: dtypesIfCUDA(get_all_dtypes(include_bool=False, include_half=True)) 3028: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3033: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3074: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_complex=False)) 3075: dtypesIfCUDA(((get_all_int_dtypes() + [torch.float32, torch.float16, torch.bfloat16]) 3077: else get_all_dtypes(include_bool=False, include_half=True, include_complex=False))) 3873: dtypes(get_all_dtypes()) 3884: dtypes(get_all_dtypes(include_bool=False)) 3916: for other in get_all_dtypes(): 3922: dtypes(get_all_dtypes()) 3932: dtypes(get_all_dtypes(include_bool=False)) 3955: dtypes(get_all_dtypes(include_bool=False)) 3961: dtypes(get_all_dtypes(include_bool=False)) 3965: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_testing.py`</summary> <p> ```python 25:from torch.testing._internal.common_dtype import get_all_dtypes 31: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_torch.py`</summary> <p> ```python 51: expectedAlertNondeterministic, get_all_device_types, skipXLA) 57: get_all_fp_dtypes, get_all_int_dtypes, get_all_math_dtypes, get_all_dtypes, get_all_complex_dtypes 296: for d in get_all_device_types(): 323: for device in get_all_device_types(): 324: for dt1 in get_all_dtypes(): 325: for dt2 in get_all_dtypes(): 343: all_dtypes = get_all_dtypes() 350: all_dtypes = get_all_dtypes() 781: for dtype in get_all_dtypes(): 986: for device in get_all_device_types(): 1017: for device in get_all_device_types(): 1018: for dtype in get_all_math_dtypes(device): 2792: for device in get_all_device_types(): 3186: dtypes(get_all_dtypes()) 3195: for error_dtype in get_all_dtypes(): 3203: dtypes(get_all_dtypes()) 3212: for error_dtype in get_all_dtypes(): 4539: dtypes(get_all_fp_dtypes()) 4545: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 4577: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 4578: dtypesIfCPU((get_all_fp_dtypes(include_half=False, include_bfloat16=True))) 4579: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4599: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4600: dtypesIfCPU((get_all_dtypes(include_half=False, include_bfloat16=False, include_complex=False))) 4601: dtypesIfCUDA((get_all_dtypes(include_bfloat16=False, include_complex=False))) 4613: for p_dtype in get_all_fp_dtypes(include_half=device.startswith('cuda'), include_bfloat16=False): 4628: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4629: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4640: dtypes(get_all_fp_dtypes()) 4723: dtypes(get_all_fp_dtypes()) 4735: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 4736: dtypesIfCUDA(get_all_fp_dtypes()) 4747: dtypes(get_all_fp_dtypes()) 4761: dtypes(get_all_fp_dtypes()) 4771: dtypes(get_all_fp_dtypes()) 4792: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 5302: dtypes(get_all_dtypes(include_bfloat16=False)) 5322: dtypes(get_all_dtypes(include_half=False, include_bfloat16=False)) 5323: dtypesIfCPU(get_all_dtypes(include_bfloat16=False)) 5324: dtypesIfCUDA(get_all_dtypes(include_bfloat16=False)) 5591: for dt in get_all_dtypes(): 5611: for dt in get_all_dtypes(): 5678: for dt in get_all_dtypes(): 5696: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 5697: dtypes(set(get_all_math_dtypes('cpu'))) 5746: dtypes(get_all_dtypes()) 5780: dtypes(get_all_dtypes()) 5885: dtypes(get_all_dtypes()) 5902: dtypes(get_all_dtypes()) 5945: dtypes(get_all_dtypes()) 5979: dtypes(get_all_dtypes(include_bool=False)) 6049: dtypes(get_all_dtypes(include_bool=False)) 6092: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6093: get_all_complex_dtypes())) 6094: dtypesIfCPU(get_all_dtypes()) 6095: dtypesIfCUDA(get_all_dtypes()) 6122: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6123: get_all_complex_dtypes())) 6124: dtypesIfCPU(get_all_dtypes()) 6125: dtypesIfCUDA(get_all_dtypes()) 6163: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6164: get_all_complex_dtypes())) 6165: dtypesIfCPU(get_all_dtypes()) 6166: dtypesIfCUDA(get_all_dtypes()) 6190: dtypes((get_all_complex_dtypes() + 6191: get_all_int_dtypes())) 6238: dtypes(get_all_dtypes()) 6323: dtypes(get_all_dtypes()) 6389: dtypes(product(get_all_dtypes(), (torch.uint8, torch.bool))) 6699: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 6700: dtypes(set(get_all_math_dtypes('cpu'))) 7452: dtypes(get_all_dtypes(include_bool=False)) 7461: dtypes(get_all_dtypes(include_bool=False)) 7477: dtypes(get_all_dtypes(include_bool=False)) 7496: dtypes(get_all_dtypes(include_bool=False)) 7538: dtypes(get_all_dtypes(include_bool=False)) 8162: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8163: get_all_complex_dtypes())) 8175: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8176: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_type_promotion.py`</summary> <p> ```python 14: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes 187: for dtype in get_all_dtypes(): 262: dtypes1 = get_all_math_dtypes('cuda') 263: dtypes2 = get_all_math_dtypes(device) 339: dtypes(itertools.product(get_all_dtypes(), get_all_dtypes())) 468: for dt1 in get_all_math_dtypes(device): 469: for dt2 in get_all_math_dtypes(device): 519: for dt1 in get_all_math_dtypes(device): 520: for dt2 in get_all_math_dtypes(device): 528: for dt in get_all_math_dtypes(device): 561: for dtype in get_all_dtypes(): 766: dtypes=get_all_math_dtypes(device)) 771: dtypes=get_all_math_dtypes(device)) 782: dtypes=get_all_math_dtypes(device)) 879: dtypes = get_all_dtypes(include_bfloat16=False) 898: dtypes = get_all_dtypes(include_bfloat16=False, include_bool=False) 965: dtypesIfCUDA(itertools.product(get_all_dtypes(include_bfloat16=False, include_complex=False), 966: get_all_dtypes(include_bfloat16=False, include_complex=False))) 967: dtypes(itertools.product(get_all_dtypes(include_half=False, include_bfloat16=False, 969: get_all_dtypes(include_half=False, include_bfloat16=False, 976: return dtype in get_all_int_dtypes() + [torch.bool] 979: return dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False) ``` </p> </details> <details> <summary> `test/test_unary_ufuncs.py`</summary> <p> ```python 24: floating_types_and, all_types_and_complex_and, floating_and_complex_types_and, get_all_dtypes, get_all_math_dtypes, 25: get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 517: dtypes((get_all_int_dtypes() + [torch.bool] + 518: get_all_fp_dtypes(include_bfloat16=False))) 596: dtypes(get_all_fp_dtypes(include_half=True, include_bfloat16=False)) 611: invalid_input_dtypes = get_all_int_dtypes() + \ 612: get_all_complex_dtypes() + \ 619: for dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False): 1048: dtypes(get_all_math_dtypes('cpu')) 1182: dtypesIfCUDA(get_all_fp_dtypes()) 1190: dtypesIfCUDA(get_all_fp_dtypes()) 1205: dtypesIfCUDA(get_all_fp_dtypes()) 1215: dtypesIfCUDA(get_all_fp_dtypes()) 1307: dtypes((get_all_dtypes(include_bool=False))) 1349: dtypes((get_all_fp_dtypes(include_half=False) + 1350: get_all_complex_dtypes())) 1351: dtypesIfCUDA((get_all_fp_dtypes(include_half=True) + 1352: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_view_ops.py`</summary> <p> ```python 19: get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 124: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 131: dtypes(get_all_dtypes(include_bfloat16=False)) 213: for view_dtype in [get_all_fp_dtypes(), get_all_complex_dtypes()]: 220: dtypes(get_all_dtypes()) 224: for view_dtype in get_all_dtypes(): 305: dtypes(get_all_complex_dtypes(include_complex32=True)) 343: dtypes(get_all_dtypes()) 354: dtypes(get_all_dtypes()) 364: dtypes(get_all_dtypes()) 374: dtypes(get_all_dtypes()) 384: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 395: dtypes(get_all_complex_dtypes()) 426: dtypes(get_all_complex_dtypes()) 451: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 1263: dtypes((torch.testing.get_all_dtypes())) 1279: dtypes((torch.testing.get_all_dtypes())) 1405: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1406: get_all_complex_dtypes())) 1471: dtypes(get_all_dtypes(include_bfloat16=False)) 1574: dtypes(get_all_dtypes()) 1601: dtypes(get_all_dtypes(include_bfloat16=False)) 1632: dtypes(*get_all_dtypes(include_bfloat16=False)) 1711: for dt in get_all_dtypes(): 1717: for dt in get_all_dtypes(): 1724: for dt in get_all_dtypes(): ``` </p> </details> I'm looking forward to your viewpoints. Thanks :) cc: mruberry kshitij12345 anjali411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71561 Reviewed By: samdow Differential Revision: D34856571 Pulled By: mruberry fbshipit-source-id: 0dca038bcad5cf69906245c496d2e61ac3876335 (cherry picked from commit b058f67b4313143efa714ab105f36e74083131b9)	2022-03-15 20:31:41 +00:00
Pearu Peterson	4168c87ed3	Support CSR to COO conversion in to_sparse(2). (#73642 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73642 Former https://github.com/pytorch/pytorch/pull/73471 that was reverted due to lack of `to_sparse(sparse_dim)` support. Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34580353 Pulled By: cpuhrsch fbshipit-source-id: a8a4ea381daeb80d8365fe931af9f55a7e789ea1 (cherry picked from commit 5a3cf8110980e5a10dbb687e87e67d5524ebf2f5)	2022-03-02 22:33:32 +00:00
Nikita Shulga	8ac7393565	Revert D33767740: [pytorch][PR] Sparse CSR CPU: cuSolverSP backend for `linalg.solve` Test Plan: revert-hammer Differential Revision: D33767740 (`199d9a992c`) Original commit changeset: a945f065210c Original Phabricator Diff: D33767740 (`199d9a992c`) fbshipit-source-id: b7934df18118f8d6d5f165deb5aae9887953ae43 (cherry picked from commit d3ddbb021b227e3638f6f7c22c6eadfa73695e31)	2022-03-01 18:33:23 +00:00
Rohan Varma	95204c4e2b	Revert D34503882: Support CSR to COO conversion in to_sparse. Test Plan: revert-hammer Differential Revision: D34503882 (`84f4e9c10a`) Original commit changeset: 4a781647a0ae Original Phabricator Diff: D34503882 (`84f4e9c10a`) fbshipit-source-id: cf161171a3b51aa3c0f2b15501956873b1ba29dd (cherry picked from commit 924c19071713777700087087b27b388eb057d8d9)	2022-03-01 15:33:37 +00:00
Pearu Peterson	84f4e9c10a	Support CSR to COO conversion in to_sparse. (#73471 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73471 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D34503882 Pulled By: cpuhrsch fbshipit-source-id: 4a781647a0ae5d03827406b75b14acc7c48da0b0 (cherry picked from commit fa3dbdc6a8529d19f8a055494436ca1f766807be)	2022-03-01 06:31:52 +00:00
Kushashwa Ravi Shrimali	199d9a992c	Sparse CSR CPU: cuSolverSP backend for `linalg.solve` (#71399 ) Summary: This PR introduces the `cuSolverSP` backend for `linalg.solve` with sparse CSR input matrices. The motivation comes from the issue: https://github.com/pytorch/pytorch/issues/69538. `cuSolver` provides [`cusolverSp<t>csrlsvluHost`](https://docs.nvidia.com/cuda/cusolver/index.html#cusolver-lt-t-gt-csrlsvlu) API, a few things to note: 1. As mentioned in the documentation: `only CPU (Host) path is provided.` From the profiling, there doesn't seem to be any GPU kernel launch for optimization, please see the profiling below. 2. Since only `host` path is provided, the CPU path uses `csrlsvluHost` (but requires PyTorch to be installed/built with CUDA support). 3. The documentation mentions reordering helps optimize stuff, but it isn't clear how it affects the performance. There are options for reordering, so we stick to `reorder = 0` as the default choice. `cuSolver` has [`csrlsvqr`](https://docs.nvidia.com/cuda/cusolver/index.html#cusolver-lt-t-gt-csrlsvqr) function which provides a `device` path to solve the linear system. This function is used for the CUDA path in this PR. Gist: For CPU Path: we call [`csrlsvluHost` function of cuSolver](https://docs.nvidia.com/cuda/cusolver/index.html#cusolver-lt-t-gt-csrlsvlu). For CUDA Path: we call [`csrlsvqr` function of cuSolver](https://docs.nvidia.com/cuda/cusolver/index.html#cusolver-lt-t-gt-csrlsvqr). Profiling: (On sparse input tensor of size 1000 x 1000, with a vector of shape length 1000), for `csrlsvlu` function (to show no GPU optimization) ```cpp ==3999651== Profiling result: Type Time(%) Time Calls Avg Min Max Name GPU activities: 100.00% 2.1440us 1 2.1440us 2.1440us 2.1440us [CUDA memcpy HtoD] API calls: 99.72% 1.07199s 9 119.11ms 500ns 1.07164s cudaFree 0.11% 1.2182ms 398 3.0600us 140ns 137.94us cuDeviceGetAttribute 0.06% 674.45us 4 168.61us 165.50us 173.64us cuDeviceTotalMem 0.03% 357.07us 4 89.268us 2.7800us 201.89us cudaMalloc 0.03% 309.29us 1 309.29us 309.29us 309.29us cudaGetDeviceProperties 0.01% 160.47us 332 483ns 350ns 3.3300us cudaFuncSetAttribute 0.01% 115.12us 4 28.780us 26.290us 33.410us cuDeviceGetName 0.00% 28.591us 5 5.7180us 440ns 16.921us cudaGetDevice 0.00% 22.061us 4 5.5150us 871ns 18.690us cudaDeviceSynchronize 0.00% 20.370us 18 1.1310us 410ns 6.9900us cudaEventDestroy 0.00% 16.390us 1 16.390us 16.390us 16.390us cudaMemcpy 0.00% 11.540us 2 5.7700us 1.4900us 10.050us cuDeviceGetPCIBusId 0.00% 10.510us 18 583ns 430ns 1.6200us cudaEventCreateWithFlags 0.00% 7.9100us 21 376ns 290ns 700ns cudaDeviceGetAttribute 0.00% 1.4300us 6 238ns 150ns 590ns cuDeviceGet 0.00% 1.2200us 4 305ns 190ns 500ns cuDeviceGetCount 0.00% 900ns 1 900ns 900ns 900ns cuInit 0.00% 860ns 4 215ns 180ns 260ns cuDeviceGetUuid 0.00% 240ns 1 240ns 240ns 240ns cuDriverGetVersion 0.00% 230ns 1 230ns 230ns 230ns cudaGetDeviceCount ``` Script: ```python import torch def solve(x, other, out): torch.linalg.solve(x, other, out=out) if __name__ == "__main__": dense_inp = torch.randn((1000, 1000), dtype=torch.float64) # Set 50% of the values to 0 randomly dense_inp = torch.nn.functional.dropout(dense_inp, p=0.5) sparse_inp = dense_inp.to_sparse_csr() other = torch.randint(100, (1000,), dtype=torch.float64) out = torch.randint(1, (1000,), dtype=torch.float64) solve(sparse_inp, other, out) ``` The following error is raised when the function is used on a CPU device with PyTorch built/installed without CUDA support: * When built without CUDA support: ```python /home/krshrimali/pytorch/torch/autograd/profiler.py:151: UserWarning: CUDA is not available, disabling CUDA profiling warn("CUDA is not available, disabling CUDA profiling") Traceback (most recent call last): File "/home/krshrimali/pytorch/test_sp.py", line 17, in <module> solve(x, other, out) File "/home/krshrimali/pytorch/test_sp.py", line 5, in solve torch.linalg.solve(x, other, out=out) RuntimeError: PyTorch was not built with CUDA support. Please use PyTorch built CUDA support ``` Performance Comparison (vs SciPy's [`scipy.sparse.linalg.spsolve`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.spsolve.html): Time taken by `scipy.sparse.linalg.spsolve` : 0.595 seconds On CPU: Time taken by `torch.linalg.solve` : 4.565 seconds On CUDA: Time taken by `torch.linalg.solve`: 1.838 seconds The inputs are of dimensions: (17281, 17281) and (17281, 1), and were taken from https://math.nist.gov/MatrixMarket/extreme.html. Thanks to IvanYashchuk for helping me with the PR, and guiding me through it. cc: IvanYashchuk pearu nikitaved cpuhrsch cc nikitaved pearu cpuhrsch Pull Request resolved: https://github.com/pytorch/pytorch/pull/71399 Reviewed By: VitalyFedyunin Differential Revision: D33767740 Pulled By: cpuhrsch fbshipit-source-id: a945f065210cd719096eb8d7cdbf8e8937c2fce9 (cherry picked from commit f4f35c17da414e1ca6c6d91402933521857aa1ea)	2022-03-01 05:32:35 +00:00
Ivan Yashchuk	0ba3498248	Sparse CSR CPU: implement addmm(dense, sparse, sparse) -> dense (#73076 ) Summary: This PR adds a possibility to multiply two sparse matrices and add the result of a product to a dense matrix. It uses [MKL spmmd function](https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/blas-and-sparse-blas-routines/inspector-executor-sparse-blas-routines/inspector-executor-sparse-blas-execution-routines/mkl-sparse-spmmd.html) and only CPU path is implemented for now. Ref. https://github.com/pytorch/pytorch/issues/60858 Pull Request resolved: https://github.com/pytorch/pytorch/pull/73076 Reviewed By: mikaylagawarecki Differential Revision: D34342993 Pulled By: cpuhrsch fbshipit-source-id: 5e5ea67cb92fbaa4d4c0eaf61e85019972989a21 (cherry picked from commit 62b8dc730e6a6736f5c03ac09eac5223cd9706cf)	2022-02-26 01:08:45 +00:00
Ivan Yashchuk	ebd93f69db	Enable CSR inputs for torch.sparse.mm (#73075 ) Summary: Previously `torch.sparse.mm` supported only COO and dense inputs. Computing derivatives works wrt dense input for sparse_csr x dense -> dense Modified implementation of `torch.sparse.mm` to be directly bound to ATen function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/73075 Reviewed By: mikaylagawarecki Differential Revision: D34342954 Pulled By: cpuhrsch fbshipit-source-id: a6ed914a0ce28b35276109479109095f7149d32b (cherry picked from commit 948de1816c46cd087bacbee36dc583cf409813f9)	2022-02-24 04:30:48 +00:00
Pearu Peterson	e785c0a1ab	Enable Half/BFloat16 support for to_dense and coalesce methods. (#72397 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72397 Test Plan: Imported from OSS Reviewed By: jbschlosser, zou3519 Differential Revision: D34286114 Pulled By: cpuhrsch fbshipit-source-id: a4f7e2abc3b2d37437cbd09d693c1b409bb011b9 (cherry picked from commit `74f94447fc`)	2022-02-17 02:54:23 +00:00
Ivan Yashchuk	fb7c4780f9	Add autograd tests for addmm, addmv, mm, mv and CSR matrix input (#71949 ) Summary: This PR adds autograd tests for `addmm, addmv, mm, mv` functions that check computing derivatives wrt dense inputs. Currently, neither autograd engine, nor gradcheck can work with CSR inputs<->CSR outputs. I added xfailing tests for that. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71949 Reviewed By: george-qi Differential Revision: D33834653 Pulled By: cpuhrsch fbshipit-source-id: 4144c1547427d4cd6b01495cf45242bb4e914e86 (cherry picked from commit `2cb362283d`)	2022-02-11 23:14:02 +00:00
Ivan Yashchuk	ad5a5a9794	Beta value is ignored for sparse torch.addmm with non-MKL build (#72430 ) Summary: When PyTorch is not built with MKL or on Windows there's a native implementation of `torch.addmm` for tensors on CPU. There was a bug that `beta` value was ignored, causing new tests to fail (see https://github.com/pytorch/pytorch/pull/71949#issuecomment-1024639741). In addition, I also enabled complex numbers support for this code path. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72430 Reviewed By: davidberard98 Differential Revision: D34045670 Pulled By: cpuhrsch fbshipit-source-id: b2b63f22ba3eea895a31c5c2925b0fb1555d2c6f (cherry picked from commit `ac0a2080bb`)	2022-02-09 00:32:17 +00:00
Nikita Shulga	38ebb776a4	Fail with unexpected success for fatal errors (#72016 ) Summary: Rest of the tests from CUDA testuite is skipped after GPU context corruption is encountered. For tests decorated with `expectedFailure` creates false impression that entire testsuite is passing. Remedy it by suppressing the exception and printing the warning about unexpected success if `should_stop_early` is true Also, prints warning when this happens (to make attribution easier) as well as when this condition is detected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72016 Test Plan: `python test_ops.py -v -k test_fn_fwgrad_bwgrad_gradient` Before the change: ``` test_fn_fwgrad_bwgrad_gradient_cpu_complex128 (__main__.TestGradientsCPU) ... ok test_fn_fwgrad_bwgrad_gradient_cpu_float64 (__main__.TestGradientsCPU) ... ok test_fn_fwgrad_bwgrad_gradient_cuda_complex128 (__main__.TestGradientsCUDA) ... expected failure ---------------------------------------------------------------------- Ran 3 tests in 0.585s OK (expected failures=1) ``` After the change: ``` test_fn_fwgrad_bwgrad_gradient_cpu_complex128 (__main__.TestGradientsCPU) ... ok test_fn_fwgrad_bwgrad_gradient_cpu_float64 (__main__.TestGradientsCPU) ... ok test_fn_fwgrad_bwgrad_gradient_cuda_complex128 (__main__.TestGradientsCUDA) ... /home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py:1670: UserWarning: TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. warn(f"TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with {rte}") /home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_device_type.py:382: UserWarning: Suppressed expected failure that resulted in fatal error warn("Suppressed expected failure that resulted in fatal error") unexpected success ---------------------------------------------------------------------- Ran 3 tests in 0.595s FAILED (unexpected successes=1) ``` And `stderr` from XML file contains requested info: ``` /home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py:1670: UserWarning: TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. warn(f"TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failed with {rte}") /home/conda/miniconda3/lib/python3.9/site-packages/torch/testing/_internal/common_device_type.py:382: UserWarning: Suppressed expected failure that resulted in fatal error warn("Suppressed expected failure that resulted in fatal error") ``` Fixes https://github.com/pytorch/pytorch/issues/71973 Reviewed By: janeyx99, ngimel Differential Revision: D33854287 Pulled By: malfet fbshipit-source-id: dd0f5a4d2fcd21ebb7ee50ce4ec4914405a812d0 (cherry picked from commit `0c0baf3931`)	2022-02-03 17:49:59 +00:00
Kushashwa Ravi Shrimali	85591dc85d	Test 0->0 correspondence for Unary Ops with Sparse CSR inputs (#70302 ) Summary: Since there is no rule in PyTorch (Sparse CSR) for filling zeros, it was decided that only those ops will be supported which do not break 0->0 correspondence. To ensure that this rule is not broken, this PR aims to add a test to ensure this rule is not broken. `sample_inputs_unary` may or may not generate a zero in the sample input. Hence, this separate test is good for validating the rule, and the support for Sparse CSR. cc nikitaved pearu cpuhrsch Pull Request resolved: https://github.com/pytorch/pytorch/pull/70302 Reviewed By: albanD Differential Revision: D33922501 Pulled By: cpuhrsch fbshipit-source-id: 10f67a220b95a8e75205345a33744ad536fdcf53 (cherry picked from commit `ade9bf7818`)	2022-02-03 16:53:27 +00:00
Christian Puhrsch	4a7e07e53e	Fix torch.save and detach for CSR Tensor (#71963 ) Summary: Currently saving a CSR Tensor simply fails. This also addresses the segfault encountered in https://github.com/pytorch/pytorch/issues/71652. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71963 Reviewed By: jbschlosser Differential Revision: D33895938 Pulled By: cpuhrsch fbshipit-source-id: a333505d3a216705147c2aaaaeb2a0fd0c2a5e43 (cherry picked from commit `a88265921c`)	2022-02-02 23:59:24 +00:00
Ivan Yashchuk	be2dc8f294	Sparse CSR CUDA: Add torch.baddbmm and torch.bmm (#68711 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68711 This PR adds possibility to multiply a single CSR matrix by a batch of dense matrices. cc nikitaved pearu cpuhrsch IvanYashchuk ngimel Test Plan: Imported from OSS Reviewed By: davidberard98 Differential Revision: D33773319 Pulled By: cpuhrsch fbshipit-source-id: 1623ce9affbc4fdc6d6130a95c5a42022858b62b (cherry picked from commit `628c8e366d`)	2022-01-28 07:25:32 +00:00
Ivan Yashchuk	f93ffc9ea8	Sparse CSR: Handle zero matrix consistently for triangular_solve (#71304 ) Summary: This PR enables `test_block_triangular` tests on the CPU. These tests revealed that there was a problem with how the nnz==0 case is handled. Now we return a tensor filled with NaNs both on CUDA and CPU. cc nikitaved pearu cpuhrsch Pull Request resolved: https://github.com/pytorch/pytorch/pull/71304 Reviewed By: davidberard98 Differential Revision: D33600482 Pulled By: cpuhrsch fbshipit-source-id: d09cb619f8b6e54b9f07eb16765ad1c183c42487	2022-01-17 13:47:49 -08:00
Ivan Yashchuk	40121456af	Sparse CSR: Add `torch.randn_like` (#68083 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68083 This PR adds support for `torch.randn_like(sparse_csr_tensor)`. It creates a new sparse csr tensor with same indices but different values that are normally distributed. In addition `.normal_()` and `torch.empty_like` were implemented because `randn_like` is a composite of these two functions. cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D33511280 Pulled By: cpuhrsch fbshipit-source-id: 6129083e8bc6cc5af2e0191294bd5e4e864f6c0e	2022-01-11 18:29:24 -08:00
Pearu Peterson	cfc5519661	Support Sparse CSR transpose. Fix clang-tidy warnings. (#70582 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70582 cc nikitaved pearu cpuhrsch Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D33414446 Pulled By: cpuhrsch fbshipit-source-id: dd0888d9dd3885579e853643a60d13373b5d6b15	2022-01-05 17:41:51 -08:00
Pearu Peterson	ab7d0df449	Support cloning CSR tensors (#70581 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70581 cc nikitaved pearu cpuhrsch Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D33413992 Pulled By: cpuhrsch fbshipit-source-id: 3a576d2c2f26d1edcc8f6932b2dbe2c7c11e9593	2022-01-04 21:41:18 -08:00
Ivan Yashchuk	60eb1e53b2	Sparse CSR CPU: Add block sparse support for MKL path (#68710 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68710 This PR adds support for block sparse (BSR) matrices for functions that use Inspector-Executor MKL Sparse API. At the moment of this PR it's: * torch.addmm * torch.addmv * torch.triangular_solve (once https://github.com/pytorch/pytorch/pull/62180 is merged) cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: ZolotukhinM Differential Revision: D33179486 Pulled By: cpuhrsch fbshipit-source-id: e1dec0dccdbfed8b280be16b8c11fc9e770d50ae	2021-12-17 10:56:05 -08:00
Ivan Yashchuk	243e135eb4	Sparse CSR CUDA: Add block sparse support for torch.triangular_solve (#68709 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68709 This PR adds support for triangular solver with a block CSR matrix. cc nikitaved pearu cpuhrsch IvanYashchuk ngimel Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D33066067 Pulled By: cpuhrsch fbshipit-source-id: 9eaf1839071e9526be8d8c6d47732b24200f3557	2021-12-16 13:03:42 -08:00
Peter Bell	6de9f0fc94	OpInfo: Allow sample_inputs_func to be any iterable (#69256 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69256 Closes #52486 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D32942008 Pulled By: mruberry fbshipit-source-id: f5b01b0298c0160b0bec6e86e2b6db8cfe746206	2021-12-09 08:37:26 -08:00
Ivan Yashchuk	a8232ee1bc	Sparse CSR CUDA: Add block torch.addmv when mat is sparse (#68708 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68708 This PR adds block CSR matrix times dense vector multiplication. cc nikitaved pearu cpuhrsch IvanYashchuk ngimel Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D32647694 Pulled By: cpuhrsch fbshipit-source-id: a1c120691c4350284b156fe4259eda684b734b66	2021-12-07 14:02:59 -08:00
Kushashwa Ravi Shrimali	63470f9449	Sparse CSR: Implement unary ufuncs (with 0->0 correspondence) (#69292 ) Summary: This PR attempts to add support for unary ufuncs (with 0->0 correspondence) for Sparse CSR Layout. Ops supported: `['abs', 'asin', 'asinh', 'atan', 'atanh', 'ceil', 'conj_physical', 'floor', 'log1p', 'neg', 'round', 'sin', 'sinh', 'sign', 'sgn', 'signbit', 'tan', 'tanh', 'trunc', 'expm1', 'sqrt', 'angle', 'isinf', 'isposinf', 'isneginf', 'isnan', 'erf', 'erfinv']` cc nikitaved pearu cpuhrsch IvanYashchuk peterbell10 Pull Request resolved: https://github.com/pytorch/pytorch/pull/69292 Reviewed By: pbelevich Differential Revision: D32805514 Pulled By: cpuhrsch fbshipit-source-id: 9ae20817e77a36d3aa6c5afa532b9dc3b8cf1dd3	2021-12-07 12:07:41 -08:00
Ivan Yashchuk	89a145fd91	Sparse CSR CUDA: Add torch.sparse.sampled_addmm (#68007 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68007 This PR adds a new function to the sparse module. `sampled_addmm` computes α(A @ B) spy(C) + β*C, where C is a sparse CSR matrix and A, B are dense (strided) matrices. This function is currently restricted to single 2D matrices, it doesn't support batched input. cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D32435799 Pulled By: cpuhrsch fbshipit-source-id: b1ffac795080aef3fa05eaeeded03402bc097392	2021-11-29 15:43:29 -08:00
Ivan Yashchuk	61a4204d80	Sparse CSR CUDA: Add block torch.addmm when mat1 is sparse (#68707 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68707 This PR adds a path for block CSR matrices for `torch.addmm`. cuSPARSE interface is restricted to 32-bit indices and square blocks. My plan is to make everything work and tests passing using an unsafe constructor first, keeping it all private. Then discuss & implement constructors with block information separately unlocking the functions for wider use. Documentation will come with the update to constructors. cc nikitaved pearu cpuhrsch IvanYashchuk ngimel Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D32650366 Pulled By: cpuhrsch fbshipit-source-id: 430a9627901781ee3d2e2496097b71ec17727d98	2021-11-29 08:58:49 -08:00
Nikita Shulga	208e109dbf	Revert D32633806: Sparse CSR CUDA: Add block torch.addmm when mat1 is sparse Test Plan: revert-hammer Differential Revision: D32633806 (`b28ddd72d3`) Original commit changeset: b98db0bd655c fbshipit-source-id: 1c757628526bb1b88747257fc77d8b9cb996e502	2021-11-24 09:15:17 -08:00
Ivan Yashchuk	b28ddd72d3	Sparse CSR CUDA: Add block torch.addmm when mat1 is sparse (#68707 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/68707 This PR adds a path for block CSR matrices for `torch.addmm`. cuSPARSE interface is restricted to 32-bit indices and square blocks. My plan is to make everything work and tests passing using an unsafe constructor first, keeping it all private. Then discuss & implement constructors with block information separately unlocking the functions for wider use. Documentation will come with the update to constructors. cc nikitaved pearu cpuhrsch IvanYashchuk ngimel Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D32633806 Pulled By: cpuhrsch fbshipit-source-id: b98db0bd655cce651a5da457e78fca08619a5066	2021-11-23 22:55:46 -08:00
Ivan Yashchuk	3b3dc1ade8	Sparse CSR CPU: add `triangular_solve_out` (#62180 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62180 This PR adds CPU dispatch for `triangular_solve` with sparse CSR matrix. The implementation uses MKL Sparse library. If it's not available then a runtime error is thrown. cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D32581395 Pulled By: cpuhrsch fbshipit-source-id: 41c7133a0d2754ef60b5a7f1d14aa0bf7680a844	2021-11-21 21:29:20 -08:00
Kushashwa Ravi Shrimali	833dcaf2d6	Sparse CSR: Add `torch.sin` (#68123 ) Summary: This PR attempts to add support for `torch.sin` for sparse CSR tensors. This aims to be a revised implementation (in some form) of https://github.com/pytorch/pytorch/pull/68083, and the implementation aims to be similar to that in [`SparseTensorMath.cpp` file](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/sparse/SparseTensorMath.cpp) The tests and `empty_like` support for sparse CSR tensors (with a minor correction) are borrowed from https://github.com/pytorch/pytorch/pull/68083 temporarily to assist CI with testing this PR. :) cc nikitaved pearu cpuhrsch IvanYashchuk krshrimali Pull Request resolved: https://github.com/pytorch/pytorch/pull/68123 Reviewed By: jbschlosser Differential Revision: D32533379 Pulled By: cpuhrsch fbshipit-source-id: eb834d64d16ee12734c77e74fffa4a47614e3dfb	2021-11-18 21:58:09 -08:00
Rok	952ca25daa	Sparse CSR: add `convert_indices_from_csr_to_coo` (#66774 ) Summary: This PR adds conversion from CSR to COO. Fixes https://github.com/pytorch/pytorch/issues/56959 cc nikitaved pearu cpuhrsch IvanYashchuk gchanan mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/66774 Reviewed By: zou3519 Differential Revision: D32288415 Pulled By: cpuhrsch fbshipit-source-id: 683ba658dc46835fdf3c0e24645c0c2bb243b968	2021-11-17 22:28:30 -08:00
Ivan Yashchuk	affa3f846c	Sparse CSR CPU: add `torch.addmm` (#65606 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/65606 This PR adds `torch.addmm(c, a, b, alpha=1.0, beta=0.0, out=out)` variant with `a, b, c, out` all being sparse CSR tensors on CPU. cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D32366236 Pulled By: cpuhrsch fbshipit-source-id: e910bcc96eee99d624b80ee881df3887ab3ba5ac	2021-11-16 17:22:46 -08:00
Ivan Yashchuk	c2642b6465	Sparse CSR CPU: add `torch.add` with all inputs sparse (#64391 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64391 This PR adds `torch.add(a, b, alpha=None, out=out)` variant with `a, b, out` all being sparse CSR tensors on CPU. Fixes #59060 cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D32316562 Pulled By: cpuhrsch fbshipit-source-id: 384462369007854b5e2e6cb9ae7b320302627c71	2021-11-11 10:02:12 -08:00
Ivan Yashchuk	cbf596bf8e	Sparse CSR CPU: add `addmv_out` (#61536 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61536 This PR adds CPU dispatch for `addmv_out` with Sparse CSR matrix. The implementation uses MKL Sparse library. If it's not available then a runtime error is thrown. Since structured_delegate is used we only need to implement the out variant, the in-place and normal variants are autogenerated. MKL descriptor of sparse matrices is implemented in `at::mkl::sparse::MklSparseCsrDescriptor`. MKL Sparse doesn't allow switching indices type in runtime, it's predetermined in build time. Only 32-bit version of MKL was tested locally, but I expect 64-bit version to work correctly as well. When indices type of PyTorch CSR tensor doesn't match with MKL's, indices tensor is converted to MKL compatible type (`int` vs `int64_t`). cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D32141787 Pulled By: malfet fbshipit-source-id: b818a0b186aa227982221c3862a594266a58a2a6	2021-11-09 12:34:21 -08:00
Ivan Yashchuk	d5d342b237	Sparse CSR CUDA: Support mixed memory format input for triangular_solve (#66401 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66401 This PR fixes the case when result and input tensors have different strides. cuSPARSE from CUDA 11.3.1 has a bug: it doesn't use correct strides to write the result. This is "fixed" in PyTorch code by copying the input tensor to a tensor with same strides as result tensor has. cc nikitaved pearu cpuhrsch IvanYashchuk ngimel Test Plan: Imported from OSS Reviewed By: davidberard98 Differential Revision: D32177966 Pulled By: cpuhrsch fbshipit-source-id: 118437409df147f04dce02763aff9bfd33f87c63	2021-11-04 15:34:42 -07:00
Ivan Yashchuk	69f86ecd3a	Sparse CSR CUDA: add `torch.add` with all inputs sparse (#63948 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63948 This PR adds `torch.add(a, b, alpha=None, out=out)` variant with `a, b, out` all being sparse CSR tensors. The underlying cuSPARSE function works only with 32-bit indices, and in the current implementation, the result tensor has 32-bit indices. Input tensors can have both 64-bit and 32-bit indices tensors. Fixes https://github.com/pytorch/pytorch/issues/59060 cc nikitaved pearu cpuhrsch IvanYashchuk ngimel Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D31909731 Pulled By: cpuhrsch fbshipit-source-id: 656f523e3947fec56b2f93c474fb6fd49f0360ca	2021-10-29 10:43:05 -07:00
Ivan Yashchuk	bd5e6fe5ac	Skip complex128 dtype for test_addmm_sizes_all_sparse_csr Windows test (#67453 ) Summary: Windows CUDA 11.1 periodic CI is failing. See https://github.com/pytorch/pytorch/pull/63511#issuecomment-953940183. I don't understand though why periodic-win-vs2019-cuda11.1-py3 was triggered on the PR, but no test from `test_sparse_csr.py` were run https://github.com/pytorch/pytorch/runs/3975200820?check_suite_focus=true. cc nikitaved pearu cpuhrsch IvanYashchuk mruberry Pull Request resolved: https://github.com/pytorch/pytorch/pull/67453 Reviewed By: malfet, seemethere, janeyx99 Differential Revision: D31997574 Pulled By: cpuhrsch fbshipit-source-id: ae8bfb6da865014f39e6ad5675eb17e5a4d39744	2021-10-28 12:24:46 -07:00
Ivan Yashchuk	7c48b9ee25	Sparse CSR CUDA: add `triangular_solve_out` (#61858 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61858 This PR adds `triangular_solve_out_sparse_csr_cuda`. The operation is used to comput the solution to the linear system where coefficient matrix is triangular. Structured kernels are used and the meta function needed some changes to support sparse csr layout. With sparse matrix input the `cloned_coefficient` tensor is 0-sized tensor. cc nikitaved pearu cpuhrsch IvanYashchuk ngimel Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D31948435 Pulled By: cpuhrsch fbshipit-source-id: 7775fece83ca705a26d75f82aead10b956b14bfd	2021-10-27 11:12:20 -07:00
Ivan Yashchuk	700b39a3df	Sparse CSR CUDA: add `torch.addmm` with all inputs sparse (#63511 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63511 This PR adds `torch.addmm(c, a, b)` variant with `c, a, b` all being CSR tensors. The underlying cuSPARSE function works only with 32-bit indices, and in the current implementation the result tensor has 32-bit indices. Input tensors can have both 64-bit and 32-bit indices tensors. cc nikitaved pearu cpuhrsch IvanYashchuk ngimel Test Plan: Imported from OSS Reviewed By: eellison Differential Revision: D31809838 Pulled By: cpuhrsch fbshipit-source-id: 97005dba27d8adcae445eb756bcbd7271061e9b5	2021-10-25 14:32:30 -07:00
Ivan Yashchuk	450221c534	Sparse CSR: Add tensor.resize_ and tensor.copy_ (#63510 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63510 Sparse CSR matrix resizing behavior: If we _increase the number of rows_ the number of specified elements in the matrix remains the same -> the size of col_indices, values doesn't change, the size of crow_indices becomes `rows+1`. If we _decrease the number of rows_ the number of specified elements will be `min(nnz, rowscols)` -> need to resize `crow_indices` to `rows+1` and set the last element to `min(nnz, rowscols)`; decrease the size of col_indices and values to `min(nnz, rows*cols)`. If we _increase the number of columns_ the number of specified elements in the matrix remains the same, the number of rows remains the same -> no need to resize anything, just set new sizes. We _cannot decrease the number of columns_ because it would require recomputing `crow_indices`. cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D31796680 Pulled By: cpuhrsch fbshipit-source-id: 7d8a9701ce06d30a1841f94bba0a057cacea9401	2021-10-20 14:19:04 -07:00
Jane Xu	793f366e34	[skip ci] Set test owners for sparse tests (#66863 ) Summary: Action following https://github.com/pytorch/pytorch/issues/66232 cc nikitaved pearu cpuhrsch IvanYashchuk Pull Request resolved: https://github.com/pytorch/pytorch/pull/66863 Reviewed By: anjali411 Differential Revision: D31771126 Pulled By: janeyx99 fbshipit-source-id: 6cb5ca0557e8555f6a09b3e607ff8888e505486e	2021-10-20 10:12:13 -07:00
Ivan Yashchuk	bd4d5cb14c	Sparse CSR: Add torch.empty (#63509 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63509 The primary use of `torch.empty` is to reserve memory for tensor and set the type, device, size information. The same is done here for SparseCSR. `crow_indices` is initialized as an empty tensor of size `num_rows + 1`. `col_indices` and `values` are initialized as empty tensors of size 0. cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D31770359 Pulled By: cpuhrsch fbshipit-source-id: c83f2a2e0d7514ba24780add1086e1bccf541dd9	2021-10-19 15:59:07 -07:00
Ivan Yashchuk	3488a85a76	Sparse CSR CUDA: fix input checks for `addmm` and `mm` (#66485 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66485 The errors for incorrectly sized inputs should match the dense variants of functions. Moved addmm_out_sparse_csr_dense_cuda from SparseCsrTensorMath.cu and removed unnecessary device check. cc nikitaved pearu cpuhrsch IvanYashchuk Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D31764036 Pulled By: cpuhrsch fbshipit-source-id: 76900fe9e4a49474695a01f34bad41cb3422321c	2021-10-19 12:01:11 -07:00
Ivan Yashchuk	08f3823647	Sparse CSR CUDA: add `addmv_out` (#61407 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61407 This PR adds `addmv_out_sparse_csr_cuda`. The operation is used to compute matrix-vector multiplication. Since structured_delegate is used we only need to implement the out variant, the in-place and normal variants are autogenerated. Working on this PR revealed that float16 (and probably bfloat16) inputs do not work correctly in cusparse, therefore for this case `addmm` is used with squeezes and unsqueezes. cc nikitaved pearu cpuhrsch IvanYashchuk ngimel Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D31584499 Pulled By: ngimel fbshipit-source-id: 4c507791471ada88969116b88eeaaba7a7536431	2021-10-12 20:06:56 -07:00
Ivan Yashchuk	541eb1db63	Add cuSPARSE descriptors and update CSR addmm (#60838 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60838 Rewrote `addmm_out_sparse_csr_dense_cuda` implementation using new cusparse descriptors. `addmm` now works without conversions with both 32-bit and 64-bit indices. The dense tensors can have a row- or column-major layout. If the dense tensors are a contiguous slice of a larger tensor, the storage is used directly without temporary copies. Test Plan: Imported from OSS Reviewed By: pbelevich Differential Revision: D30643191 Pulled By: cpuhrsch fbshipit-source-id: 5555f5b59b288daa3a3987d322a93dada63b46c8	2021-09-30 11:32:51 -07:00
Philip Meier	26b7ff5aea	deprecate dtype getters from `torch.testing` namespace (#63554 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63554 Following https://github.com/pytorch/pytorch/pull/61840#issuecomment-884087809, this deprecates all the dtype getters publicly exposed in the `torch.testing` namespace. The reason for this twofold: 1. If someone is not familiar with the C++ dispatch macros PyTorch uses, the names are misleading. For example `torch.testing.floating_types()` will only give you `float32` and `float64` skipping `float16` and `bfloat16`. 2. The dtype getters provide very minimal functionality that can be easily emulated by downstream libraries. We thought about [providing an replacement](https://gist.github.com/pmeier/3dfd2e105842ad0de4505068a1a0270a), but ultimately decided against it. The major problem is BC: by keeping it, either the namespace is getting messy again after a new dtype is added or we need to somehow version the return values of the getters. Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D30662206 Pulled By: mruberry fbshipit-source-id: a2bdb10ab02ae665df1b5b76e8afa9af043bbf56	2021-09-07 08:58:51 -07:00
Kushashwa Ravi Shrimali	d37636901e	[Doc] `make_tensor` to `torch.testing` module (#63925 ) Summary: This PR aims to add `make_tensor` to the `torch.testing` module in PyTorch docs. TODOs: * [x] Add examples cc: pmeier mruberry brianjo Pull Request resolved: https://github.com/pytorch/pytorch/pull/63925 Reviewed By: ngimel Differential Revision: D30633487 Pulled By: mruberry fbshipit-source-id: 8e5a1f880c6ece5925b4039fee8122bd739538af	2021-08-30 12:25:40 -07:00
Shen Li	1022443168	Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: revert-hammer Differential Revision: D30279364 (`b004307252`) Original commit changeset: c1ed77dfe43a fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e	2021-08-12 11:45:01 -07:00
Zsolt Dollenstein	b004307252	[codemod][lint][fbcode/c*] Enable BLACK by default Test Plan: manual inspection & sandcastle Reviewed By: zertosh Differential Revision: D30279364 fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a	2021-08-12 10:58:35 -07:00
rusty1s	82123758ba	`_convert_coo_to_csr` CPP and CUDA functionality (#61838 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/57381 and improves https://github.com/pytorch/pytorch/pull/61340 via dedicated `coo_to_csr` functionalities. Pull Request resolved: https://github.com/pytorch/pytorch/pull/61838 Reviewed By: ezyang Differential Revision: D30132736 Pulled By: cpuhrsch fbshipit-source-id: a1fd074c0d70366a524d219a620b94f8bed71d7c	2021-08-11 11:37:20 -07:00
rusty1s	457a0b63bf	use `torch.bucketize` in`to_sparse_csr` implementation (+ additional tests) (#61340 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/57381 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61340 Reviewed By: bhosmer Differential Revision: D29601393 Pulled By: cpuhrsch fbshipit-source-id: 4ca1f013d96e8716f0e658e0cd685d9aa0d98a5c	2021-07-20 15:44:25 -07:00
Ivan Yashchuk	7011513d23	Enable sparse_csr.to_dense() for bool, float16, bfloat16 and complex (#60657 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60657 Fixes https://github.com/pytorch/pytorch/issues/60648 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D29408102 Pulled By: cpuhrsch fbshipit-source-id: 406505c1c52c0eada934833f9723f58fa67e9256	2021-07-07 19:29:19 -07:00
Pearu Peterson	374278f431	Improved sparse CSR tensor sampling method (#60283 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/59379 The improved sparse CSR tensor sampling method is described in https://pearu.github.io/csr_sampling.html that features: - for specified `nnz`, one gets a CSR sample with the same `nnz` - variability of the number of specified columns per row is maximized - `crow_indices` content is randomized - a given row specific `col_indices` content is sorted and filled with unique values (see also https://github.com/pytorch/pytorch/issues/60277) Pull Request resolved: https://github.com/pytorch/pytorch/pull/60283 Reviewed By: bhosmer Differential Revision: D29492605 Pulled By: cpuhrsch fbshipit-source-id: 8d875b7c2b0573a9ab37047c6d8fe8b540295ce1	2021-07-01 13:26:19 -07:00
Joel Schlosser	03b5a225a7	Test parametrization for instantiated device-specific tests (#60233 ) Summary: The `ops` decorator provides a way to parameterize a test across a given list of ops. This would be useful for modules as well (e.g. a `modules` decorator), but the mechanism by which this is accomplished is specific to ops. In the details, the `ops` decorator tags a test function with the metadata needed (list of ops, `dtypes`) and the actual tests are generated according to this metadata during the call to `instantiate_device_type_tests()`. This PR makes this mechanism more generic, allowing for test parameterization across arbitrary dimensions. This makes a `modules` decorator (or any similar type of decorator) straightforward to implement without changes to the device-specific test instantiation logic. One caveat is that, since this is implemented where the old `ops` decorator was (within `instantiate_device_type_tests()`), this only works for tests instantiated using the device-specific instantiation logic. Longer term, even device-specific test instantiation could be treated as an optional parameterization across device types, but this PR takes a low-risk approach for now. In practice, this just means that a `device` kwarg is required for all test signatures used with the mechanism. The `ops` decorator has been refactored to use the generic mechanism and works the same as before, with one difference: when `OpDTypes.none` is specified, the test signature no longer needs an unused `dtype` kwarg. This is a nice bonus that demonstrates the added flexibility of a generic parameterization mechanism. The refactored form also has the bonus that all op-specific test generation logic is contained within the `ops` decorator class, improving readability. Behind the scenes, the generic mechanism is a base decorator class (`_TestParameterizer`) from which `ops` derives. The core functionality is in the `_parameterize_test()` method, which takes in a test function and returns a generator that produces parameterized tests, including names and parameter kwargs to pass to them. Using the `ops` decorator results in a set of op-specific tests from a given generic test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/60233 Reviewed By: iramazanli Differential Revision: D29494995 Pulled By: jbschlosser fbshipit-source-id: a14446488c106094fafcaa75ccf8e9e3faf33bfc	2021-06-30 18:50:22 -07:00
Ivan Yashchuk	c5f0692b6e	Sparse CSR: increase dtype test coverage (#60656 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60656 This PR uses `torch.testing.get_all_dtypes()` for dtype parametrisation of tests in `test_sparse_csr.py`. It adds previously excluded from tests bool, half, bfloat16, complex dtypes. `torch.complex32` is omitted due to lack of coverage and lack of specialized `AT_DISPATCH...`. The process of adding more dtypes to tests releaved that `.to_dense()` doesn't work for all dtypes. Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D29408058 Pulled By: cpuhrsch fbshipit-source-id: 319b6f51b9786d6957d508f51657657a6d00267a	2021-06-25 17:11:21 -07:00
Alexander	2d8f0d966f	CUDA support in the CSR layout: CUDA addmm/matvec (#59012 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59012 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D28719631 Pulled By: bhosmer fbshipit-source-id: 43e2004a61e114aeb0a7c6ad8a25fedda238c6da	2021-06-01 21:16:42 -07:00
Alexander	41054f2ab5	CUDA support in the CSR layout: sparse_to_dense/add_sparse_csr (#59011 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59011 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D28719550 Pulled By: bhosmer fbshipit-source-id: 530c7cd1b20ae6d8865fd414afaf6fab27a643e6	2021-05-27 20:59:22 -07:00
Alexander	b435a27fb7	CUDA support in the CSR layout: constructors (#59010 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59010 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D28719287 Pulled By: bhosmer fbshipit-source-id: fbb5784ccb5ce19dcca1f2f95c4ee16f9b7680c4	2021-05-26 16:39:43 -07:00
Alban Desmaison	032d6b0643	Revert D28112689: CUDA support in the CSR layout: constructors Test Plan: revert-hammer Differential Revision: D28112689 (`1416e57465`) Original commit changeset: f825cd4bce40 fbshipit-source-id: 421fc590797ac5fab6a55ac6f213361fbba7cd5b	2021-05-26 06:15:05 -07:00
Alexander	1416e57465	CUDA support in the CSR layout: constructors (#57274 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57274 Test Plan: Imported from OSS Reviewed By: astaff Differential Revision: D28112689 Pulled By: bhosmer fbshipit-source-id: f825cd4bce402dd4c3f71db88854f77830b687b8	2021-05-26 01:36:20 -07:00
Alexander	1fca1545d4	fixing csr addmm bug (#58768 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58768 Fixes gh-58757 This PR has a fix for CPU version of addmm op. Just for context, before this PR, only CSR @ vector was supported. I found out a minor bug in the addmm_out_sparse_csr_dense_cpu for the non MKL code which is solved in this PR. Moreover, I discovered a limitation in the current MKL implementation. It only works well (acceptable tolerance for output error) with square matrices. I was looking in deep to this issue and I found out that it could be a limitation of the MKL API. I used this [gist code](https://gist.github.com/aocsa/0606e833cd16a8bfb7d37a5fbb3a5b14) based on [this](https://github.com/baidu-research/DeepBench/blob/master/code/intel/spmm/spmm_bench.cpp) to test this behavior. As you can see there is not an acceptable output error (last column) when the matrices are squares and there is a not acceptable error when the matrices are not square. I reported the issue here: https://github.com/pytorch/pytorch/issues/58770 Looking forward to your comments. Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D28629563 Pulled By: malfet fbshipit-source-id: 5ee00ae667336e0d9301e5117057213f472cbc86	2021-05-24 09:54:07 -07:00
Rong Rong (AI Infra)	a70020465b	adding test_sparse_csr to run_test (#58666 ) Summary: fixes https://github.com/pytorch/pytorch/issues/58632. Added several skips that relates to test assert and MKL. Will address them in separate PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/58666 Reviewed By: seemethere, janeyx99 Differential Revision: D28607966 Pulled By: walterddr fbshipit-source-id: 066d4afce2672e4026334528233e69f68da04965	2021-05-22 13:17:46 -07:00
Nikita Shulga	abb215e229	Fix dtype inference in sparse_csr_tensor_ctor (#58631 ) Summary: `NULL` return from `PyObject_GetAttrString` should never get ignored without handling the exception, as behavior of subsequent Python C API calls are undefined until `PyErr_Fetch` or `PyErr_Clear` is called. This accidentally leads to `list` type being incorrectly identified as `Tensor` Fixes https://github.com/pytorch/pytorch/issues/58520 Pull Request resolved: https://github.com/pytorch/pytorch/pull/58631 Reviewed By: albanD Differential Revision: D28559454 Pulled By: malfet fbshipit-source-id: 46f044b5f0f94264779a6108474d04a8ba851c53	2021-05-20 08:02:05 -07:00
Alexander	18c89a904b	Modernize test-suite in sparse tensor CSR (#56392 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56392 Fixes for gh-56371 and gh-56369 Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D27913212 Pulled By: mruberry fbshipit-source-id: 2c78fe9fa4b6c6b566d9eb01f71e6016d672a545	2021-04-27 15:22:17 -07:00
Sameer Deshmukh	5fb1142702	Add CSR (compressed sparse row) layout for sparse tensors (#50937 ) Summary: Implement compressed sparse row format. Derived from the GCS implementation at https://github.com/pytorch/pytorch/pull/44190 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50937 Reviewed By: mrshenli Differential Revision: D27439865 Pulled By: ezyang fbshipit-source-id: 3ba3dcb9679505b980ff6a5f513e913bbae2fb1d	2021-04-12 10:09:12 -07:00

1 2 3 4 5

231 Commits