pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
Peter Bell	99f2000a99	Migrate nonzero from TH to ATen (CPU) (#59149 ) Summary: Resubmit of https://github.com/pytorch/pytorch/issues/58811, Closes gh-24745 The existing PR (gh-50655) has been stalled because `TensorIterator` doesn't guarantee iteration order in the same way that `TH_TENSOR_APPLY` does. For contiguous test cases this isn't an issue; but it breaks down for example with channels last format. I resolve this by adding a new `TensorIteratorConfig` parameter, `enforce_linear_iteration`, which disables dimension reordering. I've also added a test case for non-contiguous tensors to verify this works. This PR also significantly improves performance by adding multithreading support to the algorithm. As part of this, I wrote a custom `count_nonzero` that gives per-thread counts which is necessary to write the outputs in the right location. \| Shape \| Before \| After (1 thread) \| After (8 threads) \| \|:----------:\|--------:\|-----------------:\|------------------:\| \| 256,128,32 \| 2610 us \| 2150 us \| 551 us \| \| 128,128,32 \| 1250 us \| 1020 us \| 197 us \| \| 64,128,32 \| 581 us \| 495 us \| 99 us \| \| 32,128,32 \| 292 us \| 255 us \| 83 us \| \| 16,128,32 \| 147 us \| 126 us \| 75 us \| \| 8,128,32 \| 75 us \| 65 us \| 65 us \| \| 4,128,32 \| 39 us \| 33 us \| 33 us \| \| 2,128,32 \| 20 us \| 18 us \| 18 us \| \| 1,128,32 \| 11 us \| 9 us \| 9 us \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/59149 Reviewed By: mruberry Differential Revision: D28817466 Pulled By: ngimel fbshipit-source-id: f08f6c003c339368fd53dabd28e9ada9e59de732	2021-06-02 12:26:29 -07:00
Natalia Gimelshein	657b75d155	Revert D28700259: [pytorch][PR] Migrate nonzero from TH to ATen (CPU) Test Plan: revert-hammer Differential Revision: D28700259 (`95b1bc1009`) Original commit changeset: 9b279ca7c36d fbshipit-source-id: 267afe63376be598d24c862e02e3b4b3ea75f77c	2021-05-27 20:07:30 -07:00
Peter Bell	95b1bc1009	Migrate nonzero from TH to ATen (CPU) (#58811 ) Summary: Closes gh-24745 The existing PR (gh-50655) has been stalled because `TensorIterator` doesn't guarantee iteration order in the same way that `TH_TENSOR_APPLY` does. For contiguous test cases this isn't an issue; but it breaks down for example with channels last format. I resolve this by adding a new `TensorIteratorConfig` parameter, `enforce_linear_iteration`, which disables dimension reordering. I've also added a test case for non-contiguous tensors to verify this works. This PR also significantly improves performance by adding multithreading support to the algorithm. As part of this, I wrote a custom `count_nonzero` that gives per-thread counts which is necessary to write the outputs in the right location. \| Shape \| Before \| After (1 thread) \| After (8 threads) \| \|:----------:\|--------:\|-----------------:\|------------------:\| \| 256,128,32 \| 2610 us \| 2220 us \| 496 us \| \| 128,128,32 \| 1250 us \| 976 us \| 175 us \| \| 64,128,32 \| 581 us \| 486 us \| 88 us \| \| 32,128,32 \| 292 us \| 245 us \| 80 us \| \| 16,128,32 \| 147 us \| 120 us \| 71 us \| \| 8,128,32 \| 75 us \| 61 us \| 61 us \| \| 4,128,32 \| 39 us \| 32 us \| 32 us \| \| 2,128,32 \| 20 us \| 17 us \| 17 us \| \| 1,128,32 \| 11 us \| 9 us \| 9 us \| Pull Request resolved: https://github.com/pytorch/pytorch/pull/58811 Reviewed By: anjali411 Differential Revision: D28700259 Pulled By: ngimel fbshipit-source-id: 9b279ca7c36d8e348b7e5e4be0dd159e05aee159	2021-05-27 10:06:54 -07:00
Pearu Peterson	be4ba29d49	Detect overflow in numel of sparse COO tensor (#57492 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/57416 Pull Request resolved: https://github.com/pytorch/pytorch/pull/57492 Reviewed By: albanD Differential Revision: D28273649 Pulled By: mruberry fbshipit-source-id: 08ba50509556df1981d7ede025d84a836d2e8e5e	2021-05-25 22:16:21 -07:00
Alexander	6f2c0cccdd	New: sparse complex: add linear algebra, addmm (#57129 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57129 Test Plan: Imported from OSS Reviewed By: janeyx99, astaff Differential Revision: D28112701 Pulled By: ezyang fbshipit-source-id: 1b253453dc19e908fb18d0b1a83738243e0a8d59	2021-05-07 05:37:48 -07:00
Alexander	a911c4fc1c	New: Initial support for sparse complex tensors constructors for CPU/CUDA (#57125 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57125 I'm opening this PR, solving the last issued reported before merging PR #54153 https://github.com/pytorch/pytorch/pull/54153#issuecomment-827997616, Solves gh-50690 Test Plan: Imported from OSS Reviewed By: astaff Differential Revision: D28112702 Pulled By: ezyang fbshipit-source-id: 915681954edb14b7c19c3ffe641af2d2e6649576	2021-05-07 05:36:41 -07:00
Peter Bell	a5288a0244	Sparse support for division rounding_mode argument (#51989 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/51989 Test Plan: Imported from OSS Reviewed By: astaff Differential Revision: D28118114 Pulled By: mruberry fbshipit-source-id: 2a76ee55c3845552e57e93d54628ce3c2fab3399	2021-05-01 17:37:25 -07:00
Mike Ruberry	7bcce2acb9	Revert D27765618: Initial support for sparse complex tensors constructors for CPU/CUDA Test Plan: revert-hammer Differential Revision: D27765618 (`daef60c3b7`) Original commit changeset: a9cdd31d5c7a fbshipit-source-id: f700d5db7ff8930b9158460b5a77f68a35e212a4	2021-04-27 15:48:51 -07:00
Alexander	0d41122e61	Eliminate global usage of torch.set_default_dtype in sparse test (#56393 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/56393 Fixes for gh-56369 Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27913266 Pulled By: mruberry fbshipit-source-id: 2c590d3a2188aae251184f08c1a6a2c4c570d150	2021-04-27 15:23:14 -07:00
Alexander	daef60c3b7	Initial support for sparse complex tensors constructors for CPU/CUDA (#54153 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54153 Currently, sparse tensors only support real floating point tensors. Complex support is added in this PR for CPU/CUDA. - [x] add complex support (torch.cfloat and torch.cdouble) to torch.sparse_coo_tensor constructors - [x] add complex support to coalesce function - [x] add complex support to to_dense function - [x] add complex support to to_sparse function - [x] add complex support to sparse_add function - [x] add unit tests Note: This PR contains only complex support for torch.sparse_coo_tensor fordward function and the related ops used with this function (coalesce, to_dense, to_sparse, and sparse_add). The following PRs in ghstack should cover other sparse operations to have a more complex sparse support, specifically related with the use of specific APIs for accelerated linear algebra. Note: Before using ghstack the original PR was #50984 Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D27765618 Pulled By: ezyang fbshipit-source-id: a9cdd31d5c7a7dafd790f6cc148f3df26e884c89	2021-04-27 14:39:13 -07:00
sorenrasmussenai	f27513e951	Fix bug in torch.sparse.addmm on CUDA when beta != 0 or 1 (#56160 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/55917, which caused `torch.sparse.addmm` to fail on CUDA whenever `beta` was different from 0 or 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/56160 Reviewed By: ejguan Differential Revision: D27825108 Pulled By: ngimel fbshipit-source-id: 2ade5ea38c5322768dc4dffb40c65fcbb17ec201	2021-04-26 02:57:41 -07:00
Alexander	6ee333cdb5	modernize test_sparse (#54572 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54572 Adding device generic tests to `test_sparse`. Follow-up PR: #54153 I think is ready to review. Looking forward your comments cc mruberry. Thanks Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D27562663 Pulled By: mruberry fbshipit-source-id: c48973e707f779b529bc7f61b75103194b428987	2021-04-09 12:19:29 -07:00
Alban Desmaison	b91d48877d	Reland Fix reference cycle in sparse coalesce graph (#55404 ) Summary: Reland of https://github.com/pytorch/pytorch/pull/52874 Pull Request resolved: https://github.com/pytorch/pytorch/pull/55404 Reviewed By: bdhirsh Differential Revision: D27600438 Pulled By: albanD fbshipit-source-id: f5c286638b324ad59be65657a016028af5e2b303	2021-04-07 12:02:42 -07:00
Brian Hirsh	ec80981d28	Revert D27246997: [pytorch][PR] Fix reference cycle in sparse coalesce graph Test Plan: revert-hammer Differential Revision: D27246997 (`815bfad28c`) Original commit changeset: 0fe6c1104350 fbshipit-source-id: 4d345718589a642d3c65474b266342285205ccdf	2021-04-06 11:45:27 -07:00
Peter Bell	815bfad28c	Fix reference cycle in sparse coalesce graph (#52874 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/52253 In the issue reproducer we can replace `torch.sparse.sum(S)` with `S.coalesce()` and get the same memory leak. The reason is that calling `coalesce()` on an already coalesced tensor returns `self`. With autograd, the result gets it's `grad_fn` set to a node that contains a reference to the input tensor, creating a reference cycle. Cloning the tensor fixes this, so `coalesce` always returns a new tensor. As an aside, `torch.sparse.sum(S)` doesn't need to coalesce. The result should be the same either way. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52874 Reviewed By: bdhirsh Differential Revision: D27246997 Pulled By: albanD fbshipit-source-id: 0fe6c11043501a7874a50982afd42964f47470d3	2021-04-06 08:32:19 -07:00
Heitor Schueroff	6d87b3667f	Added support for TensorList inputs in OpInfo (#54922 ) Summary: Stack: * https://github.com/pytorch/pytorch/issues/54954 Fixed OpInfo jit tests failing for TensorList inputs * __#54922 Added support for TensorList inputs in OpInfo__ Updated OpInfo to accept either a `Tensor` or `TensorList` as `sample.input` and added workarounds to make this work with gradcheck. Note: JIT testing support for TensorList inputs will be added in a follow up PR. Fixes https://github.com/pytorch/pytorch/issues/51996 Pull Request resolved: https://github.com/pytorch/pytorch/pull/54922 Reviewed By: H-Huang Differential Revision: D27448952 Pulled By: heitorschueroff fbshipit-source-id: 3f24a56f6180eb2d044dcfc89ba59fce8acfe278	2021-03-31 04:42:10 -07:00
Edward Yang	e0aebe241d	Refactor tensor_new.cpp to use TensorOptions instead of DispatchKey (#54034 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/54034 Fixes #53544 I had to touch a bunch of lines but the refactoring was fairly mechanical. Here's how it works. The basic concept behind this PR is that tensor_new.cpp was previously abusing DispatchKey when it actually meant TensorOptions. The provided DispatchKey argument to most of the constructor functions typically comes from torch::tensors::get_default_dispatch_key(); it doesn't really make sense for people to set the default dispatch key, but this got grandfathered in due to the old API set_default_tensor_type (where the "Type" concept got refactored into "DispatchKey" concept over time). See also #53124. But the upshot is that, semantically, what we refer to as the default dispatch key really is more like torch.set_default_tensor_type(torch.Tensor) versus torch.set_default_tensor_type(torch.cuda.Tensor): clearly the user wants to do something about construction of the tensor, and TensorOptions captures that exactly. So, how exactly to translate from one to the other? - Sources (things that used to PRODUCE DispatchKey) - Most top level functions take a DispatchKey as their argument. I use the new function dispatchKeyToTensorOptions to convert it into a TensorOptions - typeIdWithDefault now produces a TensorOptions (probably could do with a rename, though I didn't) - Sinks (things that used to CONSUME DispatchKey) - Previously, the function options() was typically used to convert the DispatchKey into a TensorOptions. Now its replacement build_options just takes a TensorOptions and sets some extra fields on it. Irritatingly, I can't just replace `build_options(options, scalar_type, device)` with `options.dtype(scalar_type).device(device)` because the semantics are slightly different: if device is nullopt, we should preserve the usage of the device specified in options (what options.device() does is overwrite the device unconditionally; e.g., if device is nullopt, unset device from options) - The other major sink for DispatchKey was `internal_new_from_data`, but it turns out it only really extracts the device type from the dispatch key. Now it just pulls out the device from TensorOptions. - To actually do the translation of DispatchKey to TensorOptions, I introduce new functions dispatchKeyToLayout (replicating layout_from_backend--there are still a few uses of this function so I couldn't delete it) and dispatchKeyToDeviceType (replacing computeDeviceType) - In all internal functions, whenever DispatchKey is taken as an argument, I instead take TensorOptions as an argument, and pass it along. - Anywhere `legacyExtractDispatchKey(other.key_set())` equality was previously used, I now do `other.options().type_equal()`, which is the intended BC for doing "backend to backend" comparisons - There are a few places in the sparse constructors where we allocated a tensor for values, and then read out the dispatch key from the result to allocate the keys. As best as I can tell, this is totally equivalent to just passing in the options to both values and indices (the only difference is dtype, which is captured via a separate argument) This refactor doesn't really go far enough: for example, there are now functions that take both TensorOptions and ScalarType, when really the TensorOptions can capture this all. I kept it solely just s/DispatchKey/TensorOptions/ to reduce the number of possible bugs; also, a lot of this will be mooted by a proper fix to #53124. Even with this limited refactor, the payoff is sweet. I can delete: - backendToCPU - backendToXPU - backendToCUDA - backendToHIP - backendToBackendOfDeviceType The reason I can do this is because I can simply overwrite layout in TensorOptions to do the conversion, rather than having to type out each backend case explicitly. Signed-off-by: Edward Z. Yang <ezyang@fb.com> Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D27109509 Pulled By: ezyang fbshipit-source-id: 91d16cfbc390127770362ac04fb43f7e070077e9	2021-03-19 09:08:32 -07:00
mattip	54a2498919	Modify tests to use assertWarnsOnceRegex instead of maybeWarnsRegex (#52387 ) Summary: Related to https://github.com/pytorch/pytorch/issues/50006 Follow on for https://github.com/pytorch/pytorch/issues/48560 to ensure TORCH_WARN_ONCE warnings are caught. Most of this is straight-forward find-and-replace, but I did find one place where the TORCH_WARN_ONCE warning was not wrapped into a python warning. Pull Request resolved: https://github.com/pytorch/pytorch/pull/52387 Reviewed By: albanD Differential Revision: D26773387 Pulled By: mruberry fbshipit-source-id: 5be7efbc8ab4a32ec8437c9c45f3b6c3c328f5dd	2021-03-08 03:32:14 -08:00
Sam Estep	8c798e0622	Forbid trailing whitespace (#53406 ) Summary: Context: https://github.com/pytorch/pytorch/pull/53299#discussion_r587882857 These are the only hand-written parts of this diff: - the addition to `.github/workflows/lint.yml` - the file endings changed in these four files (to appease FB-internal land-blocking lints): - `GLOSSARY.md` - `aten/src/ATen/core/op_registration/README.md` - `scripts/README.md` - `torch/csrc/jit/codegen/fuser/README.md` The rest was generated by running this command (on macOS): ``` git grep -I -l ' $' -- . ':(exclude)/contrib/' ':(exclude)third_party' \| xargs gsed -i 's/ *$//' ``` I looked over the auto-generated changes and didn't see anything that looked problematic. Pull Request resolved: https://github.com/pytorch/pytorch/pull/53406 Test Plan: This run (after adding the lint but before removing existing trailing spaces) failed: - https://github.com/pytorch/pytorch/runs/2043032377 This run (on the tip of this PR) succeeded: - https://github.com/pytorch/pytorch/runs/2043296348 Reviewed By: walterddr, seemethere Differential Revision: D26856620 Pulled By: samestep fbshipit-source-id: 3f0de7f7c2e4b0f1c089eac9b5085a58dd7e0d97	2021-03-05 17:22:55 -08:00
Rong Rong (AI Infra)	b52e2e6045	[BE] _get_torch_cuda_version should return tuple (#52409 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/52409 Reviewed By: jbschlosser, glaringlee Differential Revision: D26513924 Pulled By: walterddr fbshipit-source-id: ee18ef357c326c5ad344d80c59821cc2b8814734	2021-02-18 09:28:38 -08:00
Mike Ruberry	594a66d778	Warn about floor_divide performing incorrect rounding (#50281 ) (#50281 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50281 Pull Request resolved: https://github.com/pytorch/pytorch/pull/51745 Test Plan: Imported from OSS Reviewed By: ngimel Pulled By: mruberry Differential Revision: D26257855 fbshipit-source-id: e5d497cf07b0c746838ed081c5d0e82fb4cb701b	2021-02-10 03:13:34 -08:00
Jeffrey Wan	c0966914bc	Internal gradcheck wrapper in testing._internal that sets certain flags to True (#51133 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/49409 There are many call sites where, gradcheck/gradgradcheck is now being implicitly invoked with `check_batched_grad` as True, but they were previously False. Cases fall into two basic categories: 1) the call site was previously using `torch.autograd.gradcheck` but is now changed to use the globally imported function instead 3) the call site was already using globally imported function, but does not explicitly pass `check_batched_grad` flag Only in the _assertGradAndGradgradChecks cases, which are infrequent, I assumed that the the author is aware that omitting the flag means not applying check_batched_grad=True. (but maybe that is not the case?) Overall this PR in its current state assumes that unless the author explicitly specified `check_batched_grad=False`, they were just probably not aware of this flag and did not mean to have this flag as False. So far exceptions to the above (as discovered by CI) include: - Mkldnn (opaque tensors do not have strides) https://app.circleci.com/pipelines/github/pytorch/pytorch/264416/workflows/e4d87886-6247-4305-8526-2696130aa9a4/jobs/10401882/tests - all cases in test_sparse (https://app.circleci.com/pipelines/github/pytorch/pytorch/264553/workflows/3c1cbe30-830d-4acd-b240-38d833dccd9b/jobs/10407103) - all cases in test_overrides (https://app.circleci.com/pipelines/github/pytorch/pytorch/264553/workflows/3c1cbe30-830d-4acd-b240-38d833dccd9b/jobs/10407236) - test_autograd (test_LSTM_grad_and_gradgrad) - (https://app.circleci.com/pipelines/github/pytorch/pytorch/264553/workflows/3c1cbe30-830d-4acd-b240-38d833dccd9b/jobs/10407235) - test_data_parallel (test_data_parallel_buffers_requiring_grad) - SIGSEGV (https://app.circleci.com/pipelines/github/pytorch/pytorch/264820/workflows/14d89503-040d-4e3d-9f7b-0bc04833589b/jobs/10422697) - test_nn (https://app.circleci.com/pipelines/github/pytorch/pytorch/264919/workflows/df79e3ed-8a31-4a8e-b584-858ee99686ff/jobs/10427315) Possible TODO is to prevent new tests from invoking external gradcheck. Pull Request resolved: https://github.com/pytorch/pytorch/pull/51133 Reviewed By: ezyang Differential Revision: D26147919 Pulled By: soulitzer fbshipit-source-id: dff883b50f337510a89f391ea2fd87de2d531432	2021-01-29 09:13:37 -08:00
Kyle Chen	d5e5c5455a	[ROCm] re-enable test_sparse.py tests (#50557 ) Summary: Signed-off-by: Kyle Chen <kylechen@amd.com> cc: jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/50557 Reviewed By: mruberry Differential Revision: D25941432 Pulled By: ngimel fbshipit-source-id: 534fc8a91a48fa8b3b397e63423cd8347b41bbe2	2021-01-18 23:36:39 -08:00
Nathan Howell	c517e15d79	Add support for converting sparse bool tensors to dense (#50019 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/49977 Pull Request resolved: https://github.com/pytorch/pytorch/pull/50019 Reviewed By: smessmer Differential Revision: D25782045 Pulled By: ezyang fbshipit-source-id: a8389cbecb7e79099292a423a6fd8ac28631905b	2021-01-06 07:38:14 -08:00
mattip	f96ce3305c	prohibit assignment to a sparse tensor (#50040 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/48225 by prohibiting assignment to a sparse Tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/50040 Reviewed By: mrshenli Differential Revision: D25757125 Pulled By: zou3519 fbshipit-source-id: 3db6f48932eb10bf6ca5e97a6091afcabb60e478	2021-01-04 14:38:35 -08:00
Himangshu	9552cc65d4	Creation of test framework for Sparse Operators (#48488 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/48488 Reviewed By: ngimel Differential Revision: D25696487 Pulled By: mruberry fbshipit-source-id: dc4f57c6628f62b74dd321f3f6b0fff86f25b040	2020-12-23 15:42:26 -08:00
Alexander	44ce0b8883	Sparse-sparse matrix multiplication (CPU/CUDA) (#39526 ) Summary: This PR implements matrix multiplication support for 2-d sparse tensors using the COO sparse format. The current implementation of `torch.sparse.mm` support this configuration, `torch.sparse.mm(sparse_matrix1, sparse_matrix2.to_dense())`, but this could spend a lot of memory when sparse_matrix2's shape is large. This implementation extends `torch.sparse.mm` function to support `torch.sparse.mm(sparse_matrix1, sparse_matrix2)` Resolves #[20988](https://github.com/pytorch/pytorch/issues/20988) for CPU/CUDA. - [x] sparse matmul - [x] CPU/CUDA C++ implementation - [x] unittests - [x] update torch.sparse.mm documentation - [x] autograd support The CPU sparse-sparse matmul was implemented taking as a reference this work "Sparse Matrix Multiplication Package (SMMP)". The GPU sparse-sparse matmul is based on cuSparse, there is specific code for CUSPARSE when CUSPARSE_VERSION >= 11 and old version of CUSPARSE. Both CPU/CUDA rely on the sparse-sparse matmul algorithm using the CSR indices format as it is one of the fastest algorithm. Here it is the latest benchmark (script is here) results for torch.sparse.mm (CUDA) and torch.sparse.mm (CPU) and scipy, values are float32 scalars: size \| density \| sparse.mm(CUDA) \| sparse.mm(CPU) \| scipy_coo_matmul -- \| -- \| -- \| -- \| -- (32, 10000) \| 0.01 \| 822.7 \| 79.4 \| 704.1 (32, 10000) \| 0.05 \| 1741.1 \| 402.6 \| 1155.3 (32, 10000) \| 0.1 \| 2956.8 \| 840.8 \| 1885.4 (32, 10000) \| 0.25 \| 6417.7 \| 2832.3 \| 4665.2 (512, 10000) \| 0.01 \| 1010.2 \| 3941.3 \| 26937.7 (512, 10000) \| 0.05 \| 2216.2 \| 26903.8 \| 57343.7 (512, 10000) \| 0.1 \| 4868.4 \| 87773.7 \| 117477.0 (512, 10000) \| 0.25 \| 16639.3 \| 608105.0 \| 624290.4 (1024, 10000) \| 0.01 \| 1224.8 \| 13088.1 \| 110379.2 (1024, 10000) \| 0.05 \| 3897.5 \| 94783.9 \| 236541.8 (1024, 10000) \| 0.1 \| 10559.1 \| 405312.5 \| 525483.4 (1024, 10000) \| 0.25 \| 57456.3 \| 2424337.5 \| 2729318.7 A new backward algorithm was implemented using only `sparse @ sparse` and `sparse_mask` operations. Here is some benchmarking: ``` [------------------------- sparse.mm-backward -------------------------] \| sparse.backward \| dense.backward ----------------------------------------------------------------------- (32, 10000) \| 0.01 \| 13.5 \| 2.4 (32, 10000) \| 0.05 \| 52.3 \| 2.4 (512, 10000) \| 0.01 \| 1016.8 \| 491.5 (512, 10000) \| 0.05 \| 1604.3 \| 492.3 (1024, 10000) \| 0.01 \| 2384.1 \| 1963.7 (1024, 10000) \| 0.05 \| 3965.8 \| 1951.9 ``` I added new benchmark tests. Now I am using a real dataset used in recent studies [1, 2] with different sparsity levels. ``` [---------------------------------- matmul ---------------------------------] \| 0.5 \| 0.7 \| 0.8 \| 0.9 \| 0.95 \| 0.98 1 threads: ------------------------------------------------------------------ (cpu) torch \| 5.4 \| 5.4 \| 5.2 \| 5.3 \| 5.3 \| 5.4 torch.sparse \| 122.2 \| 51.9 \| 27.5 \| 11.4 \| 4.9 \| 1.8 scipy \| 150.1 \| 87.4 \| 69.2 \| 56.8 \| 38.4 \| 17.1 (cuda) torch \| 1.3 \| 1.1 \| 1.1 \| 1.1 \| 1.1 \| 1.1 torch.sparse \| 20.0 \| 8.4 \| 5.1 \| 2.5 \| 1.5 \| 1.1 [----------------------------------- backward -----------------------------------] \| 0.5 \| 0.7 \| 0.8 \| 0.9 \| 0.95 \| 0.98 1 threads: ----------------------------------------------------------------------- (cpu) torch \| 17.7 \| 17.9 \| 17.7 \| 17.7 \| 17.6 \| 17.9 torch.sparse \| 672.9 \| 432.6 \| 327.5 \| 230.8 \| 176.7 \| 116.7 (cuda) torch \| 3.8 \| 3.6 \| 3.5 \| 3.5 \| 3.6 \| 3.5 torch.sparse \| 68.8 \| 46.2 \| 35.6 \| 24.2 \| 17.8 \| 11.9 Times are in milliseconds (ms). ``` In summary, I can say that the new `sparse @ sparse` backward algorithm is better as it is more about saving space than performance. Moreover, it is better than other options tested before. ## References 1. Trevor Gale, Matei Zaharia, Cliff Young, Erich Elsen. Sparse GPU Kernels for Deep Learning. Proceedings of the International Conference for High Performance Computing, 2020. [https://github.com/google-research/google-research/tree/master/sgk](https://github.com/google-research/google-research/tree/master/sgk) 2. Trevor Gale, Erich Elsen, Sara Hooker. The State of Sparsity in Deep Neural Networks. [https://github.com/google-research/google-research/tree/master/state_of_sparsity](https://github.com/google-research/google-research/tree/master/state_of_sparsity) Pull Request resolved: https://github.com/pytorch/pytorch/pull/39526 Reviewed By: mruberry Differential Revision: D25661239 Pulled By: ngimel fbshipit-source-id: b515ecd66d25f347d637e159d51aa45fb43b6938	2020-12-21 11:53:55 -08:00
Xiang Gao	87636c07bb	CUDA BF16 sparse (#48807 ) Summary: Fixes #{issue number} Pull Request resolved: https://github.com/pytorch/pytorch/pull/48807 Reviewed By: mruberry Differential Revision: D25526752 Pulled By: ngimel fbshipit-source-id: 9ff8e637486cfd67d46daf0c05142bbe611e08ec	2020-12-14 09:55:52 -08:00
kshitij12345	25ab39acd0	[numpy] `torch.asin` : promote integer inputs to float (#48461 ) Summary: Reference https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48461 Reviewed By: ngimel Differential Revision: D25192319 Pulled By: mruberry fbshipit-source-id: fd5dffeca9cd98b86782bfa6a9ab367e425ee934	2020-11-27 15:26:58 -08:00
kshitij12345	e9efd8df1b	[numpy] `torch.log1p` : promote integer inputs to float (#48002 ) Summary: Reference https://github.com/pytorch/pytorch/issues/42515 Pull Request resolved: https://github.com/pytorch/pytorch/pull/48002 Reviewed By: ngimel Differential Revision: D25148911 Pulled By: mruberry fbshipit-source-id: 902d0ddf699debd6edd1b3d55f5c73932ca45e83	2020-11-24 22:01:07 -08:00
Natalia Gimelshein	4a2fb34042	check sparse sizes (#47148 ) Summary: checks sizes of sparse tensors when comparing them in assertEqual. Removes additional checks in safeCoalesce, safeCoalesce should not be a test for `.coalesce()` function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/47148 Reviewed By: mruberry Differential Revision: D24823127 Pulled By: ngimel fbshipit-source-id: 9303a6ff74aa3c9d9207803d05c0be2325fe392a	2020-11-09 10:33:24 -08:00
vfdev-5	dc7cd97402	Fixes bug in sspaddmm (#45113 ) (#45963 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/45113 Description: - Fixed bug in sspaddmm by calling contiguous on indices. - Added tests We have to make indices contiguous as we use `indices.data_ptr` in `_to_csr` which assumes row-contiguous storage: `be45c3401a/aten/src/ATen/native/sparse/SparseTensorMath.cpp (L1087-L1090)` > Part 1 of fixing this is probably to document sspaddmm. Part 2 may be to rewrite it using other ops. (https://github.com/pytorch/pytorch/issues/45113#issuecomment-700166809) - Docs will be written here: https://github.com/pytorch/pytorch/pull/45400 Pull Request resolved: https://github.com/pytorch/pytorch/pull/45963 Reviewed By: malfet Differential Revision: D24335599 Pulled By: ngimel fbshipit-source-id: 8278c73a1b4cccc5e22c6f3818dd222588c46b45	2020-10-15 16:50:16 -07:00
Alexander	29dc3c5ec8	Sparse softmax support (CUDA) (#42307 ) Summary: This PR implements softmax support for sparse tensors. Resolves gh-23651 for CUDA. - [x] sparse softmax - [x] CUDA C++ implementation - [x] unittests - [x] update softmax documentation - [x] autograd support - [x] sparse log_softmax - [x] CUDA C++ implementation - [x] unittests - [x] update log_softmax documentation - [x] autograd support Here are some benchmark (script is [here](https://gist.github.com/aocsa/fbc1827b3e49901512a33ba96092cbc1)) results for `torch.sparse.softmax and torch.softmax`, using CPU and GPU, values are float64 scalars, timing repeat is 1000: \| size \| density \| sparse CUDA \| sparse CPU \| \|--------------\|---------\|-------------\|------------\| \| (32, 10000) \| 0.01 \| 380.2 \| 687.5 \| \| (32, 10000) \| 0.05 \| 404.3 \| 2357.9 \| \| (32, 10000) \| 0.1 \| 405.9 \| 3677.2 \| \| (512, 10000) \| 0.01 \| 438.0 \| 5443.4 \| \| (512, 10000) \| 0.05 \| 888.1 \| 24485.0 \| \| (512, 10000) \| 0.1 \| 1921.3 \| 45340.5 \| \| size \| density \| dense CUDA \| dense CPU \| \|--------------\|---------\|-------------\|------------\| \| (32, 10000) \| 0.01 \| 23.6 \| 1943.2 \| \| (32, 10000) \| 0.05 \| 23.6 \| 1954.0 \| \| (32, 10000) \| 0.1 \| 23.5 \| 1950.0 \| \| (512, 10000) \| 0.01 \| 639.3 \| 39797.9 \| \| (512, 10000) \| 0.05 \| 640.3 \| 39374.4 \| \| (512, 10000) \| 0.1 \| 639.6 \| 39192.3 \| Times are in microseconds (us). Quick note: I updated the performance test again. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42307 Reviewed By: ngimel Differential Revision: D23774427 Pulled By: mruberry fbshipit-source-id: bfabf726075b39dde544c10249f27ae1871f82c7	2020-09-24 00:07:30 -07:00
vfdev-5	c947ab0bb9	Added sparse support for asin and neg functions, updated log1p (#44028 ) Summary: Description: - [x] added C++ code for sparse `asin` and `neg` ops similarly to `log1p` op - [x] added tests - [x] coalesced input CPU/CUDA - [x] uncoalesced input CPU/CUDA - [x] added tests for `negative` and `arcsin` Backprop will be addressed in another PR. Pull Request resolved: https://github.com/pytorch/pytorch/pull/44028 Reviewed By: agolynski Differential Revision: D23793027 Pulled By: mruberry fbshipit-source-id: 5fd642808da8e528cf6acd608ca0dcd720c4ccc3	2020-09-22 02:04:38 -07:00
Xiao Wang	d75c402755	Add cusolver to build, rewrite MAGMA inverse with cusolver (#42403 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/42265 This PR adds cusolver to the pytorch build, and enables the use of cusolver/cublas library functions on GPU `torch.inverse` on certain tensor shapes. Specifically, when * the tensor is two dimensional (single batch), or * has >2 dimensions (multiple batches) and `batch_size <= 2`, or * magma is not linked, cusolver/cublas will be used. In other conditions, the current implementation of MAGMA will still be used. `8c0949ae45/aten/src/ATen/native/cuda/BatchLinearAlgebra.cu (L742-L752)` The reason for this is that for tensors with large batch_size, `cublasXgetrfBatched` and `cublasXgetriBatched` doesn't perform very well. For `batch_size > 1`, we launch cusolver functions in multiple streams. This lets cusolver functions run in parallel, and can greatly increase the performance. When `batch_size > 2`, the parallel launched cusolver functions are slightly slower than the current magma implementation, so we still use the current magma impl. On CUDA 9.2, there were some numerical issues detected, so cusolver impl will not be used. The cusolver impl will also not be used on platforms other than Nvidia CUDA. `060769feaf/aten/src/ATen/native/cuda/BatchLinearAlgebraLib.h (L10-L13)` Note that there is a new heuristic used before cusolver/cublas calls here: `8c0949ae45/aten/src/ATen/native/cuda/MiscUtils.h (L113-L121)` where `use_loop_launch = true` means launch single batch cusolver functions in parallel, and `use_loop_launch = false` means use cublas_X_batched functions. When magma is enabled (only `batch_size <= 2` will be dispatched to cusolver/cublas), the heuristic will always return `true` and the cusolver calls are faster than small batch_size magma calls. When magma is disabled, this adds the functionality of `torch.inverse`, which was disabled before for all shapes (though large batch_size cublas performance may not be as well as magma). Checklist: - [X] Add benchmark, cpu, gpu-before (magma), gpu-after (cusolver) - [X] Rewrite single inverse (ndim == 2) with cusolver - [X] Rewrite batched inverse (ndim > 2) with cublas - [X] Add cusolver to build - [x] Clean up functions related to `USE_MAGMA` define guard - [x] Workaround for non-cuda platform - [x] Workaround for cuda 9.2 - [x] Add zero size check - [x] Add tests Next step: If cusolver doesn't cause any problem in pytorch build, and there are no major performance regressions reported after this PR being merged, I will start porting other cusolver/cublas functions for linear algebra to improve the performance. <details> <summary> benchmark 73499c6 </summary> benchmark code: https://github.com/xwang233/code-snippet/blob/master/torch.inverse/inverse-cusolver.ipynb shape meaning: * `[] 2 torch.float32 -> torch.randn(2, 2, dtype=torch.float32)` * `[2] 4 torch.float32 -> torch.randn(2, 4, 4, dtype=torch.float32)` \| shape \| cpu_time (ms) \| gpu_time_before (magma) (ms) \| gpu_time_after (ms) \| \| --- \| --- \| --- \| --- \| \| [] 2 torch.float32 \| 0.095 \| 7.534 \| 0.129 \| \| [] 4 torch.float32 \| 0.009 \| 7.522 \| 0.129 \| \| [] 8 torch.float32 \| 0.011 \| 7.647 \| 0.138 \| \| [] 16 torch.float32 \| 0.075 \| 7.582 \| 0.135 \| \| [] 32 torch.float32 \| 0.073 \| 7.573 \| 0.191 \| \| [] 64 torch.float32 \| 0.134 \| 7.694 \| 0.288 \| \| [] 128 torch.float32 \| 0.398 \| 8.073 \| 0.491 \| \| [] 256 torch.float32 \| 1.054 \| 11.860 \| 1.074 \| \| [] 512 torch.float32 \| 5.218 \| 14.130 \| 2.582 \| \| [] 1024 torch.float32 \| 19.010 \| 18.780 \| 6.936 \| \| [1] 2 torch.float32 \| 0.009 \| 0.113 \| 0.128 *regressed \| \| [1] 4 torch.float32 \| 0.009 \| 0.113 \| 0.131 regressed \| \| [1] 8 torch.float32 \| 0.011 \| 0.116 \| 0.129 regressed \| \| [1] 16 torch.float32 \| 0.015 \| 0.122 \| 0.135 regressed \| \| [1] 32 torch.float32 \| 0.032 \| 0.177 \| 0.178 regressed \| \| [1] 64 torch.float32 \| 0.070 \| 0.420 \| 0.281 \| \| [1] 128 torch.float32 \| 0.328 \| 0.816 \| 0.490 \| \| [1] 256 torch.float32 \| 1.125 \| 1.690 \| 1.084 \| \| [1] 512 torch.float32 \| 4.344 \| 4.305 \| 2.576 \| \| [1] 1024 torch.float32 \| 16.510 \| 16.340 \| 6.928 \| \| [2] 2 torch.float32 \| 0.009 \| 0.113 \| 0.186 regressed \| \| [2] 4 torch.float32 \| 0.011 \| 0.115 \| 0.184 regressed \| \| [2] 8 torch.float32 \| 0.012 \| 0.114 \| 0.184 regressed \| \| [2] 16 torch.float32 \| 0.019 \| 0.119 \| 0.173 regressed \| \| [2] 32 torch.float32 \| 0.050 \| 0.170 \| 0.240 regressed \| \| [2] 64 torch.float32 \| 0.120 \| 0.429 \| 0.375 \| \| [2] 128 torch.float32 \| 0.576 \| 0.830 \| 0.675 \| \| [2] 256 torch.float32 \| 2.021 \| 1.748 \| 1.451 \| \| [2] 512 torch.float32 \| 9.070 \| 4.749 \| 3.539 \| \| [2] 1024 torch.float32 \| 33.655 \| 18.240 \| 12.220 \| \| [4] 2 torch.float32 \| 0.009 \| 0.112 \| 0.318 regressed \| \| [4] 4 torch.float32 \| 0.010 \| 0.115 \| 0.319 regressed \| \| [4] 8 torch.float32 \| 0.013 \| 0.115 \| 0.320 regressed \| \| [4] 16 torch.float32 \| 0.027 \| 0.120 \| 0.331 regressed \| \| [4] 32 torch.float32 \| 0.085 \| 0.173 \| 0.385 regressed \| \| [4] 64 torch.float32 \| 0.221 \| 0.431 \| 0.646 regressed \| \| [4] 128 torch.float32 \| 1.102 \| 0.834 \| 1.055 regressed \| \| [4] 256 torch.float32 \| 4.042 \| 1.811 \| 2.054 regressed \| \| [4] 512 torch.float32 \| 18.390 \| 4.884 \| 5.087 regressed \| \| [4] 1024 torch.float32 \| 69.025 \| 19.840 \| 20.000 *regressed \| </details> Pull Request resolved: https://github.com/pytorch/pytorch/pull/42403 Reviewed By: ailzhang, mruberry Differential Revision: D23717984 Pulled By: ngimel fbshipit-source-id: 54cbd9ea72a97989cff4127089938e8a8e29a72b	2020-09-18 20:43:29 -07:00
vfdev	24df3b7373	torch.empty_like and torch.zeros_like raise error if any memory format is provided with sparse input (#43699 ) (#44058 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/43699 - Changed the order of `TORCH_CHECK` and `if (options.layout() == kSparse && self.is_sparse())` inside `empty_like` method. - [x] Added tests EDIT: More details on that and why we can not take zeros_like approach. Python code : ```python res = torch.zeros_like(input_coalesced, memory_format=torch.preserve_format) ``` is routed to ```c++ // TensorFactories.cpp Tensor zeros_like( const Tensor& self, const TensorOptions& options, c10::optional<c10::MemoryFormat> optional_memory_format) { if (options.layout() == kSparse && self.is_sparse()) { auto res = at::empty({0}, options); // to be resized res.sparse_resize_and_clear_( self.sizes(), self.sparse_dim(), self.dense_dim()); return res; } auto result = at::empty_like(self, options, optional_memory_format); return result.zero_(); } ``` and passed to `if (options.layout() == kSparse && self.is_sparse())` When we call in Python ```python res = torch.empty_like(input_coalesced, memory_format=torch.preserve_format) ``` it is routed to ```c++ Tensor empty_like( const Tensor& self, const TensorOptions& options_, c10::optional<c10::MemoryFormat> optional_memory_format) { TORCH_CHECK( !(options_.has_memory_format() && optional_memory_format.has_value()), "Cannot set memory_format both in TensorOptions and explicit argument; please delete " "the redundant setter."); TensorOptions options = self.options() .merge_in(options_) .merge_in(TensorOptions().memory_format(optional_memory_format)); TORCH_CHECK( !(options.layout() != kStrided && optional_memory_format.has_value()), "memory format option is only supported by strided tensors"); if (options.layout() == kSparse && self.is_sparse()) { auto result = at::empty({0}, options); // to be resized result.sparse_resize_and_clear_( self.sizes(), self.sparse_dim(), self.dense_dim()); return result; } ``` cc pearu Pull Request resolved: https://github.com/pytorch/pytorch/pull/44058 Reviewed By: albanD Differential Revision: D23672494 Pulled By: mruberry fbshipit-source-id: af232274dd2b516dd6e875fc986e3090fa285658	2020-09-17 10:25:31 -07:00
Mike Ruberry	686e281bcf	Updates div to perform true division (#42907 ) Summary: This PR: - updates div to perform true division - makes torch.true_divide an alias of torch.div This follows on work in previous PyTorch releases that first deprecated div performing "integer" or "floor" division, then prevented it by throwing a runtime error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42907 Reviewed By: ngimel Differential Revision: D23622114 Pulled By: mruberry fbshipit-source-id: 414c7e3c1a662a6c3c731ad99cc942507d843927	2020-09-14 15:50:38 -07:00
vfdev	9f88bcb5a2	Minor typo fix (#42731 ) Summary: Just fixed a typo in test/test_sparse.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/42731 Reviewed By: ezyang Differential Revision: D22999930 Pulled By: mrshenli fbshipit-source-id: 1b5b21d7cb274bd172fb541b2761f727ba06302c	2020-08-07 11:17:51 -07:00
Nikita Shulga	aa4e91a6dc	Fix `TestSparse.test_bmm_windows_error` when CUDA is not available (#42626 ) Summary: Refactor comnon pattern of (torch.cuda.version and [int(x) for x in torch.cuda.version.split(".")] >= [a, b]) into `_get_torch_cuda_version()` function Pull Request resolved: https://github.com/pytorch/pytorch/pull/42626 Reviewed By: seemethere Differential Revision: D22956149 Pulled By: malfet fbshipit-source-id: 897c55965e53b477cd20f69e8da15d90489035de	2020-08-05 16:07:35 -07:00
peter	b08347fd7b	Add CUDA 11 builds for Windows CI (#42420 ) Summary: Stacked on https://github.com/pytorch/pytorch/pull/42410. Pull Request resolved: https://github.com/pytorch/pytorch/pull/42420 Reviewed By: seemethere Differential Revision: D22917230 Pulled By: malfet fbshipit-source-id: 6ad394f7f8c430c587e0b0d9c5a5e7b7bcd85bfe	2020-08-05 09:40:33 -07:00
Kurt Mohler	206db5c127	Improve `torch.norm` functionality, errors, and tests (#41956 ) Summary: BC-Breaking Note: BC breaking changes in the case where keepdim=True. Before this change, when calling `torch.norm` with keepdim=True and p='fro' or p=number, leaving all other optional arguments as their default values, the keepdim argument would be ignored. Also, any time `torch.norm` was called with p='nuc', the result would have one fewer dimension than the input, and the dimensions could be out of order depending on which dimensions were being reduced. After the change, for each of these cases, the result has the same number and order of dimensions as the input. PR Summary: * Fix keepdim behavior * Throw descriptive errors for unsupported sparse norm args * Increase unit test coverage for these cases and for complex inputs These changes were taken from part of PR https://github.com/pytorch/pytorch/issues/40924. That PR is not going to be merged because it overrides `torch.norm`'s interface, which we want to avoid. But these improvements are still useful. Issue https://github.com/pytorch/pytorch/issues/24802 Pull Request resolved: https://github.com/pytorch/pytorch/pull/41956 Reviewed By: albanD Differential Revision: D22837455 Pulled By: mruberry fbshipit-source-id: 509ecabfa63b93737996f48a58c7188b005b7217	2020-08-01 01:55:12 -07:00
Mike Ruberry	12cd083fd7	Updates torch.tensor, torch.as_tensor, and sparse ctors to use the device of inputs tensors they're given, by default (#41984 ) Summary: BC-Breaking Note This PR changes the behavior of the torch.tensor, torch.as_tensor, and sparse constructors. When given a tensor as input and a device is not explicitly specified, these constructors now always infer their device from the tensor. Historically, if the optional dtype kwarg was provided then these constructors would not infer their device from tensor inputs. Additionally, for the sparse ctor a runtime error is now thrown if the indices and values tensors are on different devices and the device kwarg is not specified. PR Summary This PR's functional change is a single line: ``` auto device = device_opt.has_value() ? device_opt : (type_inference ? var.device() : at::Device(computeDeviceType(dispatch_key))); ``` => ``` auto device = device_opt.has_value() ? device_opt : var.device(); ``` in `internal_new_from_data`. This line entangled whether the function was performing type inference with whether it inferred its device from an input tensor, and in practice meant that ``` t = torch.tensor((1, 2, 3), device='cuda') torch.tensor(t, dtype=torch.float64) ``` would return a tensor on the CPU, not the default CUDA device, while ``` t = torch.tensor((1, 2, 3), device='cuda') torch.tensor(t) ``` would return a tensor on the device of `t`! This behavior is niche and odd, but came up while aocsa was fixing https://github.com/pytorch/pytorch/issues/40648. An additional side affect of this change is that the indices and values tensors given to a sparse constructor must be on the same device, or the sparse ctor must specify the dtype kwarg. The tests in test_sparse.py have been updated to reflect this behavior. Pull Request resolved: https://github.com/pytorch/pytorch/pull/41984 Reviewed By: ngimel Differential Revision: D22721426 Pulled By: mruberry fbshipit-source-id: 909645124837fcdf3d339d7db539367209eccd48	2020-07-25 02:49:45 -07:00
Mike Ruberry	13120bf677	Updates assertEqual to require atol and rtol, removes positional atol (#38872 ) Summary: This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument. In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872 Differential Revision: D21740237 Pulled By: mruberry fbshipit-source-id: acbc027aa1d7877a49664d94db9a5fff91a07042	2020-05-27 06:31:07 -07:00
Rohan Varma	63e545e0fe	Revert D21717199: [pytorch][PR] Updates assertEqual to require atol and rtol, removes positional atol Test Plan: revert-hammer Differential Revision: D21717199 Original commit changeset: 9feb856f94ee fbshipit-source-id: bfde9c39a5ce99f0ca6183a7dde703c65b7c8259	2020-05-26 18:23:59 -07:00
Mike Ruberry	6ddca30b2d	Updates assertEqual to require atol and rtol, removes positional atol (#38872 ) Summary: This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument. In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872 Differential Revision: D21717199 Pulled By: mruberry fbshipit-source-id: 9feb856f94eee911b44f6c7140a1d07c1b026d3a	2020-05-26 08:30:23 -07:00
Pearu Peterson	48c0331e01	Sparse softmax support (CPU) (#36305 ) Summary: This PR implements softmax support for sparse tensors. The sparse softmax is related to dense softmax when the values of unspecified sparse tensor entries are taken to be `-inf` that will have the effect of "zero entries ignored". This relation is used for testing the correctness of results here. Resolves https://github.com/pytorch/pytorch/issues/23651 for CPU. - [x] sparse softmax - [x] CPU C++ implementation - [x] unittests - [x] update softmax documentation - [x] autograd support - [x] sparse log_softmax - [x] CPU C++ implementation - [x] unittests - [x] update log_softmax documentation - [x] autograd support Pull Request resolved: https://github.com/pytorch/pytorch/pull/36305 Differential Revision: D21566540 Pulled By: ezyang fbshipit-source-id: a632ea69c38622f960721482e442efeb8d0a54fc	2020-05-14 08:08:40 -07:00
Hong Xu	336e1ec592	Clean up error handling in is_nonzero and where in TensorCompare.cpp (#38150 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/38150 Differential Revision: D21539736 Pulled By: ezyang fbshipit-source-id: e390c12f5948192a552d66dcd1bb89b2cb45f170	2020-05-13 20:19:40 -07:00
ashishfarmer	bcdff7eb67	Fix for tests on ROCm (#37616 ) Summary: This pull request fixes and re-enables two of the tests disabled in https://github.com/pytorch/pytorch/issues/37427 1. `test_sparse_add_out_bfloat16` in test_sparse.py fixed to use updated `atol` argument instead of `prec` for `assertEqual` 2. The conversion of `flt_min` to `int64` is divergent on HIP compared to numpy. The change removes that conversion from the `test_float_to_int_conversion_finite` test case in test_torch.py cc: ezyang jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/37616 Differential Revision: D21379876 Pulled By: ezyang fbshipit-source-id: 2bfb41d67874383a01330c5d540ee516b3b07dcc	2020-05-04 07:16:54 -07:00
Peter Bell	675b3fc834	Prevent unbounded growth of sparse tensor in add operation (#36030 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/34964 Sparse cuda add was implemented by just concatenating the indices and values for the tensor. If called repeatedly in a tight loop this will let `nnz` grow unbounded. In the worst case of `x.add_(x)` it grows exponentially. Pull Request resolved: https://github.com/pytorch/pytorch/pull/36030 Differential Revision: D20873504 Pulled By: zou3519 fbshipit-source-id: d90ed8dda0c89571fb89e358757b5dde299513df	2020-05-01 12:05:15 -07:00
ashishfarmer	bbd2350c99	Disable tests failing on test2 in ROCm CI (#37427 ) Summary: This pull request disables the unit tests that were observed to be failing once `test2` was enabled. These tests will be one by one looked at and fixed at the earliest, but until then disabling them to unblock `test2` The pull request also disables fftPlanDestroy for rocFFT to avoid double-freeing FFT handles cc: ezyang jeffdaily Pull Request resolved: https://github.com/pytorch/pytorch/pull/37427 Differential Revision: D21302909 Pulled By: ezyang fbshipit-source-id: ecadda3778e65b7f4f97e24b932b96b9ce928616	2020-04-29 09:56:28 -07:00

1 2 3 4

197 Commits