pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 12:21:27 +01:00

Author	SHA1	Message	Date
Mikayla Gawarecki	841c65f499	Unprivate _index_reduce and add documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/76997 Approved by: https://github.com/cpuhrsch	2022-05-13 19:48:38 +00:00
jiayisun	97deda4f28	add BFloat16 support for logcumsumexp on CPU (#72694 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/72694 Approved by: https://github.com/VitalyFedyunin, https://github.com/frank-wei	2022-05-12 17:10:28 +00:00
Ivan Yashchuk	545d90f032	Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm This PR updates the derivative rule for `torch.sparse.addmm` to be working with CSR sparse matrix. Notably `torch.sparse.sampled_addmm` is used in the backward function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76591 Approved by: https://github.com/cpuhrsch	2022-05-11 18:57:40 +00:00
PyTorch MergeBot	f94abd59f7	Revert "Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm" This reverts commit `721a8ca697`. Reverted https://github.com/pytorch/pytorch/pull/76591 on behalf of https://github.com/janeyx99	2022-05-10 13:21:46 +00:00
Ivan Yashchuk	721a8ca697	Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm This PR updates the derivative rule for `torch.sparse.addmm` to be working with CSR sparse matrix. Notably `torch.sparse.sampled_addmm` is used in the backward function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76591 Approved by: https://github.com/cpuhrsch	2022-05-10 08:44:55 +00:00
PyTorch MergeBot	4ebc4890dd	Revert "Add linalg.lu_solve" This reverts commit `fc5b4a5a33`. Reverted https://github.com/pytorch/pytorch/pull/72935 on behalf of https://github.com/malfet	2022-05-09 19:12:30 +00:00
Mikayla Gawarecki	465e0ae266	Bugfix scatter_reduce backward formulas Pull Request resolved: https://github.com/pytorch/pytorch/pull/76523 Approved by: https://github.com/albanD	2022-05-05 20:22:39 +00:00
lezcano	fc5b4a5a33	Add linalg.lu_solve This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes https://github.com/pytorch/pytorch/issues/61657 Pull Request resolved: https://github.com/pytorch/pytorch/pull/72935 Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry	2022-05-05 19:02:13 +00:00
Nikita Vedeneev	33fabe9a2e	`functional.max_unpool`: OpInfo tests + simpler backward + forward ad + fwad over backward ad Resolves https://github.com/pytorch/pytorch/issues/67657, https://github.com/pytorch/pytorch/issues/67658, https://github.com/pytorch/pytorch/issues/67660. These are not necessarily bugs because we cannot produce arbitrary samples coming from `max_pool` to the gradcheck's eternal satisfaction. This PR also replaces low-level complicated backward kernels with much simpler high-level and well-tested counterparts. The replacement is also faster (before: parallel for loop, after: memory layout optimized TensorIterator's parallelization coming from `gather`). cc @albanD @mruberry @jbschlosser @walterddr Pull Request resolved: https://github.com/pytorch/pytorch/pull/68625 Approved by: https://github.com/albanD	2022-05-05 10:13:51 +00:00
lezcano	7cb7cd5802	Add linalg.lu This PR modifies `lu_unpack` by: - Using less memory when unpacking `L` and `U` - Fuse the subtraction by `-1` with `unpack_pivots_stub` - Define tensors of the correct types to avoid copies - Port `lu_unpack` to be a strucutred kernel so that its `_out` version does not incur on extra copies Then we implement `linalg.lu` as a structured kernel, as we want to compute its derivative manually. We do so because composing the derivatives of `torch.lu_factor` and `torch.lu_unpack` would be less efficient. This new function and `lu_unpack` comes with all the things it can come: forward and backward ad, decent docs, correctness tests, OpInfo, complex support, support for metatensors and support for vmap and vmap over the gradients. I really hope we don't continue adding more features. This PR also avoids saving some of the tensors that were previously saved unnecessarily for the backward in `lu_factor_ex_backward` and `lu_backward` and does some other general improvements here and there to the forward and backward AD formulae of other related functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67833 Approved by: https://github.com/IvanYashchuk, https://github.com/nikitaved, https://github.com/mruberry	2022-05-05 09:17:05 +00:00
lezcano	1a4eea57be	Improve derivative of QR decomposition We derive and implement a more concise rule for the forward and backward derivatives of the QR decomposition. While doing this we: - Fix the composite compliance of `linalg.qr` and we make it support batches - Improve the performance and simplify the implementation of both foward and backward - Avoid saving the input matrix for the backward computation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76115 Approved by: https://github.com/nikitaved, https://github.com/albanD	2022-05-05 09:14:57 +00:00
Richard Zou	71ae190b87	[composite compliance] Fix a bunch of fft backwards Replaced `at::zeros(..., grad.options()).slice().copy_(grad))` with `grad.new_zeros(..., grad.options()).slice().copy_(grad))` Test Plan: - run tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/76573 Approved by: https://github.com/ngimel, https://github.com/albanD	2022-05-03 00:07:30 +00:00
Mikayla Gawarecki	676a4a3969	Prototype _index_reduce (CPU-only) Pull Request resolved: https://github.com/pytorch/pytorch/pull/75981 Approved by: https://github.com/cpuhrsch	2022-04-27 23:01:00 +00:00
Richard Zou	9cb2871f31	Fix forward-mode AD formula for binary_cross_entropy_with_logits The problem was that `grad_input` and `grad_target` may be ZeroTensors, which are immutable. This PR changes it so that operations on grad_input and grad_target in `binary_cross_entropy_with_logits_jvp` are no longer in-place. Test Plan: - run existing tests Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/76322 Approved by: https://github.com/soulitzer	2022-04-25 22:30:57 +00:00
lezcano	441aea4127	Update Choesky's forward and backward derivative This PR: - Derives formally a new rule for Cholesky (write-up to come) - Implements it without using in-place operations in the forward or backward. - Does not instantiate inverses explicitly, but rather it solves two triangular systems of equations (2 triang vs 1 triang and 2 matmuls should be comparable, but the first one should be more stable). Pull Request resolved: https://github.com/pytorch/pytorch/pull/76032 Approved by: https://github.com/nikitaved, https://github.com/albanD	2022-04-22 00:45:38 +00:00
Nikita Shulga	f6c275f55d	Remove `-Wno-unused-variable` from `utils.cmake` (take 2) (#75538 ) Summary: [Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there. Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block. Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538 Reviewed By: anjali411 Differential Revision: D35747333 Pulled By: malfet fbshipit-source-id: 3fc5828e44a4c05ba0e89e92613e6ebbdb260626 (cherry picked from commit c179fba21cfa2a0093fad50ccad5a22dd7cff52c)	2022-04-20 17:41:59 +00:00
Ivan Yashchuk	bba4780232	Enable autograd wrt sparse CSR tensors This pull request enables accumulating gradients for the CSR tensor. Functions that work and are tested: - tensor.abs() - tensor.neg() - tensor.conj_physical() - torch.addmm `torch.mm` also works, but tests will be added later. In addition, this PR adds throwing an error when trying to access strides, storage, and contiguity info on a CSR tensor. `tensor.to_sparse_csr().to_sparse_csr()` was failing and now fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75435 Approved by: https://github.com/cpuhrsch	2022-04-19 18:42:45 +00:00
PyTorch MergeBot	5c56b2286b	Revert "Remove `-Wno-unused-variable` from utils.cmake" This reverts commit `018cbe1f5c`. Reverted https://github.com/pytorch/pytorch/pull/75538 on behalf of https://github.com/seemethere	2022-04-19 17:19:09 +00:00
Nikita Shulga	018cbe1f5c	Remove `-Wno-unused-variable` from utils.cmake [Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there. Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block. Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538 Approved by: https://github.com/cpuhrsch	2022-04-19 15:26:55 +00:00
Peter Bell	cc56fac213	Fix complex to real casting warning in _to_copy backward Fixes #75781 A Real->Complex cast should result in a gradient with no imaginary component, so discarding the imaginary component is expected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75805 Approved by: https://github.com/albanD	2022-04-19 14:04:13 +00:00
soulitzer	8721abc429	Add forward AD support for norm, dist, F.pairwise_dist, F.normalize Pull Request resolved: https://github.com/pytorch/pytorch/pull/74205 Approved by: https://github.com/albanD	2022-04-13 15:03:20 +00:00
soulitzer	76614b3a33	Test linalg vector norm subgradient Pull Request resolved: https://github.com/pytorch/pytorch/pull/75103 Approved by: https://github.com/albanD	2022-04-12 20:54:30 +00:00
anjali411	91d134093e	Add fastpath for stack and cat JVP computation Pull Request resolved: https://github.com/pytorch/pytorch/pull/75590 Approved by: https://github.com/albanD, https://github.com/soulitzer	2022-04-11 18:10:09 +00:00
soulitzer	b10d151745	Ensure convolution_backward respects output_mask Pull Request resolved: https://github.com/pytorch/pytorch/pull/75298 Approved by: https://github.com/albanD	2022-04-08 19:27:41 +00:00
Mikayla Gawarecki	e9a8e6f74a	Add include_self flag to scatter_reduce Pull Request resolved: https://github.com/pytorch/pytorch/pull/74607 Approved by: https://github.com/cpuhrsch	2022-04-05 16:31:39 +00:00
Nikita Vedeneev	5b142ce5ce	`cholesky_inverse`: complex autograd, forward AD and correct tests. As per title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75033 Approved by: https://github.com/soulitzer	2022-04-01 20:31:03 +00:00
Mikayla Gawarecki	2bfa018462	[BC-breaking] Use ScatterGatherKernel for scatter_reduce (CPU-only) (#74226 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/74226 Update signature of `scatter_reduce_` to match `scatter_/scatter_add_` `Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)` - Add new reduction options in ScatterGatherKernel.cpp and update `scatter_reduce` to call into the cpu kernel for `scatter.reduce` - `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_` - Migrate `test/test_torch.py:test_scatter_reduce` to `test/test_scatter_gather_ops.py` Test Plan: Imported from OSS Reviewed By: ngimel Differential Revision: D35222842 Pulled By: mikaylagawarecki fbshipit-source-id: 84930add2ad30baf872c495251373313cb7428bd (cherry picked from commit 1b45139482e22eb0dc8b6aec2a7b25a4b58e31df)	2022-04-01 05:57:45 +00:00
Kurt Mohler	5375b2e994	Resolve `int[]?` arguments to new OptionalIntArrayRef class This PR uses the `OptionalArrayRef` template class that was drafted in #64084. Fixes #44409 Pull Request resolved: https://github.com/pytorch/pytorch/pull/70864 Approved by: https://github.com/ezyang	2022-03-26 01:45:50 +00:00
soulitzer	a4c81b13f3	Add forward AD support for clamp when bounds are tensors Pull Request resolved: https://github.com/pytorch/pytorch/pull/74042 Approved by: https://github.com/albanD	2022-03-24 14:31:40 +00:00
soulitzer	de73f9a558	Add forward AD support for logsumexp, log_softmax, softmax, nll_loss, and cross_entropy (#73741 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73741 There are probably more perf improvements that can be made, for example reusing more quantities from forward, doing more things inplace, but in the spirit of improving coverage, this is probably OK for now. Note: I didn't do anything with half_to_float, but CUDA (locally) hasn't complained yet Test Plan: Imported from OSS Reviewed By: ejguan Differential Revision: D34690141 Pulled By: soulitzer fbshipit-source-id: fe934e191fee2c8e956d7a5f4b553923adf1b33f (cherry picked from commit ae49aff7f7c8496e04a3ce7667d8f068ca0a52ec)	2022-03-08 00:46:27 +00:00
soulitzer	e6afa4f771	batch_norm_jvp: improve error message when running_{mean,var} have forward grad defined (#73655 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/73655 Fixes: https://github.com/pytorch/pytorch/issues/73541 Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34586758 Pulled By: soulitzer fbshipit-source-id: 689dba3ac159e50b596381c27e23ef1fd8122a40 (cherry picked from commit 81ea860fbe3c217b0100730f4b74e8d5f9bf1b61)	2022-03-02 21:31:29 +00:00
Xiao Wang	89b4cfb49f	Disable TF32 in some linalg functions (#73460 ) Summary: Disable TF32 in some linalg functions See also https://github.com/pytorch/pytorch/issues/67948 #50453 https://github.com/pytorch/pytorch/issues/44240 Pull Request resolved: https://github.com/pytorch/pytorch/pull/73460 Reviewed By: albanD Differential Revision: D34493487 Pulled By: ngimel fbshipit-source-id: 958cd968ea09df3b5a4d2b4a26aaf0dfddc53981 (cherry picked from commit cd75ec645b86c4b4a66c35696ce891d006f3833b)	2022-02-28 23:28:52 +00:00
Ansley Ussery	e4214929c5	Port `amax` to structured kernel (#72124 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72124 Reviewed By: bdhirsh Differential Revision: D34215708 Pulled By: ansley fbshipit-source-id: fee887e331cb8bd9fab3d9d958ff13ac8d07be27 (cherry picked from commit `94dbb5b7e7`)	2022-02-16 06:33:09 +00:00
Ryan Spring	4f8b986e28	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: VitalyFedyunin Differential Revision: D33894937 Pulled By: jbschlosser fbshipit-source-id: b65e8fb6ea66168af8f34f45ed50e92737a33851 (cherry picked from commit `6e986f91a9`)	2022-02-14 03:40:32 +00:00
lezcano	bf09ece782	Make svd / svdvals fully functorch compatible (#72181 ) Summary: This should (hopefully) make all the CI from `functorch` go green (including jvp's!) after changing `VARIADIC_BDIMS_BOXED(_svd_helper);` with `VARIADIC_BDIMS_BOXED(_linalg_svd);` and removing all the skip and xfails associated to `linalg.svdvals`. Locally, there's just one test that started failing because of this, and that is `test_vmapjvpall_norm_nuc_cpu_float32`. I have no idea what's going on here, but it's a jvp product, so not a regression, and it might very well be caused by the jvp of other operation within `norm_nuc` as this is a composite operation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/72181 Reviewed By: ngimel Differential Revision: D33952744 Pulled By: zou3519 fbshipit-source-id: 2a2510d97eed4a0bfc25615264ddd36e38856efe (cherry picked from commit `5805fa107c`)	2022-02-03 03:21:22 +00:00
Nikita Shulga	74c44ba9d6	Revert D33850228: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33850228 (`23d03025dc`) Original commit changeset: 3cc33fb298e4 Original Phabricator Diff: D33850228 (`23d03025dc`) fbshipit-source-id: 9436e7df73c2b2e2011f321674f24973316d3692 (cherry picked from commit `c9efb58223`)	2022-01-31 17:44:19 +00:00
Ryan Spring	23d03025dc	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: cpuhrsch Differential Revision: D33850228 Pulled By: jbschlosser fbshipit-source-id: 3cc33fb298e480d7ecc5c67716da019d60c6ab33 (cherry picked from commit `3a53b3e94f`)	2022-01-31 17:07:45 +00:00
Joel Schlosser	cb823d9f07	Revert D33744717: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33744717 (`f499ab9cef`) Original commit changeset: d64532a562ed Original Phabricator Diff: D33744717 (`f499ab9cef`) fbshipit-source-id: 396c3f63de5865f894dbc353d0790a01a624be93 (cherry picked from commit `e9fb2d1db1`)	2022-01-28 18:35:01 +00:00
Ryan Spring	f499ab9cef	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: mikaylagawarecki Differential Revision: D33744717 Pulled By: jbschlosser fbshipit-source-id: d64532a562ed53247bb4fa52bb16722634d5c187 (cherry picked from commit `4713dd9cca`)	2022-01-28 16:59:09 +00:00
kshitij12345	de44a50f14	index_backward: use out-of-place index_put if any input is subclass (#71779 ) Summary: Reference: https://github.com/pytorch/functorch/issues/393 Context : The derivative of `__getitem__`/`index` is `f5a71ec2d6/tools/autograd/derivatives.yaml (L733-L734)` where `index_backward` is defined as `f5a71ec2d6/torch/csrc/autograd/FunctionsManual.cpp (L3892-L3894)` Problem arises when `grad` is not BatchedTensor but one of the other input is. In that case, `grad.new_zeros` returns an unbatched tensor and call to the inplace `_index_put_impl_` errors as it expects `zeros_like_self` to be Batched. To avoid this, we dispatch to out-of-place `index_put` if any of the input tensor is subclassed otherwise we dispatch to the inplace `_index_put_impl_`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71779 Reviewed By: albanD Differential Revision: D33790596 Pulled By: zou3519 fbshipit-source-id: 9d6d81b758740cab7b3db9b905f1e8053f82b835 (cherry picked from commit `ba0407a86e`)	2022-01-28 16:19:34 +00:00
soulitzer	51ae9ccba4	Fix forward AD for cudnn batch norm (#71901 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71901 We didn't catch this initially because CuDNN is not being tested on CI. The following tests fail on master (if we build with CuDNN), but pass with this PR: - `test_forward_mode_AD_nn_functional_batch_norm_cuda_float64` - `test_forward_mode_AD_nn_functional_instance_norm_cuda_float64` I don't think it is documented anywhere, but from the tests passing now I'm going to guess `result1` and `result2` return `mean` and `invstd` respectively. Previously, I thought mean and variance were returned because the variables were named `saved_mean` and `saved_var`. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33818652 Pulled By: soulitzer fbshipit-source-id: ecee760f5aec620dc70f57de4fb3573c8f2f5f31 (cherry picked from commit `73fd3e021c`)	2022-01-27 23:55:37 +00:00
lezcano	8ff1a8fdca	Implement forward AD for linalg.svd and improve svd_backward (#70253 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70253 I included a derivation of the formula in the complex case, as it is particularly tricky. As far as I know, this is the first time this formula is derived in the literature. I also implemented a more efficient and more accurate version of svd_backward. More importantly, I also added a lax check in the complex case making sure the loss function just depends on the subspaces spanned by the pairs of singular vectors, and not their joint phase. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33751982 Pulled By: mruberry fbshipit-source-id: c2a4a92a921a732357e99c01ccb563813b1af512 (cherry picked from commit `391319ed8f`)	2022-01-27 18:38:30 +00:00
lezcano	84f1685397	Rewrite svd and linalg.svd as structured kernels (#69827 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69827 In general, the current pattern allows for implementing optimisations for all the backends in a common place (see for example the optimisation for empty matrices). After this PR, `torch.svd` is implemented in terms of `linalg.svd` and `linalg.svdvals`, as expected. This makes it differentiable in the case when `compute_uv=False`, although this is not particularly important, as `torch.svd` will eventually be deprecated. This PR also instantiates smaller `U` / `V` when calling cusolver_gesvdj in the cases when `full_matrices=False` or `compute_uv=False`. The memory for auxiliary `U` and `V` in the cases above, needed for some cuSOLVER routines is allocated raw allocators rather than through fully fledged tensors, as it's just a blob of memory the algorithm requests. As the code is better structured now, it was easier to see that `U` and `Vh` needn't be allocated when calling `svd_cusolver_gesvd`. Now `linalg.svdvals` work as expected wrt the `out=` parameter. Note that in the test `test_svd_memory_allocation` we were passing a tensor of the wrong size and dtype and the test seemed to pass... This PR also changes the backward formula to avoid saving the input matrix, as it's not necessary. In a follow up PR, I will clean the backward formula and make it more numerically stable and efficient. This PR also does a number of memory optimisations here and there, and fixes the call to cusolver_gesvd, which were incorrect for m <= n. To test this path, I compiled the code with a flag to unconditionally execute the `if (!gesvdj_convergence_check.empty())` branch, and all the tests passed. I also took this chance to simplify the tests for these functions in `test_linalg.py`, as we had lots of tests that were testing some functionality that is already currently tested in the corresponding OpInfos. I used xwang233's feature to test both MAGMA and CUDA backends. This is particularly good for SVD, as cuSOLVER is always chosen over MAGMA when available, so testing MAGMA otherwise would be tricky. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33751983 Pulled By: mruberry fbshipit-source-id: 11d48d977946345583d33d14fb11a170a7d14fd2 (cherry picked from commit `a1860bd567`)	2022-01-27 18:38:30 +00:00
Mikayla Gawarecki	09c417ae65	Add new reduce options and autograd support for scatter_reduce (#71788 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71788 Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33778525 Pulled By: cpuhrsch fbshipit-source-id: 47b8544e29df3075bc6ede894c59499a7ffec876 (cherry picked from commit `ddcddac726`)	2022-01-27 17:38:50 +00:00
soulitzer	25e84fa4e5	Add forward AD formulas for some losses (#71026 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71026 ...and fmod Testing: - L1Loss: new module tests (linear in the real case only) - SmoothL1Loss: new module tests - MSELoss: tested - OpInfo + new module tests - huberloss: tested - OpInfo + new module tests - multi-margin-loss: new module tests - kl-div: OpInfo + new module tests - fmod: OpInfo Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33485661 Pulled By: soulitzer fbshipit-source-id: 542ef5148183b9f574d06b2e2e345d0d889537b7 (cherry picked from commit `60765438e8`)	2022-01-26 16:31:26 +00:00
lezcano	97585ae1e7	Simplify forward / backward AD for linalg.eigh and add checks (#70528 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70528 This PR adds checks for the backward of `linalg.eigh`, similar to those deduced in https://github.com/pytorch/pytorch/pull/70253 It also makes its the implementation parallel that of the (fwd/bwd) derivative of `torch.linalg.eig` and it makes most OpInfo tests pass. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D33530149 Pulled By: albanD fbshipit-source-id: 1f368b8d450d4e9e8ae74d3881c78513c27eb956	2022-01-12 08:35:52 -08:00
lezcano	061be8d600	Correct forward AD for linalg.eig and add checks (#70527 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70527 This PR adds checks for the backward of `linalg.eig`, similar to those deduced in https://github.com/pytorch/pytorch/pull/70253 It also modifies the function so that it does not save the input matrix, as it's not necessary. It also corrects the forward AD formula for it to be correct. Now all the tests pass for `linalg.eig` and `linalg.eigvals`. It also updates the docs to reflect better what's going on here. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D33530148 Pulled By: albanD fbshipit-source-id: 984521a04f81ecb28ac1c4402b0243c63dd6959d	2022-01-12 08:30:55 -08:00
soulitzer	78994d13c0	Add forward AD formulas for {batch,layer,group}_norm (#70355 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70355 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33405362 Pulled By: soulitzer fbshipit-source-id: 55a92e88a04e7b15a0a223025d66c14f7db2a190	2022-01-10 13:52:16 -08:00
soulitzer	3051aabd0e	Add forward AD formulas for convolution and some others (#69956 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69956 Test Plan: Imported from OSS Reviewed By: albanD, bdhirsh Differential Revision: D33235974 Pulled By: soulitzer fbshipit-source-id: ea60d687edc5d62d92f3fd3cb6640421d32c908c	2022-01-06 08:39:51 -08:00
Amir Khojaste	748790588c	Upgrading the loop to use irange (#70326 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70326 See D24145988 for context: it allows loops such as for(int i=0;i<10;i++) to be expressed as for(const auto i : c10::irange(10)). This is nice because it auto-types the loops and adds const-safety to the iteration variable. Test Plan: buck run //caffe2/torch/fb/sparsenn:test Reviewed By: r-barnes Differential Revision: D33243400 fbshipit-source-id: b1f1b4163f4bf662031baea9e5268459b40c69a3	2022-01-06 07:06:53 -08:00

1 2 3 4 5

219 Commits