pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-07 00:21:07 +01:00

Author	SHA1	Message	Date
lezcano	16f30b494c	Make l1_loss composite Fixing the forward AD for `sgn` in the next PR of this stack uncovered a number of issues with the derivatives of `l1_loss`. Upon inspection, `l1_loss` was just implemented as a composite function, but it was not differentiable. This PR makes it a fully differentiable function. As a side note, `l1_loss_out` was incorrect in a number of ways. Even more, it is not exposed to the public as `F.l1_loss` does not accept an `out=` parameter. As such it is not even tested. I wonder how useful is to have `out=` variants for loss functions if we don't expose them at all. Even more, I wonder how useful is to have `_out` variants for loss functions, given that their most normal use case is to return just a real number cc jbschlosser Pull Request resolved: https://github.com/pytorch/pytorch/pull/79804 Approved by: https://github.com/zou3519, https://github.com/malfet	2022-06-20 19:10:54 +00:00
PyTorch MergeBot	d4a9438786	Revert "Make l1_loss composite" This reverts commit `61a5c779bf`. Reverted https://github.com/pytorch/pytorch/pull/78257 on behalf of https://github.com/malfet due to This breaks executorch	2022-06-17 18:14:21 +00:00
Kshiteej K	04b98df87a	[fix] composite compliance: eig, eigh, symeig (#79698 ) Ref: https://github.com/pytorch/pytorch/issues/69991 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79698 Approved by: https://github.com/Lezcano, https://github.com/albanD	2022-06-17 14:13:04 +00:00
PyTorch MergeBot	ee6ebfc06b	Revert "Enable `dim=None` for `torch.sum` (#75845 )" This reverts commit `e79a51f7db`. Reverted https://github.com/pytorch/pytorch/pull/75845 on behalf of https://github.com/malfet due to Breaks MacOS builds, see `e79a51f7db`	2022-06-16 22:01:41 +00:00
Kurt Mohler	e79a51f7db	Enable `dim=None` for `torch.sum` (#75845 ) Part of #29137 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75845 Approved by: https://github.com/ezyang	2022-06-16 20:17:07 +00:00
lezcano	61a5c779bf	Make l1_loss composite Fixing the forward AD for `sgn` in the next PR of this stack uncovered a number of issues with the derivatives of `l1_loss`. Upon inspection, `l1_loss` was just implemented as a composite function, but it was not differentiable. This PR makes it a fully differentiable function. As a side note, `l1_loss_out` was incorrect in a number of ways. Even more, it is not exposed to the public as `F.l1_loss` does not accept an `out=` parameter. As such it is not even tested. I wonder how useful is to have `out=` variants for loss functions if we don't expose them at all. Even more, I wonder how useful is to have `_out` variants for loss functions, given that their most normal use case is to return just a real number cc jbschlosser Pull Request resolved: https://github.com/pytorch/pytorch/pull/78257 Approved by: https://github.com/jbschlosser	2022-06-16 00:03:22 +00:00
Michael Suo	30fb2c4aba	[lint] autoformat test/cpp and torch/csrc Let's have some fun. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78828 Approved by: https://github.com/ezyang	2022-06-11 21:11:16 +00:00
lezcano	54949a5abc	Simplify and optimize linalg.solve This PR heavily simplifies the code of `linalg.solve`. At the same time, this implementation saves quite a few copies of the input data in some cases (e.g. A is contiguous) We also implement it in such a way that the derivative goes from computing two LU decompositions and two LU solves to no LU decompositions and one LU solves. It also avoids a number of unnecessary copies the derivative was unnecessarily performing (at least the copy of two matrices). On top of this, we add a `left` kw-only arg that allows the user to solve `XA = B` rather concisely. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74046 Approved by: https://github.com/nikitaved, https://github.com/IvanYashchuk, https://github.com/mruberry	2022-06-11 04:06:40 +00:00
PyTorch MergeBot	3556457dd2	Revert "`kl_div`: fix for grads wrt `target`, double backward, forward-over-reverse AD support. (#79007 )" This reverts commit `72ad222cff`. Reverted https://github.com/pytorch/pytorch/pull/79007 on behalf of https://github.com/janeyx99 due to Broke test_fn_fwgrad_bwgrad_nn_functional_kl_div_cpu_float64 on trunk https://hud.pytorch.org/minihud?name_filter=pull%20/%20linux-xenial-py3.7-clang7-asan%20/%20test%20(default,%202,%205,%20linux.2xlarge)	2022-06-09 13:07:03 +00:00
Nikita Vedeneev	72ad222cff	`kl_div`: fix for grads wrt `target`, double backward, forward-over-reverse AD support. (#79007 ) Fixes https://github.com/pytorch/pytorch/issues/78867, fixes https://github.com/pytorch/pytorch/issues/65466. Adds forward-over-reverse AD support. Pull Request resolved: https://github.com/pytorch/pytorch/pull/79007 Approved by: https://github.com/soulitzer, https://github.com/jbschlosser	2022-06-09 09:06:52 +00:00
lezcano	c7d6cec078	Add linalg.lu_solve This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes https://github.com/pytorch/pytorch/issues/61657 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77634 Approved by: https://github.com/malfet	2022-06-07 22:28:28 +00:00
Nikita Vedeneev	a4509f5b72	More forward-over-reverse implementations. (#78740 ) Umbrella issue: https://github.com/pytorch/pytorch/issues/75432. This one implements forward-over-reverse for: * mse_loss * l1_loss * smooth_l1_loss * softplus * hardswish (also adds double backward support) * prelu Pull Request resolved: https://github.com/pytorch/pytorch/pull/78740 Approved by: https://github.com/soulitzer	2022-06-03 15:44:06 +00:00
Brian Hirsh	5cc258ec9e	make block_diag composite compliant Pull Request resolved: https://github.com/pytorch/pytorch/pull/77716 Approved by: https://github.com/zou3519	2022-05-26 16:15:42 +00:00
Nikita Vedeneev	3924d56fae	`BCE` loss: forward-over-reverse AD support (#77852 ) Umbrella issue: https://github.com/pytorch/pytorch/issues/75432 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77852 Approved by: https://github.com/soulitzer	2022-05-26 14:36:52 +00:00
Brian Hirsh	07e4533403	reland of as_strided support for functionalization; introduce as_strided_scatter This reverts commit `a95f1edd85`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78199 Approved by: https://github.com/ezyang	2022-05-24 22:40:44 +00:00
PyTorch MergeBot	a95f1edd85	Revert "as_strided support for functionalization; introduce as_strided_scatter" This reverts commit `3a921f2d26`. Reverted https://github.com/pytorch/pytorch/pull/77128 on behalf of https://github.com/suo due to This broke rocm tests on master `3a921f2d26`. rocm tests are no longer run on PRs, you should add a `ciflow/trunk` label if you want to run them	2022-05-24 20:19:12 +00:00
Brian Hirsh	3a921f2d26	as_strided support for functionalization; introduce as_strided_scatter Pull Request resolved: https://github.com/pytorch/pytorch/pull/77128 Approved by: https://github.com/ezyang	2022-05-24 18:20:31 +00:00
lezcano	0c8c39fa71	Fix derivatives of norm(p=inf) Following up on https://github.com/pytorch/pytorch/pull/51099#discussion_r583323915, we fix these derivatives, as they were incorrect until now. As described in the note, the better solution would be to use vectorised operations on the preprocessing operation when reducing on CPU. It's not clear how difficult that may be. Fixes https://github.com/pytorch/pytorch/issues/67517 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78105 Approved by: https://github.com/ngimel	2022-05-24 17:16:16 +00:00
lezcano	e0295f55b5	Fix derivatives for linalg.vector_norm(..., dtype=) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/76551 Approved by: https://github.com/albanD	2022-05-19 21:17:18 +00:00
PyTorch MergeBot	7a4e3f329f	Revert "Fix derivatives for linalg.vector_norm(..., dtype=)" This reverts commit `13d8fb93bb`. Reverted https://github.com/pytorch/pytorch/pull/76551 on behalf of https://github.com/seemethere due to Reverting the entire stack, errors originated from * https://github.com/pytorch/pytorch/pull/76547 Failed internal builds due to ([Link for Meta Employees](https://www.internalfb.com/diff/D36494019?selected_signal=c2FuZGNhc3RsZV93b3JrZmxvd19ydW46MTgwMTQzOTg1MTUzNTQ3NzQ%3D&selected_signal_verification_phase=1&dst_version_fbid=1211273672948052)): ``` aten/src/ATen/native/LinearAlgebra.cpp:2496:9: error: unused type alias 'Int' [-Werror,-Wunused-local-typedef] using Int = IntArrayRef::value_type; ^ 1 error generated. Command failed with exit code 1. ```	2022-05-19 21:04:23 +00:00
Nikita Vedeneev	7945fa6ce2	`BCE` loss: forward ad support (#77755 ) As per title + BCE with logits gets a simpler implementation. Relevant for https://github.com/pytorch/pytorch/issues/71117 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77755 Approved by: https://github.com/soulitzer	2022-05-19 13:13:58 +00:00
lezcano	13d8fb93bb	Fix derivatives for linalg.vector_norm(..., dtype=) As per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/76551 Approved by: https://github.com/mruberry	2022-05-18 11:46:50 +00:00
Nikita Vedeneev	a760dc2687	`binary_cross_entropy`: double backwart wrt target (#77416 ) As per title. An effort to make `binary_cross_entropy` all around differentiable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77416 Approved by: https://github.com/soulitzer	2022-05-18 10:29:27 +00:00
lezcano	369d9f4137	A few forward AD formulas It includes all-time favourites like: - `put` - `nn.functional.embedding` - `prelu` - `nn.functional.bilinear` - `nn.functional.rrelu` - `nn.functional.logsigmoid` Pull Request resolved: https://github.com/pytorch/pytorch/pull/77421 Approved by: https://github.com/soulitzer	2022-05-17 15:55:51 +00:00
Mikayla Gawarecki	7ba4e124e6	Bugfix gradient formula for index_reduce('prod') + separate out sample_inputs for index_reduce Pull Request resolved: https://github.com/pytorch/pytorch/pull/77382 Approved by: https://github.com/cpuhrsch	2022-05-16 18:43:57 +00:00
Mikayla Gawarecki	841c65f499	Unprivate _index_reduce and add documentation Pull Request resolved: https://github.com/pytorch/pytorch/pull/76997 Approved by: https://github.com/cpuhrsch	2022-05-13 19:48:38 +00:00
jiayisun	97deda4f28	add BFloat16 support for logcumsumexp on CPU (#72694 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/72694 Approved by: https://github.com/VitalyFedyunin, https://github.com/frank-wei	2022-05-12 17:10:28 +00:00
Ivan Yashchuk	545d90f032	Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm This PR updates the derivative rule for `torch.sparse.addmm` to be working with CSR sparse matrix. Notably `torch.sparse.sampled_addmm` is used in the backward function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76591 Approved by: https://github.com/cpuhrsch	2022-05-11 18:57:40 +00:00
PyTorch MergeBot	f94abd59f7	Revert "Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm" This reverts commit `721a8ca697`. Reverted https://github.com/pytorch/pytorch/pull/76591 on behalf of https://github.com/janeyx99	2022-05-10 13:21:46 +00:00
Ivan Yashchuk	721a8ca697	Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm This PR updates the derivative rule for `torch.sparse.addmm` to be working with CSR sparse matrix. Notably `torch.sparse.sampled_addmm` is used in the backward function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76591 Approved by: https://github.com/cpuhrsch	2022-05-10 08:44:55 +00:00
PyTorch MergeBot	4ebc4890dd	Revert "Add linalg.lu_solve" This reverts commit `fc5b4a5a33`. Reverted https://github.com/pytorch/pytorch/pull/72935 on behalf of https://github.com/malfet	2022-05-09 19:12:30 +00:00
Mikayla Gawarecki	465e0ae266	Bugfix scatter_reduce backward formulas Pull Request resolved: https://github.com/pytorch/pytorch/pull/76523 Approved by: https://github.com/albanD	2022-05-05 20:22:39 +00:00
lezcano	fc5b4a5a33	Add linalg.lu_solve This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA when calling the batched MAGMA backend with trans=True. We work around that by solving the system solving two triangular systems. We also update the heuristics for this function, as they were fairly updated. We found that cuSolver is king, so luckily we do not need to rely on the buggy backend from magma for this function. We added tests testing this function left and right. We also added tests for the different backends. We also activated the tests for AMD, as those should work as well. Fixes https://github.com/pytorch/pytorch/issues/61657 Pull Request resolved: https://github.com/pytorch/pytorch/pull/72935 Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry	2022-05-05 19:02:13 +00:00
Nikita Vedeneev	33fabe9a2e	`functional.max_unpool`: OpInfo tests + simpler backward + forward ad + fwad over backward ad Resolves https://github.com/pytorch/pytorch/issues/67657, https://github.com/pytorch/pytorch/issues/67658, https://github.com/pytorch/pytorch/issues/67660. These are not necessarily bugs because we cannot produce arbitrary samples coming from `max_pool` to the gradcheck's eternal satisfaction. This PR also replaces low-level complicated backward kernels with much simpler high-level and well-tested counterparts. The replacement is also faster (before: parallel for loop, after: memory layout optimized TensorIterator's parallelization coming from `gather`). cc @albanD @mruberry @jbschlosser @walterddr Pull Request resolved: https://github.com/pytorch/pytorch/pull/68625 Approved by: https://github.com/albanD	2022-05-05 10:13:51 +00:00
lezcano	7cb7cd5802	Add linalg.lu This PR modifies `lu_unpack` by: - Using less memory when unpacking `L` and `U` - Fuse the subtraction by `-1` with `unpack_pivots_stub` - Define tensors of the correct types to avoid copies - Port `lu_unpack` to be a strucutred kernel so that its `_out` version does not incur on extra copies Then we implement `linalg.lu` as a structured kernel, as we want to compute its derivative manually. We do so because composing the derivatives of `torch.lu_factor` and `torch.lu_unpack` would be less efficient. This new function and `lu_unpack` comes with all the things it can come: forward and backward ad, decent docs, correctness tests, OpInfo, complex support, support for metatensors and support for vmap and vmap over the gradients. I really hope we don't continue adding more features. This PR also avoids saving some of the tensors that were previously saved unnecessarily for the backward in `lu_factor_ex_backward` and `lu_backward` and does some other general improvements here and there to the forward and backward AD formulae of other related functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67833 Approved by: https://github.com/IvanYashchuk, https://github.com/nikitaved, https://github.com/mruberry	2022-05-05 09:17:05 +00:00
lezcano	1a4eea57be	Improve derivative of QR decomposition We derive and implement a more concise rule for the forward and backward derivatives of the QR decomposition. While doing this we: - Fix the composite compliance of `linalg.qr` and we make it support batches - Improve the performance and simplify the implementation of both foward and backward - Avoid saving the input matrix for the backward computation. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76115 Approved by: https://github.com/nikitaved, https://github.com/albanD	2022-05-05 09:14:57 +00:00
Richard Zou	71ae190b87	[composite compliance] Fix a bunch of fft backwards Replaced `at::zeros(..., grad.options()).slice().copy_(grad))` with `grad.new_zeros(..., grad.options()).slice().copy_(grad))` Test Plan: - run tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/76573 Approved by: https://github.com/ngimel, https://github.com/albanD	2022-05-03 00:07:30 +00:00
Mikayla Gawarecki	676a4a3969	Prototype _index_reduce (CPU-only) Pull Request resolved: https://github.com/pytorch/pytorch/pull/75981 Approved by: https://github.com/cpuhrsch	2022-04-27 23:01:00 +00:00
Richard Zou	9cb2871f31	Fix forward-mode AD formula for binary_cross_entropy_with_logits The problem was that `grad_input` and `grad_target` may be ZeroTensors, which are immutable. This PR changes it so that operations on grad_input and grad_target in `binary_cross_entropy_with_logits_jvp` are no longer in-place. Test Plan: - run existing tests Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/76322 Approved by: https://github.com/soulitzer	2022-04-25 22:30:57 +00:00
lezcano	441aea4127	Update Choesky's forward and backward derivative This PR: - Derives formally a new rule for Cholesky (write-up to come) - Implements it without using in-place operations in the forward or backward. - Does not instantiate inverses explicitly, but rather it solves two triangular systems of equations (2 triang vs 1 triang and 2 matmuls should be comparable, but the first one should be more stable). Pull Request resolved: https://github.com/pytorch/pytorch/pull/76032 Approved by: https://github.com/nikitaved, https://github.com/albanD	2022-04-22 00:45:38 +00:00
Nikita Shulga	f6c275f55d	Remove `-Wno-unused-variable` from `utils.cmake` (take 2) (#75538 ) Summary: [Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there. Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block. Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538 Reviewed By: anjali411 Differential Revision: D35747333 Pulled By: malfet fbshipit-source-id: 3fc5828e44a4c05ba0e89e92613e6ebbdb260626 (cherry picked from commit c179fba21cfa2a0093fad50ccad5a22dd7cff52c)	2022-04-20 17:41:59 +00:00
Ivan Yashchuk	bba4780232	Enable autograd wrt sparse CSR tensors This pull request enables accumulating gradients for the CSR tensor. Functions that work and are tested: - tensor.abs() - tensor.neg() - tensor.conj_physical() - torch.addmm `torch.mm` also works, but tests will be added later. In addition, this PR adds throwing an error when trying to access strides, storage, and contiguity info on a CSR tensor. `tensor.to_sparse_csr().to_sparse_csr()` was failing and now fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75435 Approved by: https://github.com/cpuhrsch	2022-04-19 18:42:45 +00:00
PyTorch MergeBot	5c56b2286b	Revert "Remove `-Wno-unused-variable` from utils.cmake" This reverts commit `018cbe1f5c`. Reverted https://github.com/pytorch/pytorch/pull/75538 on behalf of https://github.com/seemethere	2022-04-19 17:19:09 +00:00
Nikita Shulga	018cbe1f5c	Remove `-Wno-unused-variable` from utils.cmake [Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there. Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block. Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787 Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538 Approved by: https://github.com/cpuhrsch	2022-04-19 15:26:55 +00:00
Peter Bell	cc56fac213	Fix complex to real casting warning in _to_copy backward Fixes #75781 A Real->Complex cast should result in a gradient with no imaginary component, so discarding the imaginary component is expected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/75805 Approved by: https://github.com/albanD	2022-04-19 14:04:13 +00:00
soulitzer	8721abc429	Add forward AD support for norm, dist, F.pairwise_dist, F.normalize Pull Request resolved: https://github.com/pytorch/pytorch/pull/74205 Approved by: https://github.com/albanD	2022-04-13 15:03:20 +00:00
soulitzer	76614b3a33	Test linalg vector norm subgradient Pull Request resolved: https://github.com/pytorch/pytorch/pull/75103 Approved by: https://github.com/albanD	2022-04-12 20:54:30 +00:00
anjali411	91d134093e	Add fastpath for stack and cat JVP computation Pull Request resolved: https://github.com/pytorch/pytorch/pull/75590 Approved by: https://github.com/albanD, https://github.com/soulitzer	2022-04-11 18:10:09 +00:00
soulitzer	b10d151745	Ensure convolution_backward respects output_mask Pull Request resolved: https://github.com/pytorch/pytorch/pull/75298 Approved by: https://github.com/albanD	2022-04-08 19:27:41 +00:00
Mikayla Gawarecki	e9a8e6f74a	Add include_self flag to scatter_reduce Pull Request resolved: https://github.com/pytorch/pytorch/pull/74607 Approved by: https://github.com/cpuhrsch	2022-04-05 16:31:39 +00:00

1 2 3 4 5

244 Commits