Commit Graph

261 Commits

Author SHA1 Message Date
Edward Z. Yang
84c8a9f88e Use slow but safe formula for prod_backward (#81617)
prod performs a sync to test for zeros as the formula is substantially
simpler if there are no zeros, but this doesn't work for meta tensors.
The double backwards formula works great in all cases though!

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81617
Approved by: https://github.com/soulitzer
2022-07-18 18:45:32 +00:00
PyTorch MergeBot
4963adcc8d Revert "[composite compliance] matrix_exp (#81225)"
This reverts commit 367c695237.

Reverted https://github.com/pytorch/pytorch/pull/81225 on behalf of https://github.com/clee2000 due to broke functorch https://github.com/pytorch/pytorch/runs/7345901504?check_suite_focus=true
2022-07-14 19:53:51 +00:00
kshitij12345
367c695237 [composite compliance] matrix_exp (#81225)
Ref: #69991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81225
Approved by: https://github.com/zou3519
2022-07-14 18:19:11 +00:00
lezcano
b5b9db9f84 Make kl_div a composite function. (#80334)
Benchmarks: https://github.com/pytorch/pytorch/pull/80334#issuecomment-1167229285

Fixes https://github.com/pytorch/pytorch/issues/80158
Fixes https://github.com/pytorch/pytorch/issues/78867
Fixes https://github.com/pytorch/pytorch/issues/69230

Supersedes https://github.com/pytorch/pytorch/pull/79007
Supersedes https://github.com/pytorch/pytorch/pull/69212
Supersedes https://github.com/pytorch/pytorch/pull/19659
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80334
Approved by: https://github.com/ezyang
2022-07-13 20:07:36 +00:00
Kurt Mohler
23bdb570cf Reland: Enable dim=None for torch.sum (#79881)
Part of #29137

Reland of #75845
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79881
Approved by: https://github.com/albanD, https://github.com/kulinseth
2022-07-09 00:54:42 +00:00
PyTorch MergeBot
f2c8557521 Revert "Make kl_div a composite function. (#80334)"
This reverts commit 828c787ea9.

Reverted https://github.com/pytorch/pytorch/pull/80334 on behalf of https://github.com/ezyang due to doesn't work with xla
2022-07-06 17:51:06 +00:00
lezcano
828c787ea9 Make kl_div a composite function. (#80334)
Benchmarks: https://github.com/pytorch/pytorch/pull/80334#issuecomment-1167229285

Fixes https://github.com/pytorch/pytorch/issues/80158
Fixes https://github.com/pytorch/pytorch/issues/78867
Fixes https://github.com/pytorch/pytorch/issues/69230

Supersedes https://github.com/pytorch/pytorch/pull/79007
Supersedes https://github.com/pytorch/pytorch/pull/69212
Supersedes https://github.com/pytorch/pytorch/pull/19659
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80334
Approved by: https://github.com/ezyang
2022-07-04 19:33:43 +00:00
lezcano
37a5819665 Make slogdet, linalg.sloget and logdet support metatensors (#79742)
This PR also adds complex support for logdet, and makes all these
functions support out= and be composite depending on one function. We
also extend the support of `logdet` to complex numbers and improve the
docs of all these functions.

We also use `linalg_lu_factor_ex` in these functions, so we remove the
synchronisation present before.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79742
Approved by: https://github.com/IvanYashchuk, https://github.com/albanD
2022-07-01 16:09:21 +00:00
Hao Zhuang
0ca9888000 Correct the math of repeat_backward in the function comment (#80286)
Correct the math of repeat_backward in the function comment.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80286
Approved by: https://github.com/albanD
2022-06-28 16:22:46 +00:00
lezcano
42a2359612 Add forward AD for linalg.det and simplify its backward (#79487)
This PR is in preparation for implementing `logdet` and `slogdet` as
structured kernels + implementing them with more efficient derivatives

We implement forward AD for det. We also simplify the implementation of
the backward, and leave a note on how to implement it properly for
singular matrices. We leave thad for future work.

Note (by looking at the OpInfo) that the current implementation passes
the same tests as the one before. We skip the forward-over-backward in
the singular case, as that one was not working in the gradgrad case
either.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79487
Approved by: https://github.com/nikitaved, https://github.com/albanD
2022-06-24 14:15:17 +00:00
lezcano
44ff6be35a Fix backward of binary_cross_entropy_with_logits
The previous PR in this stack uncovered an error in the forward over
backward for this function.

In this PR, we fix this error and we also fix the gradgrad
implementation (and make it more stable and faster using `logsigmoid`).
We also move the double backward for this function to `FunctoinsManual`
as there's no reason for it to be in `native_functions`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80083

Approved by: https://github.com/zou3519
2022-06-23 01:31:08 +00:00
lezcano
f54e7b4ad6 More forward AD formulas
This PR:
- Corrects the forward AD formula of `torch.sgn`.
  - The reason why we can't use `auto_element_wise` for this operations is rather subtle. I left a comment.
  - This, in turn, fixes a problem we had in forward-over-backward for `linalg.svd` and other spectral decompositions (and `norm`, `linalg.norm`, `linalg.matrix_norm`) that were using `torch.abs` (whose derivative is given by `torch.sgn`.
- Implement the formula for a number of missing operations `nansum`, `amax`, `amin`...
- Simplified a few formulas, most notably the forward AD for `div` and the derivative of `norm`, `linalg.norm` and `vector_norm` for `ord=+-inf`.
- Correct the formula for `mean`, `std_mean`, `var_mean` when `dim` is provided and equal to `()` (or `None`)
- A few minor improvements to `sum_backward`, `unsqueeze_multiple` and formulas depending on them
- Fix the derivatives of `std_mean` and `std_var` (complex support,
ASAN, forward AD...)

Fixes: https://github.com/pytorch/pytorch/issues/67539

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80082

Approved by: https://github.com/zou3519
2022-06-23 01:31:08 +00:00
PyTorch MergeBot
e3d0a3ca88 Revert "More forward AD formulas"
This reverts commit 6b20ef6b91.

Reverted https://github.com/pytorch/pytorch/pull/77975 on behalf of https://github.com/janeyx99 due to I think this is the real culprit of the broken tests in 28a7ee8cec for the trunk-only slow test job
2022-06-22 19:30:02 +00:00
PyTorch MergeBot
942c371bbc Revert "Fix backward of binary_cross_entropy_with_logits"
This reverts commit 28a7ee8cec.

Reverted https://github.com/pytorch/pytorch/pull/79381 on behalf of https://github.com/janeyx99 due to Sorry, 28a7ee8cec this PR breaks trunk-only slow test job
2022-06-22 17:41:09 +00:00
lezcano
28a7ee8cec Fix backward of binary_cross_entropy_with_logits
The previous PR in this stack uncovered an error in the forward over
backward for this function.

In this PR, we fix this error and we also fix the gradgrad
implementation (and make it more stable and faster using `logsigmoid`).
We also move the double backward for this function to `FunctoinsManual`
as there's no reason for it to be in `native_functions`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79381

Approved by: https://github.com/soulitzer
2022-06-22 14:28:56 +00:00
lezcano
6b20ef6b91 More forward AD formulas
This PR:
- Corrects the forward AD formula of `torch.sgn`.
  - The reason why we can't use `auto_element_wise` for this operations is rather subtle. I left a comment.
  - This, in turn, fixes a problem we had in forward-over-backward for `linalg.svd` and other spectral decompositions (and `norm`, `linalg.norm`, `linalg.matrix_norm`) that were using `torch.abs` (whose derivative is given by `torch.sgn`.
- Implement the formula for a number of missing operations `nansum`, `amax`, `amin`...
- Simplified a few formulas, most notably the forward AD for `div` and the derivative of `norm`, `linalg.norm` and `vector_norm` for `ord=+-inf`.
- Correct the formula for `mean`, `std_mean`, `var_mean` when `dim` is provided and equal to `()` (or `None`)
- A few minor improvements to `sum_backward`, `unsqueeze_multiple` and formulas depending on them
- Fix the derivatives of `std_mean` and `std_var` (complex support,
ASAN, forward AD...)

Fixes: https://github.com/pytorch/pytorch/issues/67539

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77975

Approved by: https://github.com/soulitzer
2022-06-22 14:28:56 +00:00
Driss Guessous
a098937c20 Add factory function derivatives (#79872)
Adding derivatives for factory functions, this issue is used for tracking: #79044

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79872
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
2022-06-21 00:53:11 +00:00
lezcano
16f30b494c Make l1_loss composite
Fixing the forward AD for `sgn` in the next PR of this stack uncovered a
number of issues with the derivatives of `l1_loss`. Upon inspection,
`l1_loss` was just implemented as a composite function, but it was not
differentiable. This PR makes it a fully differentiable function.

As a side note, `l1_loss_out` was incorrect in a number of ways. Even
more, it is not exposed to the public as `F.l1_loss` does not accept an
`out=` parameter. As such it is not even tested. I wonder how useful is
to have `out=` variants for loss functions if we don't expose them at
all. Even more, I wonder how useful is to have `_out` variants  for loss
functions, given that their most normal use case is to return just a
real number cc jbschlosser

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79804

Approved by: https://github.com/zou3519, https://github.com/malfet
2022-06-20 19:10:54 +00:00
PyTorch MergeBot
d4a9438786 Revert "Make l1_loss composite"
This reverts commit 61a5c779bf.

Reverted https://github.com/pytorch/pytorch/pull/78257 on behalf of https://github.com/malfet due to This breaks executorch
2022-06-17 18:14:21 +00:00
Kshiteej K
04b98df87a [fix] composite compliance: eig, eigh, symeig (#79698)
Ref: https://github.com/pytorch/pytorch/issues/69991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79698
Approved by: https://github.com/Lezcano, https://github.com/albanD
2022-06-17 14:13:04 +00:00
PyTorch MergeBot
ee6ebfc06b Revert "Enable dim=None for torch.sum (#75845)"
This reverts commit e79a51f7db.

Reverted https://github.com/pytorch/pytorch/pull/75845 on behalf of https://github.com/malfet due to Breaks MacOS builds, see e79a51f7db
2022-06-16 22:01:41 +00:00
Kurt Mohler
e79a51f7db Enable dim=None for torch.sum (#75845)
Part of #29137

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75845
Approved by: https://github.com/ezyang
2022-06-16 20:17:07 +00:00
lezcano
61a5c779bf Make l1_loss composite
Fixing the forward AD for `sgn` in the next PR of this stack uncovered a
number of issues with the derivatives of `l1_loss`. Upon inspection,
`l1_loss` was just implemented as a composite function, but it was not
differentiable. This PR makes it a fully differentiable function.

As a side note, `l1_loss_out` was incorrect in a number of ways. Even
more, it is not exposed to the public as `F.l1_loss` does not accept an
`out=` parameter. As such it is not even tested. I wonder how useful is
to have `out=` variants for loss functions if we don't expose them at
all. Even more, I wonder how useful is to have `_out` variants  for loss
functions, given that their most normal use case is to return just a
real number cc jbschlosser

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78257

Approved by: https://github.com/jbschlosser
2022-06-16 00:03:22 +00:00
Michael Suo
30fb2c4aba [lint] autoformat test/cpp and torch/csrc
Let's have some fun.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78828

Approved by: https://github.com/ezyang
2022-06-11 21:11:16 +00:00
lezcano
54949a5abc Simplify and optimize linalg.solve
This PR heavily simplifies the code of `linalg.solve`. At the same time,
this implementation saves quite a few copies of the input data in some
cases (e.g. A is contiguous)

We also implement it in such a way that the derivative goes from
computing two LU decompositions and two LU solves to no LU
decompositions and one LU solves. It also avoids a number of unnecessary
copies the derivative was unnecessarily performing (at least the copy of
two matrices).

On top of this, we add a `left` kw-only arg that allows the user to
solve `XA = B` rather concisely.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74046

Approved by: https://github.com/nikitaved, https://github.com/IvanYashchuk, https://github.com/mruberry
2022-06-11 04:06:40 +00:00
PyTorch MergeBot
3556457dd2 Revert "kl_div: fix for grads wrt target, double backward, forward-over-reverse AD support. (#79007)"
This reverts commit 72ad222cff.

Reverted https://github.com/pytorch/pytorch/pull/79007 on behalf of https://github.com/janeyx99 due to Broke test_fn_fwgrad_bwgrad_nn_functional_kl_div_cpu_float64 on trunk https://hud.pytorch.org/minihud?name_filter=pull%20/%20linux-xenial-py3.7-clang7-asan%20/%20test%20(default,%202,%205,%20linux.2xlarge)
2022-06-09 13:07:03 +00:00
Nikita Vedeneev
72ad222cff kl_div: fix for grads wrt target, double backward, forward-over-reverse AD support. (#79007)
Fixes https://github.com/pytorch/pytorch/issues/78867,
fixes https://github.com/pytorch/pytorch/issues/65466.
Adds forward-over-reverse AD support.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79007
Approved by: https://github.com/soulitzer, https://github.com/jbschlosser
2022-06-09 09:06:52 +00:00
lezcano
c7d6cec078 Add linalg.lu_solve
This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA
when calling the batched MAGMA backend with trans=True. We work around
that by solving the system solving two triangular systems.

We also update the heuristics for this function, as they were fairly
updated. We found that cuSolver is king, so luckily we do not need to
rely on the buggy backend from magma for this function.

We added tests testing this function left and right. We also added tests
for the different backends. We also activated the tests for AMD, as
those should work as well.

Fixes https://github.com/pytorch/pytorch/issues/61657

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77634

Approved by: https://github.com/malfet
2022-06-07 22:28:28 +00:00
Nikita Vedeneev
a4509f5b72 More forward-over-reverse implementations. (#78740)
Umbrella issue: https://github.com/pytorch/pytorch/issues/75432.

This one implements forward-over-reverse for:

* mse_loss
* l1_loss
* smooth_l1_loss
* softplus
* hardswish (also adds double backward support)
* prelu

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78740
Approved by: https://github.com/soulitzer
2022-06-03 15:44:06 +00:00
Brian Hirsh
5cc258ec9e make block_diag composite compliant
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77716

Approved by: https://github.com/zou3519
2022-05-26 16:15:42 +00:00
Nikita Vedeneev
3924d56fae BCE loss: forward-over-reverse AD support (#77852)
Umbrella issue: https://github.com/pytorch/pytorch/issues/75432

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77852
Approved by: https://github.com/soulitzer
2022-05-26 14:36:52 +00:00
Brian Hirsh
07e4533403 reland of as_strided support for functionalization; introduce as_strided_scatter
This reverts commit a95f1edd85.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78199

Approved by: https://github.com/ezyang
2022-05-24 22:40:44 +00:00
PyTorch MergeBot
a95f1edd85 Revert "as_strided support for functionalization; introduce as_strided_scatter"
This reverts commit 3a921f2d26.

Reverted https://github.com/pytorch/pytorch/pull/77128 on behalf of https://github.com/suo due to This broke rocm tests on master 3a921f2d26. rocm tests are no longer run on PRs, you should add a `ciflow/trunk` label if you want to run them
2022-05-24 20:19:12 +00:00
Brian Hirsh
3a921f2d26 as_strided support for functionalization; introduce as_strided_scatter
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77128

Approved by: https://github.com/ezyang
2022-05-24 18:20:31 +00:00
lezcano
0c8c39fa71 Fix derivatives of norm(p=inf)
Following up on https://github.com/pytorch/pytorch/pull/51099#discussion_r583323915, we fix these derivatives, as they were incorrect until now.

As described in the note, the better solution would be to use vectorised operations on the preprocessing operation when reducing on CPU. It's not clear how difficult that may be.

Fixes https://github.com/pytorch/pytorch/issues/67517

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78105

Approved by: https://github.com/ngimel
2022-05-24 17:16:16 +00:00
lezcano
e0295f55b5 Fix derivatives for linalg.vector_norm(..., dtype=)
As per title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76551

Approved by: https://github.com/albanD
2022-05-19 21:17:18 +00:00
PyTorch MergeBot
7a4e3f329f Revert "Fix derivatives for linalg.vector_norm(..., dtype=)"
This reverts commit 13d8fb93bb.

Reverted https://github.com/pytorch/pytorch/pull/76551 on behalf of https://github.com/seemethere due to Reverting the entire stack, errors originated from
* https://github.com/pytorch/pytorch/pull/76547

Failed internal builds due to ([Link for Meta Employees](https://www.internalfb.com/diff/D36494019?selected_signal=c2FuZGNhc3RsZV93b3JrZmxvd19ydW46MTgwMTQzOTg1MTUzNTQ3NzQ%3D&selected_signal_verification_phase=1&dst_version_fbid=1211273672948052)):
```
aten/src/ATen/native/LinearAlgebra.cpp:2496:9: error: unused type alias 'Int' [-Werror,-Wunused-local-typedef]
  using Int = IntArrayRef::value_type;
        ^
1 error generated.
Command failed with exit code 1.
```
2022-05-19 21:04:23 +00:00
Nikita Vedeneev
7945fa6ce2 BCE loss: forward ad support (#77755)
As per title + BCE with logits gets a simpler implementation.
Relevant for https://github.com/pytorch/pytorch/issues/71117

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77755
Approved by: https://github.com/soulitzer
2022-05-19 13:13:58 +00:00
lezcano
13d8fb93bb Fix derivatives for linalg.vector_norm(..., dtype=)
As per title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76551

Approved by: https://github.com/mruberry
2022-05-18 11:46:50 +00:00
Nikita Vedeneev
a760dc2687 binary_cross_entropy: double backwart wrt target (#77416)
As per title. An effort to make `binary_cross_entropy` all around differentiable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77416
Approved by: https://github.com/soulitzer
2022-05-18 10:29:27 +00:00
lezcano
369d9f4137 A few forward AD formulas
It includes all-time favourites like:
- `put`
- `nn.functional.embedding`
- `prelu`
- `nn.functional.bilinear`
- `nn.functional.rrelu`
- `nn.functional.logsigmoid`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77421

Approved by: https://github.com/soulitzer
2022-05-17 15:55:51 +00:00
Mikayla Gawarecki
7ba4e124e6 Bugfix gradient formula for index_reduce('prod') + separate out sample_inputs for index_reduce
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77382

Approved by: https://github.com/cpuhrsch
2022-05-16 18:43:57 +00:00
Mikayla Gawarecki
841c65f499 Unprivate _index_reduce and add documentation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76997

Approved by: https://github.com/cpuhrsch
2022-05-13 19:48:38 +00:00
jiayisun
97deda4f28 add BFloat16 support for logcumsumexp on CPU (#72694)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72694
Approved by: https://github.com/VitalyFedyunin, https://github.com/frank-wei
2022-05-12 17:10:28 +00:00
Ivan Yashchuk
545d90f032 Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm
This PR updates the derivative rule for `torch.sparse.addmm` to be
working with CSR sparse matrix. Notably `torch.sparse.sampled_addmm` is
used in the backward function.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76591

Approved by: https://github.com/cpuhrsch
2022-05-11 18:57:40 +00:00
PyTorch MergeBot
f94abd59f7 Revert "Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm"
This reverts commit 721a8ca697.

Reverted https://github.com/pytorch/pytorch/pull/76591 on behalf of https://github.com/janeyx99
2022-05-10 13:21:46 +00:00
Ivan Yashchuk
721a8ca697 Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm
This PR updates the derivative rule for `torch.sparse.addmm` to be
working with CSR sparse matrix. Notably `torch.sparse.sampled_addmm` is
used in the backward function.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76591

Approved by: https://github.com/cpuhrsch
2022-05-10 08:44:55 +00:00
PyTorch MergeBot
4ebc4890dd Revert "Add linalg.lu_solve"
This reverts commit fc5b4a5a33.

Reverted https://github.com/pytorch/pytorch/pull/72935 on behalf of https://github.com/malfet
2022-05-09 19:12:30 +00:00
Mikayla Gawarecki
465e0ae266 Bugfix scatter_reduce backward formulas
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76523

Approved by: https://github.com/albanD
2022-05-05 20:22:39 +00:00
lezcano
fc5b4a5a33 Add linalg.lu_solve
This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA
when calling the batched MAGMA backend with trans=True. We work around
that by solving the system solving two triangular systems.

We also update the heuristics for this function, as they were fairly
updated. We found that cuSolver is king, so luckily we do not need to
rely on the buggy backend from magma for this function.

We added tests testing this function left and right. We also added tests
for the different backends. We also activated the tests for AMD, as
those should work as well.

Fixes https://github.com/pytorch/pytorch/issues/61657

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72935

Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry
2022-05-05 19:02:13 +00:00