Commit Graph

437 Commits

Author SHA1 Message Date
Edward Z. Yang
4c01c51266 Symintifying slice ops (#85196)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85196
Approved by: https://github.com/ezyang
2022-09-23 22:01:32 +00:00
Mikayla Gawarecki
77f1f98479 Re-introduce torch.Tensor.to_padded_tensor (#85293)
Differential Revision: [D39629004](https://our.internmc.facebook.com/intern/diff/D39629004)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85293
Approved by: https://github.com/cpuhrsch
2022-09-21 18:45:56 +00:00
Edward Z. Yang
3eb27229dd as_strided symbolic support (#85264)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D39662820](https://our.internmc.facebook.com/intern/diff/D39662820)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85264
Approved by: https://github.com/wconstab
2022-09-21 13:34:55 +00:00
Benoit Steiner
86d8c61c7c Revert D39583438: Multisect successfully blamed D39583438 for test or build failures (#85277)
Summary:
This diff is reverting D39583438
D39583438 has been identified to be causing the following test or build failures:
Tests affected:
- https://www.internalfb.com/intern/test/281475048999851/

Here's the Multisect link:
https://www.internalfb.com/intern/testinfra/multisect/1260522
Here are the tasks that are relevant to this breakage:
T124797105: 18 tests started failing for employee benoitsteiner in the last 2 weeks
We're generating a revert to back out the changes in this diff, please note the backout may land if someone accepts it.

Test Plan: NA

Reviewed By: benoitsteiner

Differential Revision: D39599694

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85277
Approved by: https://github.com/dagitses
2022-09-20 15:38:58 +00:00
kshitij12345
a4dca9822d [composite compliance] prod (#81969)
Ref: #69991

Also fixes #82644 (fix similar to #81617)

For CompositeCompliance, we can't use `item` to choose a special fast-path when Tensor is a Subclass. Instead we always dispatch to the slower but safer implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81969
Approved by: https://github.com/zou3519
2022-09-20 08:03:36 +00:00
Thomas Viehmann
e41d758e26 Handle implicit real->complex casting for backward of stack (#84993)
Fixes: #75852

P.S.: Yay for the PyTorch foundation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84993
Approved by: https://github.com/soulitzer
2022-09-19 21:20:34 +00:00
lezcano
d710c95cc0 Implement forward AD for scatter_reduce (#85000)
I left the case `reduction="prod"` for future work as it's a bit of a pain.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85000
Approved by: https://github.com/soulitzer
2022-09-16 17:45:07 +00:00
Elias Ellison
54c9c4e73d Flip fake tensors on in aot autograd (#84968)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84968
Approved by: https://github.com/Chillee
2022-09-16 15:27:48 +00:00
Pearu Peterson
a225f3cfce torch.zero_ on a sparse compressed tensor resets nnz to 0 (#85030)
Fixes #84997 and #82683

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85030
Approved by: https://github.com/cpuhrsch
2022-09-15 18:42:38 +00:00
Richard Zou
3a107bc9be [functorch] fix vmapvjpvjp test for prelu (#84939)
Turns out this is just a composite compliance issue. Branching on if
something requires grad or not can lead to incorrect gradients if we
have a BatchedTensor wrapping a tensor that requires grad.

Test Plan:
- tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84939
Approved by: https://github.com/soulitzer
2022-09-15 00:36:30 +00:00
Mikayla Gawarecki
e217b30b0f Add torch.nested namespace (#84102)
First step towards #83775
- only `to_padded_tensor` is moved to the nested namespace for now
- following the schema used for `special`, `fft`, `linalg` and other namespaces, nested functions are registered in native_functions.yaml as `nested_{function_name}` and are bound to the desired Python name in
`torch/nested/__init__.py`, and the desired C++ name in `torch/csrc/api/include/torch/nested.h`.

~~**Question**: should we keep the documentation for `Tensor.to_padded_tensor` or can this deleted since it is shared by `torch.nested.to_padded_tensor`?~~

[generated nested docs](https://docs-preview.pytorch.org/84102/nested.html?highlight=nested#module-torch.nested)

Differential Revision: [D39361148](https://our.internmc.facebook.com/intern/diff/D39361148)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84102
Approved by: https://github.com/drisspg
2022-09-12 16:31:05 +00:00
Ivan Yashchuk
01c54ad6de Remove deprecated torch.eig (#70982)
The time has come to remove deprecated linear algebra related functions. This PR removes `torch.eig`.

cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70982
Approved by: https://github.com/Lezcano, https://github.com/malfet
2022-09-09 21:31:57 +00:00
nikitaved
3eb16509c7 optimize householder product backward to be more memory-efficient (#84627)
A follow-up on discussions in https://github.com/pytorch/pytorch/pull/84180.
Makes backward more memory efficient with the lesser number of kernel calls.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84627
Approved by: https://github.com/kshitij12345, https://github.com/zou3519
2022-09-07 15:29:47 +00:00
kshitij12345
07d398fb26 [composite compliance] linalg_householder_product (#84180)
Ref: #69991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84180
Approved by: https://github.com/zou3519
2022-09-07 09:33:37 +00:00
kshitij12345
65ea3d0621 [composite compliance] cov, corrcoef (#82954)
Ref: #69991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82954
Approved by: https://github.com/zou3519
2022-08-26 15:14:37 +00:00
Mario Lezcano
3e6e0a1d10 Support a stable double backward on linalg.det for real inputs (#80217)
The complex case still fails. I do not know why.

Fixes https://github.com/pytorch/pytorch/issues/62327
Fixes https://github.com/pytorch/pytorch/issues/53364
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80217
Approved by: https://github.com/nikitaved, https://github.com/albanD, https://github.com/malfet
2022-08-24 15:18:56 +00:00
Mario Lezcano
aad89bb771 Make the derivative of masked_fill more efficient (#83515)
There's no need to add all the zeros if we extract all the non-zero
elements.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83515
Approved by: https://github.com/albanD, https://github.com/soulitzer
2022-08-18 13:00:12 +00:00
Kurt Mohler
be5b3df6cc Update std_mean/var_mean/nanmean/nansum signatures with int[1]? dim (#82912)
### Description
Change the type of the `dim` arg for `std_mean/var_mean/nanmean/nansum` to `int[1]?` in `native_functions.yaml`

### Issue
Part of #29137

### Testing

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82912
Approved by: https://github.com/albanD
2022-08-10 16:58:26 +00:00
kshitij12345
10e7a25488 [composite compliance] eig_backward (#82957)
Ref #69991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82957
Approved by: https://github.com/zou3519
2022-08-08 15:18:48 +00:00
Kurt Mohler
2bfae07a79 Enable dim=None for torch.mean (#81286)
Part of #79525

This will require coordination with XLA before merging, just like #79881
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81286
Approved by: https://github.com/albanD
2022-07-28 22:34:56 +00:00
Nikolay Korovaiko
d2c47d559c Revert "Revert "Enabling SymInt in autograd; take 3 (#81145)"" ; make sure is_intlist checks for symintnodes (#82189)
### Description
<!-- What did you change and why was it needed? -->

### Issue
<!-- Link to Issue ticket or RFP -->

### Testing
<!-- How did you test your change? -->

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82189
Approved by: https://github.com/ezyang
2022-07-26 20:47:11 +00:00
lezcano
11fe277b62 [PrimTorch] Add reference for torch.norm (#81765)
This ref does more things than `torch.norm`, and it fixes a few bugs
that `torch.norm` has. This implementation and the `torch.norm`
implementation come to terms in the next PR of this stack

We put this PR before, as otherwise `test_decomp.py` was failing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81765
Approved by: https://github.com/ngimel
2022-07-25 19:57:21 +00:00
Kshiteej K
db0e121b46 [composite compliance] put, take (#81094)
Reference: #69991

This PR makes `put` CompositeExplicit as it is implemented in terms of `put_` (for which we can't handle Composite Compliance at the implementation level).

Ref (put implementation)
478081c698/aten/src/ATen/native/TensorAdvancedIndexing.cpp (L619-L621)

Also, we update the `take` gradient formula to handle Tensor Subclass .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81094
Approved by: https://github.com/zou3519
2022-07-25 15:05:16 +00:00
kshitij12345
5880a66758 [composite compliance] matrix_exp (#81225)
Ref: #69991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81225
Approved by: https://github.com/zou3519
2022-07-25 11:11:29 +00:00
PyTorch MergeBot
c078476eb0 Revert "Enabling SymInt in autograd; take 3 (#81145)"
This reverts commit 032facd6e6.

Reverted https://github.com/pytorch/pytorch/pull/81145 on behalf of https://github.com/jeanschmidt due to breaking internal builds
2022-07-22 11:15:20 +00:00
Nikolay Korovaiko
032facd6e6 Enabling SymInt in autograd; take 3 (#81145)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/81145
Approved by: https://github.com/ezyang
2022-07-22 00:14:50 +00:00
Edward Z. Yang
84c8a9f88e Use slow but safe formula for prod_backward (#81617)
prod performs a sync to test for zeros as the formula is substantially
simpler if there are no zeros, but this doesn't work for meta tensors.
The double backwards formula works great in all cases though!

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81617
Approved by: https://github.com/soulitzer
2022-07-18 18:45:32 +00:00
PyTorch MergeBot
4963adcc8d Revert "[composite compliance] matrix_exp (#81225)"
This reverts commit 367c695237.

Reverted https://github.com/pytorch/pytorch/pull/81225 on behalf of https://github.com/clee2000 due to broke functorch https://github.com/pytorch/pytorch/runs/7345901504?check_suite_focus=true
2022-07-14 19:53:51 +00:00
kshitij12345
367c695237 [composite compliance] matrix_exp (#81225)
Ref: #69991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81225
Approved by: https://github.com/zou3519
2022-07-14 18:19:11 +00:00
lezcano
b5b9db9f84 Make kl_div a composite function. (#80334)
Benchmarks: https://github.com/pytorch/pytorch/pull/80334#issuecomment-1167229285

Fixes https://github.com/pytorch/pytorch/issues/80158
Fixes https://github.com/pytorch/pytorch/issues/78867
Fixes https://github.com/pytorch/pytorch/issues/69230

Supersedes https://github.com/pytorch/pytorch/pull/79007
Supersedes https://github.com/pytorch/pytorch/pull/69212
Supersedes https://github.com/pytorch/pytorch/pull/19659
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80334
Approved by: https://github.com/ezyang
2022-07-13 20:07:36 +00:00
Kurt Mohler
23bdb570cf Reland: Enable dim=None for torch.sum (#79881)
Part of #29137

Reland of #75845
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79881
Approved by: https://github.com/albanD, https://github.com/kulinseth
2022-07-09 00:54:42 +00:00
PyTorch MergeBot
f2c8557521 Revert "Make kl_div a composite function. (#80334)"
This reverts commit 828c787ea9.

Reverted https://github.com/pytorch/pytorch/pull/80334 on behalf of https://github.com/ezyang due to doesn't work with xla
2022-07-06 17:51:06 +00:00
lezcano
828c787ea9 Make kl_div a composite function. (#80334)
Benchmarks: https://github.com/pytorch/pytorch/pull/80334#issuecomment-1167229285

Fixes https://github.com/pytorch/pytorch/issues/80158
Fixes https://github.com/pytorch/pytorch/issues/78867
Fixes https://github.com/pytorch/pytorch/issues/69230

Supersedes https://github.com/pytorch/pytorch/pull/79007
Supersedes https://github.com/pytorch/pytorch/pull/69212
Supersedes https://github.com/pytorch/pytorch/pull/19659
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80334
Approved by: https://github.com/ezyang
2022-07-04 19:33:43 +00:00
lezcano
37a5819665 Make slogdet, linalg.sloget and logdet support metatensors (#79742)
This PR also adds complex support for logdet, and makes all these
functions support out= and be composite depending on one function. We
also extend the support of `logdet` to complex numbers and improve the
docs of all these functions.

We also use `linalg_lu_factor_ex` in these functions, so we remove the
synchronisation present before.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79742
Approved by: https://github.com/IvanYashchuk, https://github.com/albanD
2022-07-01 16:09:21 +00:00
Hao Zhuang
0ca9888000 Correct the math of repeat_backward in the function comment (#80286)
Correct the math of repeat_backward in the function comment.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80286
Approved by: https://github.com/albanD
2022-06-28 16:22:46 +00:00
lezcano
42a2359612 Add forward AD for linalg.det and simplify its backward (#79487)
This PR is in preparation for implementing `logdet` and `slogdet` as
structured kernels + implementing them with more efficient derivatives

We implement forward AD for det. We also simplify the implementation of
the backward, and leave a note on how to implement it properly for
singular matrices. We leave thad for future work.

Note (by looking at the OpInfo) that the current implementation passes
the same tests as the one before. We skip the forward-over-backward in
the singular case, as that one was not working in the gradgrad case
either.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79487
Approved by: https://github.com/nikitaved, https://github.com/albanD
2022-06-24 14:15:17 +00:00
lezcano
44ff6be35a Fix backward of binary_cross_entropy_with_logits
The previous PR in this stack uncovered an error in the forward over
backward for this function.

In this PR, we fix this error and we also fix the gradgrad
implementation (and make it more stable and faster using `logsigmoid`).
We also move the double backward for this function to `FunctoinsManual`
as there's no reason for it to be in `native_functions`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80083

Approved by: https://github.com/zou3519
2022-06-23 01:31:08 +00:00
lezcano
f54e7b4ad6 More forward AD formulas
This PR:
- Corrects the forward AD formula of `torch.sgn`.
  - The reason why we can't use `auto_element_wise` for this operations is rather subtle. I left a comment.
  - This, in turn, fixes a problem we had in forward-over-backward for `linalg.svd` and other spectral decompositions (and `norm`, `linalg.norm`, `linalg.matrix_norm`) that were using `torch.abs` (whose derivative is given by `torch.sgn`.
- Implement the formula for a number of missing operations `nansum`, `amax`, `amin`...
- Simplified a few formulas, most notably the forward AD for `div` and the derivative of `norm`, `linalg.norm` and `vector_norm` for `ord=+-inf`.
- Correct the formula for `mean`, `std_mean`, `var_mean` when `dim` is provided and equal to `()` (or `None`)
- A few minor improvements to `sum_backward`, `unsqueeze_multiple` and formulas depending on them
- Fix the derivatives of `std_mean` and `std_var` (complex support,
ASAN, forward AD...)

Fixes: https://github.com/pytorch/pytorch/issues/67539

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80082

Approved by: https://github.com/zou3519
2022-06-23 01:31:08 +00:00
PyTorch MergeBot
e3d0a3ca88 Revert "More forward AD formulas"
This reverts commit 6b20ef6b91.

Reverted https://github.com/pytorch/pytorch/pull/77975 on behalf of https://github.com/janeyx99 due to I think this is the real culprit of the broken tests in 28a7ee8cec for the trunk-only slow test job
2022-06-22 19:30:02 +00:00
PyTorch MergeBot
942c371bbc Revert "Fix backward of binary_cross_entropy_with_logits"
This reverts commit 28a7ee8cec.

Reverted https://github.com/pytorch/pytorch/pull/79381 on behalf of https://github.com/janeyx99 due to Sorry, 28a7ee8cec this PR breaks trunk-only slow test job
2022-06-22 17:41:09 +00:00
lezcano
28a7ee8cec Fix backward of binary_cross_entropy_with_logits
The previous PR in this stack uncovered an error in the forward over
backward for this function.

In this PR, we fix this error and we also fix the gradgrad
implementation (and make it more stable and faster using `logsigmoid`).
We also move the double backward for this function to `FunctoinsManual`
as there's no reason for it to be in `native_functions`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79381

Approved by: https://github.com/soulitzer
2022-06-22 14:28:56 +00:00
lezcano
6b20ef6b91 More forward AD formulas
This PR:
- Corrects the forward AD formula of `torch.sgn`.
  - The reason why we can't use `auto_element_wise` for this operations is rather subtle. I left a comment.
  - This, in turn, fixes a problem we had in forward-over-backward for `linalg.svd` and other spectral decompositions (and `norm`, `linalg.norm`, `linalg.matrix_norm`) that were using `torch.abs` (whose derivative is given by `torch.sgn`.
- Implement the formula for a number of missing operations `nansum`, `amax`, `amin`...
- Simplified a few formulas, most notably the forward AD for `div` and the derivative of `norm`, `linalg.norm` and `vector_norm` for `ord=+-inf`.
- Correct the formula for `mean`, `std_mean`, `var_mean` when `dim` is provided and equal to `()` (or `None`)
- A few minor improvements to `sum_backward`, `unsqueeze_multiple` and formulas depending on them
- Fix the derivatives of `std_mean` and `std_var` (complex support,
ASAN, forward AD...)

Fixes: https://github.com/pytorch/pytorch/issues/67539

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77975

Approved by: https://github.com/soulitzer
2022-06-22 14:28:56 +00:00
Driss Guessous
a098937c20 Add factory function derivatives (#79872)
Adding derivatives for factory functions, this issue is used for tracking: #79044

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79872
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
2022-06-21 00:53:11 +00:00
lezcano
16f30b494c Make l1_loss composite
Fixing the forward AD for `sgn` in the next PR of this stack uncovered a
number of issues with the derivatives of `l1_loss`. Upon inspection,
`l1_loss` was just implemented as a composite function, but it was not
differentiable. This PR makes it a fully differentiable function.

As a side note, `l1_loss_out` was incorrect in a number of ways. Even
more, it is not exposed to the public as `F.l1_loss` does not accept an
`out=` parameter. As such it is not even tested. I wonder how useful is
to have `out=` variants for loss functions if we don't expose them at
all. Even more, I wonder how useful is to have `_out` variants  for loss
functions, given that their most normal use case is to return just a
real number cc jbschlosser

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79804

Approved by: https://github.com/zou3519, https://github.com/malfet
2022-06-20 19:10:54 +00:00
PyTorch MergeBot
d4a9438786 Revert "Make l1_loss composite"
This reverts commit 61a5c779bf.

Reverted https://github.com/pytorch/pytorch/pull/78257 on behalf of https://github.com/malfet due to This breaks executorch
2022-06-17 18:14:21 +00:00
Kshiteej K
04b98df87a [fix] composite compliance: eig, eigh, symeig (#79698)
Ref: https://github.com/pytorch/pytorch/issues/69991
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79698
Approved by: https://github.com/Lezcano, https://github.com/albanD
2022-06-17 14:13:04 +00:00
PyTorch MergeBot
ee6ebfc06b Revert "Enable dim=None for torch.sum (#75845)"
This reverts commit e79a51f7db.

Reverted https://github.com/pytorch/pytorch/pull/75845 on behalf of https://github.com/malfet due to Breaks MacOS builds, see e79a51f7db
2022-06-16 22:01:41 +00:00
Kurt Mohler
e79a51f7db Enable dim=None for torch.sum (#75845)
Part of #29137

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75845
Approved by: https://github.com/ezyang
2022-06-16 20:17:07 +00:00
lezcano
61a5c779bf Make l1_loss composite
Fixing the forward AD for `sgn` in the next PR of this stack uncovered a
number of issues with the derivatives of `l1_loss`. Upon inspection,
`l1_loss` was just implemented as a composite function, but it was not
differentiable. This PR makes it a fully differentiable function.

As a side note, `l1_loss_out` was incorrect in a number of ways. Even
more, it is not exposed to the public as `F.l1_loss` does not accept an
`out=` parameter. As such it is not even tested. I wonder how useful is
to have `out=` variants for loss functions if we don't expose them at
all. Even more, I wonder how useful is to have `_out` variants  for loss
functions, given that their most normal use case is to return just a
real number cc jbschlosser

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78257

Approved by: https://github.com/jbschlosser
2022-06-16 00:03:22 +00:00
Michael Suo
30fb2c4aba [lint] autoformat test/cpp and torch/csrc
Let's have some fun.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78828

Approved by: https://github.com/ezyang
2022-06-11 21:11:16 +00:00
lezcano
54949a5abc Simplify and optimize linalg.solve
This PR heavily simplifies the code of `linalg.solve`. At the same time,
this implementation saves quite a few copies of the input data in some
cases (e.g. A is contiguous)

We also implement it in such a way that the derivative goes from
computing two LU decompositions and two LU solves to no LU
decompositions and one LU solves. It also avoids a number of unnecessary
copies the derivative was unnecessarily performing (at least the copy of
two matrices).

On top of this, we add a `left` kw-only arg that allows the user to
solve `XA = B` rather concisely.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/74046

Approved by: https://github.com/nikitaved, https://github.com/IvanYashchuk, https://github.com/mruberry
2022-06-11 04:06:40 +00:00
PyTorch MergeBot
3556457dd2 Revert "kl_div: fix for grads wrt target, double backward, forward-over-reverse AD support. (#79007)"
This reverts commit 72ad222cff.

Reverted https://github.com/pytorch/pytorch/pull/79007 on behalf of https://github.com/janeyx99 due to Broke test_fn_fwgrad_bwgrad_nn_functional_kl_div_cpu_float64 on trunk https://hud.pytorch.org/minihud?name_filter=pull%20/%20linux-xenial-py3.7-clang7-asan%20/%20test%20(default,%202,%205,%20linux.2xlarge)
2022-06-09 13:07:03 +00:00
Nikita Vedeneev
72ad222cff kl_div: fix for grads wrt target, double backward, forward-over-reverse AD support. (#79007)
Fixes https://github.com/pytorch/pytorch/issues/78867,
fixes https://github.com/pytorch/pytorch/issues/65466.
Adds forward-over-reverse AD support.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/79007
Approved by: https://github.com/soulitzer, https://github.com/jbschlosser
2022-06-09 09:06:52 +00:00
lezcano
c7d6cec078 Add linalg.lu_solve
This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA
when calling the batched MAGMA backend with trans=True. We work around
that by solving the system solving two triangular systems.

We also update the heuristics for this function, as they were fairly
updated. We found that cuSolver is king, so luckily we do not need to
rely on the buggy backend from magma for this function.

We added tests testing this function left and right. We also added tests
for the different backends. We also activated the tests for AMD, as
those should work as well.

Fixes https://github.com/pytorch/pytorch/issues/61657

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77634

Approved by: https://github.com/malfet
2022-06-07 22:28:28 +00:00
Nikita Vedeneev
a4509f5b72 More forward-over-reverse implementations. (#78740)
Umbrella issue: https://github.com/pytorch/pytorch/issues/75432.

This one implements forward-over-reverse for:

* mse_loss
* l1_loss
* smooth_l1_loss
* softplus
* hardswish (also adds double backward support)
* prelu

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78740
Approved by: https://github.com/soulitzer
2022-06-03 15:44:06 +00:00
Brian Hirsh
5cc258ec9e make block_diag composite compliant
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77716

Approved by: https://github.com/zou3519
2022-05-26 16:15:42 +00:00
Nikita Vedeneev
3924d56fae BCE loss: forward-over-reverse AD support (#77852)
Umbrella issue: https://github.com/pytorch/pytorch/issues/75432

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77852
Approved by: https://github.com/soulitzer
2022-05-26 14:36:52 +00:00
Brian Hirsh
07e4533403 reland of as_strided support for functionalization; introduce as_strided_scatter
This reverts commit a95f1edd85.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78199

Approved by: https://github.com/ezyang
2022-05-24 22:40:44 +00:00
PyTorch MergeBot
a95f1edd85 Revert "as_strided support for functionalization; introduce as_strided_scatter"
This reverts commit 3a921f2d26.

Reverted https://github.com/pytorch/pytorch/pull/77128 on behalf of https://github.com/suo due to This broke rocm tests on master 3a921f2d26. rocm tests are no longer run on PRs, you should add a `ciflow/trunk` label if you want to run them
2022-05-24 20:19:12 +00:00
Brian Hirsh
3a921f2d26 as_strided support for functionalization; introduce as_strided_scatter
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77128

Approved by: https://github.com/ezyang
2022-05-24 18:20:31 +00:00
lezcano
0c8c39fa71 Fix derivatives of norm(p=inf)
Following up on https://github.com/pytorch/pytorch/pull/51099#discussion_r583323915, we fix these derivatives, as they were incorrect until now.

As described in the note, the better solution would be to use vectorised operations on the preprocessing operation when reducing on CPU. It's not clear how difficult that may be.

Fixes https://github.com/pytorch/pytorch/issues/67517

Pull Request resolved: https://github.com/pytorch/pytorch/pull/78105

Approved by: https://github.com/ngimel
2022-05-24 17:16:16 +00:00
lezcano
e0295f55b5 Fix derivatives for linalg.vector_norm(..., dtype=)
As per title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76551

Approved by: https://github.com/albanD
2022-05-19 21:17:18 +00:00
PyTorch MergeBot
7a4e3f329f Revert "Fix derivatives for linalg.vector_norm(..., dtype=)"
This reverts commit 13d8fb93bb.

Reverted https://github.com/pytorch/pytorch/pull/76551 on behalf of https://github.com/seemethere due to Reverting the entire stack, errors originated from
* https://github.com/pytorch/pytorch/pull/76547

Failed internal builds due to ([Link for Meta Employees](https://www.internalfb.com/diff/D36494019?selected_signal=c2FuZGNhc3RsZV93b3JrZmxvd19ydW46MTgwMTQzOTg1MTUzNTQ3NzQ%3D&selected_signal_verification_phase=1&dst_version_fbid=1211273672948052)):
```
aten/src/ATen/native/LinearAlgebra.cpp:2496:9: error: unused type alias 'Int' [-Werror,-Wunused-local-typedef]
  using Int = IntArrayRef::value_type;
        ^
1 error generated.
Command failed with exit code 1.
```
2022-05-19 21:04:23 +00:00
Nikita Vedeneev
7945fa6ce2 BCE loss: forward ad support (#77755)
As per title + BCE with logits gets a simpler implementation.
Relevant for https://github.com/pytorch/pytorch/issues/71117

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77755
Approved by: https://github.com/soulitzer
2022-05-19 13:13:58 +00:00
lezcano
13d8fb93bb Fix derivatives for linalg.vector_norm(..., dtype=)
As per title

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76551

Approved by: https://github.com/mruberry
2022-05-18 11:46:50 +00:00
Nikita Vedeneev
a760dc2687 binary_cross_entropy: double backwart wrt target (#77416)
As per title. An effort to make `binary_cross_entropy` all around differentiable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77416
Approved by: https://github.com/soulitzer
2022-05-18 10:29:27 +00:00
lezcano
369d9f4137 A few forward AD formulas
It includes all-time favourites like:
- `put`
- `nn.functional.embedding`
- `prelu`
- `nn.functional.bilinear`
- `nn.functional.rrelu`
- `nn.functional.logsigmoid`

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77421

Approved by: https://github.com/soulitzer
2022-05-17 15:55:51 +00:00
Mikayla Gawarecki
7ba4e124e6 Bugfix gradient formula for index_reduce('prod') + separate out sample_inputs for index_reduce
Pull Request resolved: https://github.com/pytorch/pytorch/pull/77382

Approved by: https://github.com/cpuhrsch
2022-05-16 18:43:57 +00:00
Mikayla Gawarecki
841c65f499 Unprivate _index_reduce and add documentation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76997

Approved by: https://github.com/cpuhrsch
2022-05-13 19:48:38 +00:00
jiayisun
97deda4f28 add BFloat16 support for logcumsumexp on CPU (#72694)
Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72694
Approved by: https://github.com/VitalyFedyunin, https://github.com/frank-wei
2022-05-12 17:10:28 +00:00
Ivan Yashchuk
545d90f032 Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm
This PR updates the derivative rule for `torch.sparse.addmm` to be
working with CSR sparse matrix. Notably `torch.sparse.sampled_addmm` is
used in the backward function.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76591

Approved by: https://github.com/cpuhrsch
2022-05-11 18:57:40 +00:00
PyTorch MergeBot
f94abd59f7 Revert "Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm"
This reverts commit 721a8ca697.

Reverted https://github.com/pytorch/pytorch/pull/76591 on behalf of https://github.com/janeyx99
2022-05-10 13:21:46 +00:00
Ivan Yashchuk
721a8ca697 Sparse CSR: enable autograd for torch.sparse.addmm and torch.sparse.mm
This PR updates the derivative rule for `torch.sparse.addmm` to be
working with CSR sparse matrix. Notably `torch.sparse.sampled_addmm` is
used in the backward function.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76591

Approved by: https://github.com/cpuhrsch
2022-05-10 08:44:55 +00:00
PyTorch MergeBot
4ebc4890dd Revert "Add linalg.lu_solve"
This reverts commit fc5b4a5a33.

Reverted https://github.com/pytorch/pytorch/pull/72935 on behalf of https://github.com/malfet
2022-05-09 19:12:30 +00:00
Mikayla Gawarecki
465e0ae266 Bugfix scatter_reduce backward formulas
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76523

Approved by: https://github.com/albanD
2022-05-05 20:22:39 +00:00
lezcano
fc5b4a5a33 Add linalg.lu_solve
This PR adds `linalg.lu_solve`. While doing so, I found a bug in MAGMA
when calling the batched MAGMA backend with trans=True. We work around
that by solving the system solving two triangular systems.

We also update the heuristics for this function, as they were fairly
updated. We found that cuSolver is king, so luckily we do not need to
rely on the buggy backend from magma for this function.

We added tests testing this function left and right. We also added tests
for the different backends. We also activated the tests for AMD, as
those should work as well.

Fixes https://github.com/pytorch/pytorch/issues/61657

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72935

Approved by: https://github.com/IvanYashchuk, https://github.com/mruberry
2022-05-05 19:02:13 +00:00
Nikita Vedeneev
33fabe9a2e functional.max_unpool: OpInfo tests + simpler backward + forward ad + fwad over backward ad
Resolves https://github.com/pytorch/pytorch/issues/67657, https://github.com/pytorch/pytorch/issues/67658, https://github.com/pytorch/pytorch/issues/67660.

These are not necessarily bugs because we cannot produce arbitrary samples coming from `max_pool` to the gradcheck's eternal satisfaction.

This PR also replaces low-level complicated backward kernels with much simpler high-level and well-tested counterparts. The replacement is also faster (before: parallel for loop, after: memory layout optimized TensorIterator's parallelization coming from `gather`).

cc @albanD @mruberry @jbschlosser @walterddr
Pull Request resolved: https://github.com/pytorch/pytorch/pull/68625
Approved by: https://github.com/albanD
2022-05-05 10:13:51 +00:00
lezcano
7cb7cd5802 Add linalg.lu
This PR modifies `lu_unpack` by:
- Using less memory when unpacking `L` and `U`
- Fuse the subtraction by `-1` with `unpack_pivots_stub`
- Define tensors of the correct types to avoid copies
- Port `lu_unpack` to be a strucutred kernel so that its `_out` version
does not incur on extra copies

Then we implement `linalg.lu` as a structured kernel, as we want to
compute its derivative manually. We do so because composing the
derivatives of `torch.lu_factor` and `torch.lu_unpack` would be less efficient.

This new function and `lu_unpack` comes with all the things it can come:
forward and backward ad, decent docs, correctness tests, OpInfo, complex support,
support for metatensors and support for vmap and vmap over the gradients.

I really hope we don't continue adding more features.

This PR also avoids saving some of the tensors that were previously
saved unnecessarily for the backward in `lu_factor_ex_backward` and
`lu_backward` and does some other general improvements here and there
to the forward and backward AD formulae of other related functions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67833

Approved by: https://github.com/IvanYashchuk, https://github.com/nikitaved, https://github.com/mruberry
2022-05-05 09:17:05 +00:00
lezcano
1a4eea57be Improve derivative of QR decomposition
We derive and implement a more concise rule for the forward and backward
derivatives of the QR decomposition. While doing this we:
- Fix the composite compliance of `linalg.qr` and we make it support batches
- Improve the performance and simplify the implementation of both foward and backward
- Avoid saving the input matrix for the backward computation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76115

Approved by: https://github.com/nikitaved, https://github.com/albanD
2022-05-05 09:14:57 +00:00
Richard Zou
71ae190b87 [composite compliance] Fix a bunch of fft backwards
Replaced `at::zeros(..., grad.options()).slice().copy_(grad))`
with `grad.new_zeros(..., grad.options()).slice().copy_(grad))`

Test Plan:
- run tests

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76573

Approved by: https://github.com/ngimel, https://github.com/albanD
2022-05-03 00:07:30 +00:00
Mikayla Gawarecki
676a4a3969 Prototype _index_reduce (CPU-only)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75981

Approved by: https://github.com/cpuhrsch
2022-04-27 23:01:00 +00:00
Richard Zou
9cb2871f31 Fix forward-mode AD formula for binary_cross_entropy_with_logits
The problem was that `grad_input` and `grad_target` may be ZeroTensors,
which are immutable. This PR changes it so that operations on grad_input
and grad_target in `binary_cross_entropy_with_logits_jvp` are no longer
in-place.

Test Plan:
- run existing tests

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76322
Approved by: https://github.com/soulitzer
2022-04-25 22:30:57 +00:00
lezcano
441aea4127 Update Choesky's forward and backward derivative
This PR:
- Derives formally a new rule for Cholesky (write-up to come)
- Implements it without using in-place operations in the forward or backward.
- Does not instantiate inverses explicitly, but rather it solves two triangular systems of equations (2 triang vs 1 triang and 2 matmuls should be comparable, but the first one should be more stable).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76032

Approved by: https://github.com/nikitaved, https://github.com/albanD
2022-04-22 00:45:38 +00:00
Nikita Shulga
f6c275f55d Remove -Wno-unused-variable from utils.cmake (take 2) (#75538)
Summary:
[Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with  top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there.

Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block.

Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538

Reviewed By: anjali411

Differential Revision: D35747333

Pulled By: malfet

fbshipit-source-id: 3fc5828e44a4c05ba0e89e92613e6ebbdb260626
(cherry picked from commit c179fba21cfa2a0093fad50ccad5a22dd7cff52c)
2022-04-20 17:41:59 +00:00
Ivan Yashchuk
bba4780232 Enable autograd wrt sparse CSR tensors
This pull request enables accumulating gradients for the CSR tensor.
Functions that work and are tested:
- tensor.abs()
- tensor.neg()
- tensor.conj_physical()
- torch.addmm

`torch.mm` also works, but tests will be added later.

In addition, this PR adds throwing an error when trying to access strides, storage, and contiguity info on a CSR tensor.

`tensor.to_sparse_csr().to_sparse_csr()` was failing and now fixed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75435
Approved by: https://github.com/cpuhrsch
2022-04-19 18:42:45 +00:00
PyTorch MergeBot
5c56b2286b Revert "Remove -Wno-unused-variable from utils.cmake"
This reverts commit 018cbe1f5c.

Reverted https://github.com/pytorch/pytorch/pull/75538 on behalf of https://github.com/seemethere
2022-04-19 17:19:09 +00:00
Nikita Shulga
018cbe1f5c Remove -Wno-unused-variable from utils.cmake
[Comment](https://github.com/pytorch/pytorch/pull/62445/files#r680132022) claims, it got added for consistency with  top level CMakeLists.txt, but `-Wno-unused-variable` is not mentioned there.

Modify violations in 50+ files that were added in the interim by either removing unused variables, or decorating the code with `C10_UNUSED` if local variable is likely used to extend object lifetime until the end of the block.

Caused preventable revert in https://github.com/pytorch/pytorch/pull/72633#issuecomment-1092300787

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75538
Approved by: https://github.com/cpuhrsch
2022-04-19 15:26:55 +00:00
Peter Bell
cc56fac213 Fix complex to real casting warning in _to_copy backward
Fixes #75781

A Real->Complex cast should result in a gradient with no imaginary
component, so discarding the imaginary component is expected.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/75805

Approved by: https://github.com/albanD
2022-04-19 14:04:13 +00:00
soulitzer
8721abc429 Add forward AD support for norm, dist, F.pairwise_dist, F.normalize
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74205

Approved by: https://github.com/albanD
2022-04-13 15:03:20 +00:00
soulitzer
76614b3a33 Test linalg vector norm subgradient
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75103

Approved by: https://github.com/albanD
2022-04-12 20:54:30 +00:00
anjali411
91d134093e Add fastpath for stack and cat JVP computation
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75590

Approved by: https://github.com/albanD, https://github.com/soulitzer
2022-04-11 18:10:09 +00:00
soulitzer
b10d151745 Ensure convolution_backward respects output_mask
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75298

Approved by: https://github.com/albanD
2022-04-08 19:27:41 +00:00
Mikayla Gawarecki
e9a8e6f74a Add include_self flag to scatter_reduce
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74607

Approved by: https://github.com/cpuhrsch
2022-04-05 16:31:39 +00:00
Nikita Vedeneev
5b142ce5ce cholesky_inverse: complex autograd, forward AD and correct tests.
As per title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75033
Approved by: https://github.com/soulitzer
2022-04-01 20:31:03 +00:00
Mikayla Gawarecki
2bfa018462 [BC-breaking] Use ScatterGatherKernel for scatter_reduce (CPU-only) (#74226)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74226

Update signature of `scatter_reduce_` to match `scatter_/scatter_add_`

`Tensor.scatter_reduce_(int64 dim, Tensor index, Tensor src, str reduce)`

- Add new reduction options in ScatterGatherKernel.cpp and update `scatter_reduce` to call into the cpu kernel for `scatter.reduce`
- `scatter_reduce` now has the same shape constraints as `scatter_` and `scatter_add_`
- Migrate `test/test_torch.py:test_scatter_reduce` to `test/test_scatter_gather_ops.py`

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D35222842

Pulled By: mikaylagawarecki

fbshipit-source-id: 84930add2ad30baf872c495251373313cb7428bd
(cherry picked from commit 1b45139482e22eb0dc8b6aec2a7b25a4b58e31df)
2022-04-01 05:57:45 +00:00
Kurt Mohler
5375b2e994 Resolve int[]? arguments to new OptionalIntArrayRef class
This PR uses the `OptionalArrayRef` template class that was drafted in #64084.

Fixes #44409
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70864
Approved by: https://github.com/ezyang
2022-03-26 01:45:50 +00:00
soulitzer
a4c81b13f3 Add forward AD support for clamp when bounds are tensors
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74042

Approved by: https://github.com/albanD
2022-03-24 14:31:40 +00:00
soulitzer
de73f9a558 Add forward AD support for logsumexp, log_softmax, softmax, nll_loss, and cross_entropy (#73741)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73741

There are probably more perf improvements that can be made, for example reusing more quantities from forward, doing more things inplace, but in the spirit of improving coverage, this is probably OK for now.

Note: I didn't do anything with half_to_float, but CUDA (locally) hasn't complained yet

Test Plan: Imported from OSS

Reviewed By: ejguan

Differential Revision: D34690141

Pulled By: soulitzer

fbshipit-source-id: fe934e191fee2c8e956d7a5f4b553923adf1b33f
(cherry picked from commit ae49aff7f7c8496e04a3ce7667d8f068ca0a52ec)
2022-03-08 00:46:27 +00:00
soulitzer
e6afa4f771 batch_norm_jvp: improve error message when running_{mean,var} have forward grad defined (#73655)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73655

Fixes: https://github.com/pytorch/pytorch/issues/73541

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D34586758

Pulled By: soulitzer

fbshipit-source-id: 689dba3ac159e50b596381c27e23ef1fd8122a40
(cherry picked from commit 81ea860fbe3c217b0100730f4b74e8d5f9bf1b61)
2022-03-02 21:31:29 +00:00
Xiao Wang
89b4cfb49f Disable TF32 in some linalg functions (#73460)
Summary:
Disable TF32 in some linalg functions

See also https://github.com/pytorch/pytorch/issues/67948 #50453 https://github.com/pytorch/pytorch/issues/44240

Pull Request resolved: https://github.com/pytorch/pytorch/pull/73460

Reviewed By: albanD

Differential Revision: D34493487

Pulled By: ngimel

fbshipit-source-id: 958cd968ea09df3b5a4d2b4a26aaf0dfddc53981
(cherry picked from commit cd75ec645b86c4b4a66c35696ce891d006f3833b)
2022-02-28 23:28:52 +00:00
Ansley Ussery
e4214929c5 Port amax to structured kernel (#72124)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72124

Reviewed By: bdhirsh

Differential Revision: D34215708

Pulled By: ansley

fbshipit-source-id: fee887e331cb8bd9fab3d9d958ff13ac8d07be27
(cherry picked from commit 94dbb5b7e7)
2022-02-16 06:33:09 +00:00
Ryan Spring
4f8b986e28 Implement Tanh Gelu Approximation (#61439)
Summary:
1. Implements https://github.com/pytorch/pytorch/issues/39853
2. Adds approximate boolean flag to Gelu
3. Enables Tanh Gelu approximation
4. Adds double backward support for Gelu
5. Enable Tanh Gelu in NvFuser

```
def gelu(x, approximate : str = 'none'):
    if approximate == 'tanh':
        # sqrt(2/pi) = 0.7978845608028654
        return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0))))
    else:
        return x * normcdf(x)
```

Linking XLA PR - https://github.com/pytorch/xla/pull/3039

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439

Reviewed By: VitalyFedyunin

Differential Revision: D33894937

Pulled By: jbschlosser

fbshipit-source-id: b65e8fb6ea66168af8f34f45ed50e92737a33851
(cherry picked from commit 6e986f91a9)
2022-02-14 03:40:32 +00:00
lezcano
bf09ece782 Make svd / svdvals fully functorch compatible (#72181)
Summary:
This should (hopefully) make all the CI from `functorch` go green (including jvp's!) after changing `VARIADIC_BDIMS_BOXED(_svd_helper);` with `VARIADIC_BDIMS_BOXED(_linalg_svd);` and removing all the skip and xfails associated to `linalg.svdvals`.

Locally, there's just one test that started failing because of this, and that is `test_vmapjvpall_norm_nuc_cpu_float32`. I have no idea what's going on here, but it's a jvp product, so not a regression, and it might very well be caused by the jvp of other operation within `norm_nuc` as this is a composite operation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72181

Reviewed By: ngimel

Differential Revision: D33952744

Pulled By: zou3519

fbshipit-source-id: 2a2510d97eed4a0bfc25615264ddd36e38856efe
(cherry picked from commit 5805fa107c)
2022-02-03 03:21:22 +00:00
Nikita Shulga
74c44ba9d6 Revert D33850228: [pytorch][PR] Implement Tanh Gelu Approximation
Test Plan: revert-hammer

Differential Revision:
D33850228 (23d03025dc)

Original commit changeset: 3cc33fb298e4

Original Phabricator Diff: D33850228 (23d03025dc)

fbshipit-source-id: 9436e7df73c2b2e2011f321674f24973316d3692
(cherry picked from commit c9efb58223)
2022-01-31 17:44:19 +00:00
Ryan Spring
23d03025dc Implement Tanh Gelu Approximation (#61439)
Summary:
1. Implements https://github.com/pytorch/pytorch/issues/39853
2. Adds approximate boolean flag to Gelu
3. Enables Tanh Gelu approximation
4. Adds double backward support for Gelu
5. Enable Tanh Gelu in NvFuser

```
def gelu(x, approximate : str = 'none'):
    if approximate == 'tanh':
        # sqrt(2/pi) = 0.7978845608028654
        return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0))))
    else:
        return x * normcdf(x)
```

Linking XLA PR - https://github.com/pytorch/xla/pull/3039

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439

Reviewed By: cpuhrsch

Differential Revision: D33850228

Pulled By: jbschlosser

fbshipit-source-id: 3cc33fb298e480d7ecc5c67716da019d60c6ab33
(cherry picked from commit 3a53b3e94f)
2022-01-31 17:07:45 +00:00
Joel Schlosser
cb823d9f07 Revert D33744717: [pytorch][PR] Implement Tanh Gelu Approximation
Test Plan: revert-hammer

Differential Revision:
D33744717 (f499ab9cef)

Original commit changeset: d64532a562ed

Original Phabricator Diff: D33744717 (f499ab9cef)

fbshipit-source-id: 396c3f63de5865f894dbc353d0790a01a624be93
(cherry picked from commit e9fb2d1db1)
2022-01-28 18:35:01 +00:00
Ryan Spring
f499ab9cef Implement Tanh Gelu Approximation (#61439)
Summary:
1. Implements https://github.com/pytorch/pytorch/issues/39853
2. Adds approximate boolean flag to Gelu
3. Enables Tanh Gelu approximation
4. Adds double backward support for Gelu
5. Enable Tanh Gelu in NvFuser

```
def gelu(x, approximate : str = 'none'):
    if approximate == 'tanh':
        # sqrt(2/pi) = 0.7978845608028654
        return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0))))
    else:
        return x * normcdf(x)
```

Linking XLA PR - https://github.com/pytorch/xla/pull/3039

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439

Reviewed By: mikaylagawarecki

Differential Revision: D33744717

Pulled By: jbschlosser

fbshipit-source-id: d64532a562ed53247bb4fa52bb16722634d5c187
(cherry picked from commit 4713dd9cca)
2022-01-28 16:59:09 +00:00
kshitij12345
de44a50f14 index_backward: use out-of-place index_put if any input is subclass (#71779)
Summary:
Reference: https://github.com/pytorch/functorch/issues/393

Context :

The derivative of `__getitem__`/`index` is
f5a71ec2d6/tools/autograd/derivatives.yaml (L733-L734)

where `index_backward` is defined as
f5a71ec2d6/torch/csrc/autograd/FunctionsManual.cpp (L3892-L3894)

Problem arises when `grad` is not BatchedTensor but one of the other input is. In that case, `grad.new_zeros` returns an unbatched tensor and call to the inplace `_index_put_impl_` errors as it expects `zeros_like_self` to be Batched.

To avoid this, we dispatch to out-of-place `index_put` if any of the input tensor is subclassed otherwise we dispatch to the inplace `_index_put_impl_`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71779

Reviewed By: albanD

Differential Revision: D33790596

Pulled By: zou3519

fbshipit-source-id: 9d6d81b758740cab7b3db9b905f1e8053f82b835
(cherry picked from commit ba0407a86e)
2022-01-28 16:19:34 +00:00
soulitzer
51ae9ccba4 Fix forward AD for cudnn batch norm (#71901)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71901

We didn't catch this initially because CuDNN is not being tested on CI.

The following tests fail on master (if we build with CuDNN), but pass with this PR:
- `test_forward_mode_AD_nn_functional_batch_norm_cuda_float64`
- `test_forward_mode_AD_nn_functional_instance_norm_cuda_float64`

I don't think it is documented anywhere, but from the tests passing now I'm going to guess `result1` and `result2` return `mean` and `invstd` respectively. Previously, I thought mean and variance were returned because the variables were named `saved_mean` and `saved_var`.

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33818652

Pulled By: soulitzer

fbshipit-source-id: ecee760f5aec620dc70f57de4fb3573c8f2f5f31
(cherry picked from commit 73fd3e021c)
2022-01-27 23:55:37 +00:00
lezcano
8ff1a8fdca Implement forward AD for linalg.svd and improve svd_backward (#70253)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70253

I included a derivation of the formula in the complex case, as it is
particularly tricky. As far as I know, this is the first time this formula
is derived in the literature.

I also implemented a more efficient and more accurate version of svd_backward.
More importantly, I also added a lax check in the complex case making sure the loss
function just depends on the subspaces spanned by the pairs of singular
vectors, and not their joint phase.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33751982

Pulled By: mruberry

fbshipit-source-id: c2a4a92a921a732357e99c01ccb563813b1af512
(cherry picked from commit 391319ed8f)
2022-01-27 18:38:30 +00:00
lezcano
84f1685397 Rewrite svd and linalg.svd as structured kernels (#69827)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69827

In general, the current pattern allows for implementing optimisations
for all the backends in a common place (see for example the optimisation
for empty matrices).

After this PR, `torch.svd` is implemented in terms of `linalg.svd` and
`linalg.svdvals`, as expected. This makes it differentiable in the case
when `compute_uv=False`, although this is not particularly important, as
`torch.svd` will eventually be deprecated.

This PR also instantiates smaller `U` / `V` when calling cusolver_gesvdj
in the cases when `full_matrices=False` or `compute_uv=False`.

The memory for auxiliary `U` and `V` in the cases above, needed for some
cuSOLVER routines is allocated raw allocators rather than through fully
fledged tensors, as it's just a blob of memory the algorithm requests.
As the code is better structured now, it was easier to see that `U` and
`Vh` needn't be allocated when calling `svd_cusolver_gesvd`.

Now `linalg.svdvals` work as expected wrt the `out=` parameter.
Note that in the test `test_svd_memory_allocation` we were
passing a tensor of the wrong size and dtype and the test seemed to
pass...

This PR also changes the backward formula to avoid saving the input
matrix, as it's not necessary. In a follow up PR, I will clean the
backward formula and make it more numerically stable and efficient.

This PR also does a number of memory optimisations here and there, and fixes
the call to cusolver_gesvd, which were incorrect for m <= n. To test
this path, I compiled the code with a flag to unconditionally execute
the `if (!gesvdj_convergence_check.empty())` branch, and all the tests
passed.

I also took this chance to simplify the tests for these functions in
`test_linalg.py`, as we had lots of tests that were testing some
functionality that is already currently tested in the corresponding
OpInfos. I used xwang233's feature to test both MAGMA and CUDA
backends. This is particularly good for SVD, as cuSOLVER is always
chosen over MAGMA when available, so testing MAGMA otherwise would be
tricky.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33751983

Pulled By: mruberry

fbshipit-source-id: 11d48d977946345583d33d14fb11a170a7d14fd2
(cherry picked from commit a1860bd567)
2022-01-27 18:38:30 +00:00
Mikayla Gawarecki
09c417ae65 Add new reduce options and autograd support for scatter_reduce (#71788)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71788

Test Plan: Imported from OSS

Reviewed By: mikaylagawarecki

Differential Revision: D33778525

Pulled By: cpuhrsch

fbshipit-source-id: 47b8544e29df3075bc6ede894c59499a7ffec876
(cherry picked from commit ddcddac726)
2022-01-27 17:38:50 +00:00
soulitzer
25e84fa4e5 Add forward AD formulas for some losses (#71026)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71026

...and fmod

Testing:
- L1Loss: new module tests (linear in the real case only)
- SmoothL1Loss: new module tests
- MSELoss: tested - OpInfo + new module tests
- huberloss: tested - OpInfo + new module tests
- multi-margin-loss: new module tests
- kl-div: OpInfo + new module tests
- fmod: OpInfo

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33485661

Pulled By: soulitzer

fbshipit-source-id: 542ef5148183b9f574d06b2e2e345d0d889537b7
(cherry picked from commit 60765438e8)
2022-01-26 16:31:26 +00:00
lezcano
97585ae1e7 Simplify forward / backward AD for linalg.eigh and add checks (#70528)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70528

This PR adds checks for the backward of `linalg.eigh`, similar to those
deduced in https://github.com/pytorch/pytorch/pull/70253

It also makes its the implementation parallel that of the (fwd/bwd) derivative of
`torch.linalg.eig` and it makes most OpInfo tests pass.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33530149

Pulled By: albanD

fbshipit-source-id: 1f368b8d450d4e9e8ae74d3881c78513c27eb956
2022-01-12 08:35:52 -08:00
lezcano
061be8d600 Correct forward AD for linalg.eig and add checks (#70527)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70527

This PR adds checks for the backward of `linalg.eig`, similar to those
deduced in https://github.com/pytorch/pytorch/pull/70253

It also modifies the function so that it does not save the input matrix,
as it's not necessary.

It also corrects the forward AD formula for it to be correct. Now all
the tests pass for `linalg.eig` and `linalg.eigvals`.

It also updates the docs to reflect better what's going on here.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D33530148

Pulled By: albanD

fbshipit-source-id: 984521a04f81ecb28ac1c4402b0243c63dd6959d
2022-01-12 08:30:55 -08:00
soulitzer
78994d13c0 Add forward AD formulas for {batch,layer,group}_norm (#70355)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70355

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33405362

Pulled By: soulitzer

fbshipit-source-id: 55a92e88a04e7b15a0a223025d66c14f7db2a190
2022-01-10 13:52:16 -08:00
soulitzer
3051aabd0e Add forward AD formulas for convolution and some others (#69956)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69956

Test Plan: Imported from OSS

Reviewed By: albanD, bdhirsh

Differential Revision: D33235974

Pulled By: soulitzer

fbshipit-source-id: ea60d687edc5d62d92f3fd3cb6640421d32c908c
2022-01-06 08:39:51 -08:00
Amir Khojaste
748790588c Upgrading the loop to use irange (#70326)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70326

See D24145988 for context: it allows loops such as for(int i=0;i<10;i++) to be expressed as for(const auto i : c10::irange(10)). This is nice because it auto-types the loops and adds const-safety to the iteration variable.

Test Plan: buck run //caffe2/torch/fb/sparsenn:test

Reviewed By: r-barnes

Differential Revision: D33243400

fbshipit-source-id: b1f1b4163f4bf662031baea9e5268459b40c69a3
2022-01-06 07:06:53 -08:00
lezcano
a35b4b49d2 Add linalg.lu_factor (#66933)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66933

This PR exposes `torch.lu` as `torch.linalg.lu_factor` and
`torch.linalg.lu_factor_ex`.

This PR also adds support for matrices with zero elements both in
the size of the matrix and the batch. Note that this function simply
returns empty tensors of the correct size in this case.

We add a test and an OpInfo for the new function.

This PR also adds documentation for this new function in line of
the documentation in the rest of `torch.linalg`.

Fixes https://github.com/pytorch/pytorch/issues/56590
Fixes https://github.com/pytorch/pytorch/issues/64014

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D32834069

Pulled By: mruberry

fbshipit-source-id: 51ef12535fa91d292f419acf83b800b86ee9c7eb
2022-01-05 20:32:12 -08:00
Richard Zou
29f1ccc8f0 Fix some Composite Compliance problems with binary_cross_entropy backward (#70198)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/70198

This PR fixes composite compliance problems with:
- binary_cross_entropy's backward formula
- binary_cross_entropy_with_logits's backward formula
- binary_cross_entropy's double backward formula

It does so by adding checks for areAnyTensorSubclassLike.

Test Plan:
- I tested everything with functorch.
- We are going to do https://github.com/pytorch/pytorch/issues/69530 in
the future so we have a way of testing this in core. I need the
binary_cross_entropy ones for something right now and didn't want to
wait until we come up with a solution for #69530.

Reviewed By: Chillee

Differential Revision: D33246995

Pulled By: zou3519

fbshipit-source-id: 310ed3196b937d01b189870b86a6c5f77f9258b4
2021-12-22 07:24:04 -08:00
Joel Schlosser
4d5dd00e61 Remove backward ops for cuDNN transposed convolution (#69902)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69902

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33093795

Pulled By: jbschlosser

fbshipit-source-id: 8b90150bd1996e48c0c888bdab4e95a849d10ef5
2021-12-15 17:48:25 -08:00
Joel Schlosser
3dc3651e0e Remove backward ops for cuDNN convolution (#69901)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69901

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33093796

Pulled By: jbschlosser

fbshipit-source-id: f5beab6f3078144b6c8e5c4c51d69823815a9f99
2021-12-15 17:46:49 -08:00
soulitzer
b399a4d7b9 Add some reduction forward AD formulas (#69661)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69661

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33020601

Pulled By: soulitzer

fbshipit-source-id: 110da6dcd490e5c3849cace62a777aa1a2b6982e
2021-12-14 23:33:43 -08:00
Richard Zou
41e1ab0785 Introduce isTensorSubclassLike; add special cases to backwards formulas (#69534)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69534

Something is TensorSubclassLike if it is a Tensor subclass or if it has
the same problems as Tensor subclasses. Today that just includes Tensor
Subclasses and meta tensors but may include other things in the future.

Some of our backwards formulas are incompatible with TensorSubclassLike
objects. For example, calling .data_ptr() is a problem because many
TensorSubclassLike objects don't have storage. Another problem is
in-place operations: performing `regular_tensor.inplace_(tensor_subclass)`
is a problem.

This PR adds special cases to the backward formulas for torch.max and
torch.clamp to handle this. The backward formulas for torch.max and
torch.clamp are not dispatcher operations so they cannot be overridden
and we hesitate to make them dispatcher operations for FC/BC concerns
and performance overhead concerns.

Furthermore, the old concept of "is this inplace operation vmap
compatible?" can be subsumed by the general "is this inplace operation
tensor-subclass compatible" question, so I replaced all instances of
isInplaceVmapCompatible and replaced it with the isTensorSubclassLike
checks.

Test Plan
- I tested the changes using functorch.
- It's possible to write a test for these in core (one has to make
a custom tensor subclass and then send it through the operation and then
invoke autograd), but I wanted to push the work to doing some
generic testing for backward formulas
(https://github.com/pytorch/pytorch/issues/69530) instead of doing some
one-off things now.

Test Plan: Imported from OSS

Reviewed By: mrshenli

Differential Revision: D32967727

Pulled By: zou3519

fbshipit-source-id: 30fda1a7581da4c55179b7a3ca05069150bbe2dc
2021-12-09 15:03:22 -08:00
lezcano
cafcf599d0 Deprecate torch.triangular_solve (#63570)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63570

There is a use of `at::triangular_solve_out` in the file
`torch/csrc/jit/tensorexpr/external_functions.cpp` that I have not dared
to move to `at::linalg_solve_triangular_out`.

**Deprecation note:**

This PR deprecates the `torch.triangular_solve` function in favor of
`torch.linalg.solve_triangular`. An upgrade guide is added to the
documentation for `torch.triangular_solve`.

Note that it DOES NOT remove `torch.triangular_solve`, but
`torch.triangular_solve` will be removed in a future PyTorch release.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D32618035

Pulled By: anjali411

fbshipit-source-id: 0bfb48eeb6d96eff3e96e8a14818268cceb93c83
2021-12-02 13:24:55 -08:00
lezcano
f9e69af22e Modify LU_backward and lu_solve_backward to use linalg_solve_triangular (#63569)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63569

This PR also rewrites `lu_solve_backward` from scratch going from
solving 5 systems of equations to just 2.

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: mruberry

Differential Revision: D32618014

Pulled By: anjali411

fbshipit-source-id: 0e915bcf7045a4db43ffd076d807beac816c8538
2021-12-01 07:34:38 -08:00
Mike Ruberry
6ae34ea6f8 Revert D32521980: Add linalg.lu_factor
Test Plan: revert-hammer

Differential Revision:
D32521980 (b10929a14a)

Original commit changeset: 26a49ebd87f8

fbshipit-source-id: e1a6bb9c2ece9bd78190fe17e16a46e3358c5c82
2021-11-28 17:22:15 -08:00
lezcano
b10929a14a Add linalg.lu_factor (#66933)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66933

This PR exposes `torch.lu` as `torch.linalg.lu_factor` and
`torch.linalg.lu_factor_ex`.

This PR also adds support for matrices with zero elements both in
the size of the matrix and the batch. Note that this function simply
returns empty tensors of the correct size in this case.

We add a test and an OpInfo for the new function.

This PR also adds documentation for this new function in line of
the documentation in the rest of `torch.linalg`.

Fixes https://github.com/pytorch/pytorch/issues/56590
Fixes https://github.com/pytorch/pytorch/issues/64014

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D32521980

Pulled By: mruberry

fbshipit-source-id: 26a49ebd87f8a41472f8cd4e9de4ddfb7f5581fb
2021-11-27 17:52:48 -08:00
lezcano
b46c89d950 Add linalg.solve_triangular (#63568)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63568

This PR adds the first solver with structure to `linalg`. This solver
has an API compatible with that of `linalg.solve` preparing these for a
possible future merge of the APIs. The new API:
- Just returns the solution, rather than the solution and a copy of `A`
- Removes the confusing `transpose` argument and replaces it by a
correct handling of conj and strides within the call
- Adds a `left=True` kwarg. This can be achieved via transposes of the
inputs and the result, but it's exposed for convenience.

This PR also implements a dataflow that minimises the number of copies
needed before calling LAPACK / MAGMA / cuBLAS and takes advantage of the
conjugate and neg bits.

This algorithm is implemented for `solve_triangular` (which, for this, is
the most complex of all the solvers due to the `upper` parameters).
Once more solvers are added, we will factor out this calling algorithm,
so that all of them can take advantage of it.

Given the complexity of this algorithm, we implement some thorough
testing. We also added tests for all the backends, which was not done
before.

We also add forward AD support for `linalg.solve_triangular` and improve the
docs of `linalg.solve_triangular`. We also fix a few issues with those of
`torch.triangular_solve`.

Resolves https://github.com/pytorch/pytorch/issues/54258
Resolves https://github.com/pytorch/pytorch/issues/56327
Resolves https://github.com/pytorch/pytorch/issues/45734

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D32588230

Pulled By: mruberry

fbshipit-source-id: 69e484849deb9ad7bb992cc97905df29c8915910
2021-11-22 12:41:06 -08:00
soulitzer
7bb401a4c9 Add forward AD support for miscellanous operators (#67820)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67820

Original PR here: https://github.com/pytorch/pytorch/pull/67040

Test Plan: Imported from OSS

Reviewed By: gchanan

Differential Revision: D32314423

Pulled By: soulitzer

fbshipit-source-id: ecd898dc903692cab084f6922a1d86986f957b1b
2021-11-19 14:31:06 -08:00
jiej
ca92111758 Add native_dropout (#63937)
Summary:
Adds native_dropout to have a reasonable target for torchscript in auto diff. native_dropout has scale and train as arguments in its signature, this makes native_dropout more consistent with other operators and removes conditionals in the autodiff definition.

cc gmagogsfm

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63937

Reviewed By: mruberry

Differential Revision: D32477657

Pulled By: ngimel

fbshipit-source-id: d37b137a37acafa50990f60c77f5cea2818454e4
2021-11-18 19:41:10 -08:00
Jane Xu
9f4e004abd Revert D32283178: Add linalg.solve_triangular
Test Plan: revert-hammer

Differential Revision:
D32283178 (0706607abc)

Original commit changeset: deb672e6e52f

fbshipit-source-id: d2a3421292147426cc61c2f063b721acf9004755
2021-11-18 14:46:10 -08:00
lezcano
0706607abc Add linalg.solve_triangular (#63568)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63568

This PR adds the first solver with structure to `linalg`. This solver
has an API compatible with that of `linalg.solve` preparing these for a
possible future merge of the APIs. The new API:
- Just returns the solution, rather than the solution and a copy of `A`
- Removes the confusing `transpose` argument and replaces it by a
correct handling of conj and strides within the call
- Adds a `left=True` kwarg. This can be achieved via transposes of the
inputs and the result, but it's exposed for convenience.

This PR also implements a dataflow that minimises the number of copies
needed before calling LAPACK / MAGMA / cuBLAS and takes advantage of the
conjugate and neg bits.

This algorithm is implemented for `solve_triangular` (which, for this, is
the most complex of all the solvers due to the `upper` parameters).
Once more solvers are added, we will factor out this calling algorithm,
so that all of them can take advantage of it.

Given the complexity of this algorithm, we implement some thorough
testing. We also added tests for all the backends, which was not done
before.

We also add forward AD support for `linalg.solve_triangular` and improve the
docs of `linalg.solve_triangular`. We also fix a few issues with those of
`torch.triangular_solve`.

Resolves https://github.com/pytorch/pytorch/issues/54258
Resolves https://github.com/pytorch/pytorch/issues/56327
Resolves https://github.com/pytorch/pytorch/issues/45734

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: zou3519, JacobSzwejbka

Differential Revision: D32283178

Pulled By: mruberry

fbshipit-source-id: deb672e6e52f58b76536ab4158073927a35e43a8
2021-11-18 09:45:51 -08:00
Nikita Vedeneev
857fed1f42 torch.linalg.qr: forward AD support (#67268)
Summary:
As per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67268

Reviewed By: ngimel

Differential Revision: D31960517

Pulled By: albanD

fbshipit-source-id: bfd1028a8d352f550efb420f9ca609c09f4a7484
2021-11-18 08:11:54 -08:00
Matthias Reis
4c346bd073 Added forward derivatives for neg, diag, inverse, linalg_eig (#67837)
Summary:
Recreated due to CI failures as per comment https://github.com/pytorch/pytorch/pull/67339#issuecomment-959893293

===

See also discussion in https://github.com/pytorch/pytorch/issues/10223, starting from [this](https://github.com/pytorch/pytorch/issues/10223#issuecomment-949499666) comment

The formulas for the derivatives are taken from https://people.maths.ox.ac.uk/gilesm/files/NA-08-01.pdf.

As indicated, the method linalg_eig_jvp should be used instead of linalg_eig_jvp_eigenvalues and linalg_eig_jvp_eigenvectors in the future. Due to a codegen limitation, this is not yet possible.

CC albanD Lezcano

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67837

Reviewed By: mrshenli

Differential Revision: D32403662

Pulled By: soulitzer

fbshipit-source-id: 529cb93f865ce4cc2e24fa6f672d4234e7abe2b1
2021-11-16 20:32:47 -08:00
Masaki Kozuki
c5e5264be2 Disable TF32 in pinv_jvp and pinv_backward (#67948)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/67947

cc ptrblck xwang233 zasdfgbnm

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67948

Reviewed By: H-Huang

Differential Revision: D32251934

Pulled By: ngimel

fbshipit-source-id: a2b1a118337b38db61350c9e49f1ba19030d70ec
2021-11-08 22:33:29 -08:00
Natalia Gimelshein
98be5216e2 Revert D32104006: [pytorch][PR] Added forward derivatives for neg, diag, inverse, linalg_eig
Test Plan: revert-hammer

Differential Revision:
D32104006 (88c61b8d06)

Original commit changeset: 1f6ace09ee3e

fbshipit-source-id: f9f950b4177e1fe29b9059f4b5dfb9c8c67f479a
2021-11-03 12:40:00 -07:00
Matthias Reis
88c61b8d06 Added forward derivatives for neg, diag, inverse, linalg_eig (#67339)
Summary:
See also discussion in https://github.com/pytorch/pytorch/issues/10223, starting from [this](https://github.com/pytorch/pytorch/issues/10223#issuecomment-949499666) comment

The formulas for the derivatives are taken from https://people.maths.ox.ac.uk/gilesm/files/NA-08-01.pdf.

As indicated, the method linalg_eig_jvp should be used instead of linalg_eig_jvp_eigenvalues and linalg_eig_jvp_eigenvectors in the future. Due to a codegen limitation, this is not yet possible.

CC albanD Lezcano

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67339

Reviewed By: ejguan

Differential Revision: D32104006

Pulled By: albanD

fbshipit-source-id: 1f6ace09ee3e737b99520543b30550601809ceb5
2021-11-03 11:21:54 -07:00
Nikita Vedeneev
3c61700cf7 torch.linalg.householder_product: forward AD support (#67043)
Summary:
As per title.

cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry walterddr IvanYashchuk xwang233

Pull Request resolved: https://github.com/pytorch/pytorch/pull/67043

Reviewed By: VitalyFedyunin

Differential Revision: D31897617

Pulled By: albanD

fbshipit-source-id: ef135fe3d9e5b9b2a541c355017f07cdb1309979
2021-10-26 08:34:00 -07:00
lezcano
d3fc3c4ded Implement forward AD for linalg.matrix_exp (#62716)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62716

cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano

Test Plan: Imported from OSS

Reviewed By: anjali411

Differential Revision: D31823231

Pulled By: mruberry

fbshipit-source-id: 6d19b8988dce773b5716f0522d06febfe167fead
2021-10-21 23:55:36 -07:00
lezcano
0974215c4d Prefer mT and mH over transpose(-2, -1) and transpose(-2, -1).conj() (#64181)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64181

This PR replaces all the calls to:
- `transpose(-2, -1)` or `transpose(-1, -2)` by `mT()` in C++ and `mT` in Python
- `conj().transpose(-2, -1)` or `transpose(-2, -1).conj()` or `conj().transpose(-1, -2)` or `transpose(-1, -2).conj()` by `mH()` in C++ and `mH` in Python.

It also simplifies two pieces of code, and fixes one bug where a pair
of parentheses were missing in the function `make_symmetric_matrices`.

Test Plan: Imported from OSS

Reviewed By: H-Huang

Differential Revision: D31692896

Pulled By: anjali411

fbshipit-source-id: e9112c42343663d442dc5bd53ff2b492094b434a
2021-10-18 13:02:25 -07:00
Nikita Vedeneev
7fad47e522 torch.linalg.lstsq: forward/backward AD support (#65054)
Summary:
As per title.

cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry walterddr IvanYashchuk xwang233

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65054

Reviewed By: zou3519

Differential Revision: D31729468

Pulled By: albanD

fbshipit-source-id: ab7df824bc80128e7f64f6444c7a4baa4786c161
2021-10-18 11:28:44 -07:00
Nikita Vedeneev
06c37876b8 torch.linalg.householder_product faster backward (#63880)
Summary:
This PR implements a much more efficient algorithm. This algorithm allows to achieve MASSIVE speed-ups, especially for batched and/or larger double-precision inputs.
Here are some benchmarks:

<details>

<summary>Testing script</summary>

```python
from IPython import get_ipython
import torch
import itertools

torch.manual_seed(13)
#torch.set_num_threads(1)

ipython = get_ipython()

cpu = torch.device('cpu')
cuda = torch.device('cuda')

def generate_input(shape, dtype=torch.double, device=cpu):
    eigvals = torch.rand(*shape[:-1], dtype=dtype, device=device)
    eigvecs = torch.rand(*shape, dtype=dtype, device=device)
    input = (eigvecs * eigvals.unsqueeze(-2)) @ eigvecs.inverse()
    input.requires_grad_(True)
    tau = torch.rand(*shape[:-1], dtype=dtype, device=device)
    tau.requires_grad_(True)
    return input, tau

def run_test(shape, device, dtype):
    print(f"shape: {shape}, device: {device}, dtype: {dtype}")
    a, tau = generate_input(shape, dtype=dtype, device=device)
    prod = torch.linalg.householder_product(a, tau)
    ones_prod = torch.ones_like(prod)

    command = "torch.autograd.backward((prod,), (ones_prod), retain_graph=True)"
    if device == cuda:
        command = command + "; torch.cuda.synchronize()"
    ipython.magic(f"timeit {command}")
    print()

dtypes = [torch.float, torch.double]
devices = [cpu, cuda]
#devices = [cuda]
sizes = [
    (10, 10),
    (1000, 10, 10),
    (100, 100),
    (1000, 100, 100),
    (1000, 1000),
    (10, 1000, 1000),
]

for device, dtype, size in itertools.product(devices, dtypes, sizes):
    run_test(size, device, dtype)

```

</details>

<details>

<summary>This PR, cuda float32</summary>

```
shape: (10, 10), device: cuda, dtype: torch.float32
1.33 ms ± 1.82 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

shape: (1000, 10, 10), device: cuda, dtype: torch.float32
1.52 ms ± 40.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

shape: (100, 100), device: cuda, dtype: torch.float32
10.8 ms ± 9.62 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

shape: (1000, 100, 100), device: cuda, dtype: torch.float32
127 ms ± 8.45 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

shape: (1000, 1000), device: cuda, dtype: torch.float32
151 ms ± 127 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

shape: (10, 1000, 1000), device: cuda, dtype: torch.float32
981 ms ± 91.4 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

```

</details>

<details>

<summary>Master, cuda float32</summary>

```
shape: (10, 10), device: cuda, dtype: torch.float32
1.64 ms ± 6.36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

shape: (1000, 10, 10), device: cuda, dtype: torch.float32
298 ms ± 463 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (100, 100), device: cuda, dtype: torch.float32
15.4 ms ± 41.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

shape: (1000, 100, 100), device: cuda, dtype: torch.float32
5.36 s ± 711 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (1000, 1000), device: cuda, dtype: torch.float32
1.64 s ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (10, 1000, 1000), device: cuda, dtype: torch.float32
15.7 s ± 43.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

```

</details>

<details>

<summary>This PR, cuda float64</summary>

```
shape: (10, 10), device: cuda, dtype: torch.float64
1.14 ms ± 1.43 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

shape: (1000, 10, 10), device: cuda, dtype: torch.float64
2.22 ms ± 1.32 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

shape: (100, 100), device: cuda, dtype: torch.float64
10.6 ms ± 11.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

shape: (1000, 100, 100), device: cuda, dtype: torch.float64
287 ms ± 84.9 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (1000, 1000), device: cuda, dtype: torch.float64
236 ms ± 41.9 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (10, 1000, 1000), device: cuda, dtype: torch.float64
1.88 s ± 88.3 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

```
</details>

<details>

<summary>Master, cuda float64</summary>

```
shape: (10, 10), device: cuda, dtype: torch.float64
1.58 ms ± 8.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

shape: (1000, 10, 10), device: cuda, dtype: torch.float64
308 ms ± 213 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (100, 100), device: cuda, dtype: torch.float64
79 ms ± 14.5 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

shape: (1000, 100, 100), device: cuda, dtype: torch.float64
54.2 s ± 1.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (1000, 1000), device: cuda, dtype: torch.float64
31.5 s ± 698 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (10, 1000, 1000), device: cuda, dtype: torch.float64
4min 45s ± 2.48 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

```
</details>

<details>

<summary>This PR, cpu float32</summary>

```
shape: (10, 10), device: cpu, dtype: torch.float32
476 µs ± 21.4 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (1000, 10, 10), device: cpu, dtype: torch.float32
5.1 ms ± 100 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

shape: (100, 100), device: cpu, dtype: torch.float32
4.38 ms ± 4.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

shape: (1000, 100, 100), device: cpu, dtype: torch.float32
1.55 s ± 6.64 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (1000, 1000), device: cpu, dtype: torch.float32
745 ms ± 407 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (10, 1000, 1000), device: cpu, dtype: torch.float32
5.44 s ± 15.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

```
</details>

<details>

<summary>Master, cpu float32</summary>

```
shape: (10, 10), device: cpu, dtype: torch.float32
387 µs ± 645 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

shape: (1000, 10, 10), device: cpu, dtype: torch.float32
12.3 ms ± 23.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

shape: (100, 100), device: cpu, dtype: torch.float32
39.4 ms ± 80.3 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

shape: (1000, 100, 100), device: cpu, dtype: torch.float32
29.1 s ± 44.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (1000, 1000), device: cpu, dtype: torch.float32
9.42 s ± 14.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (10, 1000, 1000), device: cpu, dtype: torch.float32
1min 50s ± 282 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

```
</details>

<details>

<summary>This PR, cpu float64</summary>

```
shape: (10, 10), device: cpu, dtype: torch.float64
381 µs ± 761 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

shape: (1000, 10, 10), device: cpu, dtype: torch.float64
6.19 ms ± 13.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

shape: (100, 100), device: cpu, dtype: torch.float64
4.6 ms ± 3.26 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

shape: (1000, 100, 100), device: cpu, dtype: torch.float64
2.59 s ± 8.25 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (1000, 1000), device: cpu, dtype: torch.float64
1.07 s ± 5.09 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (10, 1000, 1000), device: cpu, dtype: torch.float64
14.4 s ± 13.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

```
</details>

<details>

<summary>Master, cpu float64</summary>

```
shape: (10, 10), device: cpu, dtype: torch.float64
395 µs ± 1.04 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

shape: (1000, 10, 10), device: cpu, dtype: torch.float64
14.6 ms ± 9.76 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

shape: (100, 100), device: cpu, dtype: torch.float64
45.5 ms ± 154 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

shape: (1000, 100, 100), device: cpu, dtype: torch.float64
33.1 s ± 69.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (1000, 1000), device: cpu, dtype: torch.float64
19.3 s ± 80.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

shape: (10, 1000, 1000), device: cpu, dtype: torch.float64
3min 30s ± 1.29 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

```
</details>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/63880

Reviewed By: soulitzer

Differential Revision: D30639435

Pulled By: anjali411

fbshipit-source-id: 127789943ae56e2f1dd03e0fe76ef7b6db86bcf0
2021-10-15 09:54:30 -07:00
Peter Bell
5f45927d15 Autograd: Delay warnings until the end of backward execution (#66235)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/50209

This adds a new warning handler that stores all warnings in a shared
queue, which can be "replayed" at a later time and, crucially, on
another thread. Then, I use this inside the autograd engine to ensure
that warnings are processed by the handler registered on the main
thread.

For testing, I also add an operator that always warns in the backward
pass and test that the warning is a normal Python warning.

cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66235

Reviewed By: ejguan

Differential Revision: D31505413

Pulled By: albanD

fbshipit-source-id: 1a7f60b038f55c20591c0748b9e86735b3fec2f9
2021-10-13 15:38:04 -07:00
Nikita Vedeneev
1b40daac74 pinv: forward/backward AD which is Frechet-defined in a rank-preserving neighborhood. (#66092)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/65911. Also enables complex support/tests for `linalg_pinv` in OpInfo.

cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry walterddr IvanYashchuk xwang233

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66092

Reviewed By: ejguan

Differential Revision: D31503072

Pulled By: albanD

fbshipit-source-id: 52018e826826ae62beaad76becb5edf880be253f
2021-10-11 08:33:28 -07:00
Nikita Vedeneev
1d586e78c6 *_solve methods: implements forward AD (#65546)
Summary:
This PR adds forward AD for `*_solve` methods.
Additionally, `cholesky_solve` gets OpInfo + a bug fix when wrong leading dimensions could be passed to LAPACK,
and `lu_solve` gets forward AD with 2x`lu_solve` instead of 1x`lu_solve` + 2x`triangular_solve`.

cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry walterddr IvanYashchuk xwang233

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65546

Reviewed By: dagitses

Differential Revision: D31431847

Pulled By: albanD

fbshipit-source-id: 0e343e0d9da3c3d2051fca215fad289d77275251
2021-10-06 16:04:22 -07:00
soulitzer
4cdfceddd2 [Reland] Avoid saving self for softmax and log_softmax (#66018)
Summary:
Reland of https://github.com/pytorch/pytorch/pull/65242

The last attempt of the reland automatically rebased onto stable, which did not yet have the revert commit

Pull Request resolved: https://github.com/pytorch/pytorch/pull/66018

Reviewed By: albanD

Differential Revision: D31348822

Pulled By: soulitzer

fbshipit-source-id: 881d701b404530c1352ac9245bd67264e1652b8a
2021-10-03 21:35:01 -07:00
Michael Suo
9ae63bd87c Revert D31238123: [pytorch][PR] Avoid saving self forsoftmax and log_softmax
Test Plan: revert-hammer

Differential Revision:
D31238123 (fb412bdd80)

Original commit changeset: afd319d3676d

fbshipit-source-id: b7980d653a4b8322a225f1dd08c2857ecbe5bc94
2021-09-30 11:34:14 -07:00
soulitzer
fb412bdd80 Avoid saving self forsoftmax and log_softmax (#65242)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/64000
 - updates double backward formula to compute grad wrt output instead of self
 - ~~In some of the error messages, we still refer to the dtype of the input, even though we are now checking the dtype of the output~~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65242

Reviewed By: albanD

Differential Revision: D31238123

Pulled By: soulitzer

fbshipit-source-id: afd319d3676d9ef8d81607e0e8c2a3e6d09f68e4
2021-09-29 18:16:12 -07:00
Mike Ruberry
0a0564a347 Revert D31206837: [pytorch][PR] *_solve methods: implements forward AD
Test Plan: revert-hammer

Differential Revision:
D31206837 (26e31f76b0)

Original commit changeset: 040beda97442

fbshipit-source-id: f28091327357af9f54f367eda6606240924b93ac
2021-09-28 23:31:16 -07:00
Nikita Vedeneev
26e31f76b0 *_solve methods: implements forward AD (#65546)
Summary:
This PR adds forward AD for `*_solve` methods.
Additionally, `cholesky_solve` gets OpInfo + a bug fix when wrong leading dimensions could be passed to LAPACK,
and `lu_solve` gets forward AD with 2x`lu_solve` instead of 1x`lu_solve` + 2x`triangular_solve`.

cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry walterddr IvanYashchuk xwang233

Pull Request resolved: https://github.com/pytorch/pytorch/pull/65546

Reviewed By: gchanan

Differential Revision: D31206837

Pulled By: albanD

fbshipit-source-id: 040beda97442e7a88a9df9abc7bb18313ce55bc3
2021-09-28 06:51:32 -07:00
Ivan Yashchuk
0aef44cb3d Add forward AD for torch.linalg.eigh (#62163)
Summary:
This PR adds forward mode differentiation for `torch.linalg.eigh` and a few other functions required for tests to pass.

For some reason running tests for `torch.linalg.eigvalsh` and complex `torch.linalg.eigh` hangs. These tests are skipped for now.

cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry heitorschueroff walterddr IvanYashchuk xwang233

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62163

Reviewed By: jbschlosser

Differential Revision: D30903988

Pulled By: albanD

fbshipit-source-id: d6a74adb9e6d2f4be8ac707848ecabf06d629823
2021-09-13 21:15:38 -07:00
Nikita Vedeneev
88fff22023 torch.lu: forward AD support (#64742)
Summary:
As per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64742

Reviewed By: H-Huang

Differential Revision: D30841227

Pulled By: albanD

fbshipit-source-id: dc4d043ab94358594adb110fbbbb60750c98262a
2021-09-10 07:19:11 -07:00
Nikita Vedeneev
dc53546655 torch.lu_solve: forward AD support (#64646)
Summary:
As per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/64646

Reviewed By: VitalyFedyunin

Differential Revision: D30807898

Pulled By: albanD

fbshipit-source-id: 1f943c22357dd1b3662cfe0d2a26af68e3a2df4c
2021-09-09 08:58:00 -07:00
Ivan Yashchuk
dd8f6ac597 Add forward mode differentiation for torch.linalg.cholesky and transpose (#62159)
Summary:
This PR adds forward mode differentiation for `torch.linalg.cholesky`, `torch.linalg.cholesky_ex`, and `transpose` functions.
Complex tests for Cholesky fail because for some reason the gradcheck sends matrices full of zeros to `cholesky_jvp` function.

cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry heitorschueroff walterddr IvanYashchuk xwang233

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62159

Reviewed By: mrshenli

Differential Revision: D30776829

Pulled By: albanD

fbshipit-source-id: 32e5539ed6423eed8c18cce16271330ab0ea8d5e
2021-09-08 09:44:30 -07:00
soulitzer
92a154aa29 Move variabletype functions around (#63330)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63330

 - This is in preparation for templated/boxed autograd-not-implemented fallback
 - Make sure VariableTypeUtils does not depend on generated code
 - Lift `isFwGradDefined` into `autograd/functions/utils.cpp` so it's available to mobile builds
 - Removes `using namespace at` from VariableTypeUtils, previously we needed this for Templated version, but now its not strictly necessary but still a good change to avoid name conflicts if this header is included elsewhere in the future.

Test Plan: Imported from OSS

Reviewed By: heitorschueroff

Differential Revision: D30518573

Pulled By: soulitzer

fbshipit-source-id: a0fb904baafc9713de609fffec4b813f6cfcc000
2021-08-26 16:02:39 -07:00
Nikita Vedeneev
dbcfd7739f Make torch.lu differentiable for wide/tall inputs + jit (#61564)
Summary:
As per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61564

Reviewed By: astaff

Differential Revision: D30338136

Pulled By: mruberry

fbshipit-source-id: f01436fc90980544cdfa270feee16bb3dda21b93
2021-08-16 11:40:57 -07:00
Nikita Vedeneev
741accb11e Implements backward for torch.lu_solve (#61681)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/22620

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61681

Reviewed By: ngimel

Differential Revision: D30063116

Pulled By: mruberry

fbshipit-source-id: e095b0cadfb7c8b37a7ef91bae5b5dc170d8ef1c
2021-08-12 21:17:11 -07:00
jiej
ed0b8a3e83 LayerNorm Support in autodiff: (#50467)
Summary:
1. extend autodiff by adding entry for layer_norm in symbolic script, we now use native_layer_norm_backward
2. added backward function `layernorm_double_backward` for `native_layer_norm_backward`, preserves double backward support for LayerNorm in autodiff/ScriptModule
3. added python test to verify autodiff on layer_norm with various configuration of optional tensors; (verify the fix in https://github.com/pytorch/pytorch/issues/49430)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/50467

Reviewed By: eellison

Differential Revision: D30232864

Pulled By: jansel

fbshipit-source-id: b9c33075386aff96afff7415df9f94388bfb474a

Co-authored-by: Ryan Spring <rspring@nvidia.com>
Co-authored-by: Jie <jiej@nvidia.com>
2021-08-12 11:05:53 -07:00
Nikita Shulga
30214aef2d [BE] irangefy (#62928)
Summary:
Replace for loop with for `irange` loop. Also fix some unused variable warnings in range loop cases

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62928

Reviewed By: driazati

Differential Revision: D30171904

Pulled By: malfet

fbshipit-source-id: 1b437a0f7e3515f4a2e324f3450e93312f1933ae
2021-08-07 13:34:13 -07:00
Nikita Vedeneev
8e35df0bf3 det_backward: return svd path for double backward (so that all ci tests pass) (#62570)
Summary:
Potentially fixes https://github.com/pytorch/pytorch/issues/62327 and fixes https://github.com/pytorch/pytorch/issues/62328.
This PR replaces the double backward of det from eig to svd. The latter is slower but should be more stable.

CC anjali411

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62570

Reviewed By: pbelevich

Differential Revision: D30072876

Pulled By: anjali411

fbshipit-source-id: c91b507dbfd6a3ec47dc6d0b0dcfa5f8c8228c30
2021-08-04 13:43:51 -07:00
Nikita Vedeneev
d7ddae8e4f det_backward: correct, more robust and with complex support [clone] (#61905)
Summary:
Clone of https://github.com/pytorch/pytorch/pull/58195 to ease the import. Done by request from anjali411

Pull Request resolved: https://github.com/pytorch/pytorch/pull/61905

Reviewed By: albanD

Differential Revision: D29937920

Pulled By: anjali411

fbshipit-source-id: 025892a8e6147790825b20458986730ad8c5bb0f
2021-07-27 10:08:26 -07:00
Ivan Yashchuk
3cd12448b4 Add forward mode differentiation for inverse and solve (#62160)
Summary:
This PR adds forward mode differentiation for `torch.linalg.inv`, `torch.linalg.inv_ex`, and `torch.linalg.solve` functions.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62160

Reviewed By: mruberry

Differential Revision: D29917213

Pulled By: albanD

fbshipit-source-id: b08bbc830f77f342cc7ca5b823d7ea4380f2aaa8
2021-07-27 07:51:22 -07:00
Nikita Shulga
a9b0a921d5 Disable avoid-non-const-global-variables lint check (#62008)
Summary:
As GoogleTest `TEST` macro is non-compliant with it as well as `DEFINE_DISPATCH`

All changes but the ones to `.clang-tidy` are generated using following script:
```
for i in `find . -type f -iname "*.c*" -or -iname "*.h"|xargs grep cppcoreguidelines-avoid-non-const-global-variables|cut -f1 -d:|sort|uniq`;  do sed -i "/\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-non-const-global-variables)/d" $i; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/62008

Reviewed By: driazati, r-barnes

Differential Revision: D29838584

Pulled By: malfet

fbshipit-source-id: 1b2f8602c945bd4ce50a9bfdd204755556e31d13
2021-07-22 18:04:40 -07:00
Mike Ruberry
1ce3281a6d Revert D29361872: [pytorch][PR] det_backward: more robust and with complex support
Test Plan: revert-hammer

Differential Revision:
D29361872 (fce85480b9)

Original commit changeset: b1f0fec7e3ac

fbshipit-source-id: feffa74ad65b0b294e0a9b0ee72d245393421f70
2021-07-15 15:26:00 -07:00
Nikita Vedeneev
fce85480b9 det_backward: more robust and with complex support (#58195)
Summary:
As per title.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58195

Reviewed By: albanD

Differential Revision: D29361872

Pulled By: anjali411

fbshipit-source-id: b1f0fec7e3ac52acd1481bcc878cc0c1d07c1852
2021-07-15 11:04:42 -07:00
Anjali Chourdia
30e48bbeae Add neg bit (#56058)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56058

User facing changes:
1. Adds a negative bit and corresponding new API (`is_neg()`,`resolve_neg()`)
2. `tensor.conj().imag` now returns a floating point tensor with neg bit set to 1 instead of a tensor with no notion of negative bit. Note that imag is still a view and all the view properties still hold for imag.

Non user facing changes:
1. Added a new Negative dispatch key and a backend fallback to handle it
2. Updated copy kernel to handle negative bit
3. Merged conjugate and negative bit fallback kernel
4. fixed https://github.com/pytorch/pytorch/issues/60478 (caused due to https://github.com/pytorch/pytorch/pull/54987)

Testing:
1. Added a new OpInfo based test `test_neg_view` (verifies that out-of-place and in-place operations work correctly for all operations when the input is a neg view tensor by checking the result against an actually negated tensor, verifies that autograd returns the same output for both neg view and actually negated tensors as well as it works fine when grad_out is a neg view).
2. Added a new test class containing `test_conj_view`, `test_neg_view`.

Test Plan: Imported from OSS

Reviewed By: soulitzer

Differential Revision: D29636403

fbshipit-source-id: 12214c9dc4806c51850f4a72a109db9527c0ca63
2021-07-13 13:50:42 -07:00
albanD
056a8e0d5c Remove un-used parameter in _trilinear backward (#60673)
Summary:
This argument is only important for speed and memory usage. So it is ok to ignore it during the backward.
As discussed, we might want to change this to speed up backward in the future.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60673

Reviewed By: soulitzer

Differential Revision: D29370125

Pulled By: albanD

fbshipit-source-id: ad50b3ea530aeb194f5a51845523b517a50f2c71
2021-06-25 17:47:10 -07:00
lezcano
dfc8247d33 Faster cumsum and cumprod backwards (#60642)
Summary:
Piggybacking on https://github.com/pytorch/pytorch/pull/58747, now we can implement the backwards of `cumsum` and `cumprod` without tricks. This minimises the number of kernels that are launched in GPU, so we see a reasonable speed-up on GPU. We should also get a better stability for ill-conditioned inputs, as we do not perform any numerical tricks to get the result.

Note that the benchmarks test forward + backward, so the true speed-up on the backward should be even faster. Even more so in `cumsum`, as it requires less operations than the backward of `cumprod`.

<details>
<summary>
Test Script
</summary>

```python
from itertools import product

import torch
from torch.utils.benchmark import Compare, Timer

def get_timer(ndims, prod_dim, dim, num_threads, device):
    size = [500]*ndims
    size[dim] = prod_dim

    x = torch.rand(*size, device=device, requires_grad=True)
    # Make sure there are no zeros as the formula for the backward
    # that we are testing is for when the backward has no zeros
    with torch.no_grad():
        x.add_(1e-3)
    grad = torch.ones_like(x)

    timer = Timer(
        "torch.autograd.grad([x.cumprod(dim)], [x], grad_outputs=[grad])",
        globals={"x": x, "dim": dim, "grad": grad},
        label=f"Cumprod + Backwards {device}",
        description=f"dim: {dim}",
        sub_label=f"prod_dim: {prod_dim}",
        num_threads=num_threads,
    )

    return timer.blocked_autorange(min_run_time=5)

def get_params():
    ndims = 3
    dims = range(ndims)
    prod_dims = [10, 100, 500]
    for dim, prod_dim, device in product(dims, prod_dims, ("cpu", "cuda")):
        threads = (1, 2, 4) if device == "cpu" else (1,)
        for num_threads in threads:
            yield ndims, prod_dim, dim, num_threads, device

compare = Compare([get_timer(*params) for params in get_params()])
compare.trim_significant_figures()
compare.print()
```

</details>

<details>
<summary>
Benchmark PR
</summary>

```
[------------ Cumprod + Backwards cpu -------------]
                     |  dim: 0  |  dim: 1  |  dim: 2
1 threads: -----------------------------------------
      prod_dim: 10   |     11   |     14   |     12
      prod_dim: 100  |    260   |    270   |    260
      prod_dim: 500  |   1400   |   1550   |   1360
2 threads: -----------------------------------------
      prod_dim: 10   |      6   |      6   |      6
      prod_dim: 100  |    170   |    166   |    167
      prod_dim: 500  |    902   |    950   |    858
4 threads: -----------------------------------------
      prod_dim: 10   |      4   |      3   |      3
      prod_dim: 100  |    110   |    108   |    106
      prod_dim: 500  |    576   |    590   |    547

Times are in milliseconds (ms).

[------------ Cumprod + Backwards cuda ------------]
                     |  dim: 0  |  dim: 1  |  dim: 2
1 threads: -----------------------------------------
      prod_dim: 10   |    562   |    566   |   1075
      prod_dim: 100  |   5388   |   5394   |   6697
      prod_dim: 500  |  28170   |  27580   |  30740

Times are in microseconds (us).
```

</details>

<details>
<summary>
Benchmark master
</summary>

```
[------------ Cumprod + Backwards cpu -------------]
                     |  dim: 0  |  dim: 1  |  dim: 2
1 threads: -----------------------------------------
      prod_dim: 10   |     11   |     13   |     12
      prod_dim: 100  |    270   |    270   |    256
      prod_dim: 500  |   1500   |   1590   |   1300
2 threads: -----------------------------------------
      prod_dim: 10   |      6   |      6   |      6
      prod_dim: 100  |    170   |    170   |    164
      prod_dim: 500  |    911   |    940   |    840
4 threads: -----------------------------------------
      prod_dim: 10   |      4   |      4   |      4
      prod_dim: 100  |    111   |    109   |    105
      prod_dim: 500  |    570   |    590   |    536

Times are in milliseconds (ms).

[------------ Cumprod + Backwards cuda ------------]
                     |  dim: 0  |  dim: 1  |  dim: 2
1 threads: -----------------------------------------
      prod_dim: 10   |    616   |    597   |   1109
      prod_dim: 100  |   5976   |   5723   |   7017
      prod_dim: 500  |  31110   |  29160   |  32320

Times are in microseconds (us).
```

</details>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/60642

Reviewed By: ngimel

Differential Revision: D29366368

Pulled By: albanD

fbshipit-source-id: b0d692ce030352965c2f152e0f92fbb61fc5ebde
2021-06-25 12:44:12 -07:00
Richard Barnes
b162d95e46 Fix a number of lint perf and safety issues in torch (#59897)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/59897

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D29037012

fbshipit-source-id: 7c16286d5fc2b67964fb65f8374dfff4d1a7aefb
2021-06-15 13:14:51 -07:00
albanD
a524ee00ca Forward AD formulas batch 3 (#59711)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59711

This is the exact same PR as before.
This was reverted before the PR below was faulty.

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D28995762

Pulled By: albanD

fbshipit-source-id: 65940ad93bced9b5f97106709d603d1cd7260812
2021-06-10 19:30:02 -07:00
Richard Barnes
e3d75b8475 irange for PyTorch sans jit (#59481)
Summary:
Switches most of the simple for loops outside of `jit` directories to use `c10::irange`.

Generated with D28874212.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59481

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D28909681

fbshipit-source-id: ec9ab1bd602933238d9d0f73d4d8d027b75d9d85
2021-06-09 14:46:11 -07:00
Ivan Yashchuk
90303157ab Enable complex dtypes for coo_sparse-coo_sparse matmul [CPU] (#59554)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59554

This PR enables complex numbers supports for matrix-matrix
multiplication of COO sparse matrices.

Test Plan: Imported from OSS

Reviewed By: jbschlosser

Differential Revision: D28968309

Pulled By: anjali411

fbshipit-source-id: 4fd471e76a5584366aabc86c08b4564667ee54ca
2021-06-08 19:34:41 -07:00
Jane Xu
14f4c8d333 Revert D28387762: Forward AD formulas batch 3
Test Plan: revert-hammer

Differential Revision:
D28387762 (58348bea06)

Original commit changeset: fc395c92af7e

fbshipit-source-id: 608d704ff5bc560714790a576eaf9ed7f1f44e13
2021-06-08 15:19:26 -07:00
Natalia Gimelshein
9d533ef3ac Renorm fix (#59615)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/59584
albanD, soulitzer, `renorm` grad was completely busted. Fast gradcheck is definitely not doing its job.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59615

Reviewed By: jbschlosser

Differential Revision: D28964271

Pulled By: ngimel

fbshipit-source-id: b6878cd24db9189b64b67eb58bd2cd8956cda78a
2021-06-08 14:59:24 -07:00
Victor Quach
c268eefe96 Use TORCH_CHECK_NOT_IMPLEMENTED for AD not implemented (#59482)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59482

Fixes #53398

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D28933809

fbshipit-source-id: 53387ec9690fc235b0622b50800feced706ea1ee
2021-06-08 14:02:04 -07:00
albanD
58348bea06 Forward AD formulas batch 3 (#58094)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/58094

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D28387762

Pulled By: albanD

fbshipit-source-id: fc395c92af7ebb5ebae95c40f6c76273047f4097
2021-06-08 13:00:21 -07:00
Nikita Vedeneev
a30b359590 fix double backward for binary_cross_entropy loss function when reduction=sum. (#59479)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/59477.

```python
In [1]: import torch

In [2]: x = torch.rand(3, 3, dtype=torch.double, requires_grad=True)

In [3]: y = torch.rand(3, 3, dtype=torch.double)

In [4]: torch.autograd.gradgradcheck(lambda x, y: torch.nn.functional.binary_cross_entropy(x, y, reduction='sum'), [x, y])
Out[4]: True

In [5]: torch.autograd.gradgradcheck(lambda x, y: torch.nn.functional.binary_cross_entropy(x, y, reduction='mean'), [x, y])
Out[5]: True

In [6]: torch.autograd.gradcheck(lambda x, y: torch.nn.functional.binary_cross_entropy(x, y, reduction='sum'), [x, y])
Out[6]: True

```

More comprehensive testing could be added in https://github.com/pytorch/pytorch/pull/59447 where explicit `gradcheck` and `gradgradcheck` tests are added.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59479

Reviewed By: ejguan

Differential Revision: D28934354

Pulled By: albanD

fbshipit-source-id: 12ce68e3c5c499b2531f7cdba3c22548d67e07e9
2021-06-07 14:14:08 -07:00
Nikita Vedeneev
c51abf8fca Make binary_cross_entropy differentiable wrt target (#59447)
Summary:
As per title. Resolves https://github.com/pytorch/pytorch/issues/56683.
`gradgradcheck` will fail once `target.requires_grad() == True` because of the limitations of the current double backward implementation.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59447

Reviewed By: agolynski

Differential Revision: D28910140

Pulled By: albanD

fbshipit-source-id: 20934880eb4d22bec34446a6d1be0a38ef95edc7
2021-06-07 09:20:17 -07:00
anjali411
3607478ecd Conjugate View (#54987)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54987

Based off of ezyang (https://github.com/pytorch/pytorch/pull/44799) and bdhirsh (https://github.com/pytorch/pytorch/pull/43702) 's prototype:

Here's a summary of the changes in this PR:
This PR adds a new dispatch key called Conjugate. This enables us to make conjugate operation a view and leverage the specialized library functions that fast path with the hermitian operation (conj + transpose).

1. Conjugate operation will now return a view with conj bit (1) for complex tensors and returns self for non-complex tensors as before. This also means `torch.view_as_real` will no longer be a view on conjugated complex tensors and is hence disabled. To fill the gap, we have added `torch.view_as_real_physical` which would return the real tensor agnostic of the conjugate bit on the input complex tensor. The information about conjugation on the old tensor can be obtained by calling `.is_conj()` on the new tensor.
2. NEW API:
    a) `.conj()` -- now returning a view.
    b) `.conj_physical()` -- does the physical conjugate operation. If the conj bit for input was set, you'd get `self.clone()`, else you'll get a new tensor with conjugated value in its memory.
    c) `.conj_physical_()`, and `out=` variant
    d) `.resolve_conj()`  -- materializes the conjugation. returns self if the conj bit is unset, else returns a new tensor with conjugated values and conj bit set to 0.
    e) `.resolve_conj_()` in-place version of (d)
    f) `view_as_real_physical` -- as described in (1), it's functionally same as `view_as_real`, just that it doesn't error out on conjugated tensors.
    g) `view_as_real` -- existing function, but now errors out on conjugated tensors.
3. Conjugate Fallback
    a) Vast majority of PyTorch functions would currently use this fallback when they are called on a conjugated tensor.
    b) This fallback is well equipped to handle the following cases:
        - functional operation e.g., `torch.sin(input)`
        - Mutable inputs and in-place operations e.g., `tensor.add_(2)`
        - out-of-place operation e.g., `torch.sin(input, out=out)`
        - Tensorlist input args
        - NOTE: Meta tensors don't work with conjugate fallback.
4. Autograd
    a) `resolve_conj()` is an identity function w.r.t. autograd
    b) Everything else works as expected.
5. Testing:
    a) All method_tests run with conjugate view tensors.
    b) OpInfo tests that run with conjugate views
        - test_variant_consistency_eager/jit
        - gradcheck, gradgradcheck
        - test_conj_views (that only run for `torch.cfloat` dtype)

NOTE: functions like `empty_like`, `zero_like`, `randn_like`, `clone` don't propagate the conjugate bit.

Follow up work:
1. conjugate view RFC
2. Add neg bit to re-enable view operation on conjugated tensors
3. Update linalg functions to call into specialized functions that fast path with the hermitian operation.

Test Plan: Imported from OSS

Reviewed By: VitalyFedyunin

Differential Revision: D28227315

Pulled By: anjali411

fbshipit-source-id: acab9402b9d6a970c6d512809b627a290c8def5f
2021-06-04 14:12:41 -07:00
Peter Bell
6408cbd918 Migrate renorm to ATen (CPU and CUDA) (#59250)
Summary:
Resubmit of https://github.com/pytorch/pytorch/issues/59108, closes https://github.com/pytorch/pytorch/issues/24754, closes https://github.com/pytorch/pytorch/issues/24616

This reuses `linalg_vector_norm` to calculate the norms. I just add a new kernel that turns  the norm into a normalization factor, then multiply the original tensor using a normal broadcasted `mul` operator. The result is less code, and better performance to boot.

#### Benchmarks (CPU):
|     Shape    | Dim |  Before | After (1 thread) | After (8 threads) |
|:------------:|:---:|--------:|-----------------:|------------------:|
| (10, 10, 10) | 0   | 11.6 us |           4.2 us |            4.2 us |
|              | 1   | 14.3 us |           5.2 us |            5.2 us |
|              | 2   | 12.7 us |           4.6 us |            4.6 us |
| (50, 50, 50) | 0   |  330 us |           120 us |           24.4 us |
|              | 1   |  350 us |           135 us |           28.2 us |
|              | 2   |  417 us |           130 us |           24.4 us |

#### Benchmarks (CUDA)
|     Shape    | Dim |  Before |   After |
|:------------:|:---:|--------:|--------:|
| (10, 10, 10) | 0   | 12.5 us | 12.1 us |
|              | 1   | 13.1 us | 12.2 us |
|              | 2   | 13.1 us | 11.8 us |
| (50, 50, 50) | 0   | 33.7 us | 11.6 us |
|              | 1   | 36.5 us | 15.8 us |
|              | 2   | 41.1 us |   15 us |

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59250

Reviewed By: mruberry

Differential Revision: D28820359

Pulled By: ngimel

fbshipit-source-id: 572486adabac8135d52a9b8700f9d145c2a4ed45
2021-06-03 11:43:27 -07:00
albanD
d095ec75a1 Forward AD formulas batch 2 (#57863)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57863

Test Plan: Imported from OSS

Reviewed By: zou3519

Differential Revision: D28387763

Pulled By: albanD

fbshipit-source-id: e1b60ab728bb05b9e3323ee0dc7e401aaf5b8817
2021-06-03 07:33:04 -07:00
Richard Barnes
3979cb0656 irange for size_t (#55320)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/55320

Test Plan: Sandcastle

Reviewed By: ngimel

Differential Revision: D27572577

fbshipit-source-id: 97710fd2bb1303006b05828a0d1343b0b59ccb03
2021-06-03 01:04:13 -07:00
kshitij12345
5c18994674 [special] Add i1 and i1e (#56352)
Summary:
Reference: https://github.com/pytorch/pytorch/issues/50345

* [x] Check Docs https://12721710-65600975-gh.circle-artifacts.com/0/docs/special.html
* [x] Investigate fp32 failure on CI?! (Fails on clang. Reproduced locally with clang-11)
* [ ] Kernel vs Composite?
* [x] Autograd for `i0e` for zero?

Pull Request resolved: https://github.com/pytorch/pytorch/pull/56352

Reviewed By: anjali411

Differential Revision: D28700888

Pulled By: mruberry

fbshipit-source-id: 91a3cbb94f5b8a3b063589ec38179848c11def83
2021-05-29 20:55:23 -07:00
Natalia Gimelshein
355b24438c make vector_norm backward call norm_backward (#59135)
Summary:
Per title. Remove duplicated code.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/59135

Reviewed By: mruberry

Differential Revision: D28775716

Pulled By: ngimel

fbshipit-source-id: 50dc77590db15976453fc41c3657a77198749849
2021-05-29 12:14:46 -07:00
Adnios
09a8f22bf9 Add mish activation function (#58648)
Summary:
See issus: https://github.com/pytorch/pytorch/issues/58375

Pull Request resolved: https://github.com/pytorch/pytorch/pull/58648

Reviewed By: gchanan

Differential Revision: D28625390

Pulled By: jbschlosser

fbshipit-source-id: 23ea2eb7d5b3dc89c6809ff6581b90ee742149f4
2021-05-25 10:36:21 -07:00
Kurt Mohler
fe8e5eb260 Change native functions to take c10::string_view args instead of std::string (#57680)
Summary:
Fixes https://github.com/pytorch/pytorch/issues/53546

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57680

Reviewed By: malfet

Differential Revision: D28511799

Pulled By: ezyang

fbshipit-source-id: 43142f994d048b28b3279ccdb7a28cbaa3190973
2021-05-20 18:15:45 -07:00
lezcano
1f3807ce5d More stable and faster implementation of the gradient of torch.linalg.eigh (#55049)
Summary:
This PR:
- Renames symeig_backward to eigh_backward
- Improves the stability and speed of the gradient computation by doing `V(A + B)Vh` instead of `VAVh + VBVh`  when both the gradients of the eigenvectors and eigenvalues are defined.
- Updates the comments of the function to make them arguably clearer

Pull Request resolved: https://github.com/pytorch/pytorch/pull/55049

Reviewed By: ngimel

Differential Revision: D28396823

Pulled By: mruberry

fbshipit-source-id: a144482bfb1054e281b58ae1fe3cf1015bab505d
2021-05-13 17:17:35 -07:00
lezcano
9e156b01e5 linalg.eig backwards and linalg.eigvals (#57276)
Summary:
This PR adds backwards support for `eig` and `eigvals`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57276

Reviewed By: ngimel

Differential Revision: D28405056

Pulled By: mruberry

fbshipit-source-id: 27ef03f139f44d75f4d319b0f3e77e99eea9bb01
2021-05-13 09:42:13 -07:00
lezcano
db13119fc4 Deprecate symeig (#57732)
Summary:
This one had a tricky usage of `torch.symeig` that had to be replaced. I tested the replacement locally though.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57732

Reviewed By: bdhirsh

Differential Revision: D28328189

Pulled By: mruberry

fbshipit-source-id: 7f000fcbf2b029beabc76e5a89ff158b47977474
2021-05-12 02:21:35 -07:00
Nikita Vedeneev
c790fd2bf8 ATen lu_unpack. Required for making torch.lu_solve differentiable. (#46913)
Summary:
Backward methods for `torch.lu` and `torch.lu_solve` require the `torch.lu_unpack` method.
However, while `torch.lu` is a Python wrapper over a native function, so its gradient is implemented via `autograd.Function`,
`torch.lu_solve` is a native function, so it cannot access `torch.lu_unpack` as it is implemented in Python.

Hence this PR presents a native (ATen) `lu_unpack` version. It is also possible to update the gradients for `torch.lu` so that backward+JIT is supported (no JIT for `autograd.Function`) with this function.

~~The interface for this method is different from the original `torch.lu_unpack`, so it is decided to keep it hidden.~~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46913

Reviewed By: albanD

Differential Revision: D28355725

Pulled By: mruberry

fbshipit-source-id: 281260f3b6e93c15b08b2ba66d5a221314b00e78
2021-05-11 22:53:21 -07:00
Ivan Yashchuk
aaca12bcc2 Deprecate in docs torch.svd and change svd -> linalg_svd (#57981)
Summary:
This PR adds a note to the documentation that torch.svd is deprecated together with an upgrade guide on how to use `torch.linalg.svd` and `torch.linalg.svdvals` (Lezcano's instructions from https://github.com/pytorch/pytorch/issues/57549).
In addition, all usage of the old svd function is replaced with a new one from torch.linalg module, except for the `at::linalg_pinv` function, that fails the XLA CI build (https://github.com/pytorch/xla/issues/2755, see failure in draft PR https://github.com/pytorch/pytorch/pull/57772).

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57981

Reviewed By: ngimel

Differential Revision: D28345558

Pulled By: mruberry

fbshipit-source-id: 02dd9ae6efe975026e80ca128e9b91dfc65d7213
2021-05-11 18:04:10 -07:00
lezcano
415ae54c31 Deprecate torch.eig (#57727)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57727

Reviewed By: bdhirsh

Differential Revision: D28317984

Pulled By: mruberry

fbshipit-source-id: fa1aa1b78fd3611ac208bca93e2b745a1bac41f1
2021-05-10 23:31:02 -07:00
Mike Ruberry
3c87fe9b14 Revert D28117714: [pytorch][PR] ATen lu_unpack. Required for making torch.lu_solve differentiable.
Test Plan: revert-hammer

Differential Revision:
D28117714 (5c67d8dfd3)

Original commit changeset: befd33db12ec

fbshipit-source-id: 295b2134935542a903a73f90a7998239dfe6cc81
2021-05-09 23:20:06 -07:00
Nikita Vedeneev
5c67d8dfd3 ATen lu_unpack. Required for making torch.lu_solve differentiable. (#46913)
Summary:
Backward methods for `torch.lu` and `torch.lu_solve` require the `torch.lu_unpack` method.
However, while `torch.lu` is a Python wrapper over a native function, so its gradient is implemented via `autograd.Function`,
`torch.lu_solve` is a native function, so it cannot access `torch.lu_unpack` as it is implemented in Python.

Hence this PR presents a native (ATen) `lu_unpack` version. It is also possible to update the gradients for `torch.lu` so that backward+JIT is supported (no JIT for `autograd.Function`) with this function.

~~The interface for this method is different from the original `torch.lu_unpack`, so it is decided to keep it hidden.~~

Pull Request resolved: https://github.com/pytorch/pytorch/pull/46913

Reviewed By: astaff

Differential Revision: D28117714

Pulled By: mruberry

fbshipit-source-id: befd33db12ecc147afacac792418b6f4948fa4a4
2021-05-09 19:12:56 -07:00
Nikita Shulga
3a66a1cb99 [clang-tidy] Exclude cppcoreguidelines-avoid-magic-numbers (#57841)
Summary:
Add cppcoreguidelines-avoid-magic-numbers exclusion to clang-tidy
Remove existing nolint warnings using following script:
```
for file in `git ls-files | grep -v \.py`; do gsed '/^ *\/\/ NOLINTNEXTLINE(cppcoreguidelines-avoid-magic-numbers)/d' -i  $file; done
```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57841

Reviewed By: samestep

Differential Revision: D28295045

Pulled By: malfet

fbshipit-source-id: 7c6e8d1213c9593f169ed3df6a916498f1a97163
2021-05-07 20:02:33 -07:00
Peter Bell
2043093217 Add correction parameter to std/var (#50903)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50903

First part of #50010. Also fixes #51127.

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D27911345

Pulled By: mruberry

fbshipit-source-id: 7138fddc935802918ab9ff19f4bc1b9f4d745d41
2021-05-07 14:40:28 -07:00
Alexander
6f2c0cccdd New: sparse complex: add linear algebra, addmm (#57129)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/57129

Test Plan: Imported from OSS

Reviewed By: janeyx99, astaff

Differential Revision: D28112701

Pulled By: ezyang

fbshipit-source-id: 1b253453dc19e908fb18d0b1a83738243e0a8d59
2021-05-07 05:37:48 -07:00
Heitor Schueroff
1f1e2dab6b Remove optional type for ord parameter in vector_norm (#57662)
Summary:
As per discussion here https://github.com/pytorch/pytorch/pull/57127#discussion_r624948215

Note that we cannot remove the optional type from the `dim` parameter because the default is to flatten the input tensor which cannot be easily captured by a value other than `None`

### BC Breaking Note
This PR changes the `ord` parameter of `torch.linalg.vector_norm` so that it no longer accepts `None` arguments. The default behavior of `2` is equivalent to the previous default of `None`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/57662

Reviewed By: albanD, mruberry

Differential Revision: D28228870

Pulled By: heitorschueroff

fbshipit-source-id: 040fd8055bbe013f64d3c8409bbb4b2c87c99d13
2021-05-06 17:53:25 -07:00
Peter Bell
33eea146ee torch.clamp with tensor min and max (#52695)
Summary:
Fixes gh-2793

Pull Request resolved: https://github.com/pytorch/pytorch/pull/52695

Reviewed By: mruberry

Differential Revision: D27395977

Pulled By: ezyang

fbshipit-source-id: f86aa240feb034d42e4c45447e72218f6a773c24
2021-05-03 12:56:16 -07:00