pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Nikita Shulga	c9efb58223	Revert D33850228: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33850228 (`23d03025dc`) Original commit changeset: 3cc33fb298e4 Original Phabricator Diff: D33850228 (`23d03025dc`) fbshipit-source-id: 9436e7df73c2b2e2011f321674f24973316d3692	2022-01-31 09:38:13 -08:00
Ryan Spring	3a53b3e94f	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: cpuhrsch Differential Revision: D33850228 Pulled By: jbschlosser fbshipit-source-id: 3cc33fb298e480d7ecc5c67716da019d60c6ab33	2022-01-31 09:00:32 -08:00
Joel Schlosser	e9fb2d1db1	Revert D33744717: [pytorch][PR] Implement Tanh Gelu Approximation Test Plan: revert-hammer Differential Revision: D33744717 (`f499ab9cef`) Original commit changeset: d64532a562ed Original Phabricator Diff: D33744717 (`f499ab9cef`) fbshipit-source-id: 396c3f63de5865f894dbc353d0790a01a624be93	2022-01-28 10:32:14 -08:00
Ryan Spring	4713dd9cca	Implement Tanh Gelu Approximation (#61439 ) Summary: 1. Implements https://github.com/pytorch/pytorch/issues/39853 2. Adds approximate boolean flag to Gelu 3. Enables Tanh Gelu approximation 4. Adds double backward support for Gelu 5. Enable Tanh Gelu in NvFuser ``` def gelu(x, approximate : str = 'none'): if approximate == 'tanh': # sqrt(2/pi) = 0.7978845608028654 return 0.5 * x * (1.0 + torch.tanh(0.7978845608028654 * (x + 0.044715 * torch.pow(x, 3.0)))) else: return x * normcdf(x) ``` Linking XLA PR - https://github.com/pytorch/xla/pull/3039 Pull Request resolved: https://github.com/pytorch/pytorch/pull/61439 Reviewed By: mikaylagawarecki Differential Revision: D33744717 Pulled By: jbschlosser fbshipit-source-id: d64532a562ed53247bb4fa52bb16722634d5c187	2022-01-28 08:55:48 -08:00
kshitij12345	ba0407a86e	index_backward: use out-of-place index_put if any input is subclass (#71779 ) Summary: Reference: https://github.com/pytorch/functorch/issues/393 Context : The derivative of `__getitem__`/`index` is `f5a71ec2d6/tools/autograd/derivatives.yaml (L733-L734)` where `index_backward` is defined as `f5a71ec2d6/torch/csrc/autograd/FunctionsManual.cpp (L3892-L3894)` Problem arises when `grad` is not BatchedTensor but one of the other input is. In that case, `grad.new_zeros` returns an unbatched tensor and call to the inplace `_index_put_impl_` errors as it expects `zeros_like_self` to be Batched. To avoid this, we dispatch to out-of-place `index_put` if any of the input tensor is subclassed otherwise we dispatch to the inplace `_index_put_impl_`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/71779 Reviewed By: albanD Differential Revision: D33790596 Pulled By: zou3519 fbshipit-source-id: 9d6d81b758740cab7b3db9b905f1e8053f82b835	2022-01-28 08:15:43 -08:00
soulitzer	73fd3e021c	Fix forward AD for cudnn batch norm (#71901 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71901 We didn't catch this initially because CuDNN is not being tested on CI. The following tests fail on master (if we build with CuDNN), but pass with this PR: - `test_forward_mode_AD_nn_functional_batch_norm_cuda_float64` - `test_forward_mode_AD_nn_functional_instance_norm_cuda_float64` I don't think it is documented anywhere, but from the tests passing now I'm going to guess `result1` and `result2` return `mean` and `invstd` respectively. Previously, I thought mean and variance were returned because the variables were named `saved_mean` and `saved_var`. Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33818652 Pulled By: soulitzer fbshipit-source-id: ecee760f5aec620dc70f57de4fb3573c8f2f5f31	2022-01-27 15:52:47 -08:00
lezcano	391319ed8f	Implement forward AD for linalg.svd and improve svd_backward (#70253 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70253 I included a derivation of the formula in the complex case, as it is particularly tricky. As far as I know, this is the first time this formula is derived in the literature. I also implemented a more efficient and more accurate version of svd_backward. More importantly, I also added a lax check in the complex case making sure the loss function just depends on the subspaces spanned by the pairs of singular vectors, and not their joint phase. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33751982 Pulled By: mruberry fbshipit-source-id: c2a4a92a921a732357e99c01ccb563813b1af512	2022-01-27 10:37:08 -08:00
lezcano	a1860bd567	Rewrite svd and linalg.svd as structured kernels (#69827 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69827 In general, the current pattern allows for implementing optimisations for all the backends in a common place (see for example the optimisation for empty matrices). After this PR, `torch.svd` is implemented in terms of `linalg.svd` and `linalg.svdvals`, as expected. This makes it differentiable in the case when `compute_uv=False`, although this is not particularly important, as `torch.svd` will eventually be deprecated. This PR also instantiates smaller `U` / `V` when calling cusolver_gesvdj in the cases when `full_matrices=False` or `compute_uv=False`. The memory for auxiliary `U` and `V` in the cases above, needed for some cuSOLVER routines is allocated raw allocators rather than through fully fledged tensors, as it's just a blob of memory the algorithm requests. As the code is better structured now, it was easier to see that `U` and `Vh` needn't be allocated when calling `svd_cusolver_gesvd`. Now `linalg.svdvals` work as expected wrt the `out=` parameter. Note that in the test `test_svd_memory_allocation` we were passing a tensor of the wrong size and dtype and the test seemed to pass... This PR also changes the backward formula to avoid saving the input matrix, as it's not necessary. In a follow up PR, I will clean the backward formula and make it more numerically stable and efficient. This PR also does a number of memory optimisations here and there, and fixes the call to cusolver_gesvd, which were incorrect for m <= n. To test this path, I compiled the code with a flag to unconditionally execute the `if (!gesvdj_convergence_check.empty())` branch, and all the tests passed. I also took this chance to simplify the tests for these functions in `test_linalg.py`, as we had lots of tests that were testing some functionality that is already currently tested in the corresponding OpInfos. I used xwang233's feature to test both MAGMA and CUDA backends. This is particularly good for SVD, as cuSOLVER is always chosen over MAGMA when available, so testing MAGMA otherwise would be tricky. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33751983 Pulled By: mruberry fbshipit-source-id: 11d48d977946345583d33d14fb11a170a7d14fd2	2022-01-27 10:35:47 -08:00
Mikayla Gawarecki	ddcddac726	Add new reduce options and autograd support for scatter_reduce (#71788 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71788 Test Plan: Imported from OSS Reviewed By: mikaylagawarecki Differential Revision: D33778525 Pulled By: cpuhrsch fbshipit-source-id: 47b8544e29df3075bc6ede894c59499a7ffec876	2022-01-27 09:34:01 -08:00
soulitzer	60765438e8	Add forward AD formulas for some losses (#71026 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/71026 ...and fmod Testing: - L1Loss: new module tests (linear in the real case only) - SmoothL1Loss: new module tests - MSELoss: tested - OpInfo + new module tests - huberloss: tested - OpInfo + new module tests - multi-margin-loss: new module tests - kl-div: OpInfo + new module tests - fmod: OpInfo Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33485661 Pulled By: soulitzer fbshipit-source-id: 542ef5148183b9f574d06b2e2e345d0d889537b7	2022-01-26 08:29:46 -08:00
lezcano	97585ae1e7	Simplify forward / backward AD for linalg.eigh and add checks (#70528 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70528 This PR adds checks for the backward of `linalg.eigh`, similar to those deduced in https://github.com/pytorch/pytorch/pull/70253 It also makes its the implementation parallel that of the (fwd/bwd) derivative of `torch.linalg.eig` and it makes most OpInfo tests pass. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D33530149 Pulled By: albanD fbshipit-source-id: 1f368b8d450d4e9e8ae74d3881c78513c27eb956	2022-01-12 08:35:52 -08:00
lezcano	061be8d600	Correct forward AD for linalg.eig and add checks (#70527 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70527 This PR adds checks for the backward of `linalg.eig`, similar to those deduced in https://github.com/pytorch/pytorch/pull/70253 It also modifies the function so that it does not save the input matrix, as it's not necessary. It also corrects the forward AD formula for it to be correct. Now all the tests pass for `linalg.eig` and `linalg.eigvals`. It also updates the docs to reflect better what's going on here. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D33530148 Pulled By: albanD fbshipit-source-id: 984521a04f81ecb28ac1c4402b0243c63dd6959d	2022-01-12 08:30:55 -08:00
soulitzer	78994d13c0	Add forward AD formulas for {batch,layer,group}_norm (#70355 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70355 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33405362 Pulled By: soulitzer fbshipit-source-id: 55a92e88a04e7b15a0a223025d66c14f7db2a190	2022-01-10 13:52:16 -08:00
soulitzer	3051aabd0e	Add forward AD formulas for convolution and some others (#69956 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69956 Test Plan: Imported from OSS Reviewed By: albanD, bdhirsh Differential Revision: D33235974 Pulled By: soulitzer fbshipit-source-id: ea60d687edc5d62d92f3fd3cb6640421d32c908c	2022-01-06 08:39:51 -08:00
Amir Khojaste	748790588c	Upgrading the loop to use irange (#70326 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70326 See D24145988 for context: it allows loops such as for(int i=0;i<10;i++) to be expressed as for(const auto i : c10::irange(10)). This is nice because it auto-types the loops and adds const-safety to the iteration variable. Test Plan: buck run //caffe2/torch/fb/sparsenn:test Reviewed By: r-barnes Differential Revision: D33243400 fbshipit-source-id: b1f1b4163f4bf662031baea9e5268459b40c69a3	2022-01-06 07:06:53 -08:00
lezcano	a35b4b49d2	Add linalg.lu_factor (#66933 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66933 This PR exposes `torch.lu` as `torch.linalg.lu_factor` and `torch.linalg.lu_factor_ex`. This PR also adds support for matrices with zero elements both in the size of the matrix and the batch. Note that this function simply returns empty tensors of the correct size in this case. We add a test and an OpInfo for the new function. This PR also adds documentation for this new function in line of the documentation in the rest of `torch.linalg`. Fixes https://github.com/pytorch/pytorch/issues/56590 Fixes https://github.com/pytorch/pytorch/issues/64014 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D32834069 Pulled By: mruberry fbshipit-source-id: 51ef12535fa91d292f419acf83b800b86ee9c7eb	2022-01-05 20:32:12 -08:00
Richard Zou	29f1ccc8f0	Fix some Composite Compliance problems with binary_cross_entropy backward (#70198 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/70198 This PR fixes composite compliance problems with: - binary_cross_entropy's backward formula - binary_cross_entropy_with_logits's backward formula - binary_cross_entropy's double backward formula It does so by adding checks for areAnyTensorSubclassLike. Test Plan: - I tested everything with functorch. - We are going to do https://github.com/pytorch/pytorch/issues/69530 in the future so we have a way of testing this in core. I need the binary_cross_entropy ones for something right now and didn't want to wait until we come up with a solution for #69530. Reviewed By: Chillee Differential Revision: D33246995 Pulled By: zou3519 fbshipit-source-id: 310ed3196b937d01b189870b86a6c5f77f9258b4	2021-12-22 07:24:04 -08:00
Joel Schlosser	4d5dd00e61	Remove backward ops for cuDNN transposed convolution (#69902 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69902 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33093795 Pulled By: jbschlosser fbshipit-source-id: 8b90150bd1996e48c0c888bdab4e95a849d10ef5	2021-12-15 17:48:25 -08:00
Joel Schlosser	3dc3651e0e	Remove backward ops for cuDNN convolution (#69901 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69901 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33093796 Pulled By: jbschlosser fbshipit-source-id: f5beab6f3078144b6c8e5c4c51d69823815a9f99	2021-12-15 17:46:49 -08:00
soulitzer	b399a4d7b9	Add some reduction forward AD formulas (#69661 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69661 Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D33020601 Pulled By: soulitzer fbshipit-source-id: 110da6dcd490e5c3849cace62a777aa1a2b6982e	2021-12-14 23:33:43 -08:00
Richard Zou	41e1ab0785	Introduce isTensorSubclassLike; add special cases to backwards formulas (#69534 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69534 Something is TensorSubclassLike if it is a Tensor subclass or if it has the same problems as Tensor subclasses. Today that just includes Tensor Subclasses and meta tensors but may include other things in the future. Some of our backwards formulas are incompatible with TensorSubclassLike objects. For example, calling .data_ptr() is a problem because many TensorSubclassLike objects don't have storage. Another problem is in-place operations: performing `regular_tensor.inplace_(tensor_subclass)` is a problem. This PR adds special cases to the backward formulas for torch.max and torch.clamp to handle this. The backward formulas for torch.max and torch.clamp are not dispatcher operations so they cannot be overridden and we hesitate to make them dispatcher operations for FC/BC concerns and performance overhead concerns. Furthermore, the old concept of "is this inplace operation vmap compatible?" can be subsumed by the general "is this inplace operation tensor-subclass compatible" question, so I replaced all instances of isInplaceVmapCompatible and replaced it with the isTensorSubclassLike checks. Test Plan - I tested the changes using functorch. - It's possible to write a test for these in core (one has to make a custom tensor subclass and then send it through the operation and then invoke autograd), but I wanted to push the work to doing some generic testing for backward formulas (https://github.com/pytorch/pytorch/issues/69530) instead of doing some one-off things now. Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D32967727 Pulled By: zou3519 fbshipit-source-id: 30fda1a7581da4c55179b7a3ca05069150bbe2dc	2021-12-09 15:03:22 -08:00
lezcano	cafcf599d0	Deprecate torch.triangular_solve (#63570 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63570 There is a use of `at::triangular_solve_out` in the file `torch/csrc/jit/tensorexpr/external_functions.cpp` that I have not dared to move to `at::linalg_solve_triangular_out`. Deprecation note: This PR deprecates the `torch.triangular_solve` function in favor of `torch.linalg.solve_triangular`. An upgrade guide is added to the documentation for `torch.triangular_solve`. Note that it DOES NOT remove `torch.triangular_solve`, but `torch.triangular_solve` will be removed in a future PyTorch release. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D32618035 Pulled By: anjali411 fbshipit-source-id: 0bfb48eeb6d96eff3e96e8a14818268cceb93c83	2021-12-02 13:24:55 -08:00
lezcano	f9e69af22e	Modify LU_backward and lu_solve_backward to use linalg_solve_triangular (#63569 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63569 This PR also rewrites `lu_solve_backward` from scratch going from solving 5 systems of equations to just 2. cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: mruberry Differential Revision: D32618014 Pulled By: anjali411 fbshipit-source-id: 0e915bcf7045a4db43ffd076d807beac816c8538	2021-12-01 07:34:38 -08:00
Mike Ruberry	6ae34ea6f8	Revert D32521980: Add linalg.lu_factor Test Plan: revert-hammer Differential Revision: D32521980 (`b10929a14a`) Original commit changeset: 26a49ebd87f8 fbshipit-source-id: e1a6bb9c2ece9bd78190fe17e16a46e3358c5c82	2021-11-28 17:22:15 -08:00
lezcano	b10929a14a	Add linalg.lu_factor (#66933 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/66933 This PR exposes `torch.lu` as `torch.linalg.lu_factor` and `torch.linalg.lu_factor_ex`. This PR also adds support for matrices with zero elements both in the size of the matrix and the batch. Note that this function simply returns empty tensors of the correct size in this case. We add a test and an OpInfo for the new function. This PR also adds documentation for this new function in line of the documentation in the rest of `torch.linalg`. Fixes https://github.com/pytorch/pytorch/issues/56590 Fixes https://github.com/pytorch/pytorch/issues/64014 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: albanD Differential Revision: D32521980 Pulled By: mruberry fbshipit-source-id: 26a49ebd87f8a41472f8cd4e9de4ddfb7f5581fb	2021-11-27 17:52:48 -08:00
lezcano	b46c89d950	Add linalg.solve_triangular (#63568 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63568 This PR adds the first solver with structure to `linalg`. This solver has an API compatible with that of `linalg.solve` preparing these for a possible future merge of the APIs. The new API: - Just returns the solution, rather than the solution and a copy of `A` - Removes the confusing `transpose` argument and replaces it by a correct handling of conj and strides within the call - Adds a `left=True` kwarg. This can be achieved via transposes of the inputs and the result, but it's exposed for convenience. This PR also implements a dataflow that minimises the number of copies needed before calling LAPACK / MAGMA / cuBLAS and takes advantage of the conjugate and neg bits. This algorithm is implemented for `solve_triangular` (which, for this, is the most complex of all the solvers due to the `upper` parameters). Once more solvers are added, we will factor out this calling algorithm, so that all of them can take advantage of it. Given the complexity of this algorithm, we implement some thorough testing. We also added tests for all the backends, which was not done before. We also add forward AD support for `linalg.solve_triangular` and improve the docs of `linalg.solve_triangular`. We also fix a few issues with those of `torch.triangular_solve`. Resolves https://github.com/pytorch/pytorch/issues/54258 Resolves https://github.com/pytorch/pytorch/issues/56327 Resolves https://github.com/pytorch/pytorch/issues/45734 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D32588230 Pulled By: mruberry fbshipit-source-id: 69e484849deb9ad7bb992cc97905df29c8915910	2021-11-22 12:41:06 -08:00
soulitzer	7bb401a4c9	Add forward AD support for miscellanous operators (#67820 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/67820 Original PR here: https://github.com/pytorch/pytorch/pull/67040 Test Plan: Imported from OSS Reviewed By: gchanan Differential Revision: D32314423 Pulled By: soulitzer fbshipit-source-id: ecd898dc903692cab084f6922a1d86986f957b1b	2021-11-19 14:31:06 -08:00
jiej	ca92111758	Add native_dropout (#63937 ) Summary: Adds native_dropout to have a reasonable target for torchscript in auto diff. native_dropout has scale and train as arguments in its signature, this makes native_dropout more consistent with other operators and removes conditionals in the autodiff definition. cc gmagogsfm Pull Request resolved: https://github.com/pytorch/pytorch/pull/63937 Reviewed By: mruberry Differential Revision: D32477657 Pulled By: ngimel fbshipit-source-id: d37b137a37acafa50990f60c77f5cea2818454e4	2021-11-18 19:41:10 -08:00
Jane Xu	9f4e004abd	Revert D32283178: Add linalg.solve_triangular Test Plan: revert-hammer Differential Revision: D32283178 (`0706607abc`) Original commit changeset: deb672e6e52f fbshipit-source-id: d2a3421292147426cc61c2f063b721acf9004755	2021-11-18 14:46:10 -08:00
lezcano	0706607abc	Add linalg.solve_triangular (#63568 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/63568 This PR adds the first solver with structure to `linalg`. This solver has an API compatible with that of `linalg.solve` preparing these for a possible future merge of the APIs. The new API: - Just returns the solution, rather than the solution and a copy of `A` - Removes the confusing `transpose` argument and replaces it by a correct handling of conj and strides within the call - Adds a `left=True` kwarg. This can be achieved via transposes of the inputs and the result, but it's exposed for convenience. This PR also implements a dataflow that minimises the number of copies needed before calling LAPACK / MAGMA / cuBLAS and takes advantage of the conjugate and neg bits. This algorithm is implemented for `solve_triangular` (which, for this, is the most complex of all the solvers due to the `upper` parameters). Once more solvers are added, we will factor out this calling algorithm, so that all of them can take advantage of it. Given the complexity of this algorithm, we implement some thorough testing. We also added tests for all the backends, which was not done before. We also add forward AD support for `linalg.solve_triangular` and improve the docs of `linalg.solve_triangular`. We also fix a few issues with those of `torch.triangular_solve`. Resolves https://github.com/pytorch/pytorch/issues/54258 Resolves https://github.com/pytorch/pytorch/issues/56327 Resolves https://github.com/pytorch/pytorch/issues/45734 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: zou3519, JacobSzwejbka Differential Revision: D32283178 Pulled By: mruberry fbshipit-source-id: deb672e6e52f58b76536ab4158073927a35e43a8	2021-11-18 09:45:51 -08:00
Nikita Vedeneev	857fed1f42	torch.linalg.qr: forward AD support (#67268 ) Summary: As per title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/67268 Reviewed By: ngimel Differential Revision: D31960517 Pulled By: albanD fbshipit-source-id: bfd1028a8d352f550efb420f9ca609c09f4a7484	2021-11-18 08:11:54 -08:00
Matthias Reis	4c346bd073	Added forward derivatives for neg, diag, inverse, linalg_eig (#67837 ) Summary: Recreated due to CI failures as per comment https://github.com/pytorch/pytorch/pull/67339#issuecomment-959893293 === See also discussion in https://github.com/pytorch/pytorch/issues/10223, starting from [this](https://github.com/pytorch/pytorch/issues/10223#issuecomment-949499666) comment The formulas for the derivatives are taken from https://people.maths.ox.ac.uk/gilesm/files/NA-08-01.pdf. As indicated, the method linalg_eig_jvp should be used instead of linalg_eig_jvp_eigenvalues and linalg_eig_jvp_eigenvectors in the future. Due to a codegen limitation, this is not yet possible. CC albanD Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/67837 Reviewed By: mrshenli Differential Revision: D32403662 Pulled By: soulitzer fbshipit-source-id: 529cb93f865ce4cc2e24fa6f672d4234e7abe2b1	2021-11-16 20:32:47 -08:00
Masaki Kozuki	c5e5264be2	Disable TF32 in `pinv_jvp` and `pinv_backward` (#67948 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/67947 cc ptrblck xwang233 zasdfgbnm Pull Request resolved: https://github.com/pytorch/pytorch/pull/67948 Reviewed By: H-Huang Differential Revision: D32251934 Pulled By: ngimel fbshipit-source-id: a2b1a118337b38db61350c9e49f1ba19030d70ec	2021-11-08 22:33:29 -08:00
Natalia Gimelshein	98be5216e2	Revert D32104006: [pytorch][PR] Added forward derivatives for neg, diag, inverse, linalg_eig Test Plan: revert-hammer Differential Revision: D32104006 (`88c61b8d06`) Original commit changeset: 1f6ace09ee3e fbshipit-source-id: f9f950b4177e1fe29b9059f4b5dfb9c8c67f479a	2021-11-03 12:40:00 -07:00
Matthias Reis	88c61b8d06	Added forward derivatives for neg, diag, inverse, linalg_eig (#67339 ) Summary: See also discussion in https://github.com/pytorch/pytorch/issues/10223, starting from [this](https://github.com/pytorch/pytorch/issues/10223#issuecomment-949499666) comment The formulas for the derivatives are taken from https://people.maths.ox.ac.uk/gilesm/files/NA-08-01.pdf. As indicated, the method linalg_eig_jvp should be used instead of linalg_eig_jvp_eigenvalues and linalg_eig_jvp_eigenvectors in the future. Due to a codegen limitation, this is not yet possible. CC albanD Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/67339 Reviewed By: ejguan Differential Revision: D32104006 Pulled By: albanD fbshipit-source-id: 1f6ace09ee3e737b99520543b30550601809ceb5	2021-11-03 11:21:54 -07:00
Nikita Vedeneev	3c61700cf7	`torch.linalg.householder_product`: forward AD support (#67043 ) Summary: As per title. cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry walterddr IvanYashchuk xwang233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/67043 Reviewed By: VitalyFedyunin Differential Revision: D31897617 Pulled By: albanD fbshipit-source-id: ef135fe3d9e5b9b2a541c355017f07cdb1309979	2021-10-26 08:34:00 -07:00
lezcano	d3fc3c4ded	Implement forward AD for linalg.matrix_exp (#62716 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/62716 cc jianyuh nikitaved pearu mruberry walterddr IvanYashchuk xwang233 Lezcano Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D31823231 Pulled By: mruberry fbshipit-source-id: 6d19b8988dce773b5716f0522d06febfe167fead	2021-10-21 23:55:36 -07:00
lezcano	0974215c4d	Prefer mT and mH over transpose(-2, -1) and transpose(-2, -1).conj() (#64181 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/64181 This PR replaces all the calls to: - `transpose(-2, -1)` or `transpose(-1, -2)` by `mT()` in C++ and `mT` in Python - `conj().transpose(-2, -1)` or `transpose(-2, -1).conj()` or `conj().transpose(-1, -2)` or `transpose(-1, -2).conj()` by `mH()` in C++ and `mH` in Python. It also simplifies two pieces of code, and fixes one bug where a pair of parentheses were missing in the function `make_symmetric_matrices`. Test Plan: Imported from OSS Reviewed By: H-Huang Differential Revision: D31692896 Pulled By: anjali411 fbshipit-source-id: e9112c42343663d442dc5bd53ff2b492094b434a	2021-10-18 13:02:25 -07:00
Nikita Vedeneev	7fad47e522	`torch.linalg.lstsq`: forward/backward AD support (#65054 ) Summary: As per title. cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry walterddr IvanYashchuk xwang233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/65054 Reviewed By: zou3519 Differential Revision: D31729468 Pulled By: albanD fbshipit-source-id: ab7df824bc80128e7f64f6444c7a4baa4786c161	2021-10-18 11:28:44 -07:00
Nikita Vedeneev	06c37876b8	`torch.linalg.householder_product` faster backward (#63880 ) Summary: This PR implements a much more efficient algorithm. This algorithm allows to achieve MASSIVE speed-ups, especially for batched and/or larger double-precision inputs. Here are some benchmarks: <details> <summary>Testing script</summary> ```python from IPython import get_ipython import torch import itertools torch.manual_seed(13) #torch.set_num_threads(1) ipython = get_ipython() cpu = torch.device('cpu') cuda = torch.device('cuda') def generate_input(shape, dtype=torch.double, device=cpu): eigvals = torch.rand(shape[:-1], dtype=dtype, device=device) eigvecs = torch.rand(shape, dtype=dtype, device=device) input = (eigvecs * eigvals.unsqueeze(-2)) @ eigvecs.inverse() input.requires_grad_(True) tau = torch.rand(*shape[:-1], dtype=dtype, device=device) tau.requires_grad_(True) return input, tau def run_test(shape, device, dtype): print(f"shape: {shape}, device: {device}, dtype: {dtype}") a, tau = generate_input(shape, dtype=dtype, device=device) prod = torch.linalg.householder_product(a, tau) ones_prod = torch.ones_like(prod) command = "torch.autograd.backward((prod,), (ones_prod), retain_graph=True)" if device == cuda: command = command + "; torch.cuda.synchronize()" ipython.magic(f"timeit {command}") print() dtypes = [torch.float, torch.double] devices = [cpu, cuda] #devices = [cuda] sizes = [ (10, 10), (1000, 10, 10), (100, 100), (1000, 100, 100), (1000, 1000), (10, 1000, 1000), ] for device, dtype, size in itertools.product(devices, dtypes, sizes): run_test(size, device, dtype) ``` </details> <details> <summary>This PR, cuda float32</summary> ``` shape: (10, 10), device: cuda, dtype: torch.float32 1.33 ms ± 1.82 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) shape: (1000, 10, 10), device: cuda, dtype: torch.float32 1.52 ms ± 40.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) shape: (100, 100), device: cuda, dtype: torch.float32 10.8 ms ± 9.62 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) shape: (1000, 100, 100), device: cuda, dtype: torch.float32 127 ms ± 8.45 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) shape: (1000, 1000), device: cuda, dtype: torch.float32 151 ms ± 127 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) shape: (10, 1000, 1000), device: cuda, dtype: torch.float32 981 ms ± 91.4 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` </details> <details> <summary>Master, cuda float32</summary> ``` shape: (10, 10), device: cuda, dtype: torch.float32 1.64 ms ± 6.36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) shape: (1000, 10, 10), device: cuda, dtype: torch.float32 298 ms ± 463 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (100, 100), device: cuda, dtype: torch.float32 15.4 ms ± 41.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) shape: (1000, 100, 100), device: cuda, dtype: torch.float32 5.36 s ± 711 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (1000, 1000), device: cuda, dtype: torch.float32 1.64 s ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (10, 1000, 1000), device: cuda, dtype: torch.float32 15.7 s ± 43.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` </details> <details> <summary>This PR, cuda float64</summary> ``` shape: (10, 10), device: cuda, dtype: torch.float64 1.14 ms ± 1.43 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) shape: (1000, 10, 10), device: cuda, dtype: torch.float64 2.22 ms ± 1.32 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) shape: (100, 100), device: cuda, dtype: torch.float64 10.6 ms ± 11.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) shape: (1000, 100, 100), device: cuda, dtype: torch.float64 287 ms ± 84.9 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (1000, 1000), device: cuda, dtype: torch.float64 236 ms ± 41.9 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (10, 1000, 1000), device: cuda, dtype: torch.float64 1.88 s ± 88.3 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` </details> <details> <summary>Master, cuda float64</summary> ``` shape: (10, 10), device: cuda, dtype: torch.float64 1.58 ms ± 8.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) shape: (1000, 10, 10), device: cuda, dtype: torch.float64 308 ms ± 213 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (100, 100), device: cuda, dtype: torch.float64 79 ms ± 14.5 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) shape: (1000, 100, 100), device: cuda, dtype: torch.float64 54.2 s ± 1.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (1000, 1000), device: cuda, dtype: torch.float64 31.5 s ± 698 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (10, 1000, 1000), device: cuda, dtype: torch.float64 4min 45s ± 2.48 s per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` </details> <details> <summary>This PR, cpu float32</summary> ``` shape: (10, 10), device: cpu, dtype: torch.float32 476 µs ± 21.4 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (1000, 10, 10), device: cpu, dtype: torch.float32 5.1 ms ± 100 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) shape: (100, 100), device: cpu, dtype: torch.float32 4.38 ms ± 4.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) shape: (1000, 100, 100), device: cpu, dtype: torch.float32 1.55 s ± 6.64 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (1000, 1000), device: cpu, dtype: torch.float32 745 ms ± 407 µs per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (10, 1000, 1000), device: cpu, dtype: torch.float32 5.44 s ± 15.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` </details> <details> <summary>Master, cpu float32</summary> ``` shape: (10, 10), device: cpu, dtype: torch.float32 387 µs ± 645 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) shape: (1000, 10, 10), device: cpu, dtype: torch.float32 12.3 ms ± 23.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) shape: (100, 100), device: cpu, dtype: torch.float32 39.4 ms ± 80.3 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) shape: (1000, 100, 100), device: cpu, dtype: torch.float32 29.1 s ± 44.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (1000, 1000), device: cpu, dtype: torch.float32 9.42 s ± 14.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (10, 1000, 1000), device: cpu, dtype: torch.float32 1min 50s ± 282 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` </details> <details> <summary>This PR, cpu float64</summary> ``` shape: (10, 10), device: cpu, dtype: torch.float64 381 µs ± 761 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) shape: (1000, 10, 10), device: cpu, dtype: torch.float64 6.19 ms ± 13.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) shape: (100, 100), device: cpu, dtype: torch.float64 4.6 ms ± 3.26 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) shape: (1000, 100, 100), device: cpu, dtype: torch.float64 2.59 s ± 8.25 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (1000, 1000), device: cpu, dtype: torch.float64 1.07 s ± 5.09 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (10, 1000, 1000), device: cpu, dtype: torch.float64 14.4 s ± 13.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` </details> <details> <summary>Master, cpu float64</summary> ``` shape: (10, 10), device: cpu, dtype: torch.float64 395 µs ± 1.04 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) shape: (1000, 10, 10), device: cpu, dtype: torch.float64 14.6 ms ± 9.76 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) shape: (100, 100), device: cpu, dtype: torch.float64 45.5 ms ± 154 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) shape: (1000, 100, 100), device: cpu, dtype: torch.float64 33.1 s ± 69.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (1000, 1000), device: cpu, dtype: torch.float64 19.3 s ± 80.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) shape: (10, 1000, 1000), device: cpu, dtype: torch.float64 3min 30s ± 1.29 s per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` </details> Pull Request resolved: https://github.com/pytorch/pytorch/pull/63880 Reviewed By: soulitzer Differential Revision: D30639435 Pulled By: anjali411 fbshipit-source-id: 127789943ae56e2f1dd03e0fe76ef7b6db86bcf0	2021-10-15 09:54:30 -07:00
Peter Bell	5f45927d15	Autograd: Delay warnings until the end of backward execution (#66235 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/50209 This adds a new warning handler that stores all warnings in a shared queue, which can be "replayed" at a later time and, crucially, on another thread. Then, I use this inside the autograd engine to ensure that warnings are processed by the handler registered on the main thread. For testing, I also add an operator that always warns in the backward pass and test that the warning is a normal Python warning. cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66235 Reviewed By: ejguan Differential Revision: D31505413 Pulled By: albanD fbshipit-source-id: 1a7f60b038f55c20591c0748b9e86735b3fec2f9	2021-10-13 15:38:04 -07:00
Nikita Vedeneev	1b40daac74	pinv: forward/backward AD which is Frechet-defined in a rank-preserving neighborhood. (#66092 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/65911. Also enables complex support/tests for `linalg_pinv` in OpInfo. cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry walterddr IvanYashchuk xwang233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/66092 Reviewed By: ejguan Differential Revision: D31503072 Pulled By: albanD fbshipit-source-id: 52018e826826ae62beaad76becb5edf880be253f	2021-10-11 08:33:28 -07:00
Nikita Vedeneev	1d586e78c6	`_solve` methods: implements forward AD (#65546 ) Summary: This PR adds forward AD for `_solve` methods. Additionally, `cholesky_solve` gets OpInfo + a bug fix when wrong leading dimensions could be passed to LAPACK, and `lu_solve` gets forward AD with 2x`lu_solve` instead of 1x`lu_solve` + 2x`triangular_solve`. cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry walterddr IvanYashchuk xwang233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/65546 Reviewed By: dagitses Differential Revision: D31431847 Pulled By: albanD fbshipit-source-id: 0e343e0d9da3c3d2051fca215fad289d77275251	2021-10-06 16:04:22 -07:00
soulitzer	4cdfceddd2	[Reland] Avoid saving self for `softmax` and `log_softmax` (#66018 ) Summary: Reland of https://github.com/pytorch/pytorch/pull/65242 The last attempt of the reland automatically rebased onto stable, which did not yet have the revert commit Pull Request resolved: https://github.com/pytorch/pytorch/pull/66018 Reviewed By: albanD Differential Revision: D31348822 Pulled By: soulitzer fbshipit-source-id: 881d701b404530c1352ac9245bd67264e1652b8a	2021-10-03 21:35:01 -07:00
Michael Suo	9ae63bd87c	Revert D31238123: [pytorch][PR] Avoid saving self for`softmax` and `log_softmax` Test Plan: revert-hammer Differential Revision: D31238123 (`fb412bdd80`) Original commit changeset: afd319d3676d fbshipit-source-id: b7980d653a4b8322a225f1dd08c2857ecbe5bc94	2021-09-30 11:34:14 -07:00
soulitzer	fb412bdd80	Avoid saving self for`softmax` and `log_softmax` (#65242 ) Summary: Fixes https://github.com/pytorch/pytorch/issues/64000 - updates double backward formula to compute grad wrt output instead of self - ~~In some of the error messages, we still refer to the dtype of the input, even though we are now checking the dtype of the output~~ Pull Request resolved: https://github.com/pytorch/pytorch/pull/65242 Reviewed By: albanD Differential Revision: D31238123 Pulled By: soulitzer fbshipit-source-id: afd319d3676d9ef8d81607e0e8c2a3e6d09f68e4	2021-09-29 18:16:12 -07:00
Mike Ruberry	0a0564a347	Revert D31206837: [pytorch][PR] `*_solve` methods: implements forward AD Test Plan: revert-hammer Differential Revision: D31206837 (`26e31f76b0`) Original commit changeset: 040beda97442 fbshipit-source-id: f28091327357af9f54f367eda6606240924b93ac	2021-09-28 23:31:16 -07:00
Nikita Vedeneev	26e31f76b0	`_solve` methods: implements forward AD (#65546 ) Summary: This PR adds forward AD for `_solve` methods. Additionally, `cholesky_solve` gets OpInfo + a bug fix when wrong leading dimensions could be passed to LAPACK, and `lu_solve` gets forward AD with 2x`lu_solve` instead of 1x`lu_solve` + 2x`triangular_solve`. cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry walterddr IvanYashchuk xwang233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/65546 Reviewed By: gchanan Differential Revision: D31206837 Pulled By: albanD fbshipit-source-id: 040beda97442e7a88a9df9abc7bb18313ce55bc3	2021-09-28 06:51:32 -07:00
Ivan Yashchuk	0aef44cb3d	Add forward AD for torch.linalg.eigh (#62163 ) Summary: This PR adds forward mode differentiation for `torch.linalg.eigh` and a few other functions required for tests to pass. For some reason running tests for `torch.linalg.eigvalsh` and complex `torch.linalg.eigh` hangs. These tests are skipped for now. cc ezyang albanD zou3519 gqchen pearu nikitaved soulitzer Lezcano Varal7 jianyuh mruberry heitorschueroff walterddr IvanYashchuk xwang233 Pull Request resolved: https://github.com/pytorch/pytorch/pull/62163 Reviewed By: jbschlosser Differential Revision: D30903988 Pulled By: albanD fbshipit-source-id: d6a74adb9e6d2f4be8ac707848ecabf06d629823	2021-09-13 21:15:38 -07:00
Nikita Vedeneev	88fff22023	`torch.lu`: forward AD support (#64742 ) Summary: As per title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/64742 Reviewed By: H-Huang Differential Revision: D30841227 Pulled By: albanD fbshipit-source-id: dc4d043ab94358594adb110fbbbb60750c98262a	2021-09-10 07:19:11 -07:00

1 2 3 4

184 Commits