pytorch

mirror of https://github.com/zebrajr/pytorch.git synced 2025-12-06 12:20:52 +01:00

Author	SHA1	Message	Date
Driss Guessous	a269e5fa04	Add forward and backward support for silu to NestedTensors (#97181 ) # Summary Add forward and backward support for silu to NestedTensors - Add forward support to silu - Add forward support to silu_ - Add backward support to silu - Add to NT docs - Add tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/97181 Approved by: https://github.com/cpuhrsch, https://github.com/jbschlosser	2023-03-20 23:46:12 +00:00
Driss Guessous	5612aa6acd	Fixes a layer_norm_nested backwards edge case. (#96788 ) # Summary Add Test and the fix for when input NT doesn't require grad to layernorm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96788 Approved by: https://github.com/cpuhrsch	2023-03-15 17:16:13 +00:00
Joel Schlosser	30d56dd8c1	Support randn_like() for NT (#96528 ) To satisfy an internal ask. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96528 Approved by: https://github.com/mikaylagawarecki, https://github.com/cpuhrsch	2023-03-13 19:39:51 +00:00
Joel Schlosser	024ea1a21e	Support zeros_like() for NT (#96527 ) This is used for the fake tensor fallbacks. Pull Request resolved: https://github.com/pytorch/pytorch/pull/96527 Approved by: https://github.com/cpuhrsch	2023-03-13 15:15:08 +00:00
Mikayla Gawarecki	6930f30ccd	Small bugfix in nested matmul bmm path head_dim acquisition (#95744 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95744 Approved by: https://github.com/drisspg	2023-03-01 03:27:08 +00:00
Joel Schlosser	68eec90cfd	Support elementwise add / mul for [B, ] nested, [B, 1] dense (CUDA only) (#95620 ) Small hack to reuse the 3D custom kernel from #88289 for [B, ] nested, [B, 1] dense elementwise add / mul. Simply treat the inputs as [B, *, 1], [B, 1, 1]. This is added to satisfy an internal ask. Future work: full general broadcasting support between mixed nested / dense. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95620 Approved by: https://github.com/cpuhrsch, https://github.com/drisspg	2023-02-27 21:07:09 +00:00
Huy Do	9b7abc4fac	Run slow gradcheck tests sequentially (#95494 ) Also redo https://github.com/pytorch/pytorch/pull/95246 as there are many more still run OOM Pull Request resolved: https://github.com/pytorch/pytorch/pull/95494 Approved by: https://github.com/clee2000	2023-02-26 00:44:25 +00:00
Driss Guessous	0d7913c9c1	add backwards for layer norm nested (#94781 ) Fixes #94702 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94781 Approved by: https://github.com/cpuhrsch	2023-02-16 01:42:57 +00:00
Driss Guessous	63bf7674fa	add backwards for gelu and relu on nested tensors. (#94776 ) Fixes #94701 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94776 Approved by: https://github.com/cpuhrsch	2023-02-14 18:42:06 +00:00
Mikayla Gawarecki	c7c7238976	Fix bug in unsqueeze_nested stride calculation (#88688 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88688 Approved by: https://github.com/cpuhrsch	2023-02-10 17:00:04 +00:00
Driss Guessous	bef2483ed8	[NestedTensor] Call contiguous in linear backward (#94317 ) Fixes #94303 If in upward grad for linear_backward was discontiguous we would throw a torch check. This updates the implementation to instead call contiguous and changes the check to an internal assert. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94317 Approved by: https://github.com/mikaylagawarecki	2023-02-07 23:43:46 +00:00
Driss Guessous	df14650f0b	[SDPA] Update SDPA API and make function Public (#92189 ) # Summary In preparation for pt 2.0 launch this PR updates SDPA's API and makes the function a nn.funcitonal public function. ## Changes ### API Previously the the function signature was: `scaled_dot_product_attention(query, key, value, attn_mask=None, need_attn_weights=False, dropout_p=0.0, is_causal=False) -> (Tensor, Tensor)` Updated signature: `scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0, is_causal=False) -> Tensor` This PR removes the need_attn_weights optional boolean variable and updates the return type to a singular tensor. #### Reasoning: The main goal of this function is to provide an easy interface for users to call into fused attention kernels e.g. (FlashAttention). The fused kernels do not currently support arbitrary attn_mask or dropout but there is a PR to mem-efficient attention to enable these. We want to have the API surface ready for when the backing kernels get updated. The fused kernels save on memory usage by not materializing the weights and it is unlikely that a fast fused implementation will enable this feature so we are removing. Discussed with folks at FAIR/Xformers and +1 this API change. #### Make function Public In preparation for the pt 2.0 launch we make the function public to start to generate user feedback Pull Request resolved: https://github.com/pytorch/pytorch/pull/92189 Approved by: https://github.com/cpuhrsch	2023-01-23 20:50:46 +00:00
Mikayla Gawarecki	5848704ef8	Removed unecessary check in `select_nested` (#89150 ) Implementation in #88585 should work for all dimensions. Removed unnecessary check that constrained select to dims 0 and 1 Pull Request resolved: https://github.com/pytorch/pytorch/pull/89150 Approved by: https://github.com/cpuhrsch	2022-11-16 22:11:37 +00:00
Antoni Viros i Martin	9f58e027a9	Add implementation for irregular dimension selection for nested tensors. (#88585 ) Summary: This diff modifies the implementation of the select operator so slices of the irregular dimension can be selected (e.g. nt[:,0,:]). Test Plan: Added new unit tests to test that the new functions work as intended (see them in diff). To test, `buck test mode/dev-nosan //caffe2/test:nested` Differential Revision: D41083993 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88585 Approved by: https://github.com/cpuhrsch	2022-11-09 00:19:38 +00:00
Antoni Viros i Martin	c77368d416	Implement a constructor for nested_tensor that is similar to torch.tensor() (#88213 ) Summary: This diff merges both previous implementations of constructors for nested tensors, the one from lists of tensors and the one with arbitrary python lists, adn implements it in pytorch core so no extensions are needed to construct NT. Pull Request resolved: https://github.com/pytorch/pytorch/pull/88213 Approved by: https://github.com/cpuhrsch	2022-11-08 00:03:18 +00:00
Christian Puhrsch	5e6ceebccb	Add support for neg to NestedTensor (#88131 ) Partially fixes #86889 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88131 Approved by: https://github.com/drisspg	2022-11-03 15:15:57 +00:00
Mikayla Gawarecki	d979caa87c	Added add/mul for nested dense [B, *, D], [B, 1, D] case (CUDA-only) (#88289 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/88289 Approved by: https://github.com/cpuhrsch	2022-11-03 01:29:25 +00:00
Christian Puhrsch	943b20e7ae	Use tensor cores for NT bmm (#86856 ) Copy of internal diff. Pull Request resolved: https://github.com/pytorch/pytorch/pull/86856 Approved by: https://github.com/drisspg	2022-11-02 21:51:40 +00:00
PyTorch MergeBot	99c07735e4	Revert "Add support for neg to NestedTensor (#88131 )" This reverts commit `6a75a0d1a1`. Reverted https://github.com/pytorch/pytorch/pull/88131 on behalf of https://github.com/mehtanirav due to [Internal breakages](https://www.internalfb.com/intern/sandcastle/job/13510799692239080/insights)	2022-11-02 18:43:36 +00:00
Driss Guessous	560786ac20	call contiguous on BMM inputs for NT on CUDA (#88108 ) Fixes #87713 BMM for cpu supports non-contiguous nested tensor inputs, while BMM for Cuda does not support currently non-contiguous inputs. The derivative for BMM: ``` - name: bmm(Tensor self, Tensor mat2) -> Tensor self: grad.bmm(mat2.transpose(1, 2).conj()) mat2: self.transpose(1, 2).conj().bmm(grad) result: self_t.bmm(mat2_p) + self_p.bmm(mat2_t) ``` When calling backward it was impossible for this function to succeed since the inputs were always discontiguous, regardless of the user input. This adds contiguous calls to BMM_cuda implementation for nested tensors. This was not caught by tests because grad_check is currently only done on CPU in test_nestedtensors. This PR updates the autograd test to also be run on GPU. As a result I found one more issue with the backward for to_padded_tensor erroring instead of calling the generic version. cc @cpuhrsch @jbschlosser @bhosmer @mikaylagawarecki Pull Request resolved: https://github.com/pytorch/pytorch/pull/88108 Approved by: https://github.com/cpuhrsch	2022-11-01 03:14:27 +00:00
Christian Puhrsch	6a75a0d1a1	Add support for neg to NestedTensor (#88131 ) Partially fixes #86889 Pull Request resolved: https://github.com/pytorch/pytorch/pull/88131 Approved by: https://github.com/drisspg	2022-11-01 02:37:42 +00:00
Christian Puhrsch	b192e7e415	Support non-contiguous NestedTensors for elementwise ops (#87888 ) Enables benchmarking of math path of sdp kernel Pull Request resolved: https://github.com/pytorch/pytorch/pull/87888 Approved by: https://github.com/drisspg	2022-10-28 11:26:17 +00:00
Antoni Viros i Martin	775fef51b7	Implement copy_, fill_, and ones_like for Nested Tensors backends (#87728 ) Summary: This diff implements copy_ in order to allow pinned memory transfers for nested tensors, as well as fill_ and ones_like, to test whether nested tensors can be created with other factory functions. Test Plan: Pass all CI and sandcastle jobs. Reviewed By: mikekgfb Differential Revision: D40689594 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87728 Approved by: https://github.com/cpuhrsch	2022-10-26 14:48:27 +00:00
Antoni Viros i Martin	d94e33f041	Add support for .to() for NestedTensor backends (#87146 ) Summary: This commit adds support for moving NestedTensors from CPU to GPU and back. The implementation includes requires implementing empty_like(), which is based on PR#83140. Test Plan: Added a new unit test based on the unit test for the main .to() implementation. All unit tests must pass, as well as every sandcastle job. Differential Revision: D40437585 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87146 Approved by: https://github.com/drisspg	2022-10-20 03:46:50 +00:00
Christian Puhrsch	f6c6048b10	Use CUTLASS GEMM for NT bmm (#85894 ) Copy of https://github.com/pytorch/pytorch/pull/85710 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85894 Approved by: https://github.com/drisspg	2022-10-18 23:11:47 +00:00
Mikayla Gawarecki	ab69550678	Add nested squeeze.dim and unsqueeze (#86813 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86813 Approved by: https://github.com/drisspg	2022-10-13 16:05:36 +00:00
PyTorch MergeBot	d169f950da	Revert "Use CUTLASS GEMM for NT bmm [OSS-only] (#85894 )" This reverts commit `ef58a132f2`. Reverted https://github.com/pytorch/pytorch/pull/85894 on behalf of https://github.com/DanilBaibak due to Break internal build	2022-10-13 15:28:09 +00:00
Mikayla Gawarecki	2a75152537	[easy] Add nested tanh (#86826 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/86826 Approved by: https://github.com/cpuhrsch	2022-10-13 00:48:08 +00:00
Christian Puhrsch	ef58a132f2	Use CUTLASS GEMM for NT bmm [OSS-only] (#85894 ) OSS-only copy of https://github.com/pytorch/pytorch/pull/85710 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85894 Approved by: https://github.com/drisspg	2022-10-12 20:03:28 +00:00
Driss Guessous	16f65f178a	Nested tensor forward only chunk operations (#85645 ) # Summary Taking over this pr: https://github.com/pytorch/pytorch/pull/83736 Adding support for chunk without autograd support Pull Request resolved: https://github.com/pytorch/pytorch/pull/85645 Approved by: https://github.com/cpuhrsch	2022-10-11 01:21:39 +00:00
Antoni Viros i Martin	cdbffa7f66	🦊 [AI Accelerators] Consolidate native_layer_norm for nested tensor (#86295 ) Summary: In order to make the layer normalization implementation for nested tensors public, it needs to be generalized to accept a normalized_shape argument instead of assuming it to be the last dimension of the nested_tensor. This commit does that, as well as adding extra unit tests to ensure the implementation is correct. Test Plan: All unit tests designed to test different ways of using the function work: `buck test //caffe2/test:nested -- test_layer_norm` Differential Revision: D40105207 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86295 Approved by: https://github.com/drisspg	2022-10-06 13:10:25 +00:00
Mikayla Gawarecki	01add6e288	Allow only one -1 in nested view/reshape (#85691 ) ### this effectively means that we only allow reshaping/viewing of nt with ONE ragged dimension Behavior before this PR: 1. `-1` allowed for implicit batch dimension 2. multiple `-1`s allowed for pre-existing dimensions 3. for new dimensions, `-1` is not allowed it is worth noting that for the most part 3 is basically unreachable because assuming a nested tensor has at least 1 ragged dimension, you would expect at least one -1 to be in the proposed shape for the pre-existing dimensions Behavior after this PR: 1. batch dimension must be specified 2. only one `-1` allowed for pre-existing dimensions this effectively means that we only allow reshaping/viewing of nt with ONE ragged dimension 3. unchanged Pull Request resolved: https://github.com/pytorch/pytorch/pull/85691 Approved by: https://github.com/cpuhrsch	2022-09-28 22:29:40 +00:00
Mikayla Gawarecki	afaee00fec	Add python `nested_tensor` and `as_nested_tensor` constructors in `torch.nested` (#85593 ) Remove `torch.nested_tensor` which has erroneous behavior wrt gradients (could be either leaf or not leaf). Introduce `torch.nested.nested_tensor` and `torch.nested.as_nested_tensor` in the vein of `torch.tensor` and `torch.as_tensor`. Done in nested `__init__.py` for now but can move to pybind in future (when we want to load from numpy/nested lists ). Discussed offline with @cpuhrsch and pybind constructor (https://github.com/pytorch/pytorch/pull/85536) was more gnarly than expected, so we can move to that when we do need loading from numpy etc. Differential Revision: [D39806622](https://our.internmc.facebook.com/intern/diff/D39806622) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85593 Approved by: https://github.com/drisspg, https://github.com/cpuhrsch	2022-09-28 20:15:02 +00:00
PyTorch MergeBot	fc8ba3a92d	Revert "Allow only one -1 in nested view/reshape (#85691 )" This reverts commit `4c4e5f6106`. Reverted https://github.com/pytorch/pytorch/pull/85691 on behalf of https://github.com/atalman due to Causes github first merge conflict	2022-09-28 17:22:53 +00:00
Mikayla Gawarecki	4c4e5f6106	Allow only one -1 in nested view/reshape (#85691 ) ### this effectively means that we only allow reshaping/viewing of nt with ONE ragged dimension Behavior before this PR: 1. `-1` allowed for implicit batch dimension 2. multiple `-1`s allowed for pre-existing dimensions 3. for new dimensions, `-1` is not allowed it is worth noting that for the most part 3 is basically unreachable because assuming a nested tensor has at least 1 ragged dimension, you would expect at least one -1 to be in the proposed shape for the pre-existing dimensions Behavior after this PR: 1. batch dimension must be specified 2. only one `-1` allowed for pre-existing dimensions this effectively means that we only allow reshaping/viewing of nt with ONE ragged dimension 3. unchanged Pull Request resolved: https://github.com/pytorch/pytorch/pull/85691 Approved by: https://github.com/cpuhrsch	2022-09-27 17:16:54 +00:00
Mikayla Gawarecki	5e700803c2	Use fallback approach for nested matmul (#85311 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85311 Approved by: https://github.com/cpuhrsch, https://github.com/drisspg	2022-09-22 21:19:09 +00:00
PyTorch MergeBot	caa0ab557d	Revert "Use fallback approach for nested matmul (#85311 )" This reverts commit `7c31f6e672`. Reverted https://github.com/pytorch/pytorch/pull/85311 on behalf of https://github.com/clee2000 due to broke lots of builds `7c31f6e672` even though the pr was green	2022-09-21 22:55:40 +00:00
Mikayla Gawarecki	7c31f6e672	Use fallback approach for nested matmul (#85311 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85311 Approved by: https://github.com/cpuhrsch, https://github.com/drisspg	2022-09-21 22:39:52 +00:00
Mikayla Gawarecki	77f1f98479	Re-introduce `torch.Tensor.to_padded_tensor` (#85293 ) Differential Revision: [D39629004](https://our.internmc.facebook.com/intern/diff/D39629004) Pull Request resolved: https://github.com/pytorch/pytorch/pull/85293 Approved by: https://github.com/cpuhrsch	2022-09-21 18:45:56 +00:00
drisspg	bda8a5729b	[Nested Tensor] Create differentiable nt to tensor view functions (#83371 ) This PR attempts to implements 2) "the safe way" of creating a view of nested tensor that returns a regular tensor. The rest of the break down is here: https://fb.quip.com/J8QCAx41af11 https://gist.github.com/drisspg/8622e9c97d374fa920ac647e1167cabc This is a short list of some edge cases. After some more work I was able to address two of the test cases in the above gist. There are few complex aspects here that I left defeated comments inline. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83371 Approved by: https://github.com/bdhirsh	2022-09-13 20:35:58 +00:00
Mikayla Gawarecki	e217b30b0f	Add `torch.nested` namespace (#84102 ) First step towards #83775 - only `to_padded_tensor` is moved to the nested namespace for now - following the schema used for `special`, `fft`, `linalg` and other namespaces, nested functions are registered in native_functions.yaml as `nested_{function_name}` and are bound to the desired Python name in `torch/nested/__init__.py`, and the desired C++ name in `torch/csrc/api/include/torch/nested.h`. ~~Question: should we keep the documentation for `Tensor.to_padded_tensor` or can this deleted since it is shared by `torch.nested.to_padded_tensor`?~~ [generated nested docs](https://docs-preview.pytorch.org/84102/nested.html?highlight=nested#module-torch.nested) Differential Revision: [D39361148](https://our.internmc.facebook.com/intern/diff/D39361148) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84102 Approved by: https://github.com/drisspg	2022-09-12 16:31:05 +00:00
Mikayla Gawarecki	1cad744694	Enable select.int when NestedTensor requires grad (#83875 ) Previously indexing a nested tensor when it requires_grad would raise an error because the backward formula for `select.int` uses `self.sizes()`. This PR fixes that by temporarily registering a _nested_select_backward function which can be removed when we start using the symint approach to register kernels. For now this functionality is needed for creating a POC that nested tensor can be an API to `segment_coo` and `segment_csr` in the torch_scatter repo ``` a = torch.arange(10).reshape(2, 5).float() b = torch.arange(12).reshape(2, 6).float() nt = torch.nested_tensor([a, b], dtype=torch.float).requires_grad_(True) nt[0] # RuntimeError: Internal error: NestedTensorImpl doesn't support sizes. Please file an issue on https://github.com/pytorch/nestedtensor ``` whereas ``` nt = torch.nested_tensor([a, b], dtype=torch.float).requires_grad_(False) nt[0] ``` would succeed Pull Request resolved: https://github.com/pytorch/pytorch/pull/83875 Approved by: https://github.com/albanD, https://github.com/drisspg	2022-09-06 22:19:32 +00:00
Driss Guessous	f803fa9fc9	[Nested Tensor] Add a NestedTensorUtils header and cpp file for organization (#84385 ) # Summary Trying to do some clean up into code structure for nested tensors. This introduces a utility header and cpp file that implements helper functions. This is the initial PR in more clean up. The next would be separating out the all native functions that create nested tensors into their own file since they do not infact do math on nested tensors. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84385 Approved by: https://github.com/mikaylagawarecki	2022-09-02 16:31:55 +00:00
YifanShenSZ	673b35c847	Better reshape with autograd support (#82754 ) (#84154 ) The original author is @YifanShenSZ and the original PR is: #82754 # Summary: Previous reshape [https://github.com/pytorch/pytorch/issues/80981](https://github.com/pytorch/pytorch/pull/80981) is ok for forward, but needs improvement for backward: need to handle "sometimes view sometimes copy" behavior. This pull request fixes it by: 1. add a new alias dispatch key `CompositeImplicitAutogradNestedTensor`, which ideally would work as nested-tensor version of `CompositeImplicitAutograd` 2. register `reshape_nested` to `reshape` by `CompositeImplicitAutogradNestedTensor` Side changes: * add contiguous memory format support to `clone_nested` * add `view_nested` * add `reshape_as_nested` Fix issue [https://github.com/pytorch/pytorch/issues/83041](https://github.com/pytorch/pytorch/issues/83041) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82754 Test Plan: Imported from GitHub, without a `Test Plan:` line. Static Docs Preview: executorch \|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D39023822/V13/executorch/)\| \|Modified Pages\| Reviewed By: albanD Differential Revision: D39023822 Pulled By: drisspg Pull Request resolved: https://github.com/pytorch/pytorch/pull/84154 Approved by: https://github.com/bdhirsh, https://github.com/albanD	2022-09-01 20:01:39 +00:00
Driss Guessous	71369051ee	[Nested Tensor] fix from_padded bug (#84217 ) Fixes #84082 Explained in the issue that the problem was arising from grad being not contiguous and the fast kernel not handiling this case gracefully. The other thing I can do is add a contiguous call to `d144594512/aten/src/ATen/native/nested/cuda/NestedTensorTransformerFunctions.cpp (L45)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/84217 Approved by: https://github.com/albanD	2022-08-30 03:48:11 +00:00
Driss Guessous	2436cf8aa8	[Nested Tensor] detach (#84078 ) ## Summary Add detach op for nested tensors. Nested tensors are not part of the composite explicit dispatch key set and therefore need to be added manually. The Detach test is failing only for the dtype=torch.float32, torch.float16 and device=cuda. The chain of ops that called are sum.backward() -> from_padded() -> unbind(). This populates the grad for a and b. Does this potentially indicated that cuda implementation for one of these ops, likely from_padded() is incorrect? Pull Request resolved: https://github.com/pytorch/pytorch/pull/84078 Approved by: https://github.com/albanD	2022-08-29 09:12:26 +00:00
PyTorch MergeBot	f4f54c7ce1	Revert "[Nested Tensor] detach (#84078 )" This reverts commit `092fe71f33`. Reverted https://github.com/pytorch/pytorch/pull/84078 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally	2022-08-28 15:30:21 +00:00
Driss Guessous	092fe71f33	[Nested Tensor] detach (#84078 ) ## Summary Add detach op for nested tensors. Nested tensors are not part of the composite explicit dispatch key set and therefore need to be added manually. The Detach test is failing only for the dtype=torch.float32, torch.float16 and device=cuda. The chain of ops that called are sum.backward() -> from_padded() -> unbind(). This populates the grad for a and b. Does this potentially indicated that cuda implementation for one of these ops, likely from_padded() is incorrect? Pull Request resolved: https://github.com/pytorch/pytorch/pull/84078 Approved by: https://github.com/albanD	2022-08-27 03:00:55 +00:00
Yifan Shen	b3c99bef0c	Support nested dropout autograd (#83338 ) When the initial version came out, `NestedTensor` was not included in the `CompositeImplicitAutograd` key set, so we had to register dropout_nested to dropout and make it forward-only. Now is the time to improve it! This pr removes dropout_nested; instead native_dropout_nested is implemented along with native_dropout_backward_nested. Side change: remove dropout__nested since @cpuhrsch suggested to leave out nested in-place ops for now Pull Request resolved: https://github.com/pytorch/pytorch/pull/83338 Approved by: https://github.com/jbschlosser	2022-08-18 00:49:29 +00:00
Mikayla Gawarecki	bd0ad7a84f	Add backward support for rudimentary NestedTensor.sum(dim) (#82625 ) Per offline discussion, this will be updated to use expand once expand semantics for nested tensor have been fleshed out. Next steps will be to add support for other features for forward sum mentioned on #82387 and likewise update the backward Pull Request resolved: https://github.com/pytorch/pytorch/pull/82625 Approved by: https://github.com/albanD	2022-08-17 18:12:00 +00:00

1 2

93 Commits