Commit Graph

92 Commits

Author SHA1 Message Date
Driss Guessous
5612aa6acd Fixes a layer_norm_nested backwards edge case. (#96788)
# Summary
Add Test and the fix for when input NT doesn't require grad to layernorm.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96788
Approved by: https://github.com/cpuhrsch
2023-03-15 17:16:13 +00:00
Joel Schlosser
30d56dd8c1 Support randn_like() for NT (#96528)
To satisfy an internal ask.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96528
Approved by: https://github.com/mikaylagawarecki, https://github.com/cpuhrsch
2023-03-13 19:39:51 +00:00
Joel Schlosser
024ea1a21e Support zeros_like() for NT (#96527)
This is used for the fake tensor fallbacks.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/96527
Approved by: https://github.com/cpuhrsch
2023-03-13 15:15:08 +00:00
Mikayla Gawarecki
6930f30ccd Small bugfix in nested matmul bmm path head_dim acquisition (#95744)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95744
Approved by: https://github.com/drisspg
2023-03-01 03:27:08 +00:00
Joel Schlosser
68eec90cfd Support elementwise add / mul for [B, *] nested, [B, 1] dense (CUDA only) (#95620)
Small hack to reuse the 3D custom kernel from #88289 for [B, *] nested, [B, 1] dense elementwise add / mul. Simply treat the inputs as [B, *, 1], [B, 1, 1]. This is added to satisfy an internal ask.

Future work: full general broadcasting support between mixed nested / dense.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/95620
Approved by: https://github.com/cpuhrsch, https://github.com/drisspg
2023-02-27 21:07:09 +00:00
Huy Do
9b7abc4fac Run slow gradcheck tests sequentially (#95494)
Also redo https://github.com/pytorch/pytorch/pull/95246 as there are many more still run OOM
Pull Request resolved: https://github.com/pytorch/pytorch/pull/95494
Approved by: https://github.com/clee2000
2023-02-26 00:44:25 +00:00
Driss Guessous
0d7913c9c1 add backwards for layer norm nested (#94781)
Fixes #94702

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94781
Approved by: https://github.com/cpuhrsch
2023-02-16 01:42:57 +00:00
Driss Guessous
63bf7674fa add backwards for gelu and relu on nested tensors. (#94776)
Fixes #94701

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94776
Approved by: https://github.com/cpuhrsch
2023-02-14 18:42:06 +00:00
Mikayla Gawarecki
c7c7238976 Fix bug in unsqueeze_nested stride calculation (#88688)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88688
Approved by: https://github.com/cpuhrsch
2023-02-10 17:00:04 +00:00
Driss Guessous
bef2483ed8 [NestedTensor] Call contiguous in linear backward (#94317)
Fixes #94303

If in upward grad for linear_backward was discontiguous we would throw a torch check. This updates the implementation to instead call contiguous and changes the check to an internal assert.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/94317
Approved by: https://github.com/mikaylagawarecki
2023-02-07 23:43:46 +00:00
Driss Guessous
df14650f0b [SDPA] Update SDPA API and make function Public (#92189)
# Summary
In preparation for pt 2.0 launch this PR updates SDPA's API and makes the function a nn.funcitonal public function.

## Changes
### API
Previously the the function signature was:
`scaled_dot_product_attention(query, key, value, attn_mask=None, need_attn_weights=False, dropout_p=0.0, is_causal=False) -> (Tensor, Tensor)`
Updated signature:
`scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0, is_causal=False) -> Tensor`

This PR removes the need_attn_weights optional boolean variable and updates the return type to a singular tensor.

#### Reasoning:
The main goal of this function is to provide an easy interface for users to call into fused attention kernels e.g.  (FlashAttention). The fused kernels do not currently support arbitrary attn_mask or dropout but there is a PR to mem-efficient attention to enable these. We want to have the API surface ready for when the backing kernels get updated.

The fused kernels save on memory usage by not materializing the weights and it is unlikely that a fast fused implementation will enable this feature so we are removing.

Discussed with folks at FAIR/Xformers and +1 this API change.

#### Make function Public
In preparation for the pt 2.0 launch we make the function public to start to generate user feedback

Pull Request resolved: https://github.com/pytorch/pytorch/pull/92189
Approved by: https://github.com/cpuhrsch
2023-01-23 20:50:46 +00:00
Mikayla Gawarecki
5848704ef8 Removed unecessary check in select_nested (#89150)
Implementation in  #88585 should work for all dimensions. Removed unnecessary check that constrained select to dims 0 and 1

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89150
Approved by: https://github.com/cpuhrsch
2022-11-16 22:11:37 +00:00
Antoni Viros i Martin
9f58e027a9 Add implementation for irregular dimension selection for nested tensors. (#88585)
Summary: This diff modifies the implementation of the select operator so slices of the irregular dimension can be selected (e.g. nt[:,0,:]).

Test Plan:
Added new unit tests to test that the new functions work as intended (see them in diff). To test,
`buck test mode/dev-nosan //caffe2/test:nested`

Differential Revision: D41083993

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88585
Approved by: https://github.com/cpuhrsch
2022-11-09 00:19:38 +00:00
Antoni Viros i Martin
c77368d416 Implement a constructor for nested_tensor that is similar to torch.tensor() (#88213)
Summary: This diff merges both previous implementations of constructors for nested tensors, the one from lists of tensors and the one with arbitrary python lists, adn implements it in pytorch core so no extensions are needed to construct NT.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88213
Approved by: https://github.com/cpuhrsch
2022-11-08 00:03:18 +00:00
Christian Puhrsch
5e6ceebccb Add support for neg to NestedTensor (#88131)
Partially fixes #86889

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88131
Approved by: https://github.com/drisspg
2022-11-03 15:15:57 +00:00
Mikayla Gawarecki
d979caa87c Added add/mul for nested dense [B, *, D], [B, 1, D] case (CUDA-only) (#88289)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88289
Approved by: https://github.com/cpuhrsch
2022-11-03 01:29:25 +00:00
Christian Puhrsch
943b20e7ae Use tensor cores for NT bmm (#86856)
Copy of internal diff.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86856
Approved by: https://github.com/drisspg
2022-11-02 21:51:40 +00:00
PyTorch MergeBot
99c07735e4 Revert "Add support for neg to NestedTensor (#88131)"
This reverts commit 6a75a0d1a1.

Reverted https://github.com/pytorch/pytorch/pull/88131 on behalf of https://github.com/mehtanirav due to [Internal breakages](https://www.internalfb.com/intern/sandcastle/job/13510799692239080/insights)
2022-11-02 18:43:36 +00:00
Driss Guessous
560786ac20 call contiguous on BMM inputs for NT on CUDA (#88108)
Fixes #87713

BMM for cpu supports  non-contiguous nested tensor inputs, while BMM for Cuda does not support currently non-contiguous inputs.

The derivative for BMM:
```
- name: bmm(Tensor self, Tensor mat2) -> Tensor
  self: grad.bmm(mat2.transpose(1, 2).conj())
  mat2: self.transpose(1, 2).conj().bmm(grad)
  result: self_t.bmm(mat2_p) + self_p.bmm(mat2_t)
```

When calling backward it was impossible for this function to succeed since the inputs were always discontiguous, regardless of the user input.  This adds contiguous calls to BMM_cuda implementation for nested tensors.

This was not caught by tests because grad_check is currently only done on CPU in test_nestedtensors. This PR updates the autograd test to also be run on GPU.

As a result I found one more issue with the backward for to_padded_tensor erroring instead of calling the generic version.

cc @cpuhrsch @jbschlosser @bhosmer @mikaylagawarecki
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88108
Approved by: https://github.com/cpuhrsch
2022-11-01 03:14:27 +00:00
Christian Puhrsch
6a75a0d1a1 Add support for neg to NestedTensor (#88131)
Partially fixes #86889

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88131
Approved by: https://github.com/drisspg
2022-11-01 02:37:42 +00:00
Christian Puhrsch
b192e7e415 Support non-contiguous NestedTensors for elementwise ops (#87888)
Enables benchmarking of math path of sdp kernel
Pull Request resolved: https://github.com/pytorch/pytorch/pull/87888
Approved by: https://github.com/drisspg
2022-10-28 11:26:17 +00:00
Antoni Viros i Martin
775fef51b7 Implement copy_, fill_, and ones_like for Nested Tensors backends (#87728)
Summary: This diff implements copy_ in order to allow pinned memory transfers for nested tensors, as well as fill_ and ones_like, to test whether nested tensors can be created with other factory functions.

Test Plan: Pass all CI and sandcastle jobs.

Reviewed By: mikekgfb

Differential Revision: D40689594

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87728
Approved by: https://github.com/cpuhrsch
2022-10-26 14:48:27 +00:00
Antoni Viros i Martin
d94e33f041 Add support for .to() for NestedTensor backends (#87146)
Summary: This commit adds support for moving NestedTensors from CPU to GPU and back. The implementation includes requires implementing empty_like(), which is based on PR#83140.

Test Plan: Added a new unit test based on the unit test for the main .to() implementation. All unit tests must pass, as well as every sandcastle job.

Differential Revision: D40437585

Pull Request resolved: https://github.com/pytorch/pytorch/pull/87146
Approved by: https://github.com/drisspg
2022-10-20 03:46:50 +00:00
Christian Puhrsch
f6c6048b10 Use CUTLASS GEMM for NT bmm (#85894)
Copy of https://github.com/pytorch/pytorch/pull/85710
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85894
Approved by: https://github.com/drisspg
2022-10-18 23:11:47 +00:00
Mikayla Gawarecki
ab69550678 Add nested squeeze.dim and unsqueeze (#86813)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86813
Approved by: https://github.com/drisspg
2022-10-13 16:05:36 +00:00
PyTorch MergeBot
d169f950da Revert "Use CUTLASS GEMM for NT bmm [OSS-only] (#85894)"
This reverts commit ef58a132f2.

Reverted https://github.com/pytorch/pytorch/pull/85894 on behalf of https://github.com/DanilBaibak due to Break internal build
2022-10-13 15:28:09 +00:00
Mikayla Gawarecki
2a75152537 [easy] Add nested tanh (#86826)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86826
Approved by: https://github.com/cpuhrsch
2022-10-13 00:48:08 +00:00
Christian Puhrsch
ef58a132f2 Use CUTLASS GEMM for NT bmm [OSS-only] (#85894)
OSS-only copy of https://github.com/pytorch/pytorch/pull/85710
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85894
Approved by: https://github.com/drisspg
2022-10-12 20:03:28 +00:00
Driss Guessous
16f65f178a Nested tensor forward only chunk operations (#85645)
# Summary

Taking over this pr: https://github.com/pytorch/pytorch/pull/83736

Adding support for chunk without autograd support
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85645
Approved by: https://github.com/cpuhrsch
2022-10-11 01:21:39 +00:00
Antoni Viros i Martin
cdbffa7f66 🦊 [AI Accelerators] Consolidate native_layer_norm for nested tensor (#86295)
Summary: In order to make the layer normalization implementation for nested tensors public, it needs to be generalized to accept a normalized_shape argument instead of assuming it to be the last dimension of the nested_tensor. This commit does that, as well as adding extra unit tests to ensure the implementation is correct.

Test Plan:
All unit tests designed to test different ways of using the function work:

`buck test //caffe2/test:nested -- test_layer_norm`

Differential Revision: D40105207

Pull Request resolved: https://github.com/pytorch/pytorch/pull/86295
Approved by: https://github.com/drisspg
2022-10-06 13:10:25 +00:00
Mikayla Gawarecki
01add6e288 Allow only one -1 in nested view/reshape (#85691)
###  this effectively means that we only allow reshaping/viewing of nt with ONE ragged dimension

Behavior before this PR:

1. `-1` allowed for implicit batch dimension
2. multiple `-1`s allowed for pre-existing dimensions
3.  for new dimensions, `-1` is not allowed

 it is worth noting that for the most part 3 is basically unreachable because assuming a nested tensor has at least 1 ragged dimension, you would expect at least one -1 to be in the proposed shape for the pre-existing dimensions

Behavior after this PR:
1. batch dimension **must be specified**
2. **only one** `-1` allowed for pre-existing dimensions **this effectively means that we only allow reshaping/viewing of nt with ONE ragged dimension**
3. unchanged

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85691
Approved by: https://github.com/cpuhrsch
2022-09-28 22:29:40 +00:00
Mikayla Gawarecki
afaee00fec Add python nested_tensor and as_nested_tensor constructors in torch.nested (#85593)
Remove `torch.nested_tensor` which has erroneous behavior wrt gradients (could be either leaf or not leaf). Introduce `torch.nested.nested_tensor` and `torch.nested.as_nested_tensor` in the vein of `torch.tensor` and `torch.as_tensor`. Done in nested `__init__.py` for now but can move to pybind in future (when we want to load from numpy/nested lists ).

Discussed offline with @cpuhrsch and pybind constructor (https://github.com/pytorch/pytorch/pull/85536) was more gnarly than expected, so we can move to that when we do need loading from numpy etc.

Differential Revision: [D39806622](https://our.internmc.facebook.com/intern/diff/D39806622)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85593
Approved by: https://github.com/drisspg, https://github.com/cpuhrsch
2022-09-28 20:15:02 +00:00
PyTorch MergeBot
fc8ba3a92d Revert "Allow only one -1 in nested view/reshape (#85691)"
This reverts commit 4c4e5f6106.

Reverted https://github.com/pytorch/pytorch/pull/85691 on behalf of https://github.com/atalman due to Causes github first merge conflict
2022-09-28 17:22:53 +00:00
Mikayla Gawarecki
4c4e5f6106 Allow only one -1 in nested view/reshape (#85691)
###  this effectively means that we only allow reshaping/viewing of nt with ONE ragged dimension

Behavior before this PR:

1. `-1` allowed for implicit batch dimension
2. multiple `-1`s allowed for pre-existing dimensions
3.  for new dimensions, `-1` is not allowed

 it is worth noting that for the most part 3 is basically unreachable because assuming a nested tensor has at least 1 ragged dimension, you would expect at least one -1 to be in the proposed shape for the pre-existing dimensions

Behavior after this PR:
1. batch dimension **must be specified**
2. **only one** `-1` allowed for pre-existing dimensions **this effectively means that we only allow reshaping/viewing of nt with ONE ragged dimension**
3. unchanged

Pull Request resolved: https://github.com/pytorch/pytorch/pull/85691
Approved by: https://github.com/cpuhrsch
2022-09-27 17:16:54 +00:00
Mikayla Gawarecki
5e700803c2 Use fallback approach for nested matmul (#85311)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85311
Approved by: https://github.com/cpuhrsch, https://github.com/drisspg
2022-09-22 21:19:09 +00:00
PyTorch MergeBot
caa0ab557d Revert "Use fallback approach for nested matmul (#85311)"
This reverts commit 7c31f6e672.

Reverted https://github.com/pytorch/pytorch/pull/85311 on behalf of https://github.com/clee2000 due to broke lots of builds 7c31f6e672 even though the pr was green
2022-09-21 22:55:40 +00:00
Mikayla Gawarecki
7c31f6e672 Use fallback approach for nested matmul (#85311)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85311
Approved by: https://github.com/cpuhrsch, https://github.com/drisspg
2022-09-21 22:39:52 +00:00
Mikayla Gawarecki
77f1f98479 Re-introduce torch.Tensor.to_padded_tensor (#85293)
Differential Revision: [D39629004](https://our.internmc.facebook.com/intern/diff/D39629004)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85293
Approved by: https://github.com/cpuhrsch
2022-09-21 18:45:56 +00:00
drisspg
bda8a5729b [Nested Tensor] Create differentiable nt to tensor view functions (#83371)
This PR attempts to implements 2) "the safe way" of creating a view of nested tensor that returns a regular tensor. The rest of the break down is here: https://fb.quip.com/J8QCAx41af11

https://gist.github.com/drisspg/8622e9c97d374fa920ac647e1167cabc
This is a short list of some edge cases. After some more work I was able to address two of the test cases in the above gist. There are few complex aspects here that I left defeated comments inline.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83371
Approved by: https://github.com/bdhirsh
2022-09-13 20:35:58 +00:00
Mikayla Gawarecki
e217b30b0f Add torch.nested namespace (#84102)
First step towards #83775
- only `to_padded_tensor` is moved to the nested namespace for now
- following the schema used for `special`, `fft`, `linalg` and other namespaces, nested functions are registered in native_functions.yaml as `nested_{function_name}` and are bound to the desired Python name in
`torch/nested/__init__.py`, and the desired C++ name in `torch/csrc/api/include/torch/nested.h`.

~~**Question**: should we keep the documentation for `Tensor.to_padded_tensor` or can this deleted since it is shared by `torch.nested.to_padded_tensor`?~~

[generated nested docs](https://docs-preview.pytorch.org/84102/nested.html?highlight=nested#module-torch.nested)

Differential Revision: [D39361148](https://our.internmc.facebook.com/intern/diff/D39361148)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84102
Approved by: https://github.com/drisspg
2022-09-12 16:31:05 +00:00
Mikayla Gawarecki
1cad744694 Enable select.int when NestedTensor requires grad (#83875)
Previously indexing a nested tensor when it requires_grad would raise an error because the backward formula for `select.int` uses `self.sizes()`. This PR fixes that by temporarily registering a _nested_select_backward function which can be removed when we start using the symint approach to register kernels. For now this functionality is needed for creating a POC that nested tensor can be an API to `segment_coo` and `segment_csr` in the torch_scatter repo

```
    a = torch.arange(10).reshape(2, 5).float()
    b = torch.arange(12).reshape(2, 6).float()
    nt = torch.nested_tensor([a, b], dtype=torch.float).requires_grad_(True)
    nt[0]
    # RuntimeError: Internal error: NestedTensorImpl doesn't support sizes. Please file an issue on https://github.com/pytorch/nestedtensor
```

whereas

```
 nt = torch.nested_tensor([a, b], dtype=torch.float).requires_grad_(False)
 nt[0]
 ```
would succeed

Pull Request resolved: https://github.com/pytorch/pytorch/pull/83875
Approved by: https://github.com/albanD, https://github.com/drisspg
2022-09-06 22:19:32 +00:00
Driss Guessous
f803fa9fc9 [Nested Tensor] Add a NestedTensorUtils header and cpp file for organization (#84385)
# Summary
Trying to do some clean up into code structure for nested tensors. This introduces a utility header and cpp file that implements helper functions.

This is the initial PR in more clean up. The next would be separating out the all native functions that create nested tensors into their own file since they do not infact do math on nested tensors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84385
Approved by: https://github.com/mikaylagawarecki
2022-09-02 16:31:55 +00:00
YifanShenSZ
673b35c847 Better reshape with autograd support (#82754) (#84154)
The original author is @YifanShenSZ  and the original PR is: #82754
# Summary:
Previous reshape [https://github.com/pytorch/pytorch/issues/80981](https://github.com/pytorch/pytorch/pull/80981) is ok for forward, but needs improvement for backward: need to handle "sometimes view sometimes copy" behavior.

This pull request fixes it by:
1. add a new alias dispatch key `CompositeImplicitAutogradNestedTensor`, which ideally would work as nested-tensor version of `CompositeImplicitAutograd`
2. register `reshape_nested` to `reshape` by `CompositeImplicitAutogradNestedTensor`

Side changes:
* add contiguous memory format support to `clone_nested`
* add `view_nested`
* add `reshape_as_nested`

Fix issue [https://github.com/pytorch/pytorch/issues/83041](https://github.com/pytorch/pytorch/issues/83041)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82754

Test Plan:
Imported from GitHub, without a `Test Plan:` line.

**Static Docs Preview: executorch**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D39023822/V13/executorch/)|

|**Modified Pages**|

Reviewed By: albanD

Differential Revision: D39023822

Pulled By: drisspg

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84154
Approved by: https://github.com/bdhirsh, https://github.com/albanD
2022-09-01 20:01:39 +00:00
Driss Guessous
71369051ee [Nested Tensor] fix from_padded bug (#84217)
Fixes #84082

Explained in the issue that the problem was arising from grad being not contiguous and the fast kernel not handiling this case gracefully.  The other thing I can do is add a contiguous call to d144594512/aten/src/ATen/native/nested/cuda/NestedTensorTransformerFunctions.cpp (L45)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/84217
Approved by: https://github.com/albanD
2022-08-30 03:48:11 +00:00
Driss Guessous
2436cf8aa8 [Nested Tensor] detach (#84078)
## Summary
Add detach op for nested tensors. Nested tensors are not part of the composite explicit dispatch key set and therefore need to be added manually.

The Detach test is failing only for the dtype=torch.float32, torch.float16 and device=cuda. The chain of ops that called are sum.backward() -> from_padded() -> unbind(). This populates the grad for a and b.

Does this potentially indicated that cuda implementation for one of these ops, likely from_padded() is incorrect?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84078
Approved by: https://github.com/albanD
2022-08-29 09:12:26 +00:00
PyTorch MergeBot
f4f54c7ce1 Revert "[Nested Tensor] detach (#84078)"
This reverts commit 092fe71f33.

Reverted https://github.com/pytorch/pytorch/pull/84078 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally
2022-08-28 15:30:21 +00:00
Driss Guessous
092fe71f33 [Nested Tensor] detach (#84078)
## Summary
Add detach op for nested tensors. Nested tensors are not part of the composite explicit dispatch key set and therefore need to be added manually.

The Detach test is failing only for the dtype=torch.float32, torch.float16 and device=cuda. The chain of ops that called are sum.backward() -> from_padded() -> unbind(). This populates the grad for a and b.

Does this potentially indicated that cuda implementation for one of these ops, likely from_padded() is incorrect?
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84078
Approved by: https://github.com/albanD
2022-08-27 03:00:55 +00:00
Yifan Shen
b3c99bef0c Support nested dropout autograd (#83338)
When the initial version came out, `NestedTensor` was not included in the `CompositeImplicitAutograd` key set, so we had to register dropout_nested to dropout and make it forward-only. Now is the time to improve it!

This pr removes dropout_nested; instead native_dropout_nested is implemented along with native_dropout_backward_nested.

Side change: remove dropout__nested since @cpuhrsch suggested to leave out nested in-place ops for now
Pull Request resolved: https://github.com/pytorch/pytorch/pull/83338
Approved by: https://github.com/jbschlosser
2022-08-18 00:49:29 +00:00
Mikayla Gawarecki
bd0ad7a84f Add backward support for rudimentary NestedTensor.sum(dim) (#82625)
Per offline discussion, this will be updated to use expand once expand semantics for nested tensor have been fleshed out.

Next steps will be to add support for other features for forward sum mentioned on #82387 and likewise update the backward

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82625
Approved by: https://github.com/albanD
2022-08-17 18:12:00 +00:00
Driss Guessous
4b597019b7 [Nested Tensor] Created Nested Tensor to Nested Tensor Views (#82658)
# Summary
This is PR is pulling out all the changes from #81838 specific to properly creating nested_tensor views. I will update this comment with a design doc once that has been made.  This should enable proper creation of NestedTensor views, two nested_tensors sharing the same buffer_ but with different NestedTensor meta data.

The function `create_nested_tensor_view` is a helper function for creating a new nested tensor whose storage aliases the base causing the underlying storage to be shared - and is therefore a view.

This function by itself is not differentiable and therefore autograd does not track its uses. If a nested tensor function implementation uses this helper in its implementation the aten_op must meet two requirements:
- The function must return a view of the input
- The function must be explicit and defines its backward

## Testing
A bug was found when creating a base tensor out of inference mode and then creating a view in inference mode. This test has been aded to this PR in order to show the effect of the change.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/82658
Approved by: https://github.com/albanD
2022-08-16 20:22:21 +00:00