Commit Graph

176 Commits

Author SHA1 Message Date
Joel Schlosser
cd6bfc7965 Proper view support for jagged layout NestedTensor (#113279)
This PR:
* Introduces an ATen op for creating true jagged views from a dense values buffer
    * `_nested_view_from_jagged(values, offsets, lengths, ragged_idx, dummy)`
    * This ops is implemented on the Python side using torch.library so we can return a subclass instance
    * `jagged_from_list()` now uses this instead of the old autograd.Function `NestedViewFromBuffer`
    * The latter op is used for non-contiguous JTs returned via `torch.nested.narrow()`
    * `dummy` is an awful hack to ensure that `NestedTensor.__torch_dispatch__()` is invoked for our view
* Introduces an ATen op for accessing the `values` component of an NT via a view
    * `_nested_get_values(nt)`
* **Removes** the autograd.Functions `ViewNestedFromBuffer` and `ViewBufferFromNested` in favor of `nested_from_values_offsets()` / `nested_from_values_offsets_lengths()` and `nt.values()`, respectively.
* Changes test code to prefer `as_nested_tensor()` over `jagged_from_list()` directly
    * Similarly, avoid `buffer_from_jagged()`, preferring `values()`
* Depends on general subclass view fake-ification on the PT2 side (handled solely in previous PRs in the stack)

With these changes, the semantics of jagged layout NTs are such that they are considered a true view of the underlying `values` buffer. This means views of jagged NTs are views of the underlying buffer as well, simplifying some handling.

Differential Revision: [D54269922](https://our.internmc.facebook.com/intern/diff/D54269922)
Co-authored-by: voznesenskym <voznesenskym@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113279
Approved by: https://github.com/ezyang
2024-03-22 02:12:36 +00:00
PyTorch MergeBot
224beecee6 Revert "Proper view support for jagged layout NestedTensor (#113279)"
This reverts commit 5855c490f0.

Reverted https://github.com/pytorch/pytorch/pull/113279 on behalf of https://github.com/jbschlosser due to Need to fix BC thing ([comment](https://github.com/pytorch/pytorch/pull/113279#issuecomment-2013899762))
2024-03-21 22:03:01 +00:00
PyTorch MergeBot
12e7602cf9 Revert "Support for torch.nested.as_nested_tensor(t) (#113280)"
This reverts commit 17c9c70265.

Reverted https://github.com/pytorch/pytorch/pull/113280 on behalf of https://github.com/jbschlosser due to Need to fix BC thing ([comment](https://github.com/pytorch/pytorch/pull/113280#issuecomment-2013893099))
2024-03-21 22:00:44 +00:00
PyTorch MergeBot
816db3bd29 Revert "Public API for NJT construction from jagged components (#121518)"
This reverts commit d4dff9cf5e.

Reverted https://github.com/pytorch/pytorch/pull/121518 on behalf of https://github.com/jbschlosser due to Need to fix BC thing ([comment](https://github.com/pytorch/pytorch/pull/121518#issuecomment-2013879641))
2024-03-21 21:56:29 +00:00
Christian Puhrsch
16935de961 Support alias for NestedTensorCPU/CUDA (#117711)
Fixes #ISSUE_NUMBER

Co-authored-by: Vincent Moens <vmoens@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117711
Approved by: https://github.com/albanD
2024-03-21 16:05:52 +00:00
Joel Schlosser
d4dff9cf5e Public API for NJT construction from jagged components (#121518)
This PR introduces `torch.nested.nested_tensor_from_jagged(values, offsets=None, lengths=None, jagged_dim=1)` (bikeshedding welcome). This is intended to be the main entrypoint for getting an NJT from the `(values, offsets, lengths)` components. The returned NJT is a view of the `values` component.

Note that `torch.nested.nested_tensor()` / `torch.nested.as_nested_tensor()` already exist for constructing an NJT from a list of tensors.

TODO:
* Some doc formatting; suggestions welcome there
* Tests / examples using `jagged_dim != 1`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121518
Approved by: https://github.com/cpuhrsch
ghstack dependencies: #113280
2024-03-21 04:14:17 +00:00
Joel Schlosser
17c9c70265 Support for torch.nested.as_nested_tensor(t) (#113280)
This PR adds support for tensor inputs to `as_nested_tensor()`. The tensor is treated as a batch of consistently-sized constituents. It utilizes `_nested_view_from_values_offsets()` to return a real view that allows for propagating gradients into inputs.
Co-authored-by: voznesenskym <voznesenskym@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113280
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
2024-03-21 04:13:55 +00:00
Joel Schlosser
5855c490f0 Proper view support for jagged layout NestedTensor (#113279)
This PR:
* Introduces an ATen op for creating true jagged views from a dense values buffer
    * `_nested_view_from_jagged(values, offsets, lengths, ragged_idx, dummy)`
    * This ops is implemented on the Python side using torch.library so we can return a subclass instance
    * `jagged_from_list()` now uses this instead of the old autograd.Function `NestedViewFromBuffer`
    * The latter op is used for non-contiguous JTs returned via `torch.nested.narrow()`
    * `dummy` is an awful hack to ensure that `NestedTensor.__torch_dispatch__()` is invoked for our view
* Introduces an ATen op for accessing the `values` component of an NT via a view
    * `_nested_get_values(nt)`
* **Removes** the autograd.Functions `ViewNestedFromBuffer` and `ViewBufferFromNested` in favor of `nested_from_values_offsets()` / `nested_from_values_offsets_lengths()` and `nt.values()`, respectively.
* Changes test code to prefer `as_nested_tensor()` over `jagged_from_list()` directly
    * Similarly, avoid `buffer_from_jagged()`, preferring `values()`
* Depends on general subclass view fake-ification on the PT2 side (handled solely in previous PRs in the stack)

With these changes, the semantics of jagged layout NTs are such that they are considered a true view of the underlying `values` buffer. This means views of jagged NTs are views of the underlying buffer as well, simplifying some handling.

Differential Revision: [D54269922](https://our.internmc.facebook.com/intern/diff/D54269922)
Co-authored-by: voznesenskym <voznesenskym@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113279
Approved by: https://github.com/ezyang
2024-03-20 23:45:34 +00:00
Joel Schlosser
ea8f6e2e54 Subclass view fake-ification via reified ViewFuncs (#118405)
This PR:
* Uses reified ViewFuncs to swap in fake tensors / symbolic SymInts for view replay during subclass view fake-ification
* Enables automatic dynamic on view bases -> fakeifies according to the resultant symbolic context instead of the old "all-static" approach
* Covers the following view types:
    * subclass -> dense
    * dense -> subclass
    * subclass -> subclass
* Dense -> dense views are handled the old way via an `as_strided()` call, as it's likely there is no view func available

Differential Revision: [D54269082](https://our.internmc.facebook.com/intern/diff/D54269082)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118405
Approved by: https://github.com/ezyang
2024-03-07 19:56:16 +00:00
Chengji Yao
0e604becc5 [NJT] support chunk on batch dim (#119713)
- support chunk op on batch dim
- support empty_like op
- add tests for the like ops

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119713
Approved by: https://github.com/jbschlosser
2024-03-05 17:57:50 +00:00
soulitzer
312ce35c1f Rename singleton int to nested int (#119661)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119661
Approved by: https://github.com/ezyang
2024-02-16 19:21:17 +00:00
Joel Schlosser
756cf2913d Fix NJT stride access in SDPA dispatcher logic (#119846)
`._stride` -> `._strides`

Adds test to cover this case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119846
Approved by: https://github.com/drisspg, https://github.com/ani300, https://github.com/soulitzer
ghstack dependencies: #119910
2024-02-14 22:37:52 +00:00
Joel Schlosser
0560c193a6 Fix meta registration for _flash_attention_forward() [ROCm forward fix] (#119910)
Addresses ROCm failures from #119812

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119910
Approved by: https://github.com/drisspg
2024-02-14 22:37:52 +00:00
Joel Schlosser
31e59766e7 Fix meta registration for _flash_attention_forward() (#119812)
Meta registration wrongly assumes 4D inputs, while the underlying op allows 3D inputs for the `mha_varlen_fwd()` case.
Testing: I added `detach()`es so the NJT test `test_sdpa_compile()` won't fail for a view-related reason. It should pass now with this fix.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119812
Approved by: https://github.com/drisspg
2024-02-14 02:38:53 +00:00
PyTorch MergeBot
8994f2367d Revert "Fix jagged NT softmax semantics (#119459)"
This reverts commit 6adadbaf79.

Reverted https://github.com/pytorch/pytorch/pull/119459 on behalf of https://github.com/malfet due to broke dynamo, see https://github.com/pytorch/pytorch/actions/runs/7835402753/job/21386634602 ([comment](https://github.com/pytorch/pytorch/pull/119459#issuecomment-1935246413))
2024-02-09 02:31:49 +00:00
Joel Schlosser
6adadbaf79 Fix jagged NT softmax semantics (#119459)
Before: `softmax` definition uses `jagged_unary_pointwise()` (wrong)
After: `softmax` impl adjusts the `dim` arg to account for the difference in dimensionality between the outer NT and the NT's `_values`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119459
Approved by: https://github.com/soulitzer
2024-02-08 20:13:12 +00:00
David Berard
278a0e1600 [NestedTensor] Support binary pointwise ops with >2 inputs (if inputs are non-tensors) (#119419)
It should usually be safe to run pointwise binary ops with >2 inputs. e.g. threshold_backward(tensor, tensor, scalar): we just operate on the values of the nested tensors, and pass in the other args as-is.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119419
Approved by: https://github.com/soulitzer
2024-02-08 20:06:40 +00:00
Huy Do
3ed9df36a9 Clean up some obsolete TODOs in run_test and several test files (#119113)
* The TODOs in `test/test_nestedtensor.py` has been mitigated, I keep the issue for reference.
* ~~The TODOs in `test/test_ops_fwd_gradients.py` doesn't apply anymore~~
* The TODOs in `run_test.py` to support disabling C++ tests is probably not going to happen.  I have never seen a flaky C++ test that needs to be disabled before.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119113
Approved by: https://github.com/kit1980
2024-02-03 23:54:30 +00:00
David Berard
460950d3aa [Nested Tensor] Support ragged_idx != 1 on aten::is_same_size, aten::_to_copy (#118442)
is_same_size is needed internally; `_to_copy` should be easy because it doesn't support new layouts.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118442
Approved by: https://github.com/cpuhrsch
2024-01-30 01:32:51 +00:00
David Berard
2842d3c9d3 [Nested Tensor] view: basic support for ragged_idx != 1 and _unsafe_view (#118317)
Uses case: `_unsafe_view` is used in aot_autograd to create a view that doesn't register as a view:

eebe7e1d37/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py (L470-L476)

If a transposed nested tensor (i.e. NT with ragged_idx != 1) encounters this code path, it previously would fail for two reasons: 1) because `_unsafe_view` isn't registered, and 2) because ragged_idx != 1 is not supported. This PR adds support for `_unsafe_view` (completely reusing the implementation of `view`; this just registers `_unsafe_view` as another op using the same implementation). It also adds support for ragged_idx != 1, but only for trivial cases where inp._size == size (the use case used by aot_autograd).

Tests: verify that the result of `_unsafe_view` doesn't have a `_base`, and that simple views on transposed NTs work.

Differential Revision: [D53096814](https://our.internmc.facebook.com/intern/diff/D53096814)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118317
Approved by: https://github.com/soulitzer
2024-01-26 17:29:37 +00:00
David Berard
52c5803088 [NestedTensor] Support ragged_idx != 1 in pointwise ops (#118157)
This PR allows pointwise ops to operate on tensors with ragged_idx != 1. It does this by passing the ragged_idx metadata into the construction of the returned NestedTensor when computing pointwise ops. The assumption is that: pointwise ops can operate directly on the values tensors, and the resulting tensor should have all the same metadata properties as the input tensors. For binary ops, a test is added to verify that adding two tensors with different ragged_idx cannot be added.

Previously:
* unary pointwise ops would error out when performed on nested tensors with ragged_idx != 1
* binary pointwise ops would produce tensors with nonsense shapes

Differential Revision: [D53032641](https://our.internmc.facebook.com/intern/diff/D53032641)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118157
Approved by: https://github.com/jbschlosser
2024-01-25 23:34:15 +00:00
YuqingJ
a97d00cca5 [Nested Tensor]Support SDPA math fallback for jagged layout nested tensor (#116445)
Support this fallback by converting the jagged layout NT to strided layout NT, and the convert the result back to jagged layout NT.
This fallback might not be efficient since it uses unbind, contiguous and split.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116445
Approved by: https://github.com/soulitzer
2024-01-12 17:30:40 +00:00
PyTorch MergeBot
9f87760160 Revert "[Nested Tensor]Support SDPA math fallback for jagged layout nested tensor (#116445)"
This reverts commit e55a778cbb.

Reverted https://github.com/pytorch/pytorch/pull/116445 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but i see it fails ROCm test in trunk due to an unsupported use case e55a778cbb ([comment](https://github.com/pytorch/pytorch/pull/116445#issuecomment-1888060036))
2024-01-11 22:21:45 +00:00
YuqingJ
e55a778cbb [Nested Tensor]Support SDPA math fallback for jagged layout nested tensor (#116445)
Support this fallback by converting the jagged layout NT to strided layout NT, and the convert the result back to jagged layout NT.
This fallback might not be efficient since it uses unbind, contiguous and split.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116445
Approved by: https://github.com/soulitzer
2024-01-11 20:28:40 +00:00
Joel Schlosser
f70aeb4ffd Fix backward for reshape() on jagged layout NT (#117137)
Provides symbolic C++-side `reshape_as()` / `reshape()` decomps for jagged layout NTs to make the backwards pass work.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117137
Approved by: https://github.com/soulitzer
2024-01-10 23:35:07 +00:00
Joel Schlosser
0b0c76bace Support squeeze.dim for jagged NT (#116891)
As title. Needed for `rev_view_func()` of `unsqueeze()`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116891
Approved by: https://github.com/soulitzer
ghstack dependencies: #115894, #116512
2024-01-06 01:00:53 +00:00
Joel Schlosser
3c21264c9b Introduce reverse view_funcs (#115894)
Part 2 of implementation for general [subclass view fake-ification](https://docs.google.com/document/d/1C5taWiplmX7nKiURXDOAZG2W5VNJ2iV0fQFq92H0Cxw).

Details:
* Codegen `rev_view_func()` alongside `view_func()`
    * Reverse view_func gives you a "base" from a "view": `rev_view_func(new_view) -> new_base` AKA it plays the original view backwards
* Utilizes the functional inverses defined in `FunctionalInverses.cpp`, passing `InverseReturnMode::AlwaysView`
* Manually implements functional inverses for `narrow()` and `chunk()`
* **NB: Multi-output views now set view_func() / rev_view_func() for each of the output views!**
    * Due to this, the `as_view()` overload that operates on a list of views is scrapped in favor of iteration via codegen

Example codegen in `ADInplaceOrViewTypeN.cpp`:
```cpp
at::Tensor narrow(c10::DispatchKeySet ks, const at::Tensor & self, int64_t dim, c10::SymInt start, c10::SymInt length) {
  auto _tmp = ([&]() {
    at::AutoDispatchBelowADInplaceOrView guard;
    return at::_ops::narrow::redispatch(ks & c10::after_ADInplaceOrView_keyset, self, dim, start, length);
  })();
  std::function<at::Tensor(const at::Tensor&)> func=nullptr;
  std::function<at::Tensor(const at::Tensor&)> rev_func=nullptr;
  if (false || !self.unsafeGetTensorImpl()->support_as_strided() ||
      c10::AutogradState::get_tls_state().get_view_replay_enabled()) {
    func = [=](const at::Tensor& input_base) {
      return at::_ops::narrow::call(input_base, dim, start, length);
    };
    rev_func = [=](const at::Tensor& input_view) {
      // NB: args from narrow() signature are passed along to the inverse
      return at::functionalization::FunctionalInverses::narrow_copy_inverse(self, input_view, at::functionalization::InverseReturnMode::AlwaysView, dim, start, length);
    };
  }
  auto result = as_view(/* base */ self, /* output */ _tmp, /* is_bw_differentiable */ true, /* is_fw_differentiable */ true, /* view_func */ func, /* rev_view_func */ rev_func, /* creation_meta */ InferenceMode::is_enabled() ? CreationMeta::INFERENCE_MODE : (at::GradMode::is_enabled() ? CreationMeta::DEFAULT : CreationMeta::NO_GRAD_MODE));
  return result;
}
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115894
Approved by: https://github.com/soulitzer
2024-01-05 16:48:12 +00:00
Joel Schlosser
ea3a5f8ddc Add chunk for jagged layout NT (#115842)
Nice to have for the [SDPA tutorial](https://pytorch.org/tutorials/intermediate/scaled_dot_product_attention_tutorial.html)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115842
Approved by: https://github.com/soulitzer
ghstack dependencies: #115192, #116111
2023-12-20 20:13:20 +00:00
Joel Schlosser
29b198dcf8 Add markDynamoStrictTest to NT tests (#116111)
Decorates all NT tests with `@markDynamoStrictTest` to ensure we get the correct signal. Adds xfails where needed to get things passing.

Includes a fix in meta_utils.py for a bug that was breaking several python 3.11 tests. In particular, a dense tensor graph input that is a view of a strided NT would slip past Dynamo's check and break in meta-ification.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116111
Approved by: https://github.com/soulitzer, https://github.com/zou3519
ghstack dependencies: #115192
2023-12-20 20:13:20 +00:00
Joel Schlosser
1474eb5f29 Fix jagged composite impl of flatten() (#115192)
Need to handle this in `NestedTensor.__torch_function__()` since it's CompositeImplicit
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115192
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
2023-12-19 19:15:21 +00:00
Joel Schlosser
bf62511e07 Reshape decomposition for jagged layout NT (#115191)
No more segfault from using `reshape()` on jagged NT :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115191
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
2023-12-18 22:34:41 +00:00
Joel Schlosser
6fee208064 Handle -1 in jagged layout NT view ops (#115843)
Allows for inheriting the ragged and batch dims via -1:
```python
nt.view(-1, -1, D)
nt.expand(B, -1, D)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115843
Approved by: https://github.com/soulitzer
ghstack dependencies: #115636
2023-12-15 00:42:47 +00:00
Joel Schlosser
0ff155fb65 Fix SDPA for SAM (#115636)
Addresses the regression for Segment Anything Fast in https://github.com/pytorch-labs/segment-anything-fast/issues/99
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115636
Approved by: https://github.com/soulitzer, https://github.com/ani300
2023-12-12 18:52:38 +00:00
soulitzer
8885128dcc Fix backward for SDPA NT jagged layout (#115576)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115576
Approved by: https://github.com/jbschlosser, https://github.com/ani300
2023-12-12 18:35:40 +00:00
Yanbo Liang
da341d0d48 [Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432)
This is splitted from #113009, please check https://github.com/pytorch/pytorch/pull/113009#issuecomment-1804417925 for more details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113432
Approved by: https://github.com/ezyang, https://github.com/jansel
2023-12-09 05:11:44 +00:00
PyTorch MergeBot
e8e4141773 Revert "[Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432)"
This reverts commit e61d6b42f0.

Reverted https://github.com/pytorch/pytorch/pull/113432 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but it is failing dynamo tests in trunk e61d6b42f0, landrace? ([comment](https://github.com/pytorch/pytorch/pull/113432#issuecomment-1847787981))
2023-12-08 20:15:39 +00:00
Yanbo Liang
e61d6b42f0 [Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432)
This is splitted from #113009, please check https://github.com/pytorch/pytorch/pull/113009#issuecomment-1804417925 for more details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113432
Approved by: https://github.com/ezyang, https://github.com/jansel
2023-12-08 17:15:14 +00:00
Joel Schlosser
3b01f30b20 Prevent invalid pointwise ops on jagged with transposed ragged dim (#115190)
TODO: tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115190
Approved by: https://github.com/soulitzer, https://github.com/ani300
2023-12-08 00:54:03 +00:00
Yanbo Liang
4620170008 [Dynamo] Revert multiple PRs since they triggered compilation stuck internally (#115126)
Revert the following PRs to mitigate internal compilation stuck:
#113432
#114016
#114507
#114196
#114739
#114669

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115126
Approved by: https://github.com/xush6528
2023-12-05 22:35:37 +00:00
Antoni Viros
1dc4588c6a Add an SDPA dispatcher for nested tensors with jagged layouts (#114164)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114164
Approved by: https://github.com/jbschlosser
2023-12-05 06:33:45 +00:00
PyTorch MergeBot
5cfda9b7f8 Revert "Add an SDPA dispatcher for nested tensors with jagged layouts (#114164)"
This reverts commit aafa8233a4.

Reverted https://github.com/pytorch/pytorch/pull/114164 on behalf of https://github.com/malfet due to Broke ROCM, see aafa8233a4 ([comment](https://github.com/pytorch/pytorch/pull/114164#issuecomment-1839798986))
2023-12-05 00:35:20 +00:00
Antoni Viros
aafa8233a4 Add an SDPA dispatcher for nested tensors with jagged layouts (#114164)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114164
Approved by: https://github.com/jbschlosser
2023-12-04 21:54:02 +00:00
Yanbo Liang
033d7b670a [Dynamo][6.1/N] Refactor out TorchInGraphFunctionVariable and improve heuristic (#113432)
This is splitted from #113009, please check https://github.com/pytorch/pytorch/pull/113009#issuecomment-1804417925 for more details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113432
Approved by: https://github.com/ezyang
2023-11-17 23:42:00 +00:00
Joel Schlosser
ea39cc34f9 Refactor NestedTensor subclass to remove ragged_size from constructor (#113491)
This PR removes the need for passing `ragged_size` into the `NestedTensor` constructor. This was an artifact of fake-ification, where sometimes we needed the NT to have a symbolic singleton symint shape for the ragged dimension. The new way of achieving this is to also store mappings between fake / functional tensors -> symbolic symints in the ragged structure registry. Now the `NestedTensor` constructor can just query this registry for the `ragged_size`.

Old: `NestedTensor(values, offsets, *, ragged_size=None, **kwargs)`
New: `NestedTensor(values, offsets, **kwargs)`

This makes it possible to have a `_nested_view_from_values_offsets(values, offsets)` without needing to pass a `ragged_size`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113491
Approved by: https://github.com/ezyang, https://github.com/soulitzer
2023-11-14 19:32:21 +00:00
Antoni Viros
1aece432ba Implement narrow from a regular tensor to jagged tensor (#112770)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112770
Approved by: https://github.com/cpuhrsch
2023-11-13 19:09:59 +00:00
Yuqing Jiang
9f3e378125 [nested tensor]add split and layer_norm_backward operations (#113108)
Summary:
Add split and layer_norm_backward.

Note: It is non trivial to support split_with_sizes backward so adding the split operation to support the use case in the model.

Test Plan: unit tests

Differential Revision: D51052966

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113108
Approved by: https://github.com/soulitzer
2023-11-08 07:44:35 +00:00
soulitzer
538ec4942a Do not generate zero-numel NT by default in helper and improve to_padded_tensor msg (#113162)
Improvements: improves to_padded_tensor error message when passed a NT with zero numel

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113162
Approved by: https://github.com/jbschlosser
ghstack dependencies: #113031, #112519, #113091
2023-11-07 19:56:26 +00:00
soulitzer
0c991acab0 Factor out test_nestedtensor setUp tearDown and call super (#113091)
Fixes https://github.com/pytorch/pytorch/issues/112845

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113091
Approved by: https://github.com/jbschlosser
ghstack dependencies: #113031, #112519
2023-11-07 19:56:26 +00:00
soulitzer
c2084da14a [NT] Backward support for broadcasting binary ops (#112519)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112519
Approved by: https://github.com/jbschlosser
ghstack dependencies: #113031
2023-11-07 00:03:21 +00:00
soulitzer
53fff56ab8 Graph break cleanly for test_nestedtensor (#112662)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112662
Approved by: https://github.com/jbschlosser
2023-11-03 07:20:43 +00:00