Commit Graph

185 Commits

Author SHA1 Message Date
Aaron Gokaslan
1d6c5972c1 [BE]: Optimize min/max/sum comprehensions C419 (#123960)
Automatic fixes that replaces certain list comprehensions with generator ones where appropriate so that they are immediately consumed. This is preview functionality in ruff for rule C419 and it was automatically applied.

Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123960
Approved by: https://github.com/malfet
2024-04-12 23:54:15 +00:00
Joel Schlosser
721dcaff94 Revert usage of NJT views in SDPA (#123215)
For internal purposes, this PR reverts the use of real views in SDPA -> autograd.Function "views" (i.e. `ViewBufferFromNested` and `ViewNestedFromBuffer`). This is a temporary fix to get the FIRST model launched and working.

**Note: this breaks some other Dynamo tests related to SDPA that rely on real views, but the breakage there isn't expected to be likely in a real-world scenario.**

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123215
Approved by: https://github.com/YuqingJ
2024-04-04 18:45:47 +00:00
PyTorch MergeBot
63d17d3c90 Revert "Revert usage of NJT views in SDPA (#123215)"
This reverts commit 0fcddb5625.

Reverted https://github.com/pytorch/pytorch/pull/123215 on behalf of https://github.com/huydhn due to Sorry for reverting your PR but I think it needs to be skipped on ROCm 0fcddb5625 ([comment](https://github.com/pytorch/pytorch/pull/123215#issuecomment-2036080570))
2024-04-04 02:57:09 +00:00
Joel Schlosser
0fcddb5625 Revert usage of NJT views in SDPA (#123215)
For internal purposes, this PR reverts the use of real views in SDPA -> autograd.Function "views" (i.e. `ViewBufferFromNested` and `ViewNestedFromBuffer`). This is a temporary fix to get the FIRST model launched and working.

**Note: this breaks some other Dynamo tests related to SDPA that rely on real views, but the breakage there isn't expected to be likely in a real-world scenario.**

Pull Request resolved: https://github.com/pytorch/pytorch/pull/123215
Approved by: https://github.com/YuqingJ
2024-04-03 23:25:31 +00:00
soulitzer
638b003cb7 [NJT] .to() properly updates device of offsets (#122797)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122797
Approved by: https://github.com/jbschlosser
2024-04-02 16:07:27 +00:00
Chirag Pandya
b6201a60c5 [BE] minor logging cleanup in distributed (#122921)
Summary:
    Minor logging cleanup in distributed library
    1. Don't use "f" formatted strings - address linter issues.
    2. Nits: Make use of unused `e` (error) in a few logs.
    3. Change info->debug as asked in issue #113545
    4. Nit: rename log -> logger in a few files for consistency
    5. Fix a linter error.

    Test Plan:
    1. Local build passes.
    2. Linter is happy.

    Reviewers: wanchaol

Pull Request resolved: https://github.com/pytorch/pytorch/pull/122921
Approved by: https://github.com/wanchaol
2024-03-29 03:34:01 +00:00
Joel Schlosser
6fc5ad931c Use zeros for NJT dummy to avoid messing with randomness (#122902)
Use of randomness was breaking vmap.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122902
Approved by: https://github.com/vmoens, https://github.com/zou3519
2024-03-28 22:09:31 +00:00
PyTorch MergeBot
4290a57e9c Revert "[NJT] .to() properly updates device of offsets (#122797)"
This reverts commit 3e7fd45b40.

Reverted https://github.com/pytorch/pytorch/pull/122797 on behalf of https://github.com/jeffdaily due to Sorry for reverting your change but it is failing CUDA and ROCm jobs in trunk. Please help take a look and reland the change ([comment](https://github.com/pytorch/pytorch/pull/122797#issuecomment-2025473181))
2024-03-28 15:17:45 +00:00
soulitzer
3e7fd45b40 [NJT] .to() properly updates device of offsets (#122797)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122797
Approved by: https://github.com/jbschlosser
2024-03-28 00:56:23 +00:00
Joel Schlosser
6767c04fde Forward fix for broken internal tests related to NJT view dummy (#122704)
(internal link) [example test breakage](https://www.internalfb.com/intern/test/562950061753019?ref_report_id=0)

Symptom: `type stub not overridden` for SymInt. The global NJT dummy relies on `SymInt.__mul__()` in its constructor. Lazily constructing the dummy avoids the race.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/122704
Approved by: https://github.com/soulitzer
2024-03-26 21:22:12 +00:00
Joel Schlosser
e6986e4317 Public API for NJT construction from jagged components (#121518)
This PR introduces `torch.nested.nested_tensor_from_jagged(values, offsets=None, lengths=None, jagged_dim=1)` (bikeshedding welcome). This is intended to be the main entrypoint for getting an NJT from the `(values, offsets, lengths)` components. The returned NJT is a view of the `values` component.

Note that `torch.nested.nested_tensor()` / `torch.nested.as_nested_tensor()` already exist for constructing an NJT from a list of tensors.

TODO:
* Some doc formatting; suggestions welcome there
* Tests / examples using `jagged_dim != 1`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121518
Approved by: https://github.com/cpuhrsch
ghstack dependencies: #113279, #113280
2024-03-22 14:48:22 +00:00
Joel Schlosser
470b44c048 Support for torch.nested.as_nested_tensor(t) (#113280)
This PR adds support for tensor inputs to `as_nested_tensor()`. The tensor is treated as a batch of consistently-sized constituents. It utilizes `_nested_view_from_values_offsets()` to return a real view that allows for propagating gradients into inputs.
Co-authored-by: voznesenskym <voznesenskym@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113280
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
ghstack dependencies: #113279
2024-03-22 02:12:37 +00:00
Joel Schlosser
cd6bfc7965 Proper view support for jagged layout NestedTensor (#113279)
This PR:
* Introduces an ATen op for creating true jagged views from a dense values buffer
    * `_nested_view_from_jagged(values, offsets, lengths, ragged_idx, dummy)`
    * This ops is implemented on the Python side using torch.library so we can return a subclass instance
    * `jagged_from_list()` now uses this instead of the old autograd.Function `NestedViewFromBuffer`
    * The latter op is used for non-contiguous JTs returned via `torch.nested.narrow()`
    * `dummy` is an awful hack to ensure that `NestedTensor.__torch_dispatch__()` is invoked for our view
* Introduces an ATen op for accessing the `values` component of an NT via a view
    * `_nested_get_values(nt)`
* **Removes** the autograd.Functions `ViewNestedFromBuffer` and `ViewBufferFromNested` in favor of `nested_from_values_offsets()` / `nested_from_values_offsets_lengths()` and `nt.values()`, respectively.
* Changes test code to prefer `as_nested_tensor()` over `jagged_from_list()` directly
    * Similarly, avoid `buffer_from_jagged()`, preferring `values()`
* Depends on general subclass view fake-ification on the PT2 side (handled solely in previous PRs in the stack)

With these changes, the semantics of jagged layout NTs are such that they are considered a true view of the underlying `values` buffer. This means views of jagged NTs are views of the underlying buffer as well, simplifying some handling.

Differential Revision: [D54269922](https://our.internmc.facebook.com/intern/diff/D54269922)
Co-authored-by: voznesenskym <voznesenskym@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113279
Approved by: https://github.com/ezyang
2024-03-22 02:12:36 +00:00
PyTorch MergeBot
224beecee6 Revert "Proper view support for jagged layout NestedTensor (#113279)"
This reverts commit 5855c490f0.

Reverted https://github.com/pytorch/pytorch/pull/113279 on behalf of https://github.com/jbschlosser due to Need to fix BC thing ([comment](https://github.com/pytorch/pytorch/pull/113279#issuecomment-2013899762))
2024-03-21 22:03:01 +00:00
PyTorch MergeBot
12e7602cf9 Revert "Support for torch.nested.as_nested_tensor(t) (#113280)"
This reverts commit 17c9c70265.

Reverted https://github.com/pytorch/pytorch/pull/113280 on behalf of https://github.com/jbschlosser due to Need to fix BC thing ([comment](https://github.com/pytorch/pytorch/pull/113280#issuecomment-2013893099))
2024-03-21 22:00:44 +00:00
PyTorch MergeBot
816db3bd29 Revert "Public API for NJT construction from jagged components (#121518)"
This reverts commit d4dff9cf5e.

Reverted https://github.com/pytorch/pytorch/pull/121518 on behalf of https://github.com/jbschlosser due to Need to fix BC thing ([comment](https://github.com/pytorch/pytorch/pull/121518#issuecomment-2013879641))
2024-03-21 21:56:29 +00:00
Joel Schlosser
d4dff9cf5e Public API for NJT construction from jagged components (#121518)
This PR introduces `torch.nested.nested_tensor_from_jagged(values, offsets=None, lengths=None, jagged_dim=1)` (bikeshedding welcome). This is intended to be the main entrypoint for getting an NJT from the `(values, offsets, lengths)` components. The returned NJT is a view of the `values` component.

Note that `torch.nested.nested_tensor()` / `torch.nested.as_nested_tensor()` already exist for constructing an NJT from a list of tensors.

TODO:
* Some doc formatting; suggestions welcome there
* Tests / examples using `jagged_dim != 1`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/121518
Approved by: https://github.com/cpuhrsch
ghstack dependencies: #113280
2024-03-21 04:14:17 +00:00
Joel Schlosser
17c9c70265 Support for torch.nested.as_nested_tensor(t) (#113280)
This PR adds support for tensor inputs to `as_nested_tensor()`. The tensor is treated as a batch of consistently-sized constituents. It utilizes `_nested_view_from_values_offsets()` to return a real view that allows for propagating gradients into inputs.
Co-authored-by: voznesenskym <voznesenskym@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113280
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
2024-03-21 04:13:55 +00:00
Joel Schlosser
5855c490f0 Proper view support for jagged layout NestedTensor (#113279)
This PR:
* Introduces an ATen op for creating true jagged views from a dense values buffer
    * `_nested_view_from_jagged(values, offsets, lengths, ragged_idx, dummy)`
    * This ops is implemented on the Python side using torch.library so we can return a subclass instance
    * `jagged_from_list()` now uses this instead of the old autograd.Function `NestedViewFromBuffer`
    * The latter op is used for non-contiguous JTs returned via `torch.nested.narrow()`
    * `dummy` is an awful hack to ensure that `NestedTensor.__torch_dispatch__()` is invoked for our view
* Introduces an ATen op for accessing the `values` component of an NT via a view
    * `_nested_get_values(nt)`
* **Removes** the autograd.Functions `ViewNestedFromBuffer` and `ViewBufferFromNested` in favor of `nested_from_values_offsets()` / `nested_from_values_offsets_lengths()` and `nt.values()`, respectively.
* Changes test code to prefer `as_nested_tensor()` over `jagged_from_list()` directly
    * Similarly, avoid `buffer_from_jagged()`, preferring `values()`
* Depends on general subclass view fake-ification on the PT2 side (handled solely in previous PRs in the stack)

With these changes, the semantics of jagged layout NTs are such that they are considered a true view of the underlying `values` buffer. This means views of jagged NTs are views of the underlying buffer as well, simplifying some handling.

Differential Revision: [D54269922](https://our.internmc.facebook.com/intern/diff/D54269922)
Co-authored-by: voznesenskym <voznesenskym@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/113279
Approved by: https://github.com/ezyang
2024-03-20 23:45:34 +00:00
Chengji Yao
0e604becc5 [NJT] support chunk on batch dim (#119713)
- support chunk op on batch dim
- support empty_like op
- add tests for the like ops

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119713
Approved by: https://github.com/jbschlosser
2024-03-05 17:57:50 +00:00
soulitzer
312ce35c1f Rename singleton int to nested int (#119661)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119661
Approved by: https://github.com/ezyang
2024-02-16 19:21:17 +00:00
Joel Schlosser
756cf2913d Fix NJT stride access in SDPA dispatcher logic (#119846)
`._stride` -> `._strides`

Adds test to cover this case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119846
Approved by: https://github.com/drisspg, https://github.com/ani300, https://github.com/soulitzer
ghstack dependencies: #119910
2024-02-14 22:37:52 +00:00
Joel Schlosser
0560c193a6 Fix meta registration for _flash_attention_forward() [ROCm forward fix] (#119910)
Addresses ROCm failures from #119812

Pull Request resolved: https://github.com/pytorch/pytorch/pull/119910
Approved by: https://github.com/drisspg
2024-02-14 22:37:52 +00:00
Joel Schlosser
31e59766e7 Fix meta registration for _flash_attention_forward() (#119812)
Meta registration wrongly assumes 4D inputs, while the underlying op allows 3D inputs for the `mha_varlen_fwd()` case.
Testing: I added `detach()`es so the NJT test `test_sdpa_compile()` won't fail for a view-related reason. It should pass now with this fix.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119812
Approved by: https://github.com/drisspg
2024-02-14 02:38:53 +00:00
PyTorch MergeBot
8994f2367d Revert "Fix jagged NT softmax semantics (#119459)"
This reverts commit 6adadbaf79.

Reverted https://github.com/pytorch/pytorch/pull/119459 on behalf of https://github.com/malfet due to broke dynamo, see https://github.com/pytorch/pytorch/actions/runs/7835402753/job/21386634602 ([comment](https://github.com/pytorch/pytorch/pull/119459#issuecomment-1935246413))
2024-02-09 02:31:49 +00:00
Joel Schlosser
6adadbaf79 Fix jagged NT softmax semantics (#119459)
Before: `softmax` definition uses `jagged_unary_pointwise()` (wrong)
After: `softmax` impl adjusts the `dim` arg to account for the difference in dimensionality between the outer NT and the NT's `_values`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119459
Approved by: https://github.com/soulitzer
2024-02-08 20:13:12 +00:00
David Berard
278a0e1600 [NestedTensor] Support binary pointwise ops with >2 inputs (if inputs are non-tensors) (#119419)
It should usually be safe to run pointwise binary ops with >2 inputs. e.g. threshold_backward(tensor, tensor, scalar): we just operate on the values of the nested tensors, and pass in the other args as-is.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119419
Approved by: https://github.com/soulitzer
2024-02-08 20:06:40 +00:00
David Berard
460950d3aa [Nested Tensor] Support ragged_idx != 1 on aten::is_same_size, aten::_to_copy (#118442)
is_same_size is needed internally; `_to_copy` should be easy because it doesn't support new layouts.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118442
Approved by: https://github.com/cpuhrsch
2024-01-30 01:32:51 +00:00
David Berard
2842d3c9d3 [Nested Tensor] view: basic support for ragged_idx != 1 and _unsafe_view (#118317)
Uses case: `_unsafe_view` is used in aot_autograd to create a view that doesn't register as a view:

eebe7e1d37/torch/_functorch/_aot_autograd/jit_compile_runtime_wrappers.py (L470-L476)

If a transposed nested tensor (i.e. NT with ragged_idx != 1) encounters this code path, it previously would fail for two reasons: 1) because `_unsafe_view` isn't registered, and 2) because ragged_idx != 1 is not supported. This PR adds support for `_unsafe_view` (completely reusing the implementation of `view`; this just registers `_unsafe_view` as another op using the same implementation). It also adds support for ragged_idx != 1, but only for trivial cases where inp._size == size (the use case used by aot_autograd).

Tests: verify that the result of `_unsafe_view` doesn't have a `_base`, and that simple views on transposed NTs work.

Differential Revision: [D53096814](https://our.internmc.facebook.com/intern/diff/D53096814)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118317
Approved by: https://github.com/soulitzer
2024-01-26 17:29:37 +00:00
albanD
24133e44b1 Fix return type hint for list types (#118238)
All single element list types are `Tensor[]` so they will always be Tuple.
I don't know of any way to easily access the pyi type and compare that to a real run so no testing here :(
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118238
Approved by: https://github.com/ezyang
2024-01-25 23:35:20 +00:00
David Berard
52c5803088 [NestedTensor] Support ragged_idx != 1 in pointwise ops (#118157)
This PR allows pointwise ops to operate on tensors with ragged_idx != 1. It does this by passing the ragged_idx metadata into the construction of the returned NestedTensor when computing pointwise ops. The assumption is that: pointwise ops can operate directly on the values tensors, and the resulting tensor should have all the same metadata properties as the input tensors. For binary ops, a test is added to verify that adding two tensors with different ragged_idx cannot be added.

Previously:
* unary pointwise ops would error out when performed on nested tensors with ragged_idx != 1
* binary pointwise ops would produce tensors with nonsense shapes

Differential Revision: [D53032641](https://our.internmc.facebook.com/intern/diff/D53032641)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118157
Approved by: https://github.com/jbschlosser
2024-01-25 23:34:15 +00:00
drisspg
4e29f01bf2 Remove sdp_kernel and replace with sdpa_kernel in attention namespace (#114689)
# Summary
Simplification of Backend Selection

This PR deprecates the `torch.backends/cuda/sdp_kernel` context manager and replaces it with a new context manager `torch.nn.attention.sdpa_kernel`. This context manager also changes the api for this context manager.

For `sdp_kernel` one would specify the backend choice by taking the negation of what kernel they would like to run. The purpose of this backend manager was to only to be a debugging tool, "turn off the math backend" and see if you can run one of the fused implementations.

Problems:
- This pattern makes sense if majority of users don't care to know anything about the backends that can be run. However, if users are seeking to use this context manager then they are explicitly trying to run a specific backend.
- This is not scalable. We are working on adding the cudnn backend and this API makes it so so that more implementations will need to be turned off if user wants to explicitly run a given backend.
- Discoverability of the current context manager. It is somewhat un-intutive that this backend manager is in backends/cuda/init when this now also controls the CPU fused kernel behavior. I think centralizing to attention namespace will be helpful.

Other concerns:
- Typically backends (kernels) for operators are entirely hidden from users and implementation details of the framework. We have exposed this to users already, albeit not by default and with beta warnings. Does making backends choices even more explicit lead to problems when we potentially want to remove existing backends, (perhaps inputs shapes will get covered by newer backends).

A nice side effect is now that we aren't using the `BACKEND_MAP` in test_transformers many, many dynamo failures are passing for CPU tests.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114689
Approved by: https://github.com/cpuhrsch
2024-01-24 22:28:04 +00:00
YuqingJ
d8420c0b0c [Nested Tensor]Add helper functions to set max_seqlen/min_seqlen directly (#117815)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117815
Approved by: https://github.com/soulitzer
2024-01-23 01:00:45 +00:00
YuqingJ
a97d00cca5 [Nested Tensor]Support SDPA math fallback for jagged layout nested tensor (#116445)
Support this fallback by converting the jagged layout NT to strided layout NT, and the convert the result back to jagged layout NT.
This fallback might not be efficient since it uses unbind, contiguous and split.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116445
Approved by: https://github.com/soulitzer
2024-01-12 17:30:40 +00:00
PyTorch MergeBot
9f87760160 Revert "[Nested Tensor]Support SDPA math fallback for jagged layout nested tensor (#116445)"
This reverts commit e55a778cbb.

Reverted https://github.com/pytorch/pytorch/pull/116445 on behalf of https://github.com/huydhn due to Sorry for reverting your change, but i see it fails ROCm test in trunk due to an unsupported use case e55a778cbb ([comment](https://github.com/pytorch/pytorch/pull/116445#issuecomment-1888060036))
2024-01-11 22:21:45 +00:00
YuqingJ
e55a778cbb [Nested Tensor]Support SDPA math fallback for jagged layout nested tensor (#116445)
Support this fallback by converting the jagged layout NT to strided layout NT, and the convert the result back to jagged layout NT.
This fallback might not be efficient since it uses unbind, contiguous and split.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/116445
Approved by: https://github.com/soulitzer
2024-01-11 20:28:40 +00:00
Joel Schlosser
f70aeb4ffd Fix backward for reshape() on jagged layout NT (#117137)
Provides symbolic C++-side `reshape_as()` / `reshape()` decomps for jagged layout NTs to make the backwards pass work.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117137
Approved by: https://github.com/soulitzer
2024-01-10 23:35:07 +00:00
Joel Schlosser
0b0c76bace Support squeeze.dim for jagged NT (#116891)
As title. Needed for `rev_view_func()` of `unsqueeze()`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/116891
Approved by: https://github.com/soulitzer
ghstack dependencies: #115894, #116512
2024-01-06 01:00:53 +00:00
Joel Schlosser
ea3a5f8ddc Add chunk for jagged layout NT (#115842)
Nice to have for the [SDPA tutorial](https://pytorch.org/tutorials/intermediate/scaled_dot_product_attention_tutorial.html)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115842
Approved by: https://github.com/soulitzer
ghstack dependencies: #115192, #116111
2023-12-20 20:13:20 +00:00
Joel Schlosser
1474eb5f29 Fix jagged composite impl of flatten() (#115192)
Need to handle this in `NestedTensor.__torch_function__()` since it's CompositeImplicit
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115192
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
2023-12-19 19:15:21 +00:00
Joel Schlosser
bf62511e07 Reshape decomposition for jagged layout NT (#115191)
No more segfault from using `reshape()` on jagged NT :)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115191
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
2023-12-18 22:34:41 +00:00
Joel Schlosser
6fee208064 Handle -1 in jagged layout NT view ops (#115843)
Allows for inheriting the ragged and batch dims via -1:
```python
nt.view(-1, -1, D)
nt.expand(B, -1, D)
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115843
Approved by: https://github.com/soulitzer
ghstack dependencies: #115636
2023-12-15 00:42:47 +00:00
Joel Schlosser
0ff155fb65 Fix SDPA for SAM (#115636)
Addresses the regression for Segment Anything Fast in https://github.com/pytorch-labs/segment-anything-fast/issues/99
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115636
Approved by: https://github.com/soulitzer, https://github.com/ani300
2023-12-12 18:52:38 +00:00
soulitzer
8885128dcc Fix backward for SDPA NT jagged layout (#115576)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115576
Approved by: https://github.com/jbschlosser, https://github.com/ani300
2023-12-12 18:35:40 +00:00
Yuqing Jiang
41b1919208 [nested_tensor]Python subclass NT overhead improvement (2/n): avoid getting from WeakTensorKeyDictionary twice during __init__ (#115450)
Summary:
Most NT operations end with creating a new NestedTensor, which is time-consuming. Trying to reduce overhead during the NestedTensor creation.

The ops return a new NestedTensor with the same offsets, so "tensor not in _tensor_symint_registry" would be false in most case. The "in" (__contain__) function takes ~8 us. If we use the "get" directly, then we save a few us for most NT operations.

Test Plan:
Before:
get_tensor_symint take 15us
https://pxl.cl/3XF83
After
get_tensor_symint take 10us
https://pxl.cl/3XFc9

Differential Revision: D51992836

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115450
Approved by: https://github.com/soulitzer
2023-12-09 03:12:31 +00:00
Yuqing Jiang
e071d6a9eb [Nested tensor]avoid using shape in python subclass NT, use _size instead (#115371)
Summary:
calling tensor.shape will call torch_dispatch which adds more overhead.

Testing overhead difference in "NT + NT" operation:
**Before:**
the add operation takes ~300us
{F1167963824}
**After:**
the add operation takes ~200us
 {F1167964056}

Test Plan: unit tests in test_nestedtensor

Differential Revision: D51949135

Pull Request resolved: https://github.com/pytorch/pytorch/pull/115371
Approved by: https://github.com/soulitzer, https://github.com/jbschlosser
2023-12-08 02:08:36 +00:00
Joel Schlosser
3b01f30b20 Prevent invalid pointwise ops on jagged with transposed ragged dim (#115190)
TODO: tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115190
Approved by: https://github.com/soulitzer, https://github.com/ani300
2023-12-08 00:54:03 +00:00
Joel Schlosser
c99db5617a Introduce general metadata cache to jagged layout NestedTensor (#115212)
Slight refactor to:
* lazily compute min / max seq_len used for flash. this avoids unnecessary graph breaks / specialization when we're not accessing these
* store min / max seq_len in a general `metadata_cache`. condensing these should make it easier to avoid specializing on these and others we may add in the future
Pull Request resolved: https://github.com/pytorch/pytorch/pull/115212
Approved by: https://github.com/soulitzer, https://github.com/ani300
ghstack dependencies: #114311
2023-12-06 19:40:35 +00:00
Joel Schlosser
22704426c3 Expand dynamic dims support for traceable subclasses (#114311)
Continuation of #112185, following the design in this [doc](https://docs.google.com/document/d/1ipSxcTzEMMOAPvxP-YJlD5JBZZmIGgh8Q34ixtOUCRo).

Summary:
* Introduce `SubclassSymbolicPolicy` containing separate dynamic dim / constraint policies for the outer and inner tensors
    * Expand the automatic dynamic algorithm to recurse into inner tensors and produce one of these for a subclass instance
    * Maintain legacy behavior for subclasses by recursively calling `mark_dynamic()` on inner tensors *of the same dim as outer* when `mark_dynamic(outer, ...)` is called
    * Addresses this: 6a86cf00ad/torch/_dynamo/variables/builder.py (L1750)
* Add `outer_size` and `outer_stride` arguments to `__tensor_unflatten__()` so that you can find out what symbols were allocated for the outer size / stride (you are expected to return a tensor that compares equal to the outer symbols)
    * Signatures now:
    ```python
    # attrs is a list of inner tensor attributes on x; inner_tensor = getattr(x, attr)
    # ctx is anything useful for rebuilding the class we want to guard on
    attrs, ctx = x.__tensor_flatten__()
    ...
    # inner_tensors is a dict of {attr -> tensor}
    # ctx is taken unmodified from flattening and (eventually) guarded on
    # outer_size is the expected size of the output; possibly symbolic
    # outer_stride is the expected strides of the output; possibly symbolic
    y = MySubclass.__tensor_unflatten__(inner_tensors, ctx, outer_size, outer_stride)

    # at the __tensor_unflatten__() call-site in PT2, we assert y.shape == outer_size and y.stride() == outer_stride
    # the assert simplifies symbols when there are relationships between outer and inner symbols
    ```
    * Size info needed for `NestedTensor` at least, stride info needed for `DTensor` at least
    * Punting on `outer_storage_offset` because storage_offset handling is horribly broken in PT2 right now
* ~~Add new `__tensor_mark_dynamic__()` to allow overriding the behavior of mark_dynamic on a per-subclass basis~~ (booted to future work)
* ~~Add guards for tensor subclasses by calling `__tensor_flatten__()` in the guard to test equality on `ctx`~~
    * Now handled in #114469
* Next PR: add TENSOR_MATCH guards on inner tensors

Pull Request resolved: https://github.com/pytorch/pytorch/pull/114311
Approved by: https://github.com/ezyang, https://github.com/drisspg, https://github.com/voznesenskym, https://github.com/bdhirsh
2023-12-05 21:09:25 +00:00
Antoni Viros
1dc4588c6a Add an SDPA dispatcher for nested tensors with jagged layouts (#114164)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114164
Approved by: https://github.com/jbschlosser
2023-12-05 06:33:45 +00:00
PyTorch MergeBot
5cfda9b7f8 Revert "Add an SDPA dispatcher for nested tensors with jagged layouts (#114164)"
This reverts commit aafa8233a4.

Reverted https://github.com/pytorch/pytorch/pull/114164 on behalf of https://github.com/malfet due to Broke ROCM, see aafa8233a4 ([comment](https://github.com/pytorch/pytorch/pull/114164#issuecomment-1839798986))
2023-12-05 00:35:20 +00:00
Antoni Viros
aafa8233a4 Add an SDPA dispatcher for nested tensors with jagged layouts (#114164)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/114164
Approved by: https://github.com/jbschlosser
2023-12-04 21:54:02 +00:00
Joel Schlosser
2a8a7425be Fix to wrap jagged dims for split() / split_with_sizes() (#113591)
Still need OpInfo-style tests to catch things like this.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113591
Approved by: https://github.com/soulitzer
2023-11-14 19:36:08 +00:00
Joel Schlosser
ea39cc34f9 Refactor NestedTensor subclass to remove ragged_size from constructor (#113491)
This PR removes the need for passing `ragged_size` into the `NestedTensor` constructor. This was an artifact of fake-ification, where sometimes we needed the NT to have a symbolic singleton symint shape for the ragged dimension. The new way of achieving this is to also store mappings between fake / functional tensors -> symbolic symints in the ragged structure registry. Now the `NestedTensor` constructor can just query this registry for the `ragged_size`.

Old: `NestedTensor(values, offsets, *, ragged_size=None, **kwargs)`
New: `NestedTensor(values, offsets, **kwargs)`

This makes it possible to have a `_nested_view_from_values_offsets(values, offsets)` without needing to pass a `ragged_size`.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113491
Approved by: https://github.com/ezyang, https://github.com/soulitzer
2023-11-14 19:32:21 +00:00
Antoni Viros
1aece432ba Implement narrow from a regular tensor to jagged tensor (#112770)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112770
Approved by: https://github.com/cpuhrsch
2023-11-13 19:09:59 +00:00
Yuqing Jiang
9f3e378125 [nested tensor]add split and layer_norm_backward operations (#113108)
Summary:
Add split and layer_norm_backward.

Note: It is non trivial to support split_with_sizes backward so adding the split operation to support the use case in the model.

Test Plan: unit tests

Differential Revision: D51052966

Pull Request resolved: https://github.com/pytorch/pytorch/pull/113108
Approved by: https://github.com/soulitzer
2023-11-08 07:44:35 +00:00
soulitzer
c2084da14a [NT] Backward support for broadcasting binary ops (#112519)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112519
Approved by: https://github.com/jbschlosser
ghstack dependencies: #113031
2023-11-07 00:03:21 +00:00
Peter Bell
718035791d Prefer e.is_number over not e.free_symbols in SymPy (#112688)
We spend somewhere on the order 1% in `sympy.Expr.free_symbols` as it is called millions of times.
Most of the time we actually just want to know "is this a constant", however `e.is_constant()` is
horribly slow. It turns out though that there is another propery `is_number` that does what we want.

> property is_number:
>
> Returns True if self has no free symbols and no undefined functions (AppliedUndef, to be precise). It will be faster
> than if not self.free_symbols, however, since is_number will fail as soon as it hits a free symbol or undefined
> function.

Even further, we also avoid the overhead of building the unnecessary set object.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112688
Approved by: https://github.com/lezcano
2023-11-06 20:05:13 +00:00
soulitzer
53fff56ab8 Graph break cleanly for test_nestedtensor (#112662)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112662
Approved by: https://github.com/jbschlosser
2023-11-03 07:20:43 +00:00
Yuqing Jiang
24f217ee64 [Nested tensor] Add more ops in Python subclass nested tensor (#112302)
Summary: Add dropout, split_with_sizes, and silu operations in python subclass nested tensor

Test Plan: unit tests

Differential Revision: D50676812

Pull Request resolved: https://github.com/pytorch/pytorch/pull/112302
Approved by: https://github.com/soulitzer, https://github.com/jbschlosser
2023-10-31 20:57:05 +00:00
Joel Schlosser
2225e6361d Support for as_nested_tensor() with jagged layout + fixed nested_tensor() semantics (#112304)
This PR:
* Adds support for the `layout` kwarg to `torch.nested.as_nested_tensor()`
* Fixes `torch.nested.nested_tensor()`
    * It should accept a list of lists of scalars
    * It should not preserve autograd history
* Adds extensive testing for these two functions

Semantics for the two functions follow those of the strided layout:
* `torch.nested.nested_tensor(tensor_list, layout=torch.jagged)`: Creates a new jagged layout NT **with no autograd history**
    * `tensor_list` can be a list of Tensors or list of lists of scalars
* `torch.nested.as_nested_tensor(tensor_list, layout=torch.jagged)`: Creates a new jagged layout NT **preserving autograd history of `tensor_list`**
    * `tensor_list` must be a list of Tensors
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112304
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
2023-10-28 02:34:27 +00:00
Antoni Viros
668c3b3f3b Add embedding op to jagged NT (#112288)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/112288
Approved by: https://github.com/cpuhrsch
2023-10-28 01:29:17 +00:00
soulitzer
73170b23d4 Add compile support for NT unbind (#111531)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111531
Approved by: https://github.com/ezyang
2023-10-23 21:16:20 +00:00
Joel Schlosser
ba2ba9621c More NT subclass op support for SAM (#111253)
With this PR, we have full op support for SAM without needing to unwrap subclass into jagged buffer -> run ops -> rewrap manually. Specifically, this was previously happening in the MaskDecoder.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/111253
Approved by: https://github.com/soulitzer, https://github.com/cpuhrsch
2023-10-18 21:21:28 +00:00
soulitzer
2dc1726ab7 Compile NestedTensor with AOTAutograd (#110529)
This PR has a number of changes that improve subclass support for AOTAutograd/Inductor in general:
-  previously if a subclass does extra aliasing between graph outputs/inputs in a way, the partitioner would complain because grad_outputs are the outputs reused as-is. Now we do a view_as(self) to workaround this.
- Use dense -> dense metadata when working with fwd_output_strides during backward. This is important since the stride information comes from inductor which sees the dense to dense graph.
- Inductor requires that the inputs to the compiled backward to match some expected strides computed during compilation. We make sure to make the inner tensors of the subclass contiguous (previously, we only made the subclass itself contiguous)

Changes specific to NestedTensor relevant to compilation:
- Properly handle the case where `__tensor_unflatten__` is passed non-symbolic dense tensors and with meta extracted from fake subclasses.
- Skip var_to_range logic for singleton int
- Skip size hint logic in inductor for singleton int

Pull Request resolved: https://github.com/pytorch/pytorch/pull/110529
Approved by: https://github.com/bdhirsh
2023-10-17 21:17:10 +00:00
Jesse Cai
4c01686027 Public API for constructing NT with jagged layout from tensor list (#111078)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/111078
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
ghstack dependencies: #109123
2023-10-13 03:27:41 +00:00
Joel Schlosser
8f90be4478 Expand NT subclass to support SAM (#109123)
This PR contains the changes needed to support using the NT jagged subclass within SAM. Note that a NT with multiple ragged dims is still required at the extremes for inputs / outputs, but the internal computation generally involves a single ragged dim, making the jagged layout usable.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109123
Approved by: https://github.com/cpuhrsch, https://github.com/soulitzer
2023-10-12 20:33:22 +00:00
soulitzer
110382bacf Make NestedTensor compilable with eager backend (#109171)
In this PR:
- Adds support for strides for jagged tensor (design doc for this coming soon)
- NestedTensor skips automatic dynamic
- Make use of @bdhirsh's subclass fakification logic by adding the __tensor_{un,}flatten__ functions.
- Additional logic for fakification: since existing subclass fakification logic does not handle the case where the outer tensor has an additional dimension. We insert one-off logic to (1) insert an extra SingletonSymInt onto the fakified NestedTensor. (2) make sure we call track_symint on both the sizes on the inner and outer tensor during guard creation.

Remaining things that are weird:
- Still need to skip some logic in meta utils for some reason (I was going to write this up more, but decided not to since we're not able to do this anyway for a immediate reason: we cannot arbitrarily compare singleton ints. For now I'm just following Brian's advise from [here](https://github.com/pytorch/pytorch/pull/109171#discussion_r1328137070) )

Pull Request resolved: https://github.com/pytorch/pytorch/pull/109171
Approved by: https://github.com/ezyang, https://github.com/bdhirsh
2023-10-11 04:47:10 +00:00
soulitzer
fda0a965c7 [reland] Support SingletonSymNode mul with coefficient (#110673)
reland of https://github.com/pytorch/pytorch/pull/110369
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110673
Approved by: https://github.com/ezyang
2023-10-10 19:37:17 +00:00
PyTorch MergeBot
1c3fae46ee Revert "Support SingletonSymNode mul with coefficient (#110369)"
This reverts commit eb8feb8ff8.

Reverted https://github.com/pytorch/pytorch/pull/110369 on behalf of https://github.com/PaliC due to bottom diff is causing a plethora of internal failures ([comment](https://github.com/pytorch/pytorch/pull/110369#issuecomment-1749802899))
2023-10-05 23:51:28 +00:00
soulitzer
eb8feb8ff8 Support SingletonSymNode mul with coefficient (#110369)
We want to be able to use SingletonSymNode to represent strides for Jagged layout tensor. The following is for 3D, but easily generalizable to higher dimensions.

Constraints:
- [B, x, D] (where x represents the "variably lengthed dim") can be strided in two ways [x, 1, sum(x)] and [dx, d, 1]. We need two different placeholder values depending on how the jagged tensor is strided.
- When doing operations we need the strides of output tensors to be expressable in terms of the strides and sizes of the inner tensors. Given [B, x, D] @ [D, D'], the output strides is [x * D', D', 1] rather than some opaque [x2, D', 1]. This constraint exists because if I'm tracing, I need a symint to represent the output stride. This symint needs to come from somewhere; I get it in several ways: (1) create a constant, (2) unbacked symint, (3) create a new input using a source, (4) output of an operation on an existing symint. It is clear that (4) is what we want here, which brings us to the design below.

Design:

Given the two constraints, the most straightforward way to implement this is actually to update SingletonSymNode to include some scalar factor, i.e. Morally, SingletonSymNode represents `factor * [s_0, s_1, …, s_n]` This enables us to symbolically compute strides from sizes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/110369
Approved by: https://github.com/ezyang
ghstack dependencies: #110044
2023-10-04 22:56:15 +00:00
soulitzer
2bcff92540 Add NestedTensor python subclass (#108314)
Description coming soon

Pull Request resolved: https://github.com/pytorch/pytorch/pull/108314
Approved by: https://github.com/jbschlosser
ghstack dependencies: #108808
2023-09-11 18:29:20 +00:00
Aaron Gokaslan
e2a3817dfd [BE] Enable C419 rule for any all shortcircuiting (#99890)
Apparently https://github.com/pytorch/pytorch/pull/78142 made torch.JIT allow for simple generator expressions which allows us to enable rules that replace unnecessary list comprehensions with generators in any/all. This was originally part of #99280 but I split it off into this PR so that it can be easily reverted should anything break.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/99890
Approved by: https://github.com/justinchuby, https://github.com/kit1980, https://github.com/malfet
2023-04-25 15:02:13 +00:00
rusty1s
19706356b5 Fix TorchScript support in as_nested_tensor (#97960)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97960
Approved by: https://github.com/cpuhrsch
2023-03-30 18:55:26 +00:00
Mikayla Gawarecki
ec61951f07 Fix inaccuracy in nt constructor documentation + broken rendering (#89152)
Rendering was broken and docstring seemed to be inaccurate

![Screen Shot 2022-11-16 at 2 16 28 PM](https://user-images.githubusercontent.com/35276741/202273588-a2da5b7b-1a6d-46bb-a74e-c0de9a0fd064.png)

Pull Request resolved: https://github.com/pytorch/pytorch/pull/89152
Approved by: https://github.com/cpuhrsch
2022-11-16 22:32:46 +00:00
Antoni Viros i Martin
c77368d416 Implement a constructor for nested_tensor that is similar to torch.tensor() (#88213)
Summary: This diff merges both previous implementations of constructors for nested tensors, the one from lists of tensors and the one with arbitrary python lists, adn implements it in pytorch core so no extensions are needed to construct NT.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/88213
Approved by: https://github.com/cpuhrsch
2022-11-08 00:03:18 +00:00
Frankie Robertson
afc9963865 Fix path to nested_tensor in example (#86891)
This appears to be a typo.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/86891
Approved by: https://github.com/H-Huang
2022-10-13 17:42:32 +00:00
Mikayla Gawarecki
a77f2a95a7 Improve NestedTensor documentation (#85186)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85186
Approved by: https://github.com/cpuhrsch
2022-10-12 22:03:04 +00:00
Mikayla Gawarecki
afaee00fec Add python nested_tensor and as_nested_tensor constructors in torch.nested (#85593)
Remove `torch.nested_tensor` which has erroneous behavior wrt gradients (could be either leaf or not leaf). Introduce `torch.nested.nested_tensor` and `torch.nested.as_nested_tensor` in the vein of `torch.tensor` and `torch.as_tensor`. Done in nested `__init__.py` for now but can move to pybind in future (when we want to load from numpy/nested lists ).

Discussed offline with @cpuhrsch and pybind constructor (https://github.com/pytorch/pytorch/pull/85536) was more gnarly than expected, so we can move to that when we do need loading from numpy etc.

Differential Revision: [D39806622](https://our.internmc.facebook.com/intern/diff/D39806622)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85593
Approved by: https://github.com/drisspg, https://github.com/cpuhrsch
2022-09-28 20:15:02 +00:00
Mikayla Gawarecki
e217b30b0f Add torch.nested namespace (#84102)
First step towards #83775
- only `to_padded_tensor` is moved to the nested namespace for now
- following the schema used for `special`, `fft`, `linalg` and other namespaces, nested functions are registered in native_functions.yaml as `nested_{function_name}` and are bound to the desired Python name in
`torch/nested/__init__.py`, and the desired C++ name in `torch/csrc/api/include/torch/nested.h`.

~~**Question**: should we keep the documentation for `Tensor.to_padded_tensor` or can this deleted since it is shared by `torch.nested.to_padded_tensor`?~~

[generated nested docs](https://docs-preview.pytorch.org/84102/nested.html?highlight=nested#module-torch.nested)

Differential Revision: [D39361148](https://our.internmc.facebook.com/intern/diff/D39361148)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/84102
Approved by: https://github.com/drisspg
2022-09-12 16:31:05 +00:00
Scott Wolchok
f9d0bc5338 [PyTorch] Delete NestedTensor Python wrapper (#74691)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74691

The wrapper just called through to methods on the underlying Tensor.
ghstack-source-id: 152433754

Test Plan: existing tests

Reviewed By: ezyang

Differential Revision: D34689789

fbshipit-source-id: cf53476780cf3ed00a3aa4add441300bfe8e27ce
(cherry picked from commit 5a9e5eb6bc13eb30be6e3c3bc4ac954c92704198)
2022-03-29 19:13:40 +00:00
Scott Wolchok
f7fe75c766 [PyTorch] Call self._impl.unbind directly in _nestedtensor wrapper (#74001)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74001

Incremental step toward removing this wrapper.
ghstack-source-id: 151540965

Test Plan: existing test_nestedtensor

Reviewed By: ezyang

Differential Revision: D34689382

fbshipit-source-id: 46eb203d4fc9ccc30008c7dc3ada24cab47b57d4
(cherry picked from commit 30e204362ca6a3da05e6d99726489879c9253a0e)
2022-03-17 18:04:50 +00:00
Scott Wolchok
79c3bbceb2 [PyTorch] Move NestedTensor printing to _tensor_str.py (#74000)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74000

Now that we're in-core, we can just customize this.
ghstack-source-id: 151540966

Test Plan: Existing test_nestedtensor seems to pass

Reviewed By: ezyang

Differential Revision: D34665270

fbshipit-source-id: 5097944a4dc4fe80cea2b8576f0123466dbeab43
(cherry picked from commit d0315f46f9906c904639f43f218e439407f5b2a7)
2022-03-17 18:04:50 +00:00
Scott Wolchok
88b19da49a [PyTorch] Make NestedTensor::dim() work (#73679)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73679

We can update the TensorImpl state used to track dim() just fine.

I'm not sure if this is sustainable; do we *want* callers to be able to muck with nested_size_tensor_ directly?
ghstack-source-id: 150349610

Test Plan: Updated test_nestedtensor.

Reviewed By: cpuhrsch

Differential Revision: D34570523

fbshipit-source-id: 739555d63226f925d6a502c9c742ce5f431cb6cc
(cherry picked from commit 1bb188162f3639f26a6204ad5d40f73e4c664a6d)
2022-03-04 19:35:41 +00:00
Christian Puhrsch
484c0de670 Minimal NestedTensor (#72881)
Summary:
This PR adds a minimal version of a NestedTensor. It introduces the general harness future development can be built around.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/72881

Reviewed By: albanD

Differential Revision: D34259177

Pulled By: cpuhrsch

fbshipit-source-id: 0245c36f603424e20f3b09651043c207f526d760
(cherry picked from commit 10764e8d427f29b364567e4cbc86ed73c3933158)
2022-03-02 16:31:51 +00:00