# Summary
This change fixes a bug that was encountered when trying to add more backward formulas for nested tensor ops. If a derivative is defined that stores the "result" for use in the backward the output of the forward op is saved using:
```
if (grad_fn) {
grad_fn->result_ = SavedVariable(result, true);
}
```
SavedVariable calls a series of functions which in turn calls shallow_copy_and_detach and when c179597753/c10/core/TensorImpl.cpp (L533) is hit this calls sizes_custom() which is not implemented and errors. I also noticed that since the storage format is different for nested_tensor not `storage_ ` but instead two tensors that the we should actually be calling the NestedTensorImpl constructor.
This PR overrides shallow_copy_and_detach from the derived class and ensures that shallow copy works correctly.
## Update
- Added the softmax derivative in this PR because that is a direct use case that was blocked by not having shallow_copy_and_detach work correctly.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81838
Approved by: https://github.com/soulitzer
Nested tensor used to assume the buffer memory to be contiguous. However, some operations can break that assumption:
* reshape
* transpose
* slice
To be able to access underlying tensors from discontinuous buffer, we need 3 metadata:
* sizes of each tensor (`nested_size_tensor_`)
* strides of each tensor (`nested_stride_tensor_`)
* offset of each tensor (`offsets_`)
so we access each tensor by `buffer.as_strided(size, stride, offset)`
This pull request introduces the offsets metadata, then added reshape and transpose so that we can create discontinuous cases for testing. Unbind, select, dropout, softmax, bmm are refactored to provide tests.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80981
Approved by: https://github.com/jbschlosser
Add numel implementation for Nested Tensor. Currently the construction of nested size and nested_strides assume contiguous. This implementation was based off of the safe_compute_numel(). Having a TORCH_CHECK in a for loop kinda feels bad but I don't really know how performant numel needs to be.
Since nested size is stored as a tensor: `nested_size_tensor().cumprod(dim=1).sum(dim=0)[1].item() ` Would also get the job done.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80424
Approved by: https://github.com/cpuhrsch
This allows subclasses such as NestedTensorImpl to provide special behavior for `int64_t size(int64_t d)` that'll also be accessible by our Python frontend.
It follows the same pattern as sizes_custom.
Currently getting CI before asking for a review.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80236
Approved by: https://github.com/ezyang
In transformer, the scale step in attention has a `nested_tensor / scalar` operation. There are two ways to support that:
1. directly support `nested_tensor / scalar`:
* pro: straightforward, good UX
* con: is dispatching `mul(nested tensor, regular tensor)` a good practice?
2. let user manually convert `scalar` to `nested_scalar = torch.nested_tensor([broadcast_scalar])`
* pro: dispatcher only has to deal with `mul(nested tensor, nested tensor)`
* con: confusing manual conversions, bad UX
Pull Request resolved: https://github.com/pytorch/pytorch/pull/80284
Approved by: https://github.com/cpuhrsch
2 reasons to add metadata `nested_stride`:
1. it will be used later in `reshape` and `transpose`
2. it reduces the computation to get offsets and shapes necessary in `unbind`-like codes, which will be used again and again in nested tensor operations
`unbind` and `select` are refactored to make use of `nested_stride`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/79831
Approved by: https://github.com/cpuhrsch, https://github.com/jbschlosser
This PR adds support for `SymInt`s in python. Namely,
* `THPVariable_size` now returns `sym_sizes()`
* python arg parser is modified to parse PyObjects into ints and `SymbolicIntNode`s
* pybind11 bindings for `SymbolicIntNode` are added, so size expressions can be traced
* a large number of tests added to demonstrate how to implement python symints.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/78135
Approved by: https://github.com/ezyang
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76640
- Adding output_size argument to to_padded_tensor
- Modified add_padding_kernelLauncher and kernels to iterate over padded tensor batch size instead of nested tensor batch size
- No fast path for CPU version
Test Plan:
buck test mode/dev-nosan //caffe2/test:nested
Performance test using N1763981:
{F728168808}
Reviewed By: cpuhrsch
Differential Revision: D36056902
fbshipit-source-id: d6df2939d6649128a7f43a2ef32d227870a8e583
(cherry picked from commit 09465f36f09d4d74c9b3303981d8cce0c7c1092a)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75808
Just as it is often difficult to write a single kernel that can handle both CPU and CUDA, so can it be difficult to do the same for NestedTensor.
ghstack-source-id: 154171542
(Note: this ignores all push blocking failures!)
Test Plan: CI?
Reviewed By: bdhirsh
Differential Revision: D35603836
fbshipit-source-id: fb0ebb19d34531ed96ce176aca325f8e2b5f90e6
(cherry picked from commit 0bcd753f93c04256c1b745f84a74ecccf0dceef5)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73817
NestedTensorImpl doesn't have sizes(). Silently getting wrong results back from it is not conducive to efficient software development. Make it throw while allowing sizes() to be inlined in the common case anyway, just like is_contiguous(). Thanks ezyang for the reminder that we could do this.
ghstack-source-id: 151302903
Test Plan: Updated test_nestedtensor.py
Reviewed By: ezyang
Differential Revision: D34660829
fbshipit-source-id: 1289f21127d6a8359893f9174f3c430a290f2c7f
(cherry picked from commit 7098b9fcfbd25a03bac19e1148426ff073810edd)
Summary:
This PR adds a minimal version of a NestedTensor. It introduces the general harness future development can be built around.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72881
Reviewed By: albanD
Differential Revision: D34259177
Pulled By: cpuhrsch
fbshipit-source-id: 0245c36f603424e20f3b09651043c207f526d760
(cherry picked from commit 10764e8d427f29b364567e4cbc86ed73c3933158)